Skip to content

Category Archives: Wikidata

A Scanner Rusty

One of my most-used WikiVerse tools is PetScan. It is a complete re-write of several other PHP-based tools, in C++ for performance reasons. PetScan has turned into the Swiss Army Knife of doing things with Wikipedia, Wikidata, and other projects. But PetScan has also developed a few issues over time. It is suffering from the […]

Batches of Rust

QuickStatments is a workhorse for Wikidata, but it had a few problems of late. One of those is bad performance with batches. Users can submit a batch of commands to the tool, and these commands are then run on the Labs server. This mechanism has been bogged down for several reasons: Batch processing written in […]

Bad credentials

So there has been an issue with QuickStatements on Friday. As users of that tool will know, you can run QuickStatements either from within your browser, or “in the background” from a Labs server. Originally, these “batch edits” were performed as QuickStatementsBot, mentioning batch and the user who submitted it in the edit summary. Later, […]

The Corfu Projector

I recently spent a week on Corfu. I was amazed by the history, the culture, the traditions, and, of course, the food. I was, however, appalled by the low coverage of Corfu localities on Wikidata. While I might be biased, living in the UK where every postbox is a historic monument, a dozen or so […]

Papers on Rust

I have written about my attempt with Rust and MediaWiki before. This post is an update on my progress. I started out writing a MediaWiki API crate to be able to talk to MediaWiki installations from Rust. I was then pointed to a wikibase crate by Tobias Schönberg and others, to which I subsequently contributed […]

Inventory

The Cleveland Museum of Art recently released 30,000 images of art under CC-Zero (~public domain). Some of the good people on Wikimedia Commons have begun uploading them there, to be used, amongst others, by Wikipedia and Wikidata. But how to find the relevant Wikipedia article (if there is one) or Wikidata item for such a […]

What else?

Structured Data on Commons is approaching. I have done a bit of work on converting Infoboxes into statements, that is, to generate structured data. But what about using it? What could that look like? Inspired by a recent WMF blog post, I wrote a simple demo on what you might call “auto-categorisation”. You can try […]

Match point

Mix’n’match is one of my more popular tools. It contains a number of catalogs, each in turn containing hundreds or even millions of entries, that could (and often should!) have a corresponding Wikidata item. The tool offers various ways to make it easier to match an entry in a catalog to a Wikidata item. While […]

Wikipedia, Wikidata, and citations

As part of an exploratory census of citations on Wikipedia, I have generated a complete (yeah, right) list of all scientific publications cited on Wikispecies, English and German Wikipedia. This is done based on the rendered HTML of the respective articles, and tries to find DOIs, PubMed, and PubMed Central IDs. The list is kept […]

Judgement Day

At the dawn of Wikidata, I wrote a tool called “Terminator”. Not just because I wanted to have one of my own, but as a pun on the term “term”, used in the database table name (“wb_term”) where Wikidata labels, descriptions, and aliases are stored. The purpose of the tool is to find important (by […]