Skip to content

Category Archives: Wikidata

Papers on Rust

I have written about my attempt with Rust and MediaWiki before. This post is an update on my progress. I started out writing a MediaWiki API crate to be able to talk to MediaWiki installations from Rust. I was then pointed to a wikibase crate by Tobias Schönberg and others, to which I subsequently contributed […]

Inventory

The Cleveland Museum of Art recently released 30,000 images of art under CC-Zero (~public domain). Some of the good people on Wikimedia Commons have begun uploading them there, to be used, amongst others, by Wikipedia and Wikidata. But how to find the relevant Wikipedia article (if there is one) or Wikidata item for such a […]

What else?

Structured Data on Commons is approaching. I have done a bit of work on converting Infoboxes into statements, that is, to generate structured data. But what about using it? What could that look like? Inspired by a recent WMF blog post, I wrote a simple demo on what you might call “auto-categorisation”. You can try […]

Match point

Mix’n’match is one of my more popular tools. It contains a number of catalogs, each in turn containing hundreds or even millions of entries, that could (and often should!) have a corresponding Wikidata item. The tool offers various ways to make it easier to match an entry in a catalog to a Wikidata item. While […]

Wikipedia, Wikidata, and citations

As part of an exploratory census of citations on Wikipedia, I have generated a complete (yeah, right) list of all scientific publications cited on Wikispecies, English and German Wikipedia. This is done based on the rendered HTML of the respective articles, and tries to find DOIs, PubMed, and PubMed Central IDs. The list is kept […]

Judgement Day

At the dawn of Wikidata, I wrote a tool called “Terminator”. Not just because I wanted to have one of my own, but as a pun on the term “term”, used in the database table name (“wb_term”) where Wikidata labels, descriptions, and aliases are stored. The purpose of the tool is to find important (by […]

More topics

After my recent blog post about the TopicMatcher tool, I had quite a few conversations about the general area of “main topic”, especially relating to the plethora of scientific publications represented on Wikidata. Here’s a round-up of related things I did since: As a first attempt, I queried all subspecies items from Wikidata, searched for […]

On Topic

Wikidata already contains a lot of information about topics – people, places, concepts etc. It also contains topics that have a topic, e.g., a painting of a person, a biographical article about someone, a scientific publication about a species. Ideally, Wikidata also describes the connection between the work and the subject. Such connections can be […]

The Quickening

My QuickStatements tool has been quite popular, in both version 1 and 2. It appears to be one of the major vectors of adding large amounts of prepared information to Wikidata. All good and well, but, as will all well-used tools, some wrinkles appear over time. So, time for a do-over! It has been a […]

The File (Dis)connect

I’ll be going on about Wikidata, images, and tools. Again. You have been warned. I have written a few image-related Wikimedia tools over the years (such as FIST, WD-FIST, to name two big ones), because I believe that images in articles and Wikidata items are important, beyond their adorning effect. But despite everyone’s efforts, images […]