Skip to content

Category Archives: Wikidata

Everybody scrape now!

If you like Wikidata and working on lists, you probably know my Mix’n’match tool, to match entries in external catalogs to Wikidata. And if you are really into these things, you might have tried your luck with the import function, to add your own catalog. But the current import page has some drawbacks: You need […]

The flowering ORCID

As part of my Large Datasets campaign, I have now downloaded and processed the latest data from ORCID. This yielded 655,706 people (47,435 or 7% in Wikidata), and 13,438,786 publications (1,079,305 or 8% in Wikidata) with a DOI or PubMed ID (to be precise, these are publications-per-person, so the same paper might be counted multiple times; however, […]

In my last blog post “The Big Ones“, I wrote about my attempts to import large, third-party datasets, and to synchronize those with Wikidata. I have since imported three datasets (BNF, VIAF, GND), and created a status page to keep a public record of what I did, and try to do. I have run a […]

The Big Ones

Update: After fixing an import error, and cross-matching of BNF-supplied VIAF data, 18% of BNF people are matched in Wikidata. This has been corrected in the text. My mix’n’match tool holds a lot of entries from third-party catalogs – 21,795,323 at the time of writing. That’s a lot, but it doesn’t cover “the big ones” – […]

ORCID mania

ORCID is an increasingly popular service to disambiguate authors of scientific publications. Many journals and funding bodies require authors to register their ORCID ID these days. Wikidata has a property for ORCID, however, only ~2400 items have an ORCID property at the moment of writing this blog post. That is not a lot, considering Wikidata […]

Comprende!

tl;dr: I wrote a quiz interface on top of a MediaWiki/WikiBase installation. It ties together material from Wikidata, Commons, and Wikipedia, to form a new educational resource. I hope the code will eventually be taken up by a Wikimedia chapter, as part of an OER strategy. The past There have been many attempts in the WikiVerse to […]

Mix’n’match post-mortem

So this, as they say, happened. On 2016-12-27, I received an update on a Mix’n’match catalog that someone had uploaded. That update had improved names and descriptions for the catalog. I try to avoid such updates, because I made the import function so I do not have to deal with every catalog myself, and also because […]

All your locations are belong to us

A recent push for a UK photography contest reminded me of an issue I have begrudged for a quite a while. On the talk page for that contest, I pointed to several tools of mine, dealing with images and locations. But they only show aspects of those, like “Wikidata items without images”. What about the others? WDQS can show maps of […]

Livin’ on the edge

A few days ago, Lydia posted about the first prototype of the new structured data system for Commons, based on Wikidata technology. While this is just a first step, structured data for Commons seems finally within reach. And that brings home the reality of over 32 million files on Commons, all having unstructured data about them, in the […]

WDQ, obsolete?

Since a few years, I run the WikiData Query tool (WDQ) to provide a query functionality to Wikidata. Nowadays, the (confusingly similarly named) SPARQL-based WDQS is the “official” way to query Wikidata. WDQS has been improving a lot, and while some of my tools still support WDQ, I deliberately left that option out of new […]