Skip to content

On Topic

Wikidata already contains a lot of information about topics – people, places, concepts etc. It also contains topics that have a topic, e.g., a painting of a person, a biographical article about someone, a scientific publication about a species. Ideally, Wikidata also describes the connection between the work and the subject. Such connections can be tremendously useful in many contexts, including GLAM and scientific research.

This kind of work can generally not be done by bots, as this would require machine-readable, reliable source data to begin with. However, manually finding items about works, then finding the matching item on Wikidata is somewhat tedious. Thus, I give you TopicMatcher!

In a nutshell, I prepare a list of Wikidata items that are about creative works – paintings, biographical articles, books, scientific publications. Then, I try to guesstimate what Wikidata item they are (mainly) about. Finally, you can get one of these “work items” and their potential subject, with buttons to connect them. At the moment, I have biographical articles looking for a “main subject”, and paintings lacking a “depicts” statement. That comes to a total of 13,531 “work items”, with 54,690 potential matches.

You will get the expected information about the work item and the potential matches, using my trusty AutoDesc. You also get a preview of the painting (if there is an image) and a search function. Below that is a page preview that differs with context; depending on the work item, you could get

  • a WikiSource page, for biographical articles there
  • a GLAM page, if the item has a statement with an external reference that can be used to construct a URL
  • a publication page, using PMC, DOI, or PubMed IDs
  • the Wikidata page of the item, if nothing else works

The idea of the page preview is to find more information about the work, which will allow you to determine the correct match. If there are no suggested subjects in the database, a search is performed automatically, in case new items have been created since the last update.

Once you are done with the item, you can click “Done” (which marks the work item as finished, so it is not shown again), or “Skip”, to keep the item in the pool. Either way, you will get another random item; the reward for good work is more work….

At the top of the page are some filtering options, if you prefer to work on a specific subset of work items. The options are a bit limited for now, but should improve when the database grows to encompass new types of works and subjects.

Alternatively, you can also look for potential works that cover a specific subject. George Washington is quite popular.

I have chosen the current candidates because they are computationally cheap and reasonably accurate to generate. However, I hope to expand to more work and subject areas over time. Scientific articles that describe species come to mind, but the queries to generate candidate matches are quite slow.

If you have ideas for queries, or just work/subject areas, or even some candidate lists, I would be happy to incorporate those into the tool!

One Comment

  1. I think a good starting point in general would be topics that have already been tagged as the topic of other works

    “Scientific articles that describe species” (or other taxa) have been on top of my wish list for a while. Again, would be good to start with those that have already been used as “main subject” tags, and then branch out from there (e.g. sister taxa with the same rank and the same parent taxon or some such, or co-occurring with some of those starting taxa, or published by the same authors or in the same journals etc.)

    Wednesday, July 4, 2018 at 02:12 | Permalink

Post a Comment

Your email is never published nor shared. Required fields are marked *