Skip to content

Reductionism

While I do occasionally write Wikimedia tools “to order”, I wrote quite a few of them because I required (or just enjoyed) the functionality myself. One thing I like to do is adding images to Wikidata, using WD-FIST. Recently, I started to focus on a specific list, people with awards (of any kind). People with awards are, in general, more likely to have an image; also, it can be satisfying to see a “job list” shrink over time. So for this one, I logged some data points:

Screen Shot 2015-06-24 at 11.24.54Over the last 2-3 weeks, even my sporadic use of the tool has reduced the list by 1/4 (note the plateau when Labs was offline!). Some thoughts along the way:

  • The list of item candidates is re-calculated on every page load, and is not stable. As awards are more likely to be added to than removed from items, the total list of people with awards is likely to be longer today than it was at the beginning of this exercise.
  • I cannot take credit for all of this reduction; images that were added to Wikidata independently, but to items on this list by chance, likewise reduce the number of items on the list.
  • Not all of the items I “dealt with” now have an image; many had their candidate images suppressed thanks to a recently implemented function, where all the Wikipedia candidate images for a person are not depicting the person, but either a navbox icon, or something associated with the person (a sculpture made by the person, a house the person lived in, etc.)
  • Many items were “dealt with” by setting a “grave image”. These seem to be surprisingly (to me at least) popular on Wikipedia, especially for people from the former Soviet Union, for some reason.
  • I skipped many items where either the item label or the image name are in non-Latin characters. Oddly enough, I can match images to items quite well if both are in the same (non-Latin) script, by visual comparison 😉
  • I also skipped many items where a candidate item has multiple people. I tried my hand on generating cropped images for specific people with the excellent CropTool, but that remains quite slow compared to the usual WD-FIST actions. Maybe if I can find a way to pre-fill the CropTool values (e.g. “create new image with this name”).
  • Based on a gut feeling, the “low-hanging fruit” will probably run out at ~10-15K items.
  • A sore point for me are statues of people; sometimes, I use close-ups of statues as an image of the person, when no proper image is available. I’m not sure if that is the right thing to do; it often seems to cover the likeness of the person (at least, better than “no image”), but somehow it feels like cheating…
  • There should be a “pictures of people” project somewhere, making prioritized lists of people to get an image for, then systematically “hunt them down” (e.g. ask these people or their heirs for free images, check other free image sources in print and online, group them by “likely event” where they could show up in the future, etc.).
  • I could really use some help for the “Cyrillic people”, towards the end of the list.

6 Comments

  1. Hsarrazin wrote:

    Hey, I myself am a fan of this tool, trying to reduce lists by nationalities. 😀

    If you need help with cyrillic names, just post me on my wd user page (the name I used).

    As for the new function that allows to remove images that do not depict the person, does it take into account which pictures are concerned. I mean, if a real portrait is added on a wp site, will it be proposed, or will the item be permanently removed from WD-FIST ?

    Wednesday, June 24, 2015 at 12:50 | Permalink
  2. Nemo @ BEIC wrote:

    Interesting!
    https://tools.wmflabs.org/fist/wdfist/?wdq=claim%5b1343:3639582%5d&depth=3&language=it&project=wikipedia&no_images_only=1&prefilled=1

    Pity that for the images I care about (for BEIC), mainly images of books, there is no automated or automatable insertion by any template in Wikimedia land, so a Wikidata property would be useless.

    I’d really love a tool like wd-fist telling me whether an image is used on a wiki but not on the corresponding page on another wiki. We have https://github.com/abartov/glamify/ but that’s a bit limited.

    Wednesday, June 24, 2015 at 13:26 | Permalink
  3. Magnus wrote:

    @Hsarrazin: It’s not specific to people, just in the context of my little “project” here. Basically, the little yellow button will prevent the images suggested at that moment for that specific item to be shown again, for that item. Some of the images might be suggested for other items, and if a new image is added to Wikipedia, that will show up again.

    Wednesday, June 24, 2015 at 15:40 | Permalink
  4. Magnus wrote:

    @Nemo: I could write a tool that does what you propose, but it would generate a lot of noise: Navbox icons, multiple pictures of the same item (person) used on different wikis.

    If infoboxes would generate “no image” categories, one could subset those easily, but then again, infoboxes should really use Wikidata images as a default fallback…

    Wednesday, June 24, 2015 at 15:45 | Permalink
  5. Nemo wrote:

    Would there be much noise even if the images/articles were searched starting from a category of files on Commons?

    Thursday, June 25, 2015 at 18:46 | Permalink
  6. Magnus wrote:

    @Nemo Oh you mean starting from the image/list of images, not from the Wikipedia pages/Wikidata items? Yeah, that might work. Lemme think…

    Friday, June 26, 2015 at 09:32 | Permalink