Skip to content

Scaling up Wikidata editing

Wikidata continues to grow. The growth is not in the number of items (which are roughly limited by the total number of Wikipedia articles at the moment) but in labels for different languages and, of course, statements. These statements are supplied by two groups of editors: the bots, adding vast numbers of statements based on programmatically defined criteria, and the humans, painstakingly adding one statement at a time. While some statements are researched and take a human to investigate, basic statements (e.g. “this item is a human”) are still missing from many items. JavaScript-based tools are trying to ease the monotonous task of adding these by hand, but do not really scale well.

Víðarr

Through the recent addition of OAuth to all Wikimedia projects, an opportunity has presented itself to ease this burden. Wikimedians can now authorize certain tools to edit on their behalf, under their control. This keeps the responsibility for edits were it always was, in the hand of the editing user, while simplifying the mechanics of editing, uploading, or otherwise modifying a wiki. Thus, I created WiDaR, the WikiData Remote editor (and Norse deity). Widar, by himself, does not come with a user interface, except the authorization link (which you need to click to use it). It serves as a central conduit for other tools to edit Wikidata on your behalf.

AutoList, reloaded

The first tool to use Widar is the improved AutoList. If you have signed up to Widar, you can now use checkboxes on AutoList to set a property:item statement (aka “claim”) on Wikidata for up to 50 items at a time. Select the items you wish to modify, and use the “Claim!” button. Enter the property and target item numbers (e.g. 31 and 5 for P31=”instance of” and Q5=”human” to mark the items as human beings) in the new dialog, click “Set claims”, and you’re done. A new window will open, representing the Widar batch operation of setting your new claims, which will take a few seconds. At the same time, the items you had selected are removed from AutoList so you can process the next batch. There is no need to wait for Widar to finish your previous batch, they can run in parallel.

AutoList now also supports lists based on a Wikipedia category tree; That way, you can generate a “pre-selected” item list (e.g. based on [[Category:Fictional cats]]), and then set Wikidata claims on the respective items appropriately. Remember, all Wikidata edits will be attributed to your account, and tagged with the Widar tool. While it is tempting to fire-and-forget batches, please take a few seconds to prevent wrong statements from being added to Wikidata.

The narrow path?

Unsurprisingly, the potential for abuse of this type of tool has not escaped me. However, my experience with TUSC has told me that there seems to be little vandalism that cannot be handled on-wiki with both build-in tools and third-party tools. Also, the availability of a technology like OAuth that allows for user mass-edits also allows for mass-reverts by administrators; even if this is initially abused in some cases, proper countermeasures exist or can be developed. Potential vandalism  should not be used as a “look, terrorists!” club; otherwise, Wikipedia (and Wikidata!) would not exist.

2 Comments


  1. Fatal error: Uncaught Error: Call to undefined function ereg() in /home/www/wordpress/wp-content/themes/veryplaintxt/functions.php:183 Stack trace: #0 /home/www/wordpress/wp-content/themes/veryplaintxt/comments.php(33): veryplaintxt_commenter_link() #1 /home/www/wordpress/wp-includes/comment-template.php(1510): require('/home/www/wordp...') #2 /home/www/wordpress/wp-content/themes/veryplaintxt/single.php(41): comments_template() #3 /home/www/wordpress/wp-includes/template-loader.php(78): include('/home/www/wordp...') #4 /home/www/wordpress/wp-blog-header.php(19): require_once('/home/www/wordp...') #5 /home/www/wordpress/index.php(17): require('/home/www/wordp...') #6 {main} thrown in /home/www/wordpress/wp-content/themes/veryplaintxt/functions.php on line 183