Skip to content

Is it a bot? Is it a user?

So I was recently blocked on Wikidata. That was one day after I passed a quarter million edits there. These two events are related, in an odd way. I have been using one of my tools to perform some rudimentary mass-adding of information; specifically, the tool was adding “instance of:human” to all Wikidata items in the English Wikipedia category “living people”. I had been running this for over a day before (there were ~50K items missing this basic fact!), but eventually, someone got annoyed with me for flooding Recent Changes, and I was blocked when I didn’t reply on-Wiki quickly enough.

I’ve since been un-blocked, and I throttled the tool, now waiting 10 seconds between every (!) edit. No harm done, but I believe it is an early sign of a larger controversy: Was I running an “unauthorized bot”, as the message on my talk page was titled? I don’t think I was. Let me explain.

Bots have been with Wikipedia and other Wikimedia projects almost since the beginning. Almost as old are complaints about bots, the best known probably Rambot‘s addition of >30K auto-generated stubs on English Wikipedia. Besides making every second article on Wikipedia about a town in the U.S. no one ever head of (causing exceptional dull-age in the “random page” function), it also flooded Recent Changes, which eventually let to bot policies and the bot flag, hiding bot edits from the default Recent Changes view. These days, bots make up a large amount of Wikipedia editing; I seem to remember that most Wikipedia edits are actually done by bots, ranging from talk page archiving to vandal fighting.

So how was my mass-adding of information different from Rambot’s? Rambot was written to perform a very specific purpose: Construct plain-text descriptions of towns from a dataset, then add these to Wikipedia. It was run once, for that specific purpose, by its creator. Other bots, like automatically reverting of certain types of vandalism, run without any supervision at the time (which is the whole point, in that case).

Herein lies the separation: Yes, I did write the tool, and I did operate it, but as two different people. That is, anyone with a Wikidata user name can use that tool, under his user name, via OAuth. Also, while the tool does perform an algorithmically defined function, it is not really constrained to a purpose, as a “classic” bot would be. That alone would most likely disqualify it to get a “bot permission” on Wikidata (unless the mood has really changed for the better there since the last time I tried). Certainly, there are overlaps between what a bot does, and what my tool does; that does not justify putting the “bot” label on it, just because it’s the only label you’ve got.

To be sure, no one (as far a I know) disputed that the edits were actually correct (unlike Rambot, which added a few thousand “broken” articles initially). And the fact that ~50K Wikidata items about living people were not even “tagged” as being about people surely highlights the necessity for such edits. Certainly, no one would object to me getting a list of items that need the “instance of:human” statement, and adding them manually. All the tool does is make such editing easier and faster for me.

Now, there is the issue of me “flooding” the Recent Changes page. I do agree that this is an issue (which is why I’m throttling the tool at the moment). I have filed a bug report to address this issue, so I can remove the throttling again eventually. So Users, bots, and OAuth-based tools can live in harmony again.