Accountability and truth on Wikipedia and beyond

It is rare for a peer-reviewed journal article in the humanities to have tens of thousands of views within its first weeks of publication and to catalyse involvement by Wikipedia’s Arbitration Committee. But that’s just what has happened this February when Prof. Jan Grabowski and Dr Shira Klein published their explosive peer-reviewed journal article, “Wikipedia’s Intentional Distortion of the History of the Holocaust”. In their extensive study of 25 Wikipedia articles and nearly 300 talk pages, noticeboards and arbitration cases, Grabowski and Klein argued that “a group of committed Wikipedia editors have been promoting a skewed version of history on Wikipedia, one touted by right-wing Polish nationalists, which whitewashes the role of Polish society in the Holocaust and bolsters stereotypes about Jews.”

Since their essay was published, several errors and problematic sources have been removed and a few weeks after the article was published, Wikipedia’s Arbitration Committee initiated a case in response to the essay. In their recently released ruling, involved editors were subject to topic bans that can be appealed after a year. ArbCom also introduced a “reliable sourcing restriction” in the topic area requiring that the use of non-academic sources requires consensus on the talk page of the article or in a discussion at the Reliable Sources Noticeboard.” The Arbitration Committee summarily dismissed claims that Klein and Grobowski’s article violated Wikipedia’s policies, but did take the unusual step of requesting the Wikimedia Foundation produce a White Paper on research ethics for academics studying Wikipedia.

While the bans were relatively severe by the Committee’s standards, it was remarkable that they shied away from the most severe available sanction. Four of the arbitrators supported a site ban on Volunteer Marek, the most aggressive and recalcitrant of the censured editors. This would have forbidden him from contributing to any Wikimedia project indefinitely. The four arbitrators felt that after a long history of previous bans, Volunteer Marek was a lost cause. The remaining seven Arbitrators demurred, and Volunteer Marek has been allowed to remain on the platform, albeit without the right to contribute to articles on ‘World War II in Poland and the History of Jews in Poland, broadly construed’, and without the right to revert another editor’s contribution more than once. It remains to be seen how effectively Wikipedia’s administrators can police these bans. Volunteer Marek edits Wikipedia 14 times a day.

Beyond the question of incivility (of which there was a hefty dose from multiple quarters), there seems to be widespread agreement that there were serious errors in the pages related to Polish history during WWII. Although involved editors like Piotrus agree that “Wikipedia’s coverage of this topic area has some errors and biases”, he has vehemently opposed his representation in Grabowski and Klein’s essay. He is no Holocaust denier, he writes. And his “occasional errors” were neither intentional nor done in bad faith. Connecting editors’ usernames to their real names and naming their employers as Grabowski and Klein did in their essay is, according to Piotrus, unethical.

This case demonstrates yet again the high stakes of Wikipedia editing. It also presents important questions about the accountability of Wikipedia and the researchers studying it. Is it enough that Wikipedia (via ArbCom) can only govern the behaviour and conduct of editors? Are these checks and balances enough to keep Wikipedia free from disinformation? Or is Shira Klein right when she responded to the ArbCom ruling that “by avoiding the issue of historical truth and focusing on civility, Wikipedia sent a clear message: “There’s no problem with falsifying the past; just be nice about it.”

In the conclusion of their essay, Grabowski and Klein write that they had recommended that the Wikimedia Foundation look into the case of Polish-Jewish history as they had done with the Croatian Wikipedia but the WMF responded that they would only step in if editors asked for support or if a community didn’t have “sufficient oversight procedures to address a matter.” Grabowski and Klein felt they needed to name editors because they were authors of content that was untrue and potentially harmful and they don’t have the confidence of Wikipedia’s ability to obtain redress when its highest body (ArbCom) only enforces conduct rather than truth.

ArbCom was asked to intervene in the same topic area in 2019. They instituted a number of topic and interaction bans, as well as the same reliable sources remedy they reinstituted in the latest case. The question is whether ArbCom’s remedies are working to keep Wikipedia free from errors that creep into Wikipedia in “hot” areas that encourage incivility and drive out balanced views. And if not, what should the Wikimedia Foundation, the Wikipedia community and researchers of Wikipedia do about it? These are still open questions and we know of no research that has interrogated whether, in cases of significant, real world ideological disputes Wikipedia’s enforcement measures work in the long term. This, it seems, is a question ripe for study in the context of larger efforts to regulate platforms that propagate misinformation and other harmful content. Wikipedia has a number of excellent mechanisms to keep out harmful content, but it is by no means perfect, as this case shows us yet again.

Heather Ford
wikihistories chief investigator

Michael Falk
wikihistories chief investigator

wikihistories book club

This week, the wikihistories team read Katy Weathington and Jed R. Brubaker’s open access article, “Queer Identities, Normative Databases: Challenges to Capturing Queerness On Wikidata,” Proceedings of the ACM on Human-Computer Interaction 7, no. CSCW1 (2023): 84:1-84:26, https://doi.org/10.1145/3579517.

Tamson’s response: Wikidata in the history of sexual categorisation

Gender, for historians, is a socially and culturally constructed phenomenon. It is not neutral or natural, but something that has a history and a politics. In a book first published in 1990, the historian Tom Laqueur argued that binary understandings of sex came to dominate our understanding of the body only in the eighteenth century. It was part of a shift away from the idea of a single “one-sex” model of anatomy, in which women were believed to share the same basic (albeit imperfect version of) the body as men, only their lack of heat meant that their genitals were forced inside rather than turned out: the vagina was an inverted penis. The change in thinking was not only a result of new scientific and medical knowledge, suggested Laqueur, but also a way to establish a clear and rigid hierarchy between the sexes. It can not be mere coincidence that this was the same period in which the Ecyclopédie emerged. Edited initially by Denis Diderot, it was published in France between 1751 and 1759 and aimed to achieve a comprehensiveness of the kind identified by Hilary Anne Clark. It also relied on new forms of categorisation, with its famous introduction by D’Alembert presenting a taxonomy of human knowledge, that displaced the existing primacy given to religion and theology with one focused on reason.

In their article, “Queer Identities, Normative Databases”, Katy Weathington and Jed R. Bruraker examine the limitations that open collaborative databases such as Wikidata have in representing queer identities. “The queer conceptualization of gender” they posit, “is inherently poorly suited, even in some cases incapable, of being transformed into data.” And yet queer identities are transformed into data on a daily basis – as the debates and case studies Weathington and Bruraker discuss so clearly show. Individuals are assigned sex and gender categories which may or may not map onto their stated or felt preference, and queer people are almost exclusively required to carry the burden of sexual orientation. According to the authors, although 86.7% of the population identifies as heterosexual, only 3% are identified as such on Wikipedia. This has significant effects. Not only do “the items, categories, and properties that are codified in a database structure exert informational power on subjects, which is especially noticeable for queer people,” but “queer individuals and communities can become bound to the structures and schema of databases they interact with (sometimes unknowingly or unwillingly).” Resisting and contesting these structures has been part of queer theory and activism since its inception as a political and intellectual project.

While Weathington and Brubaker propose some avenues for action, the implication of their study is significant for those interested in Wikipedia and its sister project, Wikidata. In refusing structured gender categories, queer identities present a fundamental challenge to the epistemic logic of the open, collaborative and highly structured database. “What does it mean,” ask the authors, “for the database, a bastion of normative standardization, to be queered, to resist its own normative impulses?” Would it, as they half speculate, become something that is “too big to be read by a human, and too inconsistent to be processed by a computer”? Or would it – as the Ecyclopédie once did – institute a new regime of categorisation even as it upended the old? In Laqueur’s story, despite the new post-eighteenth century prominence of the “two-sex” model of difference, the “one-sex” plot did not totally disappear, and the two plots come to overlap, with the distinction between them particularly indistinct in the thinking of Freud. The notion of the “stable body”, argues Laqueur, “that seems to lie at the basis of modern notions of sexual difference is also the product of particular, historical, cultural moments.” It comes in and out of focus to serve specific cultural and social ends. Which perhaps might lead us to ask of Wikidata, what work is standardisation doing and for whom at this moment in time?

Tamson Pietsch
wikihistories chief investigator

Michael’s response: towards queer computing

Weathington and Brubaker paint a dismal picture of gender identity in Wikidata. They make the case that the Wikimedians who maintain the database have neither the knowledge nor the tools at their disposal to portray queer people in an appropriate way. They give two reasons for this failure.

The first reason is ideology. A particular group of Wikimedians with passionately-held beliefs about the biological determination of sex have successfully resisted calls to open up the database schema, while a broader ideology of ‘archival completeness’ has limited the ability of trans people to determine their own identity online. Tamson writes that such biological determinism and the dream of archival completeness have a long and intertwined history. Wikidata is hardly the first attempt to categorise people according to a two-body model of sex.

The second reason is technology. According to Weathington and Brubaker, databases cannot be queer. Queerness is anti-categorical. To queer something is to break it apart, to deconstruct it, to open it up, to release and unfold it in ways that evade categorisation. But databases, argue Weathington and Brubaker, ‘inherently traffic in standardized, analyzable data’ (p. 18). To queer a database would be to destroy it: ‘If there is no longer the core assumption of a consistent organizational structure, the database loses all practical value.’ (p. 21)

Weathington and Brubaker make a reasonable case that Wikidata currently has a strict and decidedly ‘normative’ categorical scheme. But is it true that databases are inevitably rigid? Is it true that ‘queerness’ and ‘scalability’ can never come together? Must all computation forever be straight?

I am not so sure. 50 years ago, Marvin Minsky wrote that ‘programming is a good medium for expressing poorly understood and sloppily formulated ideas’. Around the same time, John McCarthy invented the ‘amb’ operator, which specifically allows the computer to be ambiguous: I think that the truth must be one of these possibilities, but I do not know which one. There is in fact a rich history of nondeterministic computing, which continues into the present. Wikidata’s key competitor is not another strictly categorised knowledge graph, but a vast neural network that categorises nothing: ChatGPT models the world as a vast array of continuous numbers that are impossible for a human to interpret as an ordered tree of discrete categories. The fact that ChatGPT is also prone to cisheteronormativity indicates that the problem may not be with the technology, but rather with its constitution into a socio-technical system in a particular place and time.

The hopeful question I would draw from Weathington and Brubaker’s discussion is this: how can computing be queered? It is hard to imagine how Wikidata could be queered. As Weathington and Brubaker suggest, the only likely solution for Wikidata in the long run is for queer people to accept some compromise with cisheteronormative users of the platform, to enable some measure of reform.

But underneath Wikidata, the Wikibase software already contains the seeds for another vision of data. Wikibase itself does not require that claims or properties or vocabularies be limited to some strictly normalized set. Wikibase does not impose the logic of ‘relational schemas’ and ‘scalability’ that Weathington and Brubaker observe at work in Wikidata itself. Indeed, part of the point of Wikibase is to abstract away the underlying ‘relational schema’, allowing users to create flexible, schemaless data on top of it. Wikibase does not require that only 102 people on earth have the authority to decide the properties of things (p. 7). There is no reason in principle why every individual in a Wikibase instance couldn’t have their own unique set of properties, which it is the task of a human, or a nondeterministic heuristic algorithm, to try and understand.

If we abandon technological determinism (again), then perhaps we can dream of a queer computer. If we can’t, then it is hard to be hopeful about our apparently inevitable digital future.

Michael Falk
wikihistories chief investigator

If you’d like to contribute to July’s book club by contributing 200-500 words on our chosen piece, send us an email! We’ll next be reading the classic “Can History Be Open Source? Wikipedia and the Future of the Past” by the late Professor Roy Rosenzweig.

newsletter #2

Accountability and truth on Wikipedia and beyond

wikihistories book club

Tamson’s response: Wikidata in the history of sexual categorisation

Michael’s response: towards queer computing