Cross-cultural neutrality about controversy? The Wikipedia case

The Tower of Babel - Lucas van Valckenborch (1595)

The project

Hi! I’m Marta and my thesis project consists in a cross-linguistic analysis of treatments and declinations of a controversial topic between different editions of Wikipedia. The aim of the research is to discover if the debate linked to a controversial page coincides or differs in the various languages and therefore how high is the percentage of incidence of the culture associated with them in the generation of contents. The final output of the project is a tool (set of visualizations) which allows to compare visually several aspects of the discussion built on a chosen controversial theme and increase the general awareness of readers on different cultural interpretations of it.

Cross-linguistic Wikipedia: NPOV and cultural background

Wikipedia, as an online encyclopedia whose contents are produced and updated entirely by users, may be considered and studied as a mirror of modern society. One of the most important conditions required in the production of content is the maintenance of a neutral point of view (NPOV) to avoid any personal opinion, a difficult aim for a collaborative articles production.

Wikipedia is currently available in 288 language editions (April 2015) and each edition is organized and developed separately from the other, based on cultural predispositions of the countries from which it takes voice. The neutrality of articles should give an objective value and universal to content that goes beyond the origins and ideologies of each editors. However, despite the article was written in accordance with NPOV policy, there are cases where the definition of the topic can’t be separated from the consideration that the different cultural backgrounds have of it.

The aim of my thesis is to verify if the debate linked to a controversial topic coincides or differs in the various language editions and therefore how high is the percentage of incidence of the culture of origin on the generation and treatment of the content. The output of the project is a tool which allows to compare, between the editions, various aspects of the debate on a chosen controversial topic.

Bibliography of cross-linguistic studies about Wikipedia

Many researches have made Wikipedia their object of study analyzing the pros and cons of this archive of knowledge generated collaboratively. Only recently the cross-linguistic aspect of Wikipedia seems to have gained interest in the sociologists community and this has led to a proliferation of studies, even if comparative researches are still limited.

The studies analyzing the cross-cultural aspect of Wikipedia can be substantially divided in two groups (Bao & Hecht, 2012): projects studying the multilingual behaviour on Wikipedia and projects that attempt to equalize information between the editions. But how different is the approach of communities in generating a shared knowledge? Is possible speak of a “global consensus” (Hecht & Gergle 2010) in a particular encyclopedic context such as Wikipedia?

A detailed analysis of the existing research projects on cross-cultural Wikipedia has permit to identify four recurrent thematic areas of study defined by the aspects examined:


Semantic analysis / text similarity

The primary aim of the many researches is to see if different language communities tend to describe the same argument in a similar manner. For this purpose the method used is the extraction and comparison of concepts / entities, identifiable with interlink, (Bao, Hecht et. 2012; Hecht & Gergle, 2010; Massa & Scrinzi, 2010) and the following comparison of their application (Adafre & De Rijke 2006). This kind of study is able to bring out the stereotypes that a nation or culture has towards another, if the external point of view on a culture coincides with the idea of this culture of itself (Laufer, Flöck et al., 2014) rather than to observe the cultural interpretation and level of involvement on a particular topic (Kumar, Coggins, Mc Monagle; Otterbacher, 2014).

Information quality/ equality

The quality of information of a collaborative knowledge is one of the most shared doubt when talking about Wikipedia. Studies that deal with this problematic in a cross-linguistic analysis tries to determine the quality indices of communities, as the amount of information provided (Adar & Skinner et al., 2009; Callahan & Herring, 2011)or the development of the categories structure (Hammwöhner, 2007), and compare them between editions. Being Wikipedia’s content a kind of knowledge produced by divergent cultural communities, could emerges a sort of dissonance in interest, quality criteria and definition of neutrality that not always coincide between editions (Laufer, Flöck et al., 2014; Stvilia, Al-Faraj et al., 2009; Chrysostom, 2012; Rogers & Sendijarevic, 2008).

Policy and regulation

Each edition has a self-management not necessarily connected with others and this aspect is reflected in both the definition and application of standards. The pages used to explain rules and characteristics of the platform, such as the production of content by Bots (Livingstone, 2014) and the maintenance of a neutral point of view (Callahan, 2014), show a strong connection with the cultural background of the community behind the contents and follow its behaviours and conduct in real life.

Culture and behavior

Other research focuses instead on a qualitative analysis of the different way of managing the production of content and the priority given by the community to certain issues. The collaboration between users follows different rules of conduct: the attitudes of courtesy, the management of disagreements or conflicts and the attention to contents accuracy reflect the culture and customs of the community (Hara, Shachaf et al., 2010). Many studies have noted a correspondence of conduct among nations with a similar score according to the “cultural dimensions” of Hosfede: power distance, tendency to individualism or collectivism, feminist or male-dominated society and uncertainty management (Pfeil, Zaphiris et al., 2006; Nemoto & Gloor, 2011). The cultural background also determines the predilection concentrate attention and therefore discussion on certain issues rather than others (Yasseri, Spoerri et al., 2014): the topics “close to home” tend to receive more interest (Otterbacher, 2014) and, in case of cross-cultural discussion, the cultures involved are determined to enforce their own version on the other (Rogers & Sendijarevic, 2008; Callahan, 2014).


Some of these researches have produced a tool as final output: these tools permit to compare articles/threads from various editions and return visually their overlaps and discrepancies. These instruments break down the language barriers allowing a cross-cultural consultation of different topics and offering to users a complete overview of information. They also help to build in the reader of Wikipedia a greater awareness of the cultural differences existing in society. Here is a brief description of existing (or in development phase) instruments:

Manypedia (

Online tool that allows to compares simultaneously two language versions of the same article. Manypedia extracts from both some basic data of the page as the most frequent words, the total number of changes and publishers, the date of entry of the article, the date of the last change and the most active publishers enabling the user to confront the salient features of the two articles.

Omnipedia (

With this tool the reader can to compare 25 language editions of Wikipedia simultaneously highlighting similarities and differences. Information is lined up to return a preview of usage of the concepts for each language, eliminating all obstacles due to language barriers. The entities (link) linked to the most discussed topics are extracted and translated into an interactive visualization that shows which topics are mentioned by linguistic editions and the sentences of the page where they appear.

SearchCrystal (

Set of visualizations which displays differences and overlaps between the 100 articles most discussed identified in 10 language editions of Wikipedia in relation to the number of changes.

Wikipedia Cross-lingual image analysis (

By entering a URL linked to a Wikipedia article the tool tracks all the languages in which this article is available and collects the images present in the pages. The output is a panoramic view of the topic by images: it provides a cultural interpretation of the argument related to the different languages.

Terra incognita (

Terra incognita collects and maps the articles with a geographic location in more than 50 language editions. The maps highlight cultural prejudices, unexpected areas of focus, overlaps between the geographies of each language and how the focus of linguistic communities was shaped over time. The project allows the comparison of different features between languages through filters: language (any language has an assigned colour), intersection (shows the location of articles appearing in more editions), translated articles, links (highlight areas whose articles have been translated into several languages).

Open questions

The common aim of these studies is discovering, if it exists, a meeting point between cultures and their point of views as they are told on Wikipedia. The technical analysis of terms unified with social studies help to achieve this goal but what kind of horizontal research and comparison should be made to go deeply in the discovery of a cross-cultural controversy? Are there other researches available introducing new aspects of cross-cultural analysis?


Adafre S.F. and De Rijke M., Finding Similar Sentences across Multiple Languages inWikipedia, 2006

Adar E., Skinner M. and Weld D.S., Information Arbitrage Across Multi-lingual Wikipedia, 2009

Bao P., Hecht B., Carton S., Quaderi M., Horn M. and Gergle D., Omnipedia: Bridging the Wikipedia Language Gap, 2012

Callahan E. and Herring S.C., Cultural Bias in Wikipedia Content on Famous Persons, 2011

Callahan E., Cross linguistic neutrality Wikipedia’s neutral point of view from a global perspective, 2014

Crisostomo A., Examining perspectives on “Abortion” within the EU through Wikipedia, 2012

Hammwöhner R., Interlingual aspects of Wikipedia’s quality, 2007

Hara N., Shachaf P. and Hew K., Crosscultural analysis of the Wikipedia community, 2010

Hecht B. and Gergle D., Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories, 2009

Hecht B. and Gergle D., The Tower of Babel Meets Web 2.0, 2010

Kumar S., Coggins G., Mc Monagle S., Schlögl S., Liao H.T., Stevenson M., Bardelli F. and Ben-David A., Cross-lingual Art Spaces on Wikipedia, 2013

Laufer P., Flöck F., Wagner C. and Strohmaier M., Mining cross-cultural relations from Wikipedia – A study of 31 European food cultures, 2014

Livingstone R., Immaterial editors: Bots and bots policier across global Wikipedia, 2014

Massa P. and Scrinzi F., Exploring Linguistic Points of View of Wikipedia, 2011

Massa P. and Zelenkauskaite A., Gender gap in Wikipedia editing – A cross language comparison, 2014

Nemoto K. and Gloor P.A., Analyzing Cultural Differences in Collaborative Innovation Networks by Analyzing Editing Behavior in Different-Language Wikipedias, 2011

Otterbacher J., Our News? Their Events?, 2014

Pfeil U., Zaphiris P. and Ang C.S., Cultural Differences in Collaborative Authoring of Wikipedia, 2006

Rogers R. and Sendijarevic E., Neutral or National Point of View? A Comparison of Srebrenica articles across Wikipedia’s language versions, 2012

Stvilia B., Al-Faraj A. and Yi Y.J., Issues of cross-contextual information quality evaluationThe case of Arabic, English, and Korean Wikipedias, 2009

Yasseri T., Spoerri A., Graham M. and János Kertész J., The most controversial topics in Wikipedia: A multilingual and geographical analysis, 2014

Tags: , , ,

Leave a Reply