Behind the scenes: visualizing debates on Wikipedia

How consensus on wikipedia is reached? During the 2013 DensityDesign course, a group of students was analyzing the different positions on the abortion as family planning method.

To identify how persons with different positions interact, part of their work focused on the italian Wikipedia page “Dibattito sull’Aborto” (Abortion Debate). Wikipedia, in fact, is a place where knowledge is built through the collaboration of several contributors that don’t necessarily share the same point of view: the results are neutral contents built with negotiation.

The video below was part of the final keynote to present their project, “Unborn discussion“, and is the animated synthesis of the visual report they’ve produced.

We found really effective the way our students analyzed and presented it, so we asked them to explain the design process they used both to analyse and to visualize it.
The project has been realized by Alberto Barone, Maria Luisa Bertazzoni, Martina Elisa Cecchi, Elisabetta Ghezzi and Alberto Grammatico.

Observation and confusion – First phase

Searching through the page changes in Wikipedia chronology, the whole pool of changes has been obtained and taken in account for the processing phase. The first step consisted in excluding the ones made by wiki-bots, the spelling corrections and the format changes as they did not bring any meaning alteration to the page, thus obtaining 147 relevant changes out of the former 289. These changes were already divided by year, month, day, hour, number of bytes (added or removed) and author. Then the changes with a high number of bytes added or removed and the principal authors of the changes have been investigated. The first striking observation was that these big changes were not characterized by the presence of other related changes in the previous or following days, and that their authors were seldom the same. Furthermore, the different authors of these big changes weren’t contributing from the birth of the page: it become clear soon that focusing the attention on a single author or a single change was not the right strategy to find a way through this maze.

Img. 01 First bar chart of edits in time

Understanding and creating a method – Second phase

Interestingly, in the first three years of the page, from 2006 to 2008, there were more edits and more high bytes changes in comparison to the other years: the focus of the analysis shifted then from the author of the edit, to the discrete number of edits and its size in bytes. The first part of the page analyzed in this way was the page index. Since the page was started, the index has been modified four times: the first three times, only some paragraphs were removed or changed, but the fourth time, instead, it was completely rearranged. The best way to analyze the page changes was then concentrating on single paragraph and comparing through time its edits and orientation. Following this method it was easier for us to compare the edits made in the same paragraph, to see which word or sentences were changed and how it changes the orientation of the page. Page changes orientation had been classified as: pro life, neutral/pro life, neutral, neutral/pro choice, pro choice, accordingly to the meaning the changes gave to the remaining text. After having established the procedure, every paragraph from 2006 to 2012 has been rated and analyzed in this same way. This method revealed that high number of bytes changes, as for example a whole paragraph editing, happened mostly as a result of a debate gathering in the end on a version shared by the community. Concerning small size edits, it was evident a lot of continuos adding and removal of low amount of bytes: these are called “edits’ war”, mirror of the diversity of the points of view. The interesting point was that often this “edits’ war” was made by highly oriented changes, but its result was frequently a neutral final edit: this confirms the nature of negotiated development that we hypothesized as a basis of Wikipedia pages growth.

The last phenomenon that was possible to observe was something beyond the edits and “edits’ war”: the spoiling attitude of some users in adding off-topic comments and insults.

Img. 02 Excel tables of  paragraphs categorizations

Visualizing the wikipedia processing – Third phase

Img. 03 Sketches of the visualizations

The first visualization wants to give general idea about the state of the changes orientation from the origin of the page until nowadays. We compared both the discrete number of changes and the corresponding size of bytes as they add or delete contents. From this comparison it appeared that though the majority of changes was neutral, there was also a consistent number of oriented changes, especially, adding contents in a neutral/pro life and pro life orientation. That led us to the conviction that the negotiation proceeds through the adding of partisan contents imposing a point of view and a further restoration of neutrality.

Img. 04 Numbers and bytes of changes over the years

The second visualization starts with the awareness of the importance of the chronological sequence of the 147 changes: the result was a simple a bar chart composed by time on the x axis and amount of positive or negative bytes on the y axis all with the orientation classification.

More over the time expansion and restriction shows that the negotiation density was higher from 2006 to 2008: with time the discussion faded and the page achieved a certain stability, proposing a more shared and neutral vision of the topic.

The comprehension is guided by a line that goes up and down through the edits and “edits’ war”: this idea comes from the continuos adding and removing of the same bytes as a kind of tennis match.  We also decided to analyze contemporary historical events related to the theme of abortion; interestingly we noticed that the discussion on the Wikipedia page rises in the days following some relevant abortion-related events, as comments on the topic from prominent persons (belonging to the Church or to different organizations) or news.

Img. 05 Changes in time and correspondence with topic-releted events

Then  another visualization has been realized as part of the one described before in order to explain in details what words or sentences in particular were added or removed and their orientation.

Img. 06 Zoom in some interesting edit wars

The last visualization concerns the relationship between the orientation of the changes and the typology of the users undertaking them. The result is a pie chart that shows that the registered members of Wikipedia mostly changes the page in a neutral way. On the other hand, the unregistered  users identified with IP, have proportionally made more pro life or neutral/pro life oriented modifications

Img. 07 Changes by authors

Tags: , , , , ,

Leave a Reply