posted by Alessandra Facchin
Wednesday, May 8th, 2019

Mapping and representing informal transport: the state of the art

This research is a preliminary step for an upcoming project. Its focus is to describe the state of the art of Informal Transport Mapping in Sub-Saharan Africa.

Informal Transports are defined, by literature, as:

“Paratransit-type services provided without official sanction.”

R. Cervero, A. Golub, Informal Transport: A global perspective

In most of the sub-Saharan African cities, where this kind of service exists, there isn’t an official mapping of the buses lines and stops. However, there are few projects where communities of users or researchers tried to map Informal Transports within a city.

To find these examples, we started using “Informal Transport Mapping” as query for a Google search and then focusing only on visual representations, such as transit maps. One of the results is the following.

F. Oyatogun’s tweet

This map shows all the local names that informal transports have in African countries. We reframed the Google query combining each local name with “mapping.” This led us to grassroots projects, initiated by informal transports users or local communities in need of proper information on this kind of services.

The following are some examples of visual maps found, other projects can be found here.

Digital Matatus

Authors: Columbia University, MIT, University of Nairobi, Groupshot
Location: Nairobi, Kenya
Year: 2012
Status: Ongoing
Map Type: Topological
Format: PDF

Project Description
Digital Matatus is a collaboration between Kenyan and American universities. The project takes advantage of mobile phone technology to collect data about Matatus infrastructures.
Students from Nairobi University collected data about routes by using a GPS app, then stops were identified thanks to students and commuters experience and thanks to visual notation, such as signs.
The data needed to be cleaned and formatted into GTFS, this led to the creation of a new and more flexible GTFS standard for paratransit transports since calendars, timetables and fares are not available for Matatus. Google agreed to update the GTFS file format in use and implemented the data collected during this project as a test to display informal transports on Google Maps.

Map Description
Digital Matatus map is a topological map which wants to show a schematic diagram of Matatus routes. The routes are grouped in coloured lines according to their destination. Landmarks and significative points of Nairobi are highlighted on the map. The map is combined with a list of all Matatu stations and every Matatu line that stops in a specific station.
The map is only available for print, but web and app versions are being developed.

Minibus taxi Routes

Authors: WhereIsMyTransport
Location: Cape Town, South Africa
Year: 2017
Status: Completed
Map Type: Topological, Thematic
Format: PDF, Web

Project Description
WhereIsMyTransport is a South African public transport data and technology startup. In 2017 they launched Cape Town Taxi Project: in three weeks they collected data about every route, common stopping point, frequencies and fares of minibusses with their own technology.
By integrating formal transport data with informal transport data, WhereIsMyTransport aims to map all the transit infrastructures of Cape Town. All the data are available from their own API for everyone to implement transport app or to design maps.
WhereIsMyTransport designed a printable map with the data collected.

Map Description
The WhereIsMyTransport printable map is a topological map which shows a schematic diagram of Minibusses routes and stops combined with formal transports stops. Each route has a number and a colour, but this marker is not used in the actual transport network. It’s not clear if lines with the same colour are related somehow. The map is divided into four tables. Table A shows the most active minibus routes, tables B C D show minibus routes of different areas.
WhereIsMyTransport API shows formal and informal transports stops on a map provided by Mapbox. All the other data are available as JSON file.

The Bus Map Project

Authors: Chadi Faraj, Jad Baaklini
Location: Beirut, Lebanon
Year: 2015
Status: Ongoing
Map Type: Topological, Thematic
Format: PDF, Web

Project Description
The Bus Map project is a grassroots initiative, started by Chadi Faraj and Jad Baaklini, which aims to map Lebanon formal and informal transport infrastructures. The data used to design the map were collected by ordinary citizens: the ultimate aim of this project is to create a community around the idea of collective mapping. This community should be able to keep the map up to date and to involve more and more people in this activity.
To collect the data app like Open GPS, Gaia GPS and Trails were employed, then the data are exported as GPX, KML and KMZ files and implemented in the map.

Map Description
The Bus Map project implemented two kinds of maps. The web-based one is a thematic map of the city of Beirut showing the bus lines: every line has its own colour and number. The bus stops are not visible on this map, the timetables are not available since the busses are not running regularly, landmarks of Beirut are highlighted. For each bus line is available information about such as the fare, the duration and the distance of the whole journey. The printable map is a topological map showing a schematic diagram of the bus lines, bus stops are included as well. Each bus line has its own colour and a number or name. On the map, there are instructions on how to use informal transports.

The Chapas Project

Authors: Joaquín Romero de Tejada
Location: Maputo, Mozambique
Year: 2013
Status: Ongoing
Map Type: Topological
Format: PDF

Project Description
The Chapas Project was started in 2013 by Joaquín Romero de Tejada with the purpose of mapping informal transport infrastructures in Maputo. The data has been collected during fieldwork and is not available to be downloaded. Every year, since 2016, the map is updated.

Map Description
The Chapas Project map is a topological map showing a schematic diagram of the informal transport infrastructures. The bus lines are grouped by colour according to their starting point or route. Bus stops are highlighted and listed, with coordinates and other information.

Candongueiros de Luanda

Authors: Development Workshop, Jon Schubert
Location: Luanda, Angola
Year: 2011
Status: Completed
Map Type: Topological
Format: PNG

Project Description
Candongueiros de Luanda is the outcome of a Development Workshop program in 2011. There isn’t any information on how data was collected, it is not available as well.

Map Description
Candongueiros de Luanda map is a topological map showing a schematic diagram of buses routes. Every bus has its own colour. There isn’t any bus stop or Luanda landmark on the map.

Dar es Salaam Dala Dalas mapping

Authors: Dar Ramani Huria, Ally
Location: Dar es Salaam, Tanzania
Year: 2016
Status: Ongoing
Map Type: Thematic
Format: Web

Project Description
Mapping Dala Dalas in Dar es Salaam is a collaboration between Dar Ramani Huria and Ally. This initiative aims to collect information on the informal transports infrastructures and layer them with data already collected by Dar Ramani Huria on the seasonal floodings in the city. The data are collected by members of the community equipped with tools like OpenMapKit and OpenDataKit. The data are then processed by Dar Ramani Huria and uploaded on OpenStreetMap. Ally helped to format the data on informal transport infrastructures. The data are available on the Dar Ramani Huria website and on OpenStreetMap.

Map Description
There are three demo maps on the Ally website. Each map is based on the same Dar es Salaam thematic map and is showing different information. Two of these maps are showing Dala Dalas routes: the routes are all of the same colour, the information given is which are the most travelled streets by buses. One of these maps is static, the other one is animated, displaying Dala Dalas routes in an unclear sequence. The third map shows the bus stops, for each of them is not specified which bus line stops there.

Transports au Mali

Authors: JungleBus, OpenStreetMap community of Mali
Location: Bamako, Mali
Year: 2017
Status: Ongoing
Map Type: Thematic
Format: Web

Project Description
This mapping initiative was started by the JungleBus founder and the OpenStreetMap community of Mali. The project aims to map as much as possible Sotrama routes to give Bamako population better information on the informal transport infrastructures. The wiki doesn’t describe how the data is collected. It can be assumed that the data collection is a grassroots collection done by OpenStreetMap users who then update the website. The data are available to be downloaded from OpenStreetMap.

Map Description
The map is a thematic map: Dala Dalas routes and stops are displayed on Bamako city map. The bus lines have different colours and names. It is not clear whether different bus lines have been grouped under the same colour and name according to some variables. Two different kinds of stops are displayed: bus stops and taxi stops.


Authors: Accra Metropolitan Assembly, JungleBus, Agence Française du Développement, Transitec and OpenStreetMap community of Ghana.
Location: Accra, Ghana
Year: 2017
Status: Completed
Map Type: Thematic
Format: Web

Project Description
AccraMobile3 was an initiative that involved Accra Metropolitan Assembly, JungleBus, Agence Française du Développement, Transitec and OpenStreetMap community of Ghana. The collaboration lasted from July until September 2017, but it is continued by the OpenStreetMap community. The project aims to map all the routes of informal transport infrastructures in Accra. It is not described how the data are collected, but on the wiki of the project, there are instructions on how to format the data before uploading it on OpenStreetMap. All the data are accessible from the OpenStreetMap website.

Map Description
AccraMobile3 map is a thematic map of the city of Accra. On the map, all the Trotro lines are displayed at once and with the same colour. By selecting a specific route on the right panel, that bus line and its stops are highlighted in a different colour. There aren’t any markers or name to distinguish routes and bus stops.

Technologies used to display data

The data obtained during these projects are mostly displayed in printable transit maps. But some projects keep data also in digital form so that it’s easier to keep the maps updated. None of the projects described has a mobile phone functioning app.

  • OpenStreetMap is a collaborative project to create a map of the world. Anyone can edit a map on OpenStreetMap and it is completely open source. However, the primary output of this project is the data that generates.
  • Google Maps is a web mapping service developed by Google. The maps are not editable by users since Google is using copyrighted map data for Google Maps. To embed Google Maps in an external website, they launched Google Maps API. With this service is possible to add different layers from the ones provided by Google.

Both map services have been used in two different ways. Some projects only display the data they have, combining these map services with other tools. Other projects store the data using these map services and keep everything updated with them.

Other tools are

  • MapBox is a provider of custom online maps for websites and applications. Their data sources are OpenStreetMap and NASA.
  • OpenLayers is an open source JavaScript library for displaying map data. It provides an API for building rich web-based geographic applications.

Moreover there are also consortia of geo-data related tools. None of the projects above involved them, but we found them during the research:

  • Open Source Geospatial Foundation is a non-profit and non-governmental organisation whose mission is to support and promote the collaborative development of open geospatial technologies and data.
  • Open Geospatial Consortium is an international not for profit organization committed to making quality open standards for the global geospatial community. These standards are made through a consensus process and are freely available for anyone to use to improve sharing of the world’s geospatial data.

Technologies used to collect data

Most of the projects described used GPS Mobile Apps to collect information about informal transports. In projects were grassroots users or large communities are involved, it is not specified how the data is collected.

  • OpenMapKit is an extension of OpenDataKit. It allows users to create mobile data collection surveys for field data collection.ODK tools allow everyone to create offline mobile surveys to collect field data. The data are then uploaded to a server when the mobile phone is online again. OMK extension is an android mobile application for browsing OpenStreetMap features to create and edit OSM tags.
  • Open GPS Tracker is a small device which can be plugged into a prepaid mobile phone to make it a GPS tracker. The Tracker responds to text message commands, detects motion, and sends you its exact position. The tracker outputs a file for Google Maps or any mapping software. The Tracker firmware is open source and user-customizable.
  • Gaia GPS and Trails are hiking apps. One of their function is to record the hiking trails and trips that a user follow.


GTFS: General Transit Feed Specification, defines a common format for public transportation schedules and associated geographic information.s

Thematic Map: a type of map that focuses on a specific theme or subject area. These kinds of maps stress spatial variation of one or a small number of geographic distributions. source

Topological Map: a type of diagram that has been simplified so that only vital information remains and unnecessary detail has been removed. These maps lack scale, and distance and direction are subject to change and variation, but the relationship between points is maintained.

posted by admin
Thursday, April 18th, 2019

Collaboration against disinformation: a summit organised by First Draft

From March 17th to 19th we participated with a group of students to the Collaboration Against Disinformation boot camp held in Milan. It was part of a series of summits across Europe. The summit was organised by First Draft, a non-profit organisation that supports journalists, academics, and technologists working on challenges relating to informational trust in the digital age. The summits aim at intensifying a cross-border collaboration on tackling information disorder as part of the new CrossCheck International Initiative.

The opportunity to participate in this summit comes from the project A Field Guide to “Fake News” and Other Information Disorders, a joint collaboration promoted by the Public Data Lab which put us in contact with First Draft.

DensityDesign Lab was invited to join the three days summit with two goals: share our approach on information visualization and facilitate discussion groups on the topic. The meeting was structured in two main moments: the first part was dedicated to short presentations held by experts from different fields and meant to understand the scale of the problem. After, the approach became more practical and the sessions turned into tutorials and conversations covering the different fields of the opening speeches.

A group of 5 Master degree students (Edoardo Guido, Matteo Banal, Francesca Grignani, Gabriele Wiedenmann and Elena Aversa), 1 PhD candidate (Beatrice Gobbo) and a PhD (Ángeles Briones) from our Lab took part in the event to propose a different method for data-driven research based on visualization. In particular, DD Lab arranged three moments: a speech presenting the findings of a master thesis about the circulation of disinformation through the Italian Facebook network followed by two interactive sessions introducing data collection methods. and visualization tools.

Circulation of disinformation through the Italian Facebook

In the plenary session Elena Aversa presented the finding of her thesis. Starting from a Facebook page sharing disinformation, developed protocols to track the main sources of the misleading news and understand the impact of them on the public. The presentation enabled the research to be shared with the audience of journalists and other experts, starting a fruitful discussion. If you are interested to know more about her thesis, you can visit here the project page.

Track and communicate the circulation of information disorders

During the first session, PhD Angeles Briones went through the different chapters of the Field Guide to Fake News and Other Informational Disorders,  presenting how protocols combining digital tools are actually effective in supplying the research. We also wanted to engage the audience on a deeper level, letting them try and use some of the tools mentioned in the Guide. The result was an interesting discussion about how participants’ own research could be developed and enriched through such methods.

Simulating recipes

The second session, held by PhD candidate Beatrice Gobbo, was even more interactive for the audience: mainly focused on the use of CrowdTangle, Netvizz, and RawGraph, combined them in a short tutorial to address a simple data collection and visualization. This hands-on approach helped some of those present to actually visualize their own datasets at the end of the meeting. 

The summit ended with a speech dedicated to the CrossCheck platform launched by First Draft. This is an online collaborative verification tool bringing together several partners around the world and created to accurately report false, misleading and confusing claims that circulate online.  

This boot camp gave us the possibility to share knowledge and work with experts from different areas and represented another occasion to understand the fundamental role that data visualization have in the analysis, comprehension, and explanation of social phenomenons.

posted by Michele Invernizzi
Monday, November 20th, 2017

FaST – Fashion Sensing Technology

logo project FaST

FaST – Fashion Sensing Technology is a project meant to design, experiment with, and implement an ICT tool that could monitor and analyze the activity of Italian emerging Fashion brands on social media. FaST aims at providing SMEs in the Fashion industry with the ability to better understand and measure the behaviours and opinions of consumers on social media, through the study of the interactions between brands and their communities, as well as support a brand’s strategic business decisions.

After a crisis that deeply hit Western economies and revealed the weaknesses of previous paradigms, the current challenge – for a mature sector like Fashion – is to reconnect social communities with their territories and their material culture. The tremendous potential created by a new generation of technologies – including digital production, social media and IoTs – offers powerful tools to pursue these goals.
Given the importance of Fashion as an economic and cultural resource for Lombardy Region and Italy as a whole, the project aims at leveraging on the opportunities given by the creation of an hybrid value chain fashion-digital, in order to design a tool that would allow the codification of new organizational models. Furthermore, the project wants to promote process innovation within the fashion industry but with a customer-centric approach, as well as the design of services that could update and innovate both creative processes and the retail channel which, as of today, represents the core to the sustainability and competitiveness of brands and companies on domestic and international markets.

FaST – Fashion Sensing Technology is a project supported by Regione Lombardia through the European Regional Development Fund (grant: “Smart Fashion & Design”). The project has being developed by Politecnico di Milano – Design dept. and Electronics, Information and Bioengineering dept. – in collaboration with Wemanage Group, Studio 4SIGMA, and CGNAL.

posted by Tommaso Elli
Wednesday, July 26th, 2017

“Data Walk” workshop – by Yanni Loukissas

Dear readers, the following post is a description of the outcomes of our latest experimental workshop held in May 2017 in Politecnico di Milano.
We’ve been lucky enough to have on our side Yanni Loukissas (from Georgia Tech), who not only outlined the structure of the experience and led the classes, but was even the writer of what is down here, so to him goes our most sincere thankfulness.

Enjoy the reading!


How can design put data in its place?

During a week in early May, I led a workshop at the Politecnico di Milano to investigate this question. A group of twenty-nine Communication Design students participated. They worked in small teams to explore the design opportunities and challenges posed by situating data in an unfamiliar setting: a walk.

Today, in 2017, we often think of data and place as being independent of one another (Loukissas 2016). Data are infinitely small, technical, and abstract while places are spatial, social and experiential. Nevertheless, data and place are linked. Indeed, “data walk” is not an oxymoron. For data can only be made and made sense of in places of production and display.

Data don’t live on the head of a pin, or a hard drive. Easy enough to say. Over the course of five days, I asked students to show how data and place are connected by designing walks that would take us through both data sets and data settings. Walking became the algorithm that allowed us to connect data and place in a series of surprisingly evocative encounters.

Student projects explored how practices of both collecting and displaying data might be critically reimagined through the structure of their walks. Walks gave rise to creative data collections: unconscious head motions, street stickers, discarded cigarette butts, minute changes in the skin. Walks also placed those data within and around the Bovisa campus of the Politecnico as: a prosthetic for walking, a soundscape, an ironic art installation, and a self-monitoring app. This range of rigorous and whimsical experiments helped us all reconsider the relationship between data and place.

The “Data Walks” workshop would not have been possible without the invitation and support of Paolo Ciuccarelli, professor and director of Communication Design at the Politecnico di Milano and Tommaso Elli, my teaching assistant and guide to Bovisa for the week. Additional thanks to all the members of Density Design Lab, who provided inspiration, moral support, and feedback.

The workshop was structured by a pair of exercises. Both explored the relationship between data and place, but in different ways:

Exercise 1. Walking for data collection

In the first exercise, students explored where data come from and how they are shaped by specific origins. This was not an abstract exercise. They learned hands-on how to make their own data sets. First, they identified a public route along which to collect data. Then, they selected a neglected or invisible subject encountered along that route as the focus of their data collection. I offered a selection of unorthodox procedures to help students develop reflexive collecting practices, such as “don’t categorize,” “use irregular measures,” “note absences,” “rely on your judgment,” and “record your own presence.” Students created and stored data using Flickr (a web app which captures images + metadata tags). Each data set captured (in its own way) what the original route looked like from the perspective of the selected subject. In the process, students learned just how human data collection can be.

Exercise 2. Walking for data display

In the second exercise, students considered the context in which new audiences might encounter their original data set. In this follow up assignment, students learned how data are affected by local settings for display. Still working in groups, they created short (~5 minute) data walks using the sets collected in exercise 1. Their data walks were both physical and informational traversals. They simultaneously took us through spaces and through a data sets, calling attention to the relationship between the data and its surrounding context.  Each group choose a procedure to help shape their walk: “narrate,” “materialize,” “participate,” “layer,” or “zoom.” The results were evaluated on three criteria. Concept: what does the data walk help us learn about the local setting? Experience: what does the data walk feel like? And technique: how is the data walk made?


(Image by Tommaso Elli)


Your light data project
Team: Nicolò Fabio Banfi, Sofia Chiarini, Giulia Corona, Alessandra Del Nero

Group 1 collected data about the play of light and shadow on a walk through the Bovisa campus. They devised a prototype mobile app that guided walkers from the main entrance to the design building, assessing variations in lighting conditions along the way. Their data walk prompted us to reflect on how light affects our bodies, through changes in the look, feel, and underlying processes within our skin.


Decoding the street colours

Group 2 created a collection of 360-degree photographs, taken at points along the path from the Bovisa train station to the steps of the design building, then deconstructed those photographs into a series of place-based color spectra. Finally, the group assembled the spectra into an imaginative walking guide to the colors of Bovisa.

Team: Long Zhang, Valeria Brienza, Teng Yilin, Margot Llobera


Watch your butts
Team: Barbara Nardella, Francesco Cosmai, Francesco Giudice, Giulia Zerbini

Group 3 gathered discarded cigarette butts and displayed them ironically as annotated art pieces along the promenade from the Bovisa train station. A 2015 Italian law recently redefined cigarette butts discarded in the street as “trash.” The group’s data walk prompted us to reflect on the general category of trash and the specific stories that individual cigarette butts might tell.


Team: Valeria Sonia Aufiero, Andrea Benedetti, Simone Costagliola, Alessandro Zotta

Group 4 assembled data on street stickers posted by local bands around Bovisa and subsequently reimagined those stickers as nodes in a sound walk. In their working prototype, each sticker functioned as a virtual speaker that diffuses music. As you got closer to a band’s sticker, the volume of their track increased. When you walked through areas dense with stickers, multiple music tracks played over one another, creating an unexpectedly entertaining cacophony.


Museum of details
Team: Alicia Gonzalez, Emanuele Innocenti, Nikita Kulikov, Ludovico Pincini, Yining Zou

Group 5 led us on a circular walk through a “museum of details” composed of artifacts left on the balconies surrounding a prominent round-about in Bovisa. The project encouraged interactions between walkers and owners of the artifacts through the use of a simple augmented reality experience and postcards, which invited comments or even offers to purchase the artifacts left out for view.

Mind your step
Team:  Sara Batisti, Vincenzo Bisceglia, Nicola Cerioli, Mattia Virtuani

Group 6 asked us to attend, reflexively, to the way we walk. How often do you watch your feet? Are you aware of your surroundings? Using the accelerometer in an iPhone in coordination with a GoPro, the group constructed an effective method for reflecting on one’s walking habits during the daily commute from the Bovisa station to the Politecnico gates. (Team:  Sara Batisti, Vincenzo Bisceglia, Nicola Cerioli, Mattia Virtuani)


Team: Alessia Bissolotti, Mara Cominardi, Serena Del Nero, Marco Mezzadra

Group 7 led us on an archeological investigation of cracks in the architecture of Bovisa, and prompted us to consider their significance as markers of both historical and environmental change. Their prototype mobile app treated cracks as opportunities to peel back the surface of the city and peer into its past.


Dipartimento di Design, Politecnico di Milano
May 8-12, 2017

Post by Yanni Loukissas (Visiting Instructor), special Thanks to Paolo Ciuccarelli (Program Director) and Tommaso Elli (Teaching Assistant).

For more information on this workshop and other activities hosted by the Local Data Design Lab, contact

posted by Michele Mauri
Monday, May 23rd, 2016

“From Mind to Reality” Workshop

The week of May 2nd we experimented a new workshop format, engaging students with different backgrounds – Communication Design, Product Design, Design & Engineering – asking them to design and prototype an object able to make live data streams “tangible”.

The aim of the workshop was to experiment the potential of combining rapid-prototyping techniques with information visualization knowledge, trying to go beyond the “flat” nature of information visualization.

Each students team identified a live datasource (e.g. personal social network feed, real-time environmental data, personal services…) and created a tangible experience of the data exploiting the potential of materials, structures and shapes.

Students had at at their disposal the Polifactory space and machineries (3D-printers, CMC laser cutters etc.).

Despite the short time at disposal, all the groups managed to create a working prototype connected to a data stream. Below you can see the results.

I would like to thank Fondazione Politecnico for having made possible the workshop, and Polifactory staff for their active support to students, and for the spaces. Finally i want to thank Monica Bordegoni and Marina Carulli who co-tutored the workshop with me.


Each group developed a working prototype, and we asked to create a shor video presenting the project. below you can see the results.


BeWave is a weather station that literally shows sea conditions in real time. It’s a design product made for surfers from all around the world: thanks to its dedicated app, it is possible to turn on the device and set the location and if you want real time data or forecast. Using a live stream of data from, a portal for surfers, BeWave shows wave height and period (paddle movements), wind speed and direction (neopixel ring) and tide height (neopixel strip).
Chiara Riente, Kacper Pietrzykowski, Lorenzo Positano, Maria Elena Besana, Marius Hölter


Our product helps to find the most popular and crowded party-club in real time exploiting the tracking of the tweets that contain the name of the locals.
The object is composed of:
– A wooden base that contains the circuits and the cables;
– A wooden map of Milan, produced with the laser cut;
– A 3d printed scaled buildings placed on the real place on the map, and inside of them there are the leds;
The number of tweets is visualized through the blinking of colored leds: the more a club is tweeted, the more the leds blink fast.
Carlo Colombo, Erika Inzitari, Hanife Hicret Yildiz, Maarja Lind, Piero Barbieri


We created SpotiLights, which is inspired by classic disco balls. It is designed to make house parties more engaging by exploiting actions people are used to do during parties: SpotiLights changes its behavior according to what people do on Spotify and Instagram during a party. House owner just creates a Spotify playlist and shares it with his friends. After that, every time someone adds a song to the shared playlist, the leds’ color changes, based on the color assigned to that person. The more songs added to the playlist, the faster the blinking of the leds. In addition, everytime somebody posts a photo on Instagram using the hashtag “#SpotiLights”, the spinning speed of the inner solid increases.
Ghazaleh Afrahi, Jennifer Monclou, Lucia Cosma, Pietro Cedone, Sebastian Forero Hernandez


Twogethere is a clock, made for any couple of people living in the same house, that works thanks to geolocalization. Following the distance of the connected mobile phone, it shows the transfers of a person, who is moving from home or coming back, by the hands. The maximum angle is 180°, that stands for a fixed distance, that could be defined as a city area.
While on the right half the clock marks distances, the left side is a lamp. Its intensity increases when the person is coming back home, meaning that a further distance from it corresponds to less light. When the hand is getting closer to 0° it emits a sound in order to catch the attention.

Carola Barnaba, Chiara Bonsignore, Delin Hou, Eyleen Carolina Camargo, Qiji Ni


Historically, personal finance could be a matter managing the cash in one’s wallet or purse and withdrawing more from an account when it was depleted.
With the advent of the credit card, debit card, paypal and other electronic payment systems, personal finance has become less and less tangible as it becomes more integrated with digital communication and online commerce. It is quite simple to now spend more then intended or have transactions go unnoticed. Our aim is to reconnect the physical world to one’s sense of their personal finance.
We hope to do this by visually and physically demonstrating for an individual their expenditures over the course of one week. to achieve this we will us thousands of small spheres to represent the funds in one’s budget. 1 sphere = 1 euro. A large reservoir of “budget” will be allowed to flow out according to the rate of one’s spending. Not only will one’s expenditures be demonstrated for the seven days, but each day will be individually quantified in “daily” vials and made available for relative comparison at the end of the week. To do this we will use some of the same technologies that helped to remove tangibility from personal finance.
The process begins when our user makes an electronic transaction of some kind. A data stream is created beginning with the users bank, which is configured to provide notifications of any banking activity. This notification contains information related to the time of the transaction, the amount withdrawn and the balance remaining in the account. The message alert is configured to arrive at a web-based mail parsing service hosted by . A rule parses the email for the amount withdrawn and makes it available through an API along with a unique ID code that is used to identify the transaction.
Our device will consist of a reservoir of spheres positioned above a carousel of 7 vials, one representing each of the days of the week. Between the reservoir and carousel is positioned a customized “valve” that precisely controls the deployment of sphered from the “budget”. The carousel motion is actuated by a servo as is the valve. A third servo acts to agitate the spheres and aid in operation.
A NodeMCU running the arduino boot loader provides direct control of all three of the servo motors. Upon powering, the NodeMCU attempts to connect to it’s familiar wireless network. Once a connection is established it immediately connects to a NTP time server to establish the current day of the week. Once this is complete the carousel’s position is set. At this point an Http library and GET function are used to connect to our API at If a new transaction is detected the value of the expenditure is read and the appropriate number of spheres are deployed. Once complete, the day of the week and API are continually monitored for any updates.


Davide Pedone, Lorenzo Piazzoli, Michael Barocca, Oliviero Spinelli, Pietro Tordini


@the_polifactory NIDO’s LED lit bulbs represent friends on #socialmedia. Colors are set according to their feelings based on hashtags.

Andrea Lacavalla, Beatrice Gobbo, Jelena Milutinovic, Karen Rodriguez, Michele Invernizzi


Unstable is a domestic object designed to help people, who daily spend a lot of time working on laptop, to control their working sessions.
The idea came from a personal and daily experience, all of us is used to stare at the screen for many hours losing the track of time, but for our health, especially for our eyes, this is not good at all. In fact, there are work policies that say you should have a 15 minutes break every 2 hours of work.
Unstable has the aim to make you realize when it’s time for a break, making you impossible to keep working, because after 2 hours it automatically turns unstable.

Giulia Piccoli Trapletti, Laura Toffetti, Maddalena Bernasconi, Matteo Montecchia, Riccardo Gualzetti


Sedentary lifestyle is a type of lifestyle with no or irregular physical activity and it’s a growing problem in our society.
In order to fight this bad habit, we connected a stool with personal movements data: if the user doesn’t reach a defined amount of steps, the stool will react by changing shape.
Lazarus, our smart stool, changes it’s flat surface into a series of solids with different heights, this uncomfortable configuration both shows the user that he’s moved too little and forces him to stand up and move.

Carlo Alberto Giordan, Lucrezia Lopresti, Luobin Huang, Mauro Abbattista, Xiaoqian Liu


Pollenair is a pollution awareness lamp.
The product alerts who owns it about the condition of a city’s air.
The lamp portraits four distinct emotions. Its mood swings are represented by different positions and colours.
Pollenair allows you to know if your city is being eco friendly without even opening your window.
From the data source, the information is collected and transmitted, in live streaming. With the keyboard you can choose a city and compare the numbers of pollution around the world.

Camila Borrero, Chang Ge, Chiara Cirella, Inês Filipe, Prateek Chopra

posted by Giorgio Uboldi
Friday, July 17th, 2015

San Marino Design Workshop

Within the rich San Marino Design workshop 2015 agenda organized by the Università degli Studi di San Marino and the Università IUAV di Venezia, I was invited to hold an intensive 5 days workshop about Open Data and Data Visualization.

The aim of the workshop was to understand how the data released by the public administration (not in Open data format yet), combined with other data sources and the use of data visualization, can help to unveil and describe unexpected aspects of a territory.

In the first phase of the workshop the 13 participant, divided in 4 groups, had to examine the datasets released by the public administration of San Marino about different topics (turism, commercial activities, demography, employment, etc) and choose a subject to explore through data visualization. In a second phase the students had to think about other possible data sources (social media, data released by civic organizations, etc.) that could be combined with the initial datasets in order to broaden the research and find interesting insights.
The students that decided to focus on the turism of San Marino for example, decided to combine the datasets about the turistic flows and accommodations with the metadata of the pictures taken in San Marino and uploaded on Flickr in order to see which areas of the Republic were more photographed and how the turists move on the territory.

To see the results and descriptions of the works download the posters produced by the students (only in italian sorry).

During the workshop the students had the possibility to test some new features and charts of the new version of RAW (more news coming soon).

I would like to thank:
The Università degli Studi di San Marino and the Università IUAV di Venezia and all the people who organized the workshop.
Elisa Canini, the tutor that helped me in the organization and the teaching activites of the workshop.
All the participants that attended the workshop with great enthusiasm: Lucia Tonelli, Bucchi Maria Cecilia, Grenzi Paola, Bocci Chiara, Falsetti Duccio, Leurini Luca, Marazzo Hillary, Barone Raffaella, Lampredi Luigi, Mosciatti Raffaele, Ponsillo Nunzia, Sotgiu Maria Chiara,Moccia Antonio, Michela Claretti.

posted by Marta Croce
Thursday, April 30th, 2015

Depiction of cultural points of view on homosexuality using Wikipedia as a proxy

Preliminary research questions

Which aspects of a controversy is a semi-automated analysis able to return? Is it possible to depict the different cultural points of view on a controversial topic analysing Wikipedia? Can the results be used to improve awareness on the existing differences among cultural approaches on the controversy?

These are the principal questions that I’ve put as starting point of my research project and to whom I hope to be able to provide an answer to with the publication of obtained results.

Structure of the research

CHOICE OF A CONTROVERSIAL TOPIC – As first step, I chose a subject characterized by different connotations according to specific cultural backgrounds of communities: homosexuality seems to be a critical theme for many culture and each nation has an own level of tolerance of it. The meeting between a separate management of the phenomenon and the self- maintenance and conduction of Wikipedia’s editions is a favourable occasion to observe not the emergence of different points of views but the simultaneous existence of several neutral definition of homosexuality.

RELEVANCE CHECK – The second step is an examination of the effective consideration of the theme verifying how many editions have a dedicated article about that and subsequently the selection of eight European communities which have a different statement versus homosexuality: (reports of Ilga-Europe association on human rights situation of LGBTI people in Europe).

DATA COLLECTION – The correspondent pages in each linguistic edition are been investigated reporting data from various features not directly connected with content. The data collected with quantitative, semantic and qualitative analyses has permitted to make a cross-cultural comparison on several aspects of debate (as user, edits and macro-areas of discussion) and returned some information about communities approach to homosexual phenomenon. The comparison is structured in five point of interest: development of discussion in time, behaviour and characteristic of user, elements of discussion, cultural interest on topic, self or global focus.

Questions and Answers – First results

1. Presence of a dedicated article in editions

How important is the phenomenon and which level of familiarity/tolerance different cultures have with it?

Presence and status of the article (Protected, semi-protected, controlled, good article, FA/featured article) are the first indicators of a discrepancy on the activity generated from the theme. Homosexuality is present in 106 language editions out of 277; in few edition the page is protected, semi-protected or controlled. Only in four language the topic has been awarded as “featured article” on the main page.

Same topic means same information?

Comparison of basic information of pages: entry date and quantity of information contained of each edition selected. Is there any connection between length of description and degree of acceptance of homosexuality in cultures? As degree of tolerance I used data coming from the reports of Ilga-Europe association on human rights situation of LGBTI people in Europe. While there is not a clear relationship, the Russian edition stands out for the size of the page compared to the degree of tolerance.

How much time after the entry of the article is the discussion started? How many editors participate also in talk pages?

The birth of a talk page is linked to the needs of users to find an agreement on the correctness of the content to share. The reduction of the gap between the article’s insertion date and the birth of its talk page can be viewed as an indicator of a different need to debate? This interlude decreases with time: the talk page of the English article was introduced a year and a half later, while in other editions it took less than twelve months. The reduction of margin was due probably to an increase of the noise generated by homosexuality whose criticality in 2001 was not the same of 2004: the number of laws worldwide relating to homosexual persons is increasing. The percentage of editors who participate in talk page also shows a divergence in the involvement degree in controversy. In the Russian page, which is connected to the nation who have the lower tolerance toward homosexuality, the need of user to bring the debate to a second level seems to be stronger than in the others.

2. Development of discussion in time

Has the discussion a regular pattern or shows irregular peaks? Periods of high dispute coincide with high involvement?

A coincidence of trends suggests an intersection of interests between communities and therefore an increase of the discussion linked with events of cross-cultural relevance. However, the analysis of edits for month made on homosexuality pages brings to light a debate that is clearly far to be connected between editions. Each edition follows its own line of debate whose peaks could be linked with particular local events or issues that are not relevant for the other cultures. The causes of a conflict in a page just few times exceed local barriers coming to be cross-cultural disputes.

Which period has gained more participation? Many edits coincides with a high involvement?

A significant gap between number of edit and number of participating editors is signal of a heated dispute, identifiable with an edit war. An equality of the two quantities indicates a co-productive behavior of users, which participate without discussing; on the other hand, an ample variance symbolizes a long negotiation debate inside the community and highlights the difficulty of users in finding a common neutral view on homosexuality definition. In English and French pages the number of editors involved is similar to edits sum which means that there is an equal participation of users in highest discussion periods. In the other cases, a more critical perception of the theme causes a more drastic diatriba.

Which linguistic edition of the article generates more debate?

The presence of many discordant views on content’s neutrality causes an increase of edits therefore a higher average of edits for user. The data analysis allows to identify the pages with a more frequent tendency to the debate: the editors of Russian and Hungarian pages seems to be less prone to compliance when it is for the definition homosexuality.

First conclusions

These first analyses on relevance and chronological development of controversy associated with homosexuality shows that there isn’t an evident overlapping of discussion between editions. The different cultural perception of homosexuality influences the participation of users in debate and, as a consequences, produces disconnected periods of discussion between editions: each timeline seems to reflect its own community diatribes on topic and not an involvement in a common discussion with others.

posted by Marta Croce
Thursday, April 9th, 2015

Cross-cultural neutrality about controversy? The Wikipedia case

The Tower of Babel - Lucas van Valckenborch (1595)

The project

Hi! I’m Marta and my thesis project consists in a cross-linguistic analysis of treatments and declinations of a controversial topic between different editions of Wikipedia. The aim of the research is to discover if the debate linked to a controversial page coincides or differs in the various languages and therefore how high is the percentage of incidence of the culture associated with them in the generation of contents. The final output of the project is a tool (set of visualizations) which allows to compare visually several aspects of the discussion built on a chosen controversial theme and increase the general awareness of readers on different cultural interpretations of it.

Cross-linguistic Wikipedia: NPOV and cultural background

Wikipedia, as an online encyclopedia whose contents are produced and updated entirely by users, may be considered and studied as a mirror of modern society. One of the most important conditions required in the production of content is the maintenance of a neutral point of view (NPOV) to avoid any personal opinion, a difficult aim for a collaborative articles production.

Wikipedia is currently available in 288 language editions (April 2015) and each edition is organized and developed separately from the other, based on cultural predispositions of the countries from which it takes voice. The neutrality of articles should give an objective value and universal to content that goes beyond the origins and ideologies of each editors. However, despite the article was written in accordance with NPOV policy, there are cases where the definition of the topic can’t be separated from the consideration that the different cultural backgrounds have of it.

The aim of my thesis is to verify if the debate linked to a controversial topic coincides or differs in the various language editions and therefore how high is the percentage of incidence of the culture of origin on the generation and treatment of the content. The output of the project is a tool which allows to compare, between the editions, various aspects of the debate on a chosen controversial topic.

Bibliography of cross-linguistic studies about Wikipedia

Many researches have made Wikipedia their object of study analyzing the pros and cons of this archive of knowledge generated collaboratively. Only recently the cross-linguistic aspect of Wikipedia seems to have gained interest in the sociologists community and this has led to a proliferation of studies, even if comparative researches are still limited.

The studies analyzing the cross-cultural aspect of Wikipedia can be substantially divided in two groups (Bao & Hecht, 2012): projects studying the multilingual behaviour on Wikipedia and projects that attempt to equalize information between the editions. But how different is the approach of communities in generating a shared knowledge? Is possible speak of a “global consensus” (Hecht & Gergle 2010) in a particular encyclopedic context such as Wikipedia?

A detailed analysis of the existing research projects on cross-cultural Wikipedia has permit to identify four recurrent thematic areas of study defined by the aspects examined:


Semantic analysis / text similarity

The primary aim of the many researches is to see if different language communities tend to describe the same argument in a similar manner. For this purpose the method used is the extraction and comparison of concepts / entities, identifiable with interlink, (Bao, Hecht et. 2012; Hecht & Gergle, 2010; Massa & Scrinzi, 2010) and the following comparison of their application (Adafre & De Rijke 2006). This kind of study is able to bring out the stereotypes that a nation or culture has towards another, if the external point of view on a culture coincides with the idea of this culture of itself (Laufer, Flöck et al., 2014) rather than to observe the cultural interpretation and level of involvement on a particular topic (Kumar, Coggins, Mc Monagle; Otterbacher, 2014).

Information quality/ equality

The quality of information of a collaborative knowledge is one of the most shared doubt when talking about Wikipedia. Studies that deal with this problematic in a cross-linguistic analysis tries to determine the quality indices of communities, as the amount of information provided (Adar & Skinner et al., 2009; Callahan & Herring, 2011)or the development of the categories structure (Hammwöhner, 2007), and compare them between editions. Being Wikipedia’s content a kind of knowledge produced by divergent cultural communities, could emerges a sort of dissonance in interest, quality criteria and definition of neutrality that not always coincide between editions (Laufer, Flöck et al., 2014; Stvilia, Al-Faraj et al., 2009; Chrysostom, 2012; Rogers & Sendijarevic, 2008).

Policy and regulation

Each edition has a self-management not necessarily connected with others and this aspect is reflected in both the definition and application of standards. The pages used to explain rules and characteristics of the platform, such as the production of content by Bots (Livingstone, 2014) and the maintenance of a neutral point of view (Callahan, 2014), show a strong connection with the cultural background of the community behind the contents and follow its behaviours and conduct in real life.

Culture and behavior

Other research focuses instead on a qualitative analysis of the different way of managing the production of content and the priority given by the community to certain issues. The collaboration between users follows different rules of conduct: the attitudes of courtesy, the management of disagreements or conflicts and the attention to contents accuracy reflect the culture and customs of the community (Hara, Shachaf et al., 2010). Many studies have noted a correspondence of conduct among nations with a similar score according to the “cultural dimensions” of Hosfede: power distance, tendency to individualism or collectivism, feminist or male-dominated society and uncertainty management (Pfeil, Zaphiris et al., 2006; Nemoto & Gloor, 2011). The cultural background also determines the predilection concentrate attention and therefore discussion on certain issues rather than others (Yasseri, Spoerri et al., 2014): the topics “close to home” tend to receive more interest (Otterbacher, 2014) and, in case of cross-cultural discussion, the cultures involved are determined to enforce their own version on the other (Rogers & Sendijarevic, 2008; Callahan, 2014).


Some of these researches have produced a tool as final output: these tools permit to compare articles/threads from various editions and return visually their overlaps and discrepancies. These instruments break down the language barriers allowing a cross-cultural consultation of different topics and offering to users a complete overview of information. They also help to build in the reader of Wikipedia a greater awareness of the cultural differences existing in society. Here is a brief description of existing (or in development phase) instruments:

Manypedia (

Online tool that allows to compares simultaneously two language versions of the same article. Manypedia extracts from both some basic data of the page as the most frequent words, the total number of changes and publishers, the date of entry of the article, the date of the last change and the most active publishers enabling the user to confront the salient features of the two articles.

Omnipedia (

With this tool the reader can to compare 25 language editions of Wikipedia simultaneously highlighting similarities and differences. Information is lined up to return a preview of usage of the concepts for each language, eliminating all obstacles due to language barriers. The entities (link) linked to the most discussed topics are extracted and translated into an interactive visualization that shows which topics are mentioned by linguistic editions and the sentences of the page where they appear.

SearchCrystal (

Set of visualizations which displays differences and overlaps between the 100 articles most discussed identified in 10 language editions of Wikipedia in relation to the number of changes.

Wikipedia Cross-lingual image analysis (

By entering a URL linked to a Wikipedia article the tool tracks all the languages in which this article is available and collects the images present in the pages. The output is a panoramic view of the topic by images: it provides a cultural interpretation of the argument related to the different languages.

Terra incognita (

Terra incognita collects and maps the articles with a geographic location in more than 50 language editions. The maps highlight cultural prejudices, unexpected areas of focus, overlaps between the geographies of each language and how the focus of linguistic communities was shaped over time. The project allows the comparison of different features between languages through filters: language (any language has an assigned colour), intersection (shows the location of articles appearing in more editions), translated articles, links (highlight areas whose articles have been translated into several languages).

Open questions

The common aim of these studies is discovering, if it exists, a meeting point between cultures and their point of views as they are told on Wikipedia. The technical analysis of terms unified with social studies help to achieve this goal but what kind of horizontal research and comparison should be made to go deeply in the discovery of a cross-cultural controversy? Are there other researches available introducing new aspects of cross-cultural analysis?


Adafre S.F. and De Rijke M., Finding Similar Sentences across Multiple Languages inWikipedia, 2006

Adar E., Skinner M. and Weld D.S., Information Arbitrage Across Multi-lingual Wikipedia, 2009

Bao P., Hecht B., Carton S., Quaderi M., Horn M. and Gergle D., Omnipedia: Bridging the Wikipedia Language Gap, 2012

Callahan E. and Herring S.C., Cultural Bias in Wikipedia Content on Famous Persons, 2011

Callahan E., Cross linguistic neutrality Wikipedia’s neutral point of view from a global perspective, 2014

Crisostomo A., Examining perspectives on “Abortion” within the EU through Wikipedia, 2012

Hammwöhner R., Interlingual aspects of Wikipedia’s quality, 2007

Hara N., Shachaf P. and Hew K., Crosscultural analysis of the Wikipedia community, 2010

Hecht B. and Gergle D., Measuring Self-Focus Bias in Community-Maintained Knowledge Repositories, 2009

Hecht B. and Gergle D., The Tower of Babel Meets Web 2.0, 2010

Kumar S., Coggins G., Mc Monagle S., Schlögl S., Liao H.T., Stevenson M., Bardelli F. and Ben-David A., Cross-lingual Art Spaces on Wikipedia, 2013

Laufer P., Flöck F., Wagner C. and Strohmaier M., Mining cross-cultural relations from Wikipedia – A study of 31 European food cultures, 2014

Livingstone R., Immaterial editors: Bots and bots policier across global Wikipedia, 2014

Massa P. and Scrinzi F., Exploring Linguistic Points of View of Wikipedia, 2011

Massa P. and Zelenkauskaite A., Gender gap in Wikipedia editing – A cross language comparison, 2014

Nemoto K. and Gloor P.A., Analyzing Cultural Differences in Collaborative Innovation Networks by Analyzing Editing Behavior in Different-Language Wikipedias, 2011

Otterbacher J., Our News? Their Events?, 2014

Pfeil U., Zaphiris P. and Ang C.S., Cultural Differences in Collaborative Authoring of Wikipedia, 2006

Rogers R. and Sendijarevic E., Neutral or National Point of View? A Comparison of Srebrenica articles across Wikipedia’s language versions, 2012

Stvilia B., Al-Faraj A. and Yi Y.J., Issues of cross-contextual information quality evaluationThe case of Arabic, English, and Korean Wikipedias, 2009

Yasseri T., Spoerri A., Graham M. and János Kertész J., The most controversial topics in Wikipedia: A multilingual and geographical analysis, 2014

posted by Giulio Fagiolini
Friday, September 5th, 2014

The Big Picture

A visual exploration of the reciprocal image of Italy and China observed through the lens of Digital Methods.

After Borders and Visualizing Controversies in Wikipedia,  I introduce here The Big Picture, my M.Sc Thesis for the Master in Communication Design at Politecnico di Milano. Together with the project we will also introduce, the website showcasing the research. The project has been carried out under the supervision of professor Paolo Ciuccarelli and the co-supervision of YANG Lei, Curator and Exhibition Director at China Millennium Monument Museum of Digital Art of Beijing.


The project, starting from my personal experience of living in China for more than one year, aims to examine the peculiarities of the narrative of both countries in one another’s web space. It consists in the collection, categorisation and visualisation of 4,800 images from the reciprocal national internet domains of Italy and China.

The exponential growth of non-professional and professional media producers has created a new cultural situation as well as a challenge to our normal ways of tracking and studying culture (Manovich, 2009). Thanks to this massive production of data we are able to make a number of analyses that were not possible previously. In a context where the language barrier represents a big obstacle, images can be the medium for cultural analysis by taking advantage of both the visual properties and their intrinsic storytelling capabilities.

The questions we were interested in were, first, whether we could use the collection of images found in the reciprocal web of Italy and China as a tool to investigate the perception of respective national identities, and, second, what kind of insights these images would provide.


The background to this research combines two approaches developed by the Digital Methods Initiative of Amsterdam and the Software Studies Initiative of New York. The first method, which considers the digital sphere both as a measure of the impact of new technologies on the user and as a resource used by the real world as a political and social space (Weltevrede 2009), introduces the term “online groundedness” in an effort to conceptualise the research that follows the medium, to capture its dynamics and make grounded claims about cultural and societal change (Rogers 2013, 38). The second approach focuses on research into software and the way computational methods can be used for the analysis of massive data sets and data flows in order to analyse large collections of images. “If media are ‘tools for thought’ through which we think and communicate the results of our thinking to others, it is logical that we would want to use the tools to let us think verbally, visually, and spatially.”(Manovich 2013, 232)

Selection of the Sources

Having decided to examine the perceived identities of these nations in their mutual web-spaces through images and to pay close attention to how this identity is “broadcasted”, search engines, being a crucial point of entrance and exploration of the web, seemed a natural place to start. The two main sources for the collection of data were therefore the two main image-search engines of the two countries. Google’s position as the main search engine in Italy (we refer here specifically to the national domain, is mirrored by Baidu in China, which commands about two-thirds of the booming search market there[7]. To add a further layer to the research, we employed Google’s advanced search instruments to conduct a second series of queries limited to a selection of domains concerning specific news websites that carried particular meaning for either country. Thus the collection included 2,400 images for each data set obtained by searching for the translated name of one nation in the local nation’s web space: 900 images retrieved directly from the respective search engine and 300 from five different news websites scraped via the search engine.

Data Collection

In order to ensure that research on the images was as objective as possible, it was crucial to isolate it from personal computer and search engine use. Some rules were implemented for this purpose:

  • Log out from any Google service
  • Delete all customisation and localization services related to social networks and browser history
  • Empty the search engine’s cache

Because data collection from the Chinese web was done in mainland China, it was not necessary to use proxy or other software to simulate the originating location of the queries. Each query was conducted from the country of the specific domain. The collection of images was carried out between 01-15/02/2013 for images pertaining to China, and between 01-15/03/2013 for images regarding Italy. The period in question is fundamental for the analysis of the content. The results show a combination of collective memories, everyday narratives and the peculiarities of each day: a sampling of separate moments, seasons, amplifications and contractions of time as they appeared at the instant in which they were harvested.

Data Processing

Before beginning to visualise, it was necessary to understand all the data enclosed in the images. We first measured the properties in each image by using the QTIP digital image processing application that provided us with measurement files listing the mean values of brightness hue and saturation in each image. Then, to provide a qualitative dimension to the research, the images selected were manually categorised. They were organised into a hierarchical and multiple taxonomy. This allowed us to track the characteristics of each image and identify the main thematic clusters. We ended up with around 100 sub-categories belonging to seven main categories: Architecture, Disaster report, Economics, Nature, Non-photo, Politics, Society, and Sport.


The first intention was to take a step back and compare the images of the two datasets in relation to their visual features. We relied on the Cultural Analytics tools and techniques developed by the Software Studies Initiative at the University of California, San Diego. By exploring large image sets in relation to multiple visual dimensions and using high resolution visualisations, the Cultural Analytics approach allows us to detect patterns which are not visible with standard interfaces for media viewing. In contrast with standard media visualisations which represent data as points, lines, and other graphical primitives, Cultural Analytics visualisations show all the images in a composition.

These representations allow us to identify easily the points of continuity and discontinuity between the visual features of the two data sets, while selective ImageMontages quantify the differences according to each step of the value. As we can see from the visualisations, each nation has a specific Local Colour: visual attributes and dominant tones, which relate to specific cultural territories.

A specific visual model was then developed to visualise the categories and its subcategories. It shows the main category as the central bubble around which the sub-keywords are disposed in circles for the identification of relevant issues. Each image is tagged with one or more keywords/sub-keywords, and the dimension of each bubble is proportional to the number of images tagged with a keyword or sub-keyword.

In order to compare the relevance of each keyword to each of the sources, we made a series of bar charts. Each one represents the profile of a single source. In this way we could easily contrast the different “vocations” of the sources by highlighting the space given to each topic.

The Website

The conclusion of our experimental project has been the creation and development of the website where the main visualisations have been collected. In the process of creating this interface our focus has remained on the same idea from which this project originated: to increase awareness of the way we see and the way we are seen by a culture radically different from our own. This was done by making a tool which makes the topic comprehensible to outsiders, without the need for simplification, as well as to specialists in the field.

From a data visualisation point of view, the biggest challenge was to find an appropriate structure: simplified enough to show the big picture emerging from the data and detailed enough to preserve all the interesting details in the data. We acted on this in two ways: first, we decided to set up the narration consistently on a comparative level; and second, to give the user a tool for a multifaceted exploration of data. Keeping the visualisation and the storytelling on a comparative level helped to keep the exploration clean and structured, which also enabled us to explain each level of the research.

The narrative leads the user into a more in depth engagement with the data where own hypotheses can be formulated and tested. To make this possible we realized the exploration tool, a personal instrument for navigating the data set. It aims to enrich current interfaces with additional visual cues about the relative weights of metadata values, as well as how that weight differs from the global metadata distribution.

To conclude, we can say that the work allows the user not only to explore all the singular elements of the database but also to focus on the database as a whole. We hope that this work will provide insight into the big picture for the general reader while offering the specialist a practical tool to test hypotheses and intuitions. As the title states, the overall purpose and outcome is to show a big picture including all the facets that make it unique.

Full Thesis

For any comment or suggestion please feel free to contact me at or the DensityDesign Lab. at

posted by Giovanni Magni
Sunday, August 3rd, 2014

Borders. A geo-political atlas of film’s production.

Hi there! I’m Giovanni and with this post I would like to officially present the final version of Borders, my master degree developed within Density Design in particular with professor Paolo Ciuccarelli and the research fellows Giorgio Uboldi and Giorgio Caviglia (at time of writing PostDoc Researcher at Stanford University).

Since the beginning the idea was to create some visual analysis of cinema and everything related to this industry but the first question that came out from our minds was, what can we actually do on this topic? what can we actually visualize?

During the past years Density Design made some minor projects on this topic ( link #01, link #02) and, if we try with a simple research, we can easily find lot of attempts on the web, the problem was that all the projects we focused on had limitations. Most of them relied on small datasets or did not answer any proper research question and, above all, none of them showed the relevance of film industry and how it affects society and social dynamics.

Fascinated by some maps I had the chance to see during the research’s months, I started to think about a way to visualize how cinema can make countries closer, even if they don’t have a proper geographical proximity. Basic idea was that in the film industry there are thousands of share production and collaborations, between actors for example, or directors, or companies and what we could actually try to do was to visualize this collaborations and to make it clear with new maps.

After a long process of revisions of our goals and research questions, we decided to focus on the relevance of the film industry inside of society during the last century, using online collected data related to this topic to visualize the evolution of relations between countries during time. Aim was to use cinema a key to read society using the dense network of collaborations inside of this industry to generate new proximity indexes between countries and, starting from them, to create new maps which can show the economical and political dynamics inside of “Hollywood” and a sort of new world based on how the film industry developed relations and connections in the last 100 years.

After decided what to do, second step was to find enough data to build up a relevant analysis. There are lot of platform where you can find informations about movies, such as Rotten Tomatoes and IMDB. We selected two main sources for this project, the Internet Movie Database and Wikipedia, both of them are based on user generated content giving us the chance to actually see how movies penetrate into social imaginarium and global interest.

The first one got our attention thanks to an open subset of the whole archive (link) which contains data about more than a million of films and gets an update every six months (more or less), the second one could give us the possibility to analyse this industry in different cultures and linguistic versions and, thanks to its APIs and the related DBPedia portal, it is basically a huge container of meta-data related to movies.


Starting from its huge archive, we decided to focus on that kind of information which can give back some kind of economical and political aspect, we selected 4 specific datasets:

– Locations (all the locations known, film by film – 774.687 records)
– Companies (all the companies involved in the production, film by film – 1.632.046 records)
– Release Dates (for each film all the release dates in each country – 932.943 records)
– Languages (list of languages’ appearance in each film – 1.008.384 records)

    After a huge cleaning process (god bless who invented Python) I proceeded to generate that proximity indexes I mentioned above. The process is intricate but basically really easy, all the indexes are created counting how many times movies of a country have a connection with other countries. For example, a value of proximity between France and Germany is the amount of time that inside of French movies’ production there have been involved German companies, or total amount of locations made in the German territory. What I did, for each one of the four dataset we selected, was to calculate this index for every possible couple of countries (200*200 countries circa) with the idea of using it later in Gephi (network generator software) as a “edges weight” between nodes (nations).


    “Where” a shot is taken is a choice that depends on various causes, two of them are costs of production and the requirement to move to a specific place according to the film’s plot. An entire cast move to a different location to follow the film’s theme which can require specific place and sets or to save on production’s costs moving to places where, for multiple reasons, it results cheaper.

    Analysing the whole list of locations recorded on IMDB, the aim is to visualize which are the countries that take advantage from these dynamics and how nations behave differently in this process of import/export of shooting.

    At the same time, using the same information, an additional analysis on individual countries can be done, we can visualize the percentage of locations made in a foreign country related to the total amount of locations recorded in the archive and see how different nations behave differently (next figure) or, for example, consider only one nation production and see where it has made some location around the world generating “individual” maps.


    A study on collaborations between national productions and different companies shows again a sort of economical side of this world. The most interesting part of this analysis is made by a network of countries more or less attracted to each other according to a value which is a count of times that a particular connection occurred (for example amount of time that Italian movies involved Spanish companies). As we see in the next figure this network is dominated by western and economically better developed countries, it basically shows importance of a national film’s industry within the global production.

    At the same time it’s interesting to focus on smaller economic systems and geographic areas, showing the historical evolution of inner dynamics. In the next figures we can see how the situation in the European continent has evolved and strongly changed during time:

    And how the situation changed in a single country such as Canada, showing the percentage of Canadian companies involved in the production decade by decade:


    Our opinion was that themes debated within a national film’s production are strongly connected to the history of the country and to events in which the nation itself has been involved in. Therefore a strong appearance of a foreign language in the movies’ dialogs of a specific country could represent a sort of link, a connection between different cultures and nations considered.

    A bipartite network show us in the next figure how countries and languages arrange themselves mutually, according to connections between them, generating new clusters and showing relationship developed during time. It’s important to point out that, to highlight this feature, within the network has not been considered the link between a nation and its own mother language, obviously this value is numerically much bigger than any other connection and should force the network into a not interesting shape.


    In this case, available data revealed itself as messy and confusing compared to the previous ones, tracking  release dates of movies in different countries is not easy and it shows another peculiarity, in the IMDB archive we can find complete data regarding most famous and biggest productions but at the same time, data regarding small national systems and less important movies are incomplete or not significant.

    To develop a correct analysis of the global movies’ distribution phenomenon it was necessary to take a step back and base it on a reliable set of data. Specifically we decided to focus and analyse distribution of American movies around the world, indeed into the database they are quantitatively much more represented than the other countries and related release dates are better recorded. Furthermore we decided not to evaluate data related to TV programs and TV series, which follows different and specific ways of distribution.

    We thought that the better way to verify potential trends during time of this particular aspect was to visualize in each decade how many American movies were released in any other nation and how far (days of delay) from the American release date, generating a sort of economic and cultural detachment between United States (which can be considerate as leading nation) and any other country. Supposition is that a movie is released earlier where there is more interest and therefore more chance to get a gain from it, the visualization shows how the process of distribution got faster decade by decade, from the 80’s when American movies were released in other countries after at least 6 months (average delay), to the present when Hollywood movies are released almost everywhere around the world earlier than 3 months after the american premiere.


    What we did in this last paragraph was to verify how films of each country are represented on the different Wikipedian linguistic versions through related pages, what we wanted to do was to verify the overall interest on national productions evaluating their amount of pages on each Wikipedia.

    To collect necessary data we used both DbPedia ( and the encyclopedia’s APIs, what we did was basically to count on every Wiki version how many movies of every country are represented with a proper page, using this value (combined with the Page Size) to create a proximity index between nations and to generate a bi-partite network and some minor visualization.

    Since all the sources where the data come from are based on user generated content, what we see in these visualizations is an image of global interest in cinema rather than a visual representation of an official productions database. It could be interesting to repeat the same process on some kind of “official data” and see what are the differences between the two version.

    What we have is a sort of thematic atlas which can be developed on many other different kind of data (music, literature..) while keeping its purpose, to be an observation of society (and its global evolution) trough the informations coming from an artistic movement!

    For any comment or suggestion please feel free to contact me at or the DensityDesign Lab. at

    To close this post, some work in progress pictures:


    Ahmed, A., Batagelj, V., Fu, X., Hong, S., Merrick, D. & Mrvar, A. 2007, “Visualisation and analysis of the Internet movie database”, Visualization, 2007. APVIS’07. 2007 6th International Asia-Pacific Symposium onIEEE, , pp. 17.

    Bastian, M., Heymann, S. & Jacomy, M. 2009, “Gephi: an open source software for exploring and manipulating networks.”, ICWSM, pp. 361.

    Bencivenga, A., Mattei, F.E.E., Chiarullo, L., Colangelo, D. & Percoco, A. “La formazione dell’immagine turistica della Basilicata e il ruolo del cinema”, Volume 3-Numero 6-Novembre 2013, pp. 139.

    Caviglia, G. 2013, The design of heuristic practices. Rethinking communication design in the digital humanities.

    Cutting, J.E., Brunick, K.L., DeLong, J.E., Iricinschi, C. & Candan, A. 2011, “Quicker, faster, darker: Changes in Hollywood film over 75 years”, i-Perception, vol. 2, no. 6, pp. 569.

    Goldfarb, D., Arends, M., Froschauer, J. & Merkl, D. 2013, “Art History on Wikipedia, a Macroscopic Observation”, arXiv preprint arXiv:1304.5629.

    Herr, B.W., Ke, W., Hardy, E.F. & Börner, K. 2007, “Movies and Actors: Mapping the Internet Movie Database.”, IV, pp. 465.

    Jacomy, M., Heymann, S., Venturini, T. & Bastian, M. 2011, “ForceAtlas2, A continuous graph layout algorithm for handy network visualization”, Medialab center of research.

    Jessop, M. 2008, “Digital visualization as a scholarly activity”, Literary and Linguistic Computing, vol. 23, no. 3, pp. 281-293.

    Jockers, M.L. 2012, “Computing and visualizing the 19th-century literary genome”, Digital Humanities Conference. Hamburg.

    Kittur, A., Suh, B. & Chi, E.H. 2008, “Can you ever trust a wiki?: impacting perceived trustworthiness in wikipedia”, Proceedings of the 2008 ACM conference on Computer supported cooperative workACM, , pp. 477.

    Latour, B. 1996, “On actor-network theory. A few clarifications plus more than a few complications”, Soziale welt, vol. 47, no. 4, pp. 369-381.

    Manovich, L. 2013, “Visualizing Vertov”, Russian Journal of Communication, vol. 5, no. 1, pp. 44-55.

    Manovich, L. 2010, “What is visualization?”, paj: The Journal of the Initiative for Digital Humanities, Media, and Culture, vol. 2, no. 1.

    Manovich, L. 2007, “Cultural analytics: Analysis and visualization of large cultural data sets”, Retrieved on Nov, vol. 23, pp. 2008.

    Masud, L., Valsecchi, F., Ciuccarelli, P., Ricci, D. & Caviglia, G. 2010, “From data to knowledge-visualizations as transformation processes within the data-information-knowledge continuum”, Information Visualisation (IV), 2010 14th International Conference IEEE, , pp. 445.

    Morawetz, N., Hardy, J., Haslam, C. & Randle, K. 2007, “Finance, Policy and Industrial Dynamics—The Rise of Co‐productions in the Film Industry”, Industry and Innovation, vol. 14, no. 4, pp. 421-443.

    Moretti, F. 2005, Graphs, maps, trees: abstract models for a literary history, Verso.

    Van Ham, F. & Perer, A. 2009, ““Search, Show Context, Expand on Demand”: Supporting Large Graph Exploration with Degree-of-Interest”, Visualization and Computer Graphics, IEEE Transactions on, vol. 15, no. 6, pp. 953-960.