Data for Development Senegal Challenge: Analyzing the Migration of the Senegalese People during Religious Festivals and its Consequences on Health Issues using Orange Mobile Dataset

Beginning in April, the Orange Group and Sonatel launched an innovation challenge called “Data for Development Senegal” whose aim seeks to improve societal development using big data. According to the launch site, there are 5 “priority subject  matters” which have been defined: Health, agriculture, transport/urban planning, energy, and national statistics. The winning teams will be chosen in April 2015 and rewarded on the basis of anonymous algorithm advancement, data mining, and display and data cross-referencing. In addition to development progress, the organization seeks to improve “ecosystem of local companies and start-ups”.

My team’s project, “Analyzing the Migration of the Senegalese People during Religious Festivals and its Consequences on Health Issues using Orange Mobile Dataset”, was accepted for participation earlier this year. The project is an interdisciplinary research collaboration from the fields of computer science and public health and composed of researchers froms the U.S. and Senegal. Through our analysis, we seek to create interactive data visualizations that account for the movements of the Senegalese population during various religious festivals as they relate to health issues in the regions. The purpose for this research is to combat the overwhelming disease outbreak that occurs during mass pilgrimages throughout the year. These outbreaks are often due to sanitation shortfalls in water storage and septic systems. By paying special attention to the most prominent religious festivals in the area, we hope to correlate specific locations with disease outbreaks so that we may better suggest appropriate actions to be taken. Our data visualizations will be helpful to help convey this information for the purpose of raising awareness.

The released data includes the latitude and longitude for caller and recipient cell towers, as well as caller identification, call time, and call duration. By mapping the Call Details Records (CDR) for specific dates, we can estimate the number of individuals moving to and from festival cities at any given time. We are also using health data received from the Ministry of Health for the years of 2010 and 2011. Due to the recent Ebola outbreak, however, retrieving the health records for 2012 has not been possible. In order to circumnavigate this, we will be implementing a machine learning algorithm that will estimate the disease outbreak for the missing year. In regards to the data visualization process, we have already established contacts with Andrew Hill of CartoDB ( to use for our mapping needs.By the end of the project, we hope to have a successful modeling system for various Senegalese health departments to help alleviate the spreading of diseases during future festival migrations.

For more information about the Data for Development Senegal Challenge, please