Data for Development: One Step Forward, Two Steps Back

Samantha Bradshaw

Series: Governing the Internet: Chaos, Control or Consensus?

February 24, 2014

Governments, information and telecommunication companies, international organizations and humanitarian aid agencies are embracing data as a new tool that will revolutionize the way we address some of the world’s most serious problems, including food shortages and price volatility, financial crises, disease outbreaks and human rights violations. Real-time data from cellular providers, social media networks and other similar sources can be mined and analyzed, allowing policy makers to gain unique insights into these problems. Information from these so-called “big data” sets can be used to develop better policy; however, there are privacy and security risks. Without sufficient resources to identify, analyze and address these risks as big data use increases, the pursuit of data for development could become counterproductive in terms of improving the life chances of the world’s poorest, moving us one step forward and two steps back.

Globally, people are connecting through technology at an unprecedented rate. In 2008, for the first time in history, the number of devices that connected to the Internet exceeded the world population and by 2012 there were more than six billion mobile phone subscriptions (GCN 2013). As more people connect to the Internet, our interactions with digital products and services create a wealth of data that is collected and stored by private corporations. This data includes: personal information, such as an individual’s name, credit card number or Internet Protocol address; lifestyle information, such as an individual’s race, religion or sexual orientation; and behavioural information, such as an individual’s Internet viewing and purchasing habits. While some of this data collection is incidental, much of it is an intrinsic part of the business models driving the global information economy.

Big data is “an umbrella term that represents the massive volume and variety of data that is created and stored on the Internet” (Bradshaw, Harris and Zeifman 2013). Unlike traditional sources of information such as statistics, household surveys or census data, which are effective at tracking long-term trends, big data can generate a real-time picture for decision makers (Global Pulse 2012). Enabled by the Internet and mobile technology, activities such as phone calls and text messages, interaction with social media platforms, filling out online surveys, making online purchases and creating user-generated content all produce stored data, which can be collected and analyzed in aggregate, and in real time. With this rapid growth in rich, high-volume data sources, high-powered analytics can uncover a wide variety of patterns, trends and correlations about groups and individuals, and open up significant opportunities to make robust inferences about the world.

Recognizing this, a number of international organizations such as the World Economic Forum, the UN’s Global Pulse and the Organisation for Economic Co-operation and Development have begun to embrace data analytics as a tool for innovating development strategies and providing humanitarian aid.

However, the private sector plays a large role in collecting and storing much of this data, treating it as an increasingly valuable commodity that drives much of the global information economy. As a result, a lot of the big data that might be put to use in the service of various public goods remains locked away behind the firewalls and intellectual property protections of private information and telecommunication companies. As big data analytics is increasingly discussed as a tool for development, pressure is growing from the so-called “open data” movement for private firms to open up their data mines for public research.

Open data is data that “anyone is free to use, reuse and redistribute” (Office for the Coordination of Humanitarian Affairs [OCHA] 2013). Unlike big data, which can be privately owned or have varying levels of access, open data is free from copyright and can be shared within the public space (Global Pulse 2013). The movement supporting open data often frames it as a means of further enhancing scientific and social research, as well as providing governments and organizations with accountability and transparency mechanisms.

Global Pulse, an initiative launched by the United Nations that has been at the forefront of this movement, put forward the concept of “data philanthropy,” whereby “corporations would take the initiative to anonymize their data sets and provide this data to social innovators to mine the data for insights, patterns, and trends in real time or near real time” (Global Pulse 2013). This notion was extended by the World Economic Forum, which describes a data ecosystem where “actors in the public, private and development sectors…recognize the mutual benefits of creating and maintaining a ‘data commons’ in which this information benefits society as a whole while protecting individual security and privacy” (World Economic Forum 2012). While the idea of a data commons or data philanthropy are well intended efforts to improve the world, there are important privacy and security risks to consider.

Privacy is one of the most sensitive issues when it comes to accessing, utilizing and securing data. First, Internet users — who are the primary producers of data — may be unaware that they are producing data; they may also be unaware of what their data is being used for and by whom it is being used. People routinely consent to terms of service agreements and complete online surveys and forms such as health questionnaires or store loyalty program applications, without fully realizing how their data might be used or misused. Furthermore, many people who are aware of the privacy risks associated with using the Internet and mobile technology are unable to avoid them due to terms of service agreements that give companies permission to collect and store an individual’s data. In addition, there is a lack of alternate services without these privacy costs. While consumers consent to these agreements, in reality they have little to no ability to negotiate the contracts themselves. Because many Internet services — such as search engines, email or social media — have become an essential part of society, opting out of some terms of service agreements essentially amounts to opting out of the economy and digital public sphere.

Second, data is often sold or distributed to third parties without an individual’s knowledge of where the data is going and for what purposes it will be used. The situation becomes even more complicated when data is sold on tertiary markets, as it becomes significantly more difficult for individuals to track the location of their personal information. While most companies purchase data as a major marketing asset, governments and non-governmental organizations have also been known to buy and sell data (Bradshaw, Harris and Zeifman 2013).

These issues should raise important privacy concerns for individuals around the world. For example, all that is needed to identify an individual American citizen 87 percent of the time is “the triple identifier” of birthday, gender and postal code (Buytendijk and Heiser 2013). Additionally, even if data is anonymized before it is sold, recent studies indicate it is fairly easy to de-anonymize information (Letouzé 2012). This means that even if anonymized data is collected and then shared for good faith purposes — such as providing an early warning signal for mass atrocities or crimes against humanity — an individual’s data could still easily be traced back to them.

When information can be tracked back to a particular individual or group of individuals, it can put these people at risk. Information that identifies humanitarian aid workers or individuals who protest against oppressive government regimes, acts of violence or human rights violations can be used by governments or armed groups for retribution. This occurred when the Egyptian government used mobile call logs to track down dissent in the aftermath of anti-government food protests in 2008 (Ahmed et al. 2009); when the Taliban threatened to target foreign aid workers responding to the floods in Pakistan in 2010 (OCHA 2013); and when the Ukrainian government used mobile and GPS technology to send the following text message to protestors: “Dear subscriber, you are registered as a participant in a mass riot” in 2014 (Walker and Grytsenko 2014).

Data proliferation could pose other even more perverse security risks. Many countries are characterized by “historical divisions, ethnic conflicts and other social and cultural vulnerability that heighten the risk that big and open data will be misused” (Nyst 2013). For example, discrimination and persecution could occur if de-anonymized data pertaining to an individual’s sexual orientation, religious beliefs or political affiliations were made openly or easily available. In a context such as this, opening up data could further empower those who are already empowered and disenfranchise others, ultimately leading to more human rights abuses and closed, undemocratic societies.

It is worth remembering the role that punch-card technology played in facilitating the Holocaust, by providing a means for the Nazis to collect and collate information on the German population, and deliberately target a subset.[1] Of course, this technology continued to be used in other parts of the world without such horrific outcomes and, ultimately, with much significant benefit. As with previous developments in data collection and analysis, however, we must ensure that we carefully think through how things could go wrong with big data, and build robust mechanisms to safeguard privacy and security.

As individuals across the planet connect to the Internet and to mobile networks, creating more and more data, there are opportunities to make new observations and inferences about the world and, one hopes, craft better policy outcomes. There is, however, an absence of legal and technical safeguards for data protection, privacy and accountability, particularly in many developing countries where most of this new connectivity and data production is happening. Relevant policy actors need to soberly assess these risks to privacy and security, and move quickly to fill current governance gaps. Otherwise, as we step forward to embrace big and open data’s significant potential to make policy better, we risk undermining opportunities for people in developing countries.

Works Cited

Ahmed et al. 2009. “Threats to Mobile Phone Users’ Privacy.” www.engr.mun.ca/~mhahmed/privacy/mobile_phone_privacy_report.pdf.

Black, Edwin. 2012. “IBM’s Role in the Holocaust — What the New Documents Reveal.” The Huffington Post. February 27. www.huffingtonpost.com/edwin-black/ibm-holocaust_b_1301691.html.

Bradshaw Samantha, Kyle Harris and Hyla Zeifman. 2013. “Big Data, Big Responsibilities: Recommendations to the Privacy Commissioner on Canadian Privacy Rights in a Digital Age.” CIGI Junior Fellows Policy Brief No. 8. Waterloo: CIGI. www.cigionline.org/sites/default/files/no8_0.pdf.

Buytendijk Frank and Jay Heiser. 2013. “Confronting the Privacy and Ethical Risks of Big Data. Financial Times. September 24. www.ft.com/intl/cms/s/0/105e30a4-2549-11e3-b349-00144feab7de.html#axzz2tyMtfrLK.

Global Pulse. 2013. “Big Data for Development: A Primer.” June. www.unglobalpulse.org/sites/default/files/Primer%202013_FINAL%20FOR%20PRINT.pdf.

GCN. 2013. Tracking the Evolution of Big Data: A Timeline. May 28. http://gcn.com/articles/2013/05/30/gcn30-timeline-big-data.aspx.

Letouzé, Emmanuel. 2012. “Big Data for Development: Challenges and Opportunities.” UN Global Pulse. www.unglobalpulse.org/sites/default/files/BigDataforDevelopment- UNGlobalPulseJune2012.pdf.

Nyst, Carly. 2013. “Data for Development: The New Conflict Resource?” Privacy International. www.privacyinternational.org/blog/data-for-development-the-new-conflict-resource.

OCHA. 2013. Humanitarianism in the Network Age. https://docs.unocha.org/sites/dms/Documents/WEB%20Humanitarianism%20in%20the%20Network%20Age%20vF%20single.pdf.

Reaves, Jessica. 2001. “IBM: Haunted by Nazi-Era Activities?” Time Magazine. February 13. http://content.time.com/time/nation/article/0,8599,99249,00.html.

Walker, Shaun and Oksana Grytsenko. 2014. “Text Messages Warn Ukraine Protestors They Are “Participants in Mass Riot.” The Guardian. January 21. www.theguardian.com/world/2014/jan/21/ukraine-unrest-text-messages-protesters-mass-riot.

World Economic Forum. 2012. “Big Data, Big Impact: New Possibilities for International Development.” www3.weforum.org/docs/WEF_TC_MFS_BigDataBigImpact_Briefing_2012.pdf.

[1] For more information, see Reaves (2001) and Black (2012).

Part of Series

Governing the Internet: Chaos, Control or Consensus?

Internet governance involves highly complex, transboundary governance challenges in a rapidly evolving technical environment. Identifying effective policy options that can balance competing interests and conflicting values requires foresight and analysis. Governing the Internet presents timely expert opinion from CIGI staff and a variety of guest authors on governance options across a range of vital Internet governance issues.

About the Author

Samantha Bradshaw

Samantha Bradshaw is a CIGI fellow and assistant professor in new technology and security at American University.