- Data has been referred to as the “gold” in the era of big data; however, there are several important differences between data and gold, including the marginal cost of reproduction and usage, and the relationship between scale and value.
- Establishing property rights would make it clear what personal data individuals are willing to have collected, and for what price.
- Corporations would then need to ask permission, pay to collect and use the data, and provide both data and cash options for the use of their services.
- The lack of an international system for the regulation of data presents an opportunity: nations that develop strategic policy that allows for effective data use while ensuring the integrity of data and property rights would gain significant comparative advantage.
ore than a decade ago data, supposedly, became the new “gold” of our “age of big data” (Nissenbaum 2004; World Economic Forum 2012; Rotella 2012). If data has become the source of all new riches, then it is critical for societies wishing to secure economic growth and prosperity to devise a data strategy.
Focusing on economics and growth, there are a few significant differences between data and gold. This essay will focus on two: the marginal cost of reproduction and usage, and the relationship between scale and value.1 Once data has been collected and stored, the marginal cost of creating another copy — and the cost of transporting it to the other side of the world and back — approach zero. For gold, the cost of producing another unit is very similar to the cost of producing the first, and transportation costs are both high and distance dependent. Further, the usage of gold is absolutely exclusive. If a unit of gold is used to make a gold watch, the same unit of gold cannot be used for anything else without first melting the watch. However, if data is used to build a financial algorithm, it can also be used to build a marketing algorithm or even a different financial algorithm. All these algorithms would work perfectly well at the same time.2
A firm might not want others to have access to the data as its competitors might be able to develop a better algorithm using the same data. Nonetheless, not only can multiple firms and individuals use the same data, but doing so does not diminish its usefulness or hurt the data in any way. Additionally, multiple usages of the same data can be done in different locales and times, that is, either concurrently or many years into the future.
Here lies another important characteristic of data: it does not lose its value with time or use. Indeed, it might be worth more over time if it can be linked with other data, which leads to yet another difference between data and gold. No matter how much gold an economic actor possesses, its worth per unit stays the same. This is exactly the opposite with data: the more data one has, the more valuable each piece of that data and the data overall are. Indeed, for the purpose of training neural networks — what is now called artificial intelligence and machine learning — those who have more data have an unassailable advantage over others. The reason is that with current techniques, the more data used to “train” specific neural networks, the better the algorithms it produces. As a result, it is already questionable whether anyone can compete with incumbents such as Google (Alphabet), Facebook, Microsoft or Amazon (Arrieta et al. 2017; Porter 2018; Duhigg 2018; Khan 2017; Radinsky 2015).
Last, but certainly not least, there is one more economic difference between gold and data. Most of the commercial uses of gold, and the business models around them, are well-known, but since data is the raw material for innovation, there is little reason to believe we know how it will be used in the future, what the real value of different kinds of data will be or even what the business models will look like. The only certainty about data is that for the foreseeable future, there will be significant experimentation. Indeed, the locales where most of the experimentation will occur are more likely to reap the associated economic growth benefits. This is an area of economic similarity between data and gold: the places where gold is processed have enjoyed sustained growth, not the places where gold has been mined.
These several inherent differences between data and gold can serve as the basic principles for a data strategy from the point of view of economic growth. These should not diminish — or even be prioritized over — the societal concerns of a data strategy.
The Need to Establish the Market for Data
If data is the main resource for growth and innovation, policy should ensure that well-functioning data markets with efficient price-setting mechanisms exist to enable the optimal allocation of resources, incentivizing growth and innovation.3 However, for any economic transaction to happen, there is a need to establish property rights, decide what they entail and set the rules about the transfer of said property rights in whole or in part.
Currently, in most countries, such rules either do not exist or are, at best, aspirational. The result is that one side of the equation, namely the corporations that gather the data, have de facto full, exclusive and unlimited privilege in time and usage property rights on the data they gather. It is here that the confusion between data issues and privacy is the most damaging to society and economic growth.
In today’s economy and society, not entering a website is not a viable option. The issue is not whether the user is aware their life is now coded to become the commodity called data. Instead, questions arise around who has a right to collect what data, who has the right to define what the data is used for and how (if at all) the data can be used.
These are classic issues of defining property rights (Coase 1960; 2013; Posner and Weyl 2017).4 Indeed, by establishing property rights, it would be immediately clear what personal data people are willing (or unwilling) to have collected and, at least as importantly, for what price. Having markets that put prices on data would also have the wonderful effect of optimizing the allocation of resources to the collection, acquisition and processing of data, resulting in a positive impact on economic growth.
The current situation is by far the worst imaginable for citizens, locales and future economic growth. Data is gathered by organizations, mostly for-profit corporations, and unless specifically noted (for example, in the health-care field) it belongs to the gatherer, who can then utilize it for free without any time or place limitations, while enjoying full exclusivity (that is, they can deny anyone else access to the data and/or sell it to whomever they wish at whatever terms they deem most beneficial). Further, they are not required to let people know what data they have collected, whether it is accurate, where and how they store it, how they use it, if they sell it or to whom they sell it. If this sounds eerily similar to the conditions that turned the relatively minor issue of higher-than-expected subprime mortgage defaults in the United States into the great recession of 2008, that is because it is. With data, however, there is more collection, trade and storage, and even less is known about who owns and uses what elements of the data, the quality and accuracy of both the data and the algorithms built on top of it, where the data is stored and how safe it is.
Modern life involves a frenzy of data collection. Presently, each private corporation does its best to collect at least the same amount of data as other companies, and then prevent others from having access to that data. From smart watches to mobile devices, computers, televisions, home alarms, heating and cooling systems, cars or fitness equipment, the same data is being collected again and again by different competing corporations. As a consequence, the lives of citizens in modern democracies are under such intense surveillance by multiple organizations that it makes the data collection efforts described by George Orwell in his dystopian novel 1984 look like a semi-professional attempt by benevolent amateurs (Orwell 1949). Furthermore, neither citizens nor their communities see any of the economic growth benefits that are the fruits of the intensive efforts to gather, process and utilize their data.
Establishing clear property rights for data would solve most of these issues. With clear and full property rights given to individuals, corporations will have to ask for permission, pay to collect and use the data and accurately price their services since individuals will now have a choice to pay with either cash or data. For example, under such conditions, Facebook will have to offer users the option to pay for the usage of their app, in which case Facebook will not be allowed to collect their data. Thus, Facebook will need to put a price that reflects its valuation of the data it loses access to. In addition, there would be a clear incentive and need to keep accurate data storage facilities — the quality and accuracy of the data can then be checked and assured. It would be clear who owns what data and how it is used and stored. Most importantly, the data would only have to be collected once.5
From the point of view of regional economic growth and innovation policy, establishing property rights for data are especially important due to two inherent qualities of data: increased value to scale and the fact that data is a non-rivalrous good. The latter refers to the fact that data can be used at different times by many users for many purposes without diminishing the ability of others to use it.6 The great uncertainty about the future uses of data and the business models/opportunities associated with them, means that access for yet-to-exist companies and entrepreneurs, who will try to develop yet-to-be-thought-about products must be ensured, otherwise the basis of future innovation and innovators will be undermined. Unless access to this data is ensured, the future and present companies and entrepreneurs of a locale that is not already the home base of a leading incumbent will have diminished chances of being able to scale up.
In short, a critical component of a national or regional data strategy is establishing property rights and rules of usage, with an eye on future access in addition to the present. The most elegant solution would be to grant to people full property rights on their personal data and a fully transparent open-source licensing system with limited access/usage rights to data gathered as part of public or semi-public activities, such as transportation services (run by either public or private companies) or smart cities.7 Significant experimentation should be conducted on various models, from full open-source to two-level licensing, where a license to use is granted to the gatherer in exchange for sharing the data with current and future local citizens and companies, either for free or for a nominal fee. Thus, for example, app services, such as Waze, and transportation-for-pay services, such as Uber and Lyft, which operate in various cities, should make their data readily accessible to cities and their residents in exchange for the right to use it. With regards to personal data, this can be collected to a universal reservoir (which will be either centralized or fragmented depending on security and efficiency concerns) and citizens can then check its accuracy and allocate (for a price) the right to use it. For that to work, full transparency on who asks for access to this data is needed.
While many, especially industry lobbyists, might argue these conditions are so complex that they are technologically unfeasible or so cumbersome that they are unworkable, reality has already proved them wrong. These conditions currently underlay Estonia’s e-government policy, which is considered the most advanced and competitive in the world. Indeed, Estonia’s data strategy is now a competitive advantage that the country skillfully uses to lure international business and talent to make Estonia their base of operations (Heller 2017). Further, market solutions already exist. Two examples for such a system are Solid (social linked data), developed by Tim Berners-Lee and his collaborators at the Massachusetts Institute of Technology (MIT), and OpenPDS, developed by researchers at the MIT media lab.8
Technological and many of the regulatory issues have already been ironed out at least once, making this policy feasible with regards to both public and private services. Further, as Estonia has already proven, being the leader grants significant comparative advantage.
Establishing the Rules around Data Gathering and Usage
Another key issue is the need to establish rules around who is allowed to collect what data and for what purpose. This also includes enabling accurate pricing mechanisms depending on the level of data collection and right of usage. Solutions to this can be seen as deciding on a point on a continuum from a free unregulated market-based system to a licensed data-gathering regime. At one end of this continuum, companies and individuals are allowed to collect data if given permission from the users. In turn, these companies would provide either data or cash options for the use of their services (such as an app). The role of the government in this system is to then ensure a repository (either publicly or privately managed) exists that accurately reflects all data that is collected. This repository will provide the ability to check for accuracy and adhere to the collected once principle, as well as the current licensing and approvals status. Thus, for example, if a user opts to pay with data for using fitness app X, regulations will enable the repository system to record the transaction, what data is collected (not the data itself), the extent the individual has allowed the company to use the data and all further transactions on the data (including allowing the user to pay with the same data for other uses, since they have the property rights on their own data, and while allowing the fitness app to collect and use specific data, the user might not grant the company license to sell the data to third parties). The system, therefore, needs to allow an accurate record of all the requests for data, by whom and for what reason, as well as ensuring all individuals have the ability to know exactly what data has been collected about them, and verify or challenge it, have an accurate map of all the transactions and licensing agreements they approved, as well as all requests for the data and who they were from.
On the other end of the spectrum for a data collection regime is a system similar to the one that currently regulates professional service providers, such as medical doctors, accountants and lawyers. This system would grant certain organizations and individuals a data-gatherer license and only these organizations would be allowed to collect individual data. The role of the government would be to ensure a repository is kept that includes what data is collected by whom and what transactions and requests have occurred. This would also allow for accuracy checks and reviews between the systems.
With regards to security, it should not, necessarily, be the role of government to actively supply security. However, no matter what system of data collection is chosen, it is the role of government to set and ensure minimal security standards. Further, since data is property, there is an urgent need to determine both criminal and civil penalties in cases of theft, misuse and neglect.
It should be immediately obvious that for such a system to work, there will be a need to manage the transfer of data to different jurisdictions while ensuring property rights will not be infringed.
Establishing International Rules
Data, once collected, is information, and information not only travels immediately at very low cost, but also, in many cases, should be allowed to travel easily.9 Nonetheless, while there is currently a sophisticated international system of trade that regulates the movement of goods, services and capital, there is no such system with regards to data. As long as data is assumed to have no value, this oversight is somewhat understandable. This is no longer the case.
First (Regulatory) Mover Advantage
Further, if society wants to develop robust, transparent markets for data based on clearly defined property rights, there is an urgent need to define a regime that would respect different societies’ decisions on the allocation of property rights and data collection. Such a system needs to be flexible enough to ensure maximum innovation and utilization of data, while ensuring the integrity of data and property rights.
It is important to note that current thought-leadership in this area is missing. This presents a unique opportunity, since jurisdictions with a fair, principled and efficient system not only gain a significant comparative advantage with far-reaching economic consequences, but also stand a chance to influence the design of the international system. By doing so, these countries would, in effect, ensure that their norms and views on how society should look will be the building blocks of the next global innovation economy. This would also have the side benefit of creating significant advantages for their own companies and entrepreneurs, who will be well-versed on how to operate in such system. A similar advantage is now granted to American companies with regards to the global intellectual property rights regime.
The future of economic growth is data. Countries, including Canada, that want to prosper need to develop strategic data policies. Those who do this well, and quickly enough, stand to gain enormous prosperity for their citizens. Those that do not should hope that they will not become the next (data) mining ghost towns.
1 The marginal cost of production is the change in costs associated with a unit increase in production. Similarly, the marginal cost of reproduction is the cost of duplicating one unit of data once it is obtained.
2 The technical term for a good with such properties is a non-rivalrous good.
3 In a well-functioning market, clear price signals are required to indicate the appropriate value of a product, which coordinates the supply and demand for a commodity. For this, complete information is required among a large number of buyers and sellers with homogenous goods. Incomplete information between buyers and sellers necessitates regulation to approximate effective price signals to coordinate production and consumption.
4 R. H. Coase (1960) suggested that even with the implementation of property rights, there could be a social cost or externalities, thus leading to conflict between property owners. Because bargaining involves transaction costs, it is imperative that a third party settle the distribution through clearly demarcated property rights. As Elinor Ostrom and Charlotte Hess (2007, 4) suggested, property rights, “depend on the existence of enforcement of a set of rules that define who has the right to undertake which activities on their own initiative and how the returns from that activity will be allocated.”
5 The “collected once” principle states that every point of data can be collected only once. Accordingly, if a fitness device collected and stored a user’s vital signs throughout the day, their watch, smartphone and smart home will not be allowed to do so again (and again, and again) that day. A working example of the collected once principle is Estonia’s e-government policy. As part of their well-developed e-government program, the authorities are only allowed to collect specific data of citizens once. This data — only after obtaining approval from the citizens for each transaction — can be shared internally within government departments and with businesses, reducing the intense surveillance faced by citizens by multiple digital platforms, websites and applications. Estonian citizens also have complete control over who is asking for their data, can question as to why their data is needed and to approve its use by a given requester (Priisalu and Ottis 2017; Liiv 2017). This policy has been advocated by the EU Commission as a part of its single data market strategy and its e-government action plan. It was also adopted in European Council Resolution in 2013 (European Commission 2016, 3; European Council 2013, 4).
6 Increased value to scale means that the more data one possesses, the higher the value of that data.
7 On the importance of full transparency, see Fung, Graham and Weil (2007).
8 For more on the Solid system, see https://solid.mit.edu/. For more on the OpenPDS system, see http://openpds.media.mit.edu/#architecture.
9 The rationale behind allowing free movement of data is that it reduces the costs for business and consumers and reduces the regulatory burden of digital platforms operating in different countries. Data localization policies could require firms to set up data centres or set up local servers, thus imposing costs on firms (Selby 2017). As an example, some content on Netflix and Amazon cannot be streamed in certain countries. This translates into a welfare loss for consumers in those countries as well as for producers in the country where the content is produced (Pop 2015). The European Union adheres to this rationale in its communication on the free movement of data across Europe, suggesting that free movement of data would help businesses adopt cloud technologies; it even goes as far as quantifying that it would benefit the EU economy by €8 billion a year through cost savings and efficiency gains (European Union 2017, 7).
Arrieta Ibarra, Imanol, Leonard Goff, Diego Jiménez Hernández, Jaron Lanier and E. Glen Weyl. 2017. “Should We Treat Data as Labor? Moving Beyond ‘Free.’” SSRN Scholarly Paper ID 3093683. https://papers.ssrn.com/abstract=3093683.
Breznitz, Dan and Vincenzo Palermo. 2018. “Privacy, Innovation and Regulation: Examining the Impact of the European ‘Cookie Law’ on Technological Trajectories.” Working Paper. https://papers.ssrn.com/abstract=3136789.
Coase, R. H. 1960. “The Problem of Social Cost.” The Journal of Law and Economics 3: 1–44.
———. 2013. “The Federal Communications Commission.” The Journal of Law and Economics 56 (4): 879–915. https://doi.org/10.1086/674871.
Duhigg, Charles. 2018. “The Case Against Google.” The New York Times, February 20. www.nytimes.com/2018/02/20/magazine/the-case-against-google.html.
European Commission. 2016. “EU eGovernment Action Plan 2016-2020.” Communication 179. Brussels: European Commission.
European Council. 2013. Conclusions of the European Council. Conclusions 169/13. October 25. Brussels: European Council. www.consilium.europa.eu/uedocs/cms_data/docs/pressdata/en/ec/139197.pdf.
European Union. 2017. “Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions ‘Building a European Data Economy.’” Brussels: European Union. https://eur-lex.europa.eu/legalcontent/EN/TXT/?uri=COM%3A2017%3A9%3AFIN.
Fung, Archon, Mary Graham and David Weil. 2007. Full Disclosure: The Perils and Promise of Transparency. Cambridge, UK: Cambridge University Press.
Heller, Nathan. 2017. "Estonia, the Digital Republic." The New Yorker, December 18 and 25. www.newyorker.com/magazine/2017/12/18/estonia-the-digital-republic.
Khan, Lina M. 2017. “Amazon’s Antitrust Paradox.” Yale Law Journal 126 (3): 710–805.
Liiv, Innar. 2017. “Welcome to E-Estonia, the Tiny Nation That’s Leading Europe in Digital Innovation.” The Conversation, April 4. http://theconversation.com/welcome-to-e-estonia-the-tiny-nation-thats-leading-europe-in-digital-innovation-7444.
Nissenbaum, H. 2004. "Privacy as a Contextual Integrity." Washington Law Review 79: 119–54.
Orwell, George. 1949. 1984. London, UK: Penguin.
Ostrom, Elinor and Charlotte Hess. 2007. “Private and Common Property Rights.” SSRN Scholarly Paper ID 1304699. Rochester, NY: Social Science Research Network. https://papers.ssrn.com/abstract=1304699.
Pop, Valentina. 2015. “Interview: ‘You Can’t Use 18th Century Law for a Digital World.’” EU Observer. February 26. https://euobserver.com/economic/127800.
Porter, Eduardo. 2018. “Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It?” The New York Times, March 6. www.nytimes.com/2018/03/06/business/economy/user-data-pay.html.
Posner, Eric A. and E. Glen Weyl. 2017. “Property Is Only Another Name for Monopoly.” Journal of Legal Analysis 9 (1): 51–123. https://doi.org/10.1093/jla/lax001.
Priisalu, Jaan and Rain Ottis. 2017. “Personal Control of Privacy and Data: Estonian Experience.” Health and Technology 7 (4): 441–51. https://doi.org/10.1007/s12553-017-0195-1.
Radinsky, Kira. 2015. “Data Monopolists Like Google Are Threatening the Economy.” Harvard Business Review, March 2. https://hbr.org/2015/03/data-monopolists-like-google-are-threatening-the-economy.
Rotella, Perry. 2012. “Is Data The New Oil?” Forbes, April 2. www.forbes.com/sites/perryrotella/2012/04/02/is-data-the-new-oil/.
Selby, John. 2017. “Data Localization Laws: Trade Barriers or Legitimate Responses to Cybersecurity Risks, or Both?” International Journal of Law and Information Technology 25 (3): 213–32. https://doi.org/10.1093/ijlit/eax010.
World Economic Forum. 2012. “Big Data, Big Impact: New Possibilities for International Development.” January 22. www.weforum.org/reports/big-databig-impact-new-possibilities-internationaldevelopment.