Data Governance in the Digital Age | Preventing Big Data Discrimination in Canada: Addressing Design, Consent and Sovereignty Challenges

Preventing Big Data Discrimination in Canada: Addressing Design, Consent and Sovereignty Challenges

Published: April 12, 2018

Authors: Jonathan Obar Brenda McPhail

Key Points

  • Big-data-driven automated decision making expands and exacerbates discrimination. Policy addressing the role of big data in our lives will convey to the world how we, as a country, ensure the rights of the most vulnerable among us, respect the rights of Indigenous Nations and persist as a moral leader.
  • To prevent big data discrimination, Canada’s national data strategy must address the challenge of biased data sets, algorithms and individuals; address the mismanagement of Indigenous data; and address the overarching challenge of consent failures.
  • Canada must ensure the auditability and accountability of data sets, algorithms and individuals implicated in data-driven decision making. This should include efforts to protect Canadian data and decision-making systems from foreign interference, and efforts to ensure Canadian law governs Canadian data.

ig data presents Canada with a civil rights challenge for the twenty-first century.

The data-driven systems of the future, privileging automation and artificial intelligence, may normalize decision-making processes that “intensify discrimination, and compromise our deepest national values” (Eubanks 2018, 12).

The challenge is to check our innovation preoccupation, to slow down and look around, and say, we cannot allow big data discrimination to happen. Not here.

As the data-driven economy zooms forward — and along with it the desire for Canadian innovation and relevance — we must remember, as we rush toward could we, to always ask ourselves should we?

The policy determined in the coming years about the role of big data in our lives will speak volumes to how we, as a country, ensure the rights of the most vulnerable among us, respect the rights of Indigenous Nations and persist as a moral leader in the world. We cannot ignore these most important of responsibilities as we strive for innovation at home and recognition abroad.

The challenge is to check our innovation preoccupation, to slow down and look around, and say, we cannot allow big data discrimination to happen.

In a brief essay it is impossible to capture all that is being cautioned in the growing number of academic works with titles such as Automating Inequality (Eubanks 2018), Algorithms of Oppression (Noble 2018), Weapons of Math Destruction (O’Neil 2017), Broken Windows, Broken Code (Scannell 2016) and “Privacy at the Margins” (Marwick and boyd 2018a). The goal here is to acknowledge, as was noted in an aptly named Obama-era White House report entitled Big Data: Seizing Opportunities, Preserving Values: “big data analytics have the potential to eclipse longstanding civil rights protections in how personal information is used in housing, credit, employment, health, education, and the marketplace" (The White House 2014, 3).

To this list, we can add immigration, public safety, policing and the justice system as additional contexts where algorithmic processing of big data impacts civil rights and liberties.

The following section introduces overarching big data discrimination concerns. The second section outlines five big data discrimination challenges, along with policy recommendations to aid prevention. The final section provides a counter to the “new oil” rhetoric and a call to check the innovation preoccupation in light of the civil rights challenge.

Big Data Discrimination: Overarching Concerns

Automated, data-driven decision making requires personal data collection, management, analysis, retention, disclosure and use. At each point in the process, we are all susceptible to inaccuracies, illegalities and injustices. We may all be unfairly labelled as “targets or waste” (Turow 2012, 88), and suffer consequences at the bank, our job, the border, in court, at the supermarket and anywhere that data-driven decision making determines eligibility (Pasquale 2015). The quickly changing procedures for determining and implementing labels from myriad data points and aggregations must be scrutinized, as policy struggles to keep up with industry practice (Obar and Wildman 2015). (See Figure 1 for a list of problematic data broker labels identified by the US Senate.)

Figure 1: Data Broker Consumer Profile Categories

Source: US Senate (2013).

While this threatens us all, the research is clear: vulnerable communities are disproportionately susceptible to big data discrimination (Gangadharan, Eubanks and Barocas 2014; Newman 2014; Barocas and Selbst 2016; Madden et al. 2017). One revealing account comes from a study detailed in the recent book Automating Inequality, which identifies how these systems not only discriminate, but also reinforce marginality:

What I found was stunning. Across the country, poor and working-class people are targeted by new tools of digital poverty management and face life-threatening consequences as a result. Automated eligibility systems discourage them from claiming public resources that they need to survive and thrive. Complex integrated databases collect their most personal information, with few safeguards for privacy or data security, while offering almost nothing in return. Predictive models and algorithms tag them as risky investments and problematic parents. Vast complexes of social service, law enforcement, and neighborhood surveillance make their every move visible and offer up their behavior for government, commercial, and public scrutiny.
(Eubanks 2018, 11)

People of colour; lesbian, gay, bisexual, transgender and queer communities; Indigenous communities; the disabled; the elderly; immigrants; low-income communities; children; and many other traditionally marginalized groups are threatened by data discrimination at rates differing from the privileged (Marwick and boyd 2018b). Add to this the concern that vulnerability is not a static position, but one amplified by context. As Alice E. Marwick and danah boyd (ibid., 1160) argue, “When people are ill…the way they think about and value their health data changes radically compared with when they are healthy. Women who are facing the abuse of a stalker find themselves in a fundamentally different position from those without such a threat. All too often, technology simply mirrors and magnifies these problems, increasing the pain felt by the target…. Needless to say, those who are multiply marginalized face even more intense treatment."

In crafting a national data strategy, the government must acknowledge this quickly unfolding reality and ensure the protection of fundamental Canadian values.

Policy Recommendations: Preventing Big Data Discrimination

Addressing Discriminatory Data Sets

A common error in thinking about big data is that everything is new. The technology seems new, the possibility seems new and the data seems new. In reality, many of the historical data sets populated in health care, public safety, criminal justice and financial contexts (to name a few) are being integrated into new big data systems, and along with them, the built-in biases of years of problematic collection (see, for example, Pasquale 2015; Scannell 2016). At the same time, new data, whether flowing from millions of sensors and trackers or scraped from the data trails generated by lives lived online, may well perpetuate and amplify existing bias unless we actively guard against it.

Biased Policing, Biased Police Data

While so-called predictive policing based on big data analysis is relatively new in Canada, literature from the United States presents cautionary findings (Hunt, Saunders and Hollywood 2014; Angwin et al. 2016; Lum and Isaac 2016; Joh 2016; Scannell 2016). In particular, the concern that biased policing techniques (for example, broken windows policing, including stop and frisk1) contribute to biased police data. The use of historical data sets in new analyses, and the maintenance of biased policing techniques to generate new data, raise considerable concerns for civil rights in general, and automated criminal justice efforts in particular.

Canada is not immune to problematic policing practice2 and biased police data. For example, a Toronto Star analysis of 10 years’ worth of data regarding arrests and charges for marijuana possession, acquired from the Toronto Police Service, revealed black people with no criminal history were three times more likely to be arrested than white people with similar histories (Rankin, Contenta and Bailey 2017). Not coincidentally, this is similar to the rate at which black people are subject to police stops, or “carding” (ibid.).

There is concern that biased policing techniques, such as broken windows policing, including stop and frisk, contribute to biased police data. Canada is not immune to problematic policing practice; current enforcement may disproportionately impact poor neighbourhoods and racialized communities. (Photo: Toronto-Images.Com /

These findings were reinforced by statements from former Toronto Police Chief Bill Blair, who said, “I think there’s a recognition that the current enforcement disproportionately impacts poor neighbourhoods and racialized communities” (quoted in Proudfoot 2016). He later added that “the disparity and the disproportionality of the enforcement of these laws and the impact it has on minority communities, Aboriginal communities and those in our most vulnerable neighbourhoods” is “[o]ne of the great injustices in this country” (quoted in Solomon 2017).

To prevent big data discrimination, Canada’s national data strategy must acknowledge the challenge of biased data sets. Addressing this challenge might involve a combination of strategies for eliminating biases in historical and new data sets, being critical of data sets from entities not governed by Canadian law (see Andrew Clement’s contribution to this essay series) and developing policy that promotes lawful decision-making practices (i.e., data use) mandating accountability for entities creating and using data sets for decision making.

Addressing Discrimination by Design

Algorithms that analyze data and automate decision making are also a problem. Data scientist Cathy O’Neil now famously referred to biased algorithms wreaking havoc on “targets or waste” (Turow 2012, 88) as “weapons of math destruction” (O’Neil 2017). The scholarship is clear: writing unbiased algorithms is difficult and, often by design or error, programmers build in misinformation, racism, prejudice and bias, which “tend to punish the poor and the oppressed in our society, while making the rich richer” (ibid., 3). In addition, many of these algorithms are proprietary, so there are challenges in looking under the hood and toward ensuring public accountability (Pasquale 2015; Kroll et al. 2017).

To prevent big data discrimination, Canada’s national data strategy must acknowledge the challenge of biased algorithms. Addressing this challenge might involve a combination of strategies for auditing and eliminating biases in algorithms, being critical of algorithms from entities not governed by Canadian law and developing policy that promotes lawful decision-making practices (i.e., data use) mandating accountability for entities creating and using algorithms for decision making.

While a Canadian vision is necessary, international scholarship and policy initiatives may inform its development. American attempts to determine “algorithmic ethics” (see Sandvig et al. 2016) and to address algorithmic transparency and accountability (see Pasquale 2015; Kroll et al. 2017) may be of assistance. In particular, New York City’s legislative experiment, an “algorithmic accountability bill” (Kirchner 2017) might inform the development of oversight mechanisms.

Addressing People Who Want to Discriminate

Even if the data sets and the algorithms were without bias, some individuals might still want to discriminate. This old concern is also amplified at a time when digital tools aid and abet those circumventing anti-discrimination law in areas such as job recruitment (Acquisti and Fong 2015). This means that methods for protecting against big data discrimination must not just monitor the technology, but also the people interpreting the outputs.

To prevent big data discrimination, Canada’s national data strategy must acknowledge the challenge of biased individuals. Addressing this challenge might involve a combination of strategies for auditing and enforcing lawful data use, implementing what the Obama White House referred to as the “no surprises rule” (The White House 2014, 56) suggesting that data collected in one context should not be reused or manipulated for another, and being critical of decisions from entities not governed by Canadian law.

Addressing the Oppression of Indigenous Nations

The British Columbia First Nations Data Governance Initiative (2017) released a report in April 2017 identifying five concerns suggesting historical mismanagement of Indigenous data. These concerns are quoted below, updated with edits provided by Gwen Phillips, citizen of the Ktunaxa Nation and champion of the BC First Nations Data Governance Initiative:

It is equally important to recognize that nation states have traditionally handled and managed Indigenous data in the following ways: 1. Methods and approaches used to gather, analyze and share data on Indigenous communities has reinforced systemic oppression, barriers and unequal power relations; 2. Data on Indigenous communities has typically been collected and interpreted through a lens of inherent lack, with a focus on statistics that reflect disadvantage and negative stereotyping; 3. Data on Indigenous communities collected by nation state institutions has been of little use to Indigenous communities, further distancing Nations from the information; 4. Data on Indigenous communities collected by the nation state government has been assumed to be owned and therefore controlled by said government 5. With a lack of a meaningful Nation-to-Nation dialogue about data sovereignty.
(Ibid., 3)

The report also emphasizes the following recommendation: “The time for Canada to support the creation of Indigenous-led, Indigenous Data Sovereignty charter(s) is now. The Government of Canada’s dual stated commitment to the reconciliation process and becoming a global leader on open government presents a timely opportunity. This opportunity should be rooted in a Nation-to-Nation dialogue, with Indigenous Nations setting the terms of the ownership and stewardship of their data as it best reflects the aspirations and needs of their peoples and communities” (ibid.).

To prevent big data discrimination, Canada’s national data strategy must respect the rights of Indigenous Nations. How we address this issue should be viewed as central to our national ethics and moral leadership in the world. These concerns ought to be addressed in conjunction with, but also independently of, all other Canadian approaches to preventing big data discrimination.

Addressing Consent Failures

Consent is the “cornerstone” of Canadian privacy law (Office of the Privacy Commissioner [OPC] 2016). Consent is generally required for lawful data collection and use and, in theory, stands as a fundamental protection against misuse. There are two problems with the consent model as it relates to big data discrimination. First, current mechanisms for engaging individuals in privacy and reputation protections produce considerable consent failures. This is captured well by the “the biggest lie on the internet” anecdote — “I agree to the terms and conditions” (Obar and Oeldorf-Hirsch 2016). Scholarship suggests that people do not read, understand or even engage with consent materials. The growing list of reasons include: the length, number and complexity of policies (McDonald and Cranor 2008; Reidenberg et al. 2015), user resignation (Turow, Hennessy and Draper 2015), the tangential nature of privacy deliberation to user demands (Obar and Oeldorf-Hirsch 2016), and even the political economic motivations of service providers, manifested often via the clickwrap3 (Obar and Oeldorf-Hirsch 2017; 2018). Second, entities may not know how they want to use data at the time of collection, leading to vague consent provisions for retention, disclosure, research and aggregation (Lawson, McPhail and Lawton 2015; Clement and Obar 2016; Parsons 2017; Obar and Clement 2018). In this context, it is impossible for individuals to anticipate the ways their information might be used and re-used, never mind be aware of the potential for big data discrimination. In sum, when it comes to delivering privacy and reputation protections, while consent remains a strong place to start, it is also a terrible place to finish. On their own, consent mechanisms leave users incapable of challenging the complex threats expanded and exacerbated by big data (see Nissenbaum 2011; Solove 2012; Obar 2015; OPC 2016; 2017).

Consent is generally required for lawful data collection and use and, in theory, stands as a fundamental protection against misuse.

To prevent big data discrimination, Canada’s national data strategy must acknowledge the challenge of consent failures. Addressing this challenge might involve a combination of strategies for supporting new consent models and procedures at home and abroad, strengthening purpose specification requirements for data use, and ensuring lawful data use in all consequential data-driven decision-making processes, including eligibility determinations.

Canadian leadership should draw from extensive OPC consent consultations (OPC 2017), as well as from international efforts. In particular, in May 2018, the European Union will begin enforcing enhanced consent requirements through its General Data Protection Regulation.4 The requirements and outcomes should be evaluated as Canada develops its national data strategy.

Big Data Is Not the “New Oil”

So, no, big data is not the “new oil” for the Canadian economy. The beings whose bodies made the oil we burn died millions of years ago. You and I, and all persons in Canada, are not fuel — we are living human beings. Canada’s Charter of Rights and Freedoms grants us all “the right to the equal protection and equal benefit of the law without discrimination and, in particular, without discrimination based on race, national or ethnic origin, colour, religion, sex, age or mental or physical disability.”5

We must ensure that those who wield big data respect these rights. We must check our innovation preoccupation. Let us pursue policy with our eyes open, always with the goal of persisting as a moral leader in this world. We must protect the rights of the vulnerable. We must respect the rights of Indigenous Nations. We must prevent big data discrimination. That is the civil rights challenge of the twenty-first century.


1 Broken windows policing refers to the controversial policing technique whereby officers engage in aggressive enforcement efforts for smaller alleged crimes, with the goal, supposedly, of deterring larger crimes (Harcourt 2009). Stop and frisk is one example of broken windows policing, where officers “temporarily detain someone they suspect of a crime, and…‘pat down’ suspects they think might be armed” (Butler 2014, 57). Furthermore, “because the ‘reasonable suspicion’ standard that authorizes stops and frisks is lenient, the police have wide discretion in who they detain and frisk.…[and these] are probably the most common negative interactions that citizens have with the police (many more people get detained than arrested). For example, in New York City, in 2012, the police conducted 532,911 stops and frisks” (ibid.). Assertions that both techniques are discriminatory toward minority populations, in the United States, in particular, have contributed to the suggestion that, for example, “stop and frisk is, in the United States, a central site of inequality, discrimination, and abuse of power” (ibid., 57).
2 The Black Experience Project (BEP) suggests about black participants from the Greater Toronto Area: “(they) are more likely to be stopped in public than to be helped by the police, and younger Black males are particularly likely to experience police harassment. Not surprisingly, BEP participants almost unanimously condemn the way in which Black people are treated by the local police” (The Environics Institute 2017, 4).
3 The clickwrap is “a digital prompt that enables the user to provide or withhold their consent to a policy or set of policies by clicking a button, checking a box, or completing some other digitally-mediated action suggesting 'I agree' or 'I don’t agree'" (Obar and Oeldorf-Hirsch 2018). There are concerns that clickwraps support circumvention of consent materials such as privacy and terms of service policies of social media services (ibid.).
4 See
5 See

Works Cited

Acquisti, Alessando and Christina M. Fong. 2015. “An experiment in hiring discrimination via online social networks.”

Angwin, Julia, Jeff Larson, Surya Mattu and Lauren Kirchner. 2016. “Machine Bias.” ProPublica, May 23.

Barocas, Solon and Andrew D. Selbst. 2016. “Big data’s disparate impact.” California Law Review 104: 671–732.

British Columbia First Nations Data Governance Initiative. 2017. “Decolonizing Data: Indigenous Data Sovereignty Primer.” April.

Butler, Paul. 2014. “Stop and frisk and torture-lite: police terror of minority communities.” Ohio State Journal of Criminal Law 12: 57–69.

Clement, Andrew and Jonathan A. Obar. 2016. “Keeping internet users in the know or in the dark: An analysis of the data privacy transparency of Canadian internet carriers.” Journal of Information Policy 6 (1): 294–331.

Eubanks, Virginia. 2018. Automating Inequality: How High-Tech Tools Profile, Police and Punish the Poor. New York, NY: St. Martin’s Press.

Gangadharan, Seeta P., Virginia Eubanks and Solon Barocas, eds. 2014. Data and Discrimination: Collected Essays.

Harcourt, Bernard E. 2009. Illusion of Order: The False Promise of Broken Windows Policing. Cambridge, MA: Harvard University Press.

Hunt, Priscillia, Jessica Saunders and John S. Hollywood. 2014. “Evaluation of the Shreveport predictive policing experiment.” Rand Research Reports.

Joh, Elizabeth E. 2016. The new surveillance discretion: Automated suspicion, big data, and policing.” Harvard Law and Policy Review 10 (1): 15–42.

Kirchner, Lauren. 2017. “New York City moves to create accountability for algorithms.” Ars Technica, December 19.

Kroll, Joshua A., Joanna Huey, Solon Barocas, Edward W. Felten, Joel R. Reidenberg, David G. Robinson and Harlan Yu. 2017. “Accountable algorithms.” University of Pennsylvania Law Review 165: 633–705.

Lawson, Philippa, Brenda McPhail and Eric Lawton. 2015. The Connected Car: Who is in the Driver’s Seat? BC Freedom of Information and Privacy Association.

Lum, Kristian and William Isaac. 2016. “To predict and serve?” Significance 13 (5): 14–19.

McDonald, Aleecia M. and Lorrie Faith Cranor. 2008. “The Cost of Reading Privacy Policies.” I/S: A Journal of Law and Policy for the Information Society 4: 540–65.

Madden, Mary, Michele Gilman, Karen Levy and Alice Marwick. 2017. “Privacy, poverty, and Big Data: A matrix of vulnerabilities for poor Americans.” Washington University Law Review 95: 53–125.

Marwick, Alice E. and danah boyd, eds. 2018a. “Privacy at the Margins.” Special issue, International Journal of Communication 12.

———. 2018b. “Understanding privacy at the margins: Introduction.” In “Privacy at the Margins,” Alice E. Marwick and danah boyd, eds. Special issue, International Journal of Communication 12: 1157–65.

Newman, Nathan. 2014. “How Big Data Enables Economic Harm to Consumers, Especially to Low-Income and Other Vulnerable Sectors of the Population.”

Nissenbaum, Helen. 2011. “A contextual approach to privacy online.” Daedalus 140 (4): 32–48.

Noble, Safiya U. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York, NY: New York University Press.

Obar, Jonathan A. 2015. “Big Data and The Phantom Public: Walter Lippmann and the fallacy of data privacy self-management.” Big Data & Society 2 (2): 1–16.

Obar, Jonathan A. and Andrew Clement. 2018. “Keeping internet users in the know or in the dark: Data privacy transparency of Canadian internet service providers.”

Obar, Jonathan A. and Anne Oeldorf-Hirsch. 2016. “The biggest lie on the internet: Ignoring the privacy policies and terms of service policies of social networking services.”

———. 2017. “Clickwrap impact: Quick-join options and ignoring privacy and terms of service policies of social networking services.” In Proceedings of the 8th International Conference on Social Media & Society. July. Association for Computing Machinery.

———. 2018. “The clickwrap: A political economic tool for manufacturing consent on social media.”

Obar, Jonathan A. and Steven S. Wildman. 2015. “Social media definition and the governance challenge: An introduction to the special issue.” Telecommunications Policy 9 (39): 745–50.

OPC. 2016. “Consent and privacy: A discussion paper exploring potential enhancements to consent under the Personal Information Protection and Electronic Documents Act.” May.

———. 2017. “2016-17 Annual Report to Parliament on the Personal Information Protection and Electronic Documents Act and the Privacy Act.” September.

O’Neil, Cathy. 2017. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY: Broadway Books.

Parsons, Christopher. 2017. “The (in) effectiveness of voluntarily produced transparency reports.” Business & Society.

Pasquale, Frank. 2015. The Black Box Society: The Secret Algorithms that Control Money and Information. Cambridge, MA: Harvard University Press.

Proudfoot, Shannon. 2016. “Bill Blair: The former top cop in charge of Canada’s pot file.” Maclean’s, September 29.

Rankin, Jim, Sandro Contenta and Andrew Bailey. 2017. “Toronto marijuana arrests reveal ‘startling’ racial divide.” Toronto Star, July 6.

Reidenberg, Joel R., Travis Breaux, Lorrie Faith Cranor, Brian French, Amanda Grannis, James T. Graves, Fei Liu, Aleecia McDonald, Thomas B. Norton, Rohan Ramanath, N. Cameron Russell, Norman Sadeh and Florian Schaub. 2015. “Disagreeable privacy policies: Mismatches between meaning and users’ understanding.” Berkeley Technology Law Journal 30 (1): 39–68.

Sandvig, Christian, Kevin Hamilton, Karrie Karahalios and Cedric Langbort. 2016. “When the algorithm itself is a racist: Diagnosing ethical harm in the basic components of software.” International Journal of Communication 10: 4972–90.

Scannell, R. Joshua. 2016. “Broken windows, broken code.”, August 29.

Solomon, Evan. 2017. “A bad trip: Legalizing pot is about race.” Maclean’s, April 14.

Solove, Daniel J. 2012. “Introduction: Privacy self-management and the consent dilemma.” Harvard Law Review 126: 1880–1903.

The Environics Institute. 2017. “The Black Experience Project in the GTA: Executive Summary.”

The White House. 2014. Big Data: Seizing Opportunities, Preserving Values.

Turow, Joseph. 2012. The Daily You: How the New Advertising Industry Is Defining Your Identity and Your Worth. New Haven, CT: Yale University Press.

Turow, Joseph, Michael Hennessy and Nora Draper. 2015. “The trade off fallacy: How marketers are misrepresenting American consumers and opening them up to exploitation.”

US Senate. 2013. “A review of the data broker industry: Collection, use and sale of consumer data for marketing purposes.” Committee on Commerce, Science and Transportation. Office of Oversight and Investigations Majority Staff. December 18.

The opinions expressed in this article/multimedia are those of the author(s) and do not necessarily reflect the views of CIGI or its Board of Directors.

Rationale of a Data Strategy

The Role of a Data Strategy for Canadian Industries

Balancing Privacy and Commercial Values

Domestic Policy for Data Governance

International Policy Considerations