How Clearview AI Could Violate Copyright Law

Clearview AI scraped countless copyright-protected images from social media sites to develop a commercial facial recognition technology. Could copyright law be used to dismantle its service?

March 10, 2020
The person who owns the copyright on a particular photograph — including one that may get scraped into a massive image database — has certain rights to control the reproduction and distribution of the work. (Shutterstock)

The recent controversy over the use of Clearview AI’s facial recognition technology by police services in Canada largely focused on the significant privacy issues it raises. The federal privacy commissioner, Daniel Therrien, and his provincial counterparts have launched a joint investigation into Clearview AI’s data-related practices. Commissioner Therrien is also investigating the RCMP’s use of Clearview AI under the Privacy Act.

There is no doubt that the privacy issues are important. However, they are not the only legal issues raised by Clearview AI’s technology; copyright issues should be considered too.

Clearview AI provides facial recognition services, largely to law enforcement. It has a massive database of images that are scraped from social media sites, as well as from other “publicly accessible” webpages on the internet. It uses proprietary algorithms to search this database of images for matches, then communicates a match to its clients, along with hyperlinks to the internet-based sources of any matched images.

The person who owns the copyright for a particular photograph — including one that may get scraped into a massive image database — has certain rights to control the reproduction and distribution of the work. When it comes to photographs posted to online social media sites, it is interesting to think about what protection these copyrights provide. It is also interesting to think about how digital-era laws may fail individuals if exceptions in the laws are successfully leveraged in the design of troubling business models.

Data scraping involves the use of bots to harvest data in bulk from the internet. Most commercial websites will have terms of service that prohibit data scraping, but these contractual provisions have proven somewhat challenging to enforce. Copyright provides a stronger basis for enforcement, but ownership of copyright in these images is diffuse. In the early days of social media, platform companies toyed with the idea of having users assign copyright in the content they posted to the platform. Unsurprisingly, this was not popular. The standard approach today is to leave copyright with the user. The platform obtains a licence that meets its own commercial needs. Interestingly, if platforms did hold copyright in all of their publicly accessible content, they would be in a much better position to resist the kind of wholesale data scraping engaged in by companies like Clearview AI.

Generally, for a photograph, the owner of copyright is the person who took the photograph. Ownership issues are straightforward, then, in cases where people have taken the photographs that they upload to social media. At first glance, these individuals have their copyrights infringed when their photographs are scraped and saved in Clearview AI’s database. Case law in Canada for example, has found that the scraping of copyright-protected photographs from the internet for commercial purposes is infringement (See, for example, Century 21 v. Rogers Communications and Trader v. CarGurus). Absent any other defence, Clearview might argue that the database of images constitutes fair use; however, fair use is a contextual inquiry, and much about this context might militate against such a finding. The accompanying likely breach of privacy rights (at least in Canada, and the European Union) might even be a relevant contextual factor.

Clearview AI, however, relies upon an exception in the US Digital Millennium Copyright Act (DMCA) of 1998. The DMCA amended the US Copyright Act. Among other changes, the DMCA added a new section 512 to the law, which included an exception to protect online “service providers.” Such companies facilitate the communication of information over the internet or host user-provided content. The law was amended to exempt them from liability “by reason of the provider referring or linking users to an online location containing infringing material or infringing activity, by using information location tools, including a directory, index, reference, pointer, or hypertext link.” This essentially describes the function of a search engine. Clearview AI undoubtedly believes that it qualifies as a service provider offering an information location tool. It directs those with copyright concerns to follow the notice and takedown protocol prescribed by the legislation. An operator of an information location tool who receives such notice can remove the offending content, and it is not liable for damages unless it fails to do so. A similar exception exists in Canadian copyright law (see section 41.25 of the Copyright Act).

On the face of it, Clearview AI seems to be far more than a simple search engine — it has created its own database out of content scraped from the internet, and it offers facial recognition services to its clients. Nevertheless, its activities, at least superficially, are analogous to those of a “service provider.” Whether it can ultimately rely on the service provider exception has important implications for the legitimacy of its business model. If the images within the database infringe copyright and the service is not an information location tool, liability for statutory damages could be steep. There is also potential for class action litigation, although proof of copyright ownership could be a complicating factor. In any event, absent recourse to this exception, the copyright issues have the potential to derail this type of use of images shared by people on social media to build massive facial recognition databases for commercial exploitation. It is not privacy law, but it could have a similar impact.

Clearview AI has found a strategy to make use of vast quantities of copyright-protected images available on the internet to create a commercial artificial intelligence (AI) facial recognition service. In doing so, it relies upon provisions that were added to copyright law to make it possible for content providers to operate without fear of liability for the kinds of infringements over which they had very little control — for example, when users post infringing content to a company’s platform without its awareness — and to facilitate the search for content over the internet. Clearview AI has leveraged the law to place the onus on individual copyright holders to take action, while at the same time, shielding itself from liability.

If, in fact, Clearview AI is wrong about the application of copyright law in this case, then copyright law could be a useful tool to dismantle its service — and to prevent others from similarly exploiting user-contributed content online. If it is not wrong, and can squeeze itself into the statutory exception, then this technology is likely just a harbinger of more like it to come. The situation exposes the risks and challenges of legislating in a rapidly changing digital environment. As technology evolves, provisions that seemed important or essential in one context become loopholes to be exploited in another very different context. Perhaps robust privacy protection is the answer to this conundrum — but that too remains a work in progress.

The opinions expressed in this article/multimedia are those of the author(s) and do not necessarily reflect the views of CIGI or its Board of Directors.

About the Author

Teresa Scassa is a CIGI senior fellow. She is also the Canada Research Chair in Information Law and Policy and a full professor at the University of Ottawa’s Law Faculty, where her groundbreaking research explores issues of data ownership and control.