How Transparency Reporting Could Incentivize Irresponsible Content Moderation

December 10, 2019
AP_19296515581940.jpg
Facebook Chief Executive Officer Mark Zuckerberg turns back and smiles after arriving for a hearing before the House Financial Services Committee on Capitol Hill on October 23, 2019. (AP Photo/Susan Walsh)

In the past few years, social media companies have been generating headlines with a new type of report released alongside their quarterly earnings — one about transparency. Facebook’s latest transparency reports released in mid-November even garnered headlines in prominent papers such as The Washington Post.

Transparency reports are common for many industries. Google’s transparency report page lists other companies, universities or foundations (such as Wikimedia) that produce these documents. Transparency reporting is a worthy initiative and an important step in changing the relationship between companies, governments and civil society. But we should be aware that transparency reporting’s reliance on statistics can create problematic incentives and unforeseen consequences.  

The reports are full of graphs and statistics on different types of horrifying content — including child sexual exploitation material, hate speech, bullying and harassment. In its third quarter, Facebook dealt with tens of millions of posts violating different content policies, including 11.6 million posts with child nudity or child sexual exploitation material. Facebook noted that this was a significant increase from the previous quarter, when it had addressed 6.9 million items in this category. The report emphasized that the figures don’t reflect an increase in policy-violating material; rather, they pointed to Facebook’s improved detection. The report also estimated the prevalence of this material, claiming that it was at most 0.04 percent in the third quarter: “Out of every 10,000 views on Facebook or Instagram in Q3 2019, we estimate that no more than four of those views contained content that violated that policy.”

No one wants platforms to enable horrifying material involving sexual exploitation of children. But we should also be aware that statistics can create flawed incentives. At present, the platforms determine the metrics they report on. They also change these metrics for each report. In this developing field, it is understandable that we are still figuring out how to measure what matters, but, in the meantime, companies are grading their own homework.

Metrics define incentives. Any statistical metric can create false binary choices and offer overly simplistic solutions to complex problems, as we have seen in other arenas. For example, in the late 1990s when Tony Blair was prime minister of the United Kingdom, the Labour Party created a cap on class sizes, believing that this would improve pupils’ performances. This focus often meant pouring money into pushing classes of 31 below 30, as the cap dictated. Yet, the singular focus on this statistic was too simplistic. In 2015, the head of the Organisation for Economic Co-operation and Development’s Program of International Student Assessment surveys called it a myth that smaller classes mean better performance. Effective teachers matter more. And measuring better teaching is generally a qualitative exercise, rather than a statistical one.

For social media companies, the statistics at present focus on content removal. On the most basic level, if platforms get praised for removing more content, they may be incentivized to over-delete in other, less clear-cut categories. The reports do not give examples of removed content, for obvious reasons. But that makes it hard to judge the accuracy of removals.

The statistics may also incentivize companies to use more artificial intelligence (AI) systems to detect violations. Facebook, for example, highlights the percentage of content found through AI. In November, the platform shared that, because of AI systems, it finds 80 percent of hate speech content proactively, up from 68 percent in the previous report. This may sound encouraging, but we should be aware that relying on AI can exacerbate existing problems. We already have some evidence that these types of AI content tools disproportionately affect marginalized communities, including people of colour and women. One recent study found that Google’s AI tool to detect toxic comments frequently classified comments in African-American English as toxic. As a result, AI detection tools may shut out some of the very communities who have found social media to be a powerful place to organize. In the worst-case scenario, a cycle emerges: develop systems to categorize and detect content that violates policies; report on the amount of that policy-violating content removed; empower systems to remove more of the content and generate “improved” statistics for the next report; disproportionately remove content from marginalized groups; and repeat.

Earlier this year, I analyzed the transparency reports for NetzDG, Germany’s Network Enforcement Act, which requires social media companies to remove flagged posts that violate German speech laws within 24 hours. The NetzDG requires companies that receive over 100 complaints to publish regular transparency reports. The reports detail numbers such as how many posts were flagged and how many were removed. These were the first transparency reports on content removal required by law anywhere in the world.

The NetzDG reports have become an important test case for such documents. First, they remind us that we should be wary of trusting companies’ self-reported numbers. The only fine issued under NetzDG was to Facebook in July 2019 for under-reporting content violations in its transparency reports. Because other transparency reports are voluntary, there are no clear independent audit mechanisms.

Second, we might think beyond statistics to other forms of transparency, such as transparency of process. For me, the most useful parts of the NetzDG reports were not the numbers, but the descriptions of how companies approached adjudication and removal. The numbers told me how many posts were flagged and how many were removed. But with relatively little context, there was little I could do with those numbers. For example, I did not know how many pieces of content Germans had posted during that period, making it very hard to assess whether the percentage of content flagged was increasing or decreasing. The short descriptions of processes mattered far more. I learned how many content moderators were used, and I learned that companies generally removed the material for breaching their terms of service rather than contravening German speech law. This meant that users would have to appeal to the companies for content to be restored, rather than a German court.

The German law required transparency by numbers, but it would be more useful to include transparency of process. Process transparency could include, for starters, the number of content moderators employed by social media companies and the number of staff employed by third-party contractors. At the International Grand Committee hearing held in Dublin in early November, the head of global policy management for Facebook, Monika Bickert, could not say what percentage of content moderators were employed by Facebook and how many were contractors.

To continue with this one, relatively small example, we did not even know the working conditions for content moderators before investigations by journalists such as Casey Newton and scholars such as Sarah T. Roberts. These investigations have revealed that content moderators often suffer severe psychological consequences from looking at difficult material all day. A few months after Newton had published two investigative pieces on one third-party contractor monitoring content for Facebook, that company did not renew its contracts with Facebook and will stop working in content moderation by the end of 2019. This raises broader questions of whether there should be greater transparency into the process of hiring content moderation firms, their labour practices, and support for workers. Even more generally, should we know more about whether content moderators are trained to understand local contexts? This is particularly crucial for certain categories like hate speech where a seemingly innocuous phrase or image in one country can be hate speech in another.

Proposals to make transparency a pillar of regulation are in the works. One French proposal recommends creating a regulatory body that could mandate and enforce transparency and accountability from the largest social media companies. This regulator could also establish reasonable access for third-party researchers that maintained user privacy.

Transparency matters, but it also matters that we get the right kinds of transparency. Improved transparency reporting and regulation would not solve the many problems posed by digital platforms. It would, however, enable policy makers to write evidence-based policy as another step toward improved platform governance.

The opinions expressed in this article/multimedia are those of the author(s) and do not necessarily reflect the views of CIGI or its Board of Directors.

About the Author

Heidi Tworek is a CIGI senior fellow and an expert on platform governance, the history of media technologies, and health communications. She is a Canada Research Chair, associate professor of history and public policy, and director of the Centre for the Study of Democratic Institutions at the University of British Columbia, Vancouver campus.