AI talk: How to bell the cat?

Aug. 11, 2023 / By V. “Juggy” Jagannathan, PhD

In this blog, I explore the state of affairs with how to label content generated by artificial intelligence (AI). This is an incredibly hard problem to solve.  

A cat lounging on a couch.

Labeling AI generated content 

Fundamentally, we want to know how a specific piece of content is created – which may be audio, video, image, text or a combination of these modalities. Generative AI technology has rapidly advanced to a point where some are wondering if now machines have a conscience! They don’t. The tech can be used to create fake audio that eerily resembles real people. Even video can now be made to look real. Art is no longer relegated to artists. And the chat capabilities have passed the long-standing Turing test. Not knowing if something is real or fake can undermine trust in society. This article from Pew Research, carrying multiple opinions on the current technology trends, underscores or rather bemoans, how trust is being eroded. When one cannot distinguish between truth and fiction, what will happen to our democratic institutions? This Wired news article refers to a fake news article showing an image of explosion at the Pentagon that caused a real dip in the stock market, highlighting the real threats this kind of misinformation can pose. 

Another major reason to want to label content is the evolving theory of intellectual property protections. Some leading legal scholars are positing that AI generated content, such as from Dall-E or Stable Diffusion applications, should not be given copyright protection. Such protections are meant to protect humans not machines. And what kind of patent protection should we provide for AI generated drugs? Does one give such drugs the same protection as now? These are emerging issues which do not have clear answers. However, the need to know how content or solutions are created is there. 

Of course, content created with the help of AI can be true or false. The same is true of human generated content as well! Labeling is simply the first step in understanding the provenance of how the information is created. The Wired article mentioned above goes into a litany of issues involved. The problem is also not binary – that is, content is created by AI or not. Frequently it is a mix of content – part human and part AI, particularly when there is a mix of modalities. Can we label which part is AI and which part is not AI? Or can we just seek to identify if any AI generated content is part of the artifact?

Approaches to label content 

Watermarking is one of the approaches used with tagging content and has been around for a while and many organizations use it to protect audio, video and image content. Text generated using large language models (LLMs) is a different story. There have been techniques developed to watermark such text and they rely on the mathematical properties of how LLMs generate text. And turns out it can be easily defeated: Simply take the AI generated text and use another AI system to paraphrase the first output.

Now there is a concerted effort by the U.S. government to get all the major players to watermark AI generated content. All the usual suspects – Open AI, Microsoft, Google, Meta, Amazon and others have pledged to watermark AI generated content. One approach gaining traction is the efforts by a new coalition promoting a new watermark standard.

Coalition for Content Provenance and Authenticity (C2PA) 

This coalition is promoting a new standard, as an opt-in approach, that uses cryptography to digitally sign content. A certification authority (CA) assigns credentials to organizations that then use those credentials to digitally sign every component of content. So audio, video and images will have associated signatures that can be independently verified to establish provenance – who generated that content. It is up to the users to trust the actor who signed the content. At least one person/entity knows who created it. The MIT Technology Review article shows how in practice this watermark (signature) can be utilized. It shows a video with a mark at the top right corner and when a user clicks or hovers over it with a mouse, it shows the complete provenance information as to how the content was generated and by whom and when.

Concluding thoughts 

We are still in the wild west period for Generative AI. C2PA based efforts are still in their infancy and not all major players have embraced the approach. In my opinion, any opt-in approach can only be effective if it is mandated by regulation.

I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.

“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.