AI talk: Belebele, NotebookLM and Freakonomics AI

Sept. 15, 2023 / By V. “Juggy” Jagannathan, PhD

This week, I blog about three different stories – two that involve Big Tech. The third storyline is about a Freakonomics podcast; they have done a three-part series on artificial intelligence (AI) – covering a range of interesting topics dubbed “How to think about AI!” 

Meta’s Belebele Benchmark 

The word belebele caught my attention as it conjured up in my mind Punjabi songs in Indian movies, which involve a lot of dancing shouting the term belebele. But apparently, the word actually means big, large, fat, great in the Bambara language, native to the country of Mali! So, what is it with this benchmark?

Well, Meta is on a roll. They released Llama 2 two months ago. Llama 2 was released with a fairly open license, which organizations can use to develop large language model (LLM) applications. How open this model is, was the topic discussed in an earlier blog. Then, a few weeks later, Meta released AudioCraft.

AudioCraft is another open-source application of LLMs – this one generates music from text prompts. Google has a similar solution you can experiment with but has not released it in the open-source arena.  A few weeks later, Meta released SeamlessM4T. It is a multi-tool for machine translations! You can do speech-to-speech, speech-to-text, text-to-speech and text-to-text between 100 different languages! It can also handle speech with multiple languages as input. This is particularly prevalent in bi-lingual households where the conversations can seamlessly flow using multiple languages. Certainly, this is true in my house, where I keep switching between two languages. Again, Meta seems to be open sourcing it, but for research purposes only this time. Following that, Meta released Code Llama, another open source release to auto-generate software code to compete against Github Copilot and Google’s Vertex AI.

Now, the release of the Belebele benchmark is different from all the releases above. This is a dataset – a multi-lingual one at that – covering 122 languages and dialects. A dataset that can be used to evaluate translation engines and question answering in multiple languages. It can also be used to fine tune language models. Support for impressive array of languages, including my mother tongue, Tamil. 

A teacher helping a student on a laptop.

Google’s NotebookLM 

In my last blog, I talked about an AI assistant being developed by the Khan Academy to tutor young kids. Google has now released something similar but with a different use case and workflow. I saw a blog on this front with a fairly hyperbolic claim: “…an AI product that could kill the schooling system we know.” 

That caught my attention. And then, I looked at Google’s staid blog on their site. So, what did they release that caused such excitement? Well, it’s a notebook app infused with AI. The notebook can be used to take notes, download teachers’ class material and anything related to the class the student is taking. Given this material, the student can ask any question to the notebook instead of the teacher and get a detailed response back. It’s the same tutoring concept of Khanmigo – except the NotebookLM response is grounded in just the class material. The claim in the blog above that it will change how teachers teach and how students learn decades into the future is probably true. However, I am not sure I like the idea of students interacting with laptops while attending live lectures. I say, interact with your AI assistants in your own time!

Freakonomics – How to think about AI 

A long time ago, I read a few books by Levitt and Dubner titled Freakonomics – intriguing lessons in economics. I was and still am a big fan of their work. I was, however, unaware that they have a podcast series to explore the hidden side of everything. When my friend sent me a link to their podcasts about AI, I decided to check it out.

Turns out, it was a three-part series, ”How to think about AI” by a guest host, Adam Davidson, who is famous for his podcast, Planet Money on NPR, another one of my favorite shows. The podcast series is well done, exploring the promise and perils of AI. The first episode has an interesting discussion on whether AI can tell a joke! The second explores the fear that it might take away jobs to what the future holds. The last episode – from just a week ago, has an interview with the CEO of Anthropic, with a focus on how to make AI behave ethically – a topic we covered in one of my earlier blogs. The podcast series ends with some prognoses and advice. I like this remark by Adam, “This is my one recommendation. Don’t sit it out. Get to know it. You can hate it. You can love it. You can have all sorts of mixed feelings. But the more you understand it, the better prepared you’ll be.”

Acknowledgement 
My friend and classmate Murali sent me the link to the AI podcast series.

I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.

“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.