AI talk: Is open open and Khanmigo

Aug. 25, 2023 / By V. “Juggy” Jagannathan, PhD

This edition of AI talk explores the meaning of the term “open” in the context of software. Then I pivot to look at the fascinating work of the Khan Academy and its new artificial intelligence (AI) bot Khanmigo. 

How open is open source? 

When Meta released its Llama 2 last month, they proclaimed the release of “the next generation of our open-source large language model” – the key word being “open-source.” But that created a firestorm of reactions from the open source community. This blog is one such response, Is Llama 2 open source? No – and perhaps we need a new definition of open … MIT Technology Review recently published a detailed synopsis of what is happening in the open source community titled “The future of open source is still very much in flux.” So, what gives? What is really happening here? 

For reference, you can check out Meta’s license terms here. One clause, intended to screen out the likes of Google, states that anyone with more than 700 million customers should request a license. This impacts just a few behemoths, but is still considered anti-competitive in the open source community. Another restrictive covenant that impacts practically anyone is the stricture that the model will not be used to train other large language models unless it is a derivative of Llama. 

The basic reason for some of the outrage is that Meta is just one company. They cannot unilaterally declare what is open source and what is not. Accepted norms demand the license goes through Open Source Initiative license review process. There is also the more extreme version of open source from the Free Software Foundation. You can read the differences between OSI and this other, more liberal model here 

But Meta should be lauded for releasing the models that make it possible for a whole range of applications to be built and commercialized. The European Union AI Act proposal has a lot of implications on foundation models that are used in developing any solutions. First and foremost is transparency. What data was used to train the model? What was done to mitigate risks? How was it evaluated? Some of this information is available in the detailed paper Meta released, but not all. What legislation will eventually be adopted is anybody’s guess at this point. Stanford Human-Centered AI hosted a webinar on this topic that exposes the ongoing debate on the EU legislation. It is quite instructive. There is an argument for an open source exception from regulation for foundation models. But that begs the question, what exactly is open source? 

A professor speaking in a lecture hall.

Khanmigo 

Last week I came across a podcast featuring Bill Gates talking with the founder of the Khan Academy, Sal Khan. I am a huge fan of the Khan Academy and have been supporting them for more than a decade. I also enjoy their back-to-basics videos on almost any topic under the sun. So, I was curious to see what they are up to now. Earlier in the year I saw a great demo of how the Khan Academy was adopting GPT-4 to help students learn.

What is new now? In the podcast with Bill Gates, I found out some interesting titbits about Sal Khan. I was pleasantly surprised to learn he likes Bollywood movies! I do too! Another personal fact that got my attention: He is an artist with a penchant for drawing. I have always been amazed at his ability to draw in his videos explaining an incredible range of topics. In the initial years after founding the Khan Academy, Sal Khan created thousands of tutoring videos, all by himself. Now, of course, they have an army of people creating tutoring content and the tools from his academy are being used around the world and in classrooms.

The Khan Academy is releasing a new tool which harnesses generative AI capabilities dubbed “Khanmigo.”  Check out the video that explains the goal of this tool, which is quite ambitious. A personalized tutor, an AI assistant and coach, which can help a student learn and understand any topic. In this video, Sal refers to Bloom’s 2 sigma problem. What is that problem? Benjamin Bloom, an educational psychologist, showed that an average student with one-to-one tutoring performed at a level that is 2 sigma standard deviations better than an average student taught in a classroom. That is huge. So, the goal of Khanmigo is to be that tutor and provide one-to-one tutoring, lifting the levels of all students to become masters of whatever subjects they chose to learn.

The Khan Academy is working on guardrails to make this tool effective and useful, given all the current limitations of this technology. If Khan is successful, this will be huge benefit to the underserved children among us and around the world. I for one, will be cheering him on from the sidelines.

I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.

“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.