From 3M Health Information Systems
AI talk: Whew! What a year for AI! And Fei-Fei Li
In this last blog for 2023, we take a tour of momentous happenings in the world of artificial intelligence (AI) through my 2023 AI talk blog journey. We also review the heartwarming immigration story penned by the AI research scientist Fei-Fei Li.
The year that was – 2023
We have just reached the one-year milestone of the release of ChatGPT. The jaw dropping capability that ChatGPT exhibited set the stage for what was to come. Many criticized it as “fluent BS” – but everyone could see the beginning of a new era. Google research proclaimed that the behavior of large language models (LLMs) emerged with scale – in terms of how long it was trained, how much data it was trained with, and most importantly, how many parameters were used. The model parameters count has gone beyond a trillion now! Attempts to understand how and why LLMs perform so brilliantly started in earnest. Stanford’s Human-Centered Artificial Intelligence Institute began its journey exploring what they dubbed “Foundation models.”
The popularity of ChatGPT and the exponential rate at which its use among the population grew, forced Google to release its own incarnation of LLM – Bard. Meta released the Llama model as well. Furor over the LLMs began to build up. Open AI released a version of what this tech means for the future of labor and employment. Their take: it will impact most job categories, particularly the white collar ones.
Prominent researchers from across the world, including deep learning pioneers Geoff Hinton and Professor Bengio, advocated caution in exploring LLMs – they were part of the 30,000 signatories on an open letter from the Future of Life Institute. Coaxing the right answer from chatbots became the national pastime! It spawned an entirely new field, prompt engineering. Though it could hardly be called engineering, since it was more a trial and error process. However, it did spawn a new category of jobs, a well paid one at that! All the concerns surrounding LLMs negative uses led the European Union to formulate the EU AI Act. The act attempted to promote the notion of transparency in the creation of models, among other provisions – an attempt to understand what data was used and how it was used.
Divergent camps regarding LLMs appeared in the scientific community, one claiming the technology is a harbinger of the end of human civilization (admittedly an extreme view), the other more sanguine about the technology arc. This year’s Munk debate explored these divergent viewpoints. Yann LeCun’s position is transparency in model creation and deployment is an antidote for the negative impacts of tech. True to this vision, Meta released its next version, Llama 2. Llama 2, unlike its predecessor, was released using a fairly permissive license, more open for anyone to use or adapt for commercial and non-commercial uses. Llama 2 has been downloaded almost a million times from Hugging Face since its release this summer.
One oft-repeated concern with generative AI content is one related to the detection of malicious fake content – deep fake images, audio and video. Detecting and labelling AI generated content is an ongoing active research endeavor. The release of GPT-4 in the spring of 2023 created an ecosystem of applications in countless domains. One interesting and substantive one was the release of Khanmigo (powered by GPT-4) by Khan Academy with the ambition of providing private tutoring lessons for all students and adults alike. ChatGPT and the subsequent release of GPT-4 all had one oblivious limitation. Their knowledge had an end date – the date at which the data it was trained on ended. It was clear from the start that this limitation needed to be augmented by the use of web search for current content. The new version of GPT-4 does use web search to augment its results.
Another limitation of LLMs is they cannot provide meaningful results for queries regarding private corporate or proprietary information. To address this limitation, a new category of solutions has now emerged – dubbed the retrieval augmented language models (RALM). RALM is a technique that can be used with any language model, large and small, providing a vehicle for creating custom solutions for specific problems with proprietary data.
Now we are at the end of 2023. Fittingly, to bookend this year, Google has released its latest LLM – Gemini. This new model is providing astonishing multi-model capability. Google DeepMind researchers demonstrated one of the model’s capabilities, which was to process 200,000 scientific papers released over the past few years, and extract and update a table that has been manually curated over a decade by scientists. It did so in matter of hours, understanding both the text and images.
These new and rapidly evolving capabilities frequently bring up the specter of artificial general intelligence. In his book, “The Coming Wave,” one of the original founders of DeepMind, Mustafa Suleyman, sounds alarm bells on the impact of this profoundly transformative technology. The White House executive order on the “The Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence,” is squarely designed to address and mitigate negative impacts of generative AI. And, the most fascinating and troubling fact, no one yet really knows how or why these models perform the way they do.
We are truly embarking in unchartered waters. Of one thing is certain: It will continue to astound and concern us.
Fei-Fei Li and her immigrant journey
I have read a few books now by AI pioneers and about AI pioneers. But didn’t really have any preconceived notion of what to expect from Dr. Fei-Fei Li’s book: The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI.
I was blown away by the personal journey she describes in the pages – an immigrant journey over the past three decades. I am an immigrant as well and can relate to some aspects of it, like being poor, manic periods of working hard, but her story was deeply moving. She was brought to the U.S. as a teenager by her parents explicitly looking for a better life for her. Thrown into a high school with minimal proficiency in English, she found a mentor in a math teacher, who proved pivotal to her weathering the alien environment. Her innate intelligence helped her to survive and thrive. Her mother, it turns out, was key in her remaining steadfastly focused on her path to become a research scientist despite lucrative offers from Wall Street and other commercial entities.
Her journey to create ImageNet was fascinating. Her perseverance and belief that big data is at the core of developing intelligence solutions lead to the development of this open-source resource of millions of categorized images. In 2012, the ImageNet classification challenge submission by University of Toronto researchers Alex Kriszhevsky, Illya Sutskever and Geoff Hinton blew away the competition. Their novel approach was to use a deep learning algorithm to solve the classification of images into objects, launching the new paradigm for machine learning to stratosphere. Fei-Fei’s research journey had her crisscrossing between east and west coasts, finally landing at Stanford. In 2018 she testified in Congress about the perils and promise of AI. Now she leads the Stanford HAI Institute.
She is still very young and undoubtedly will contribute a lot more to this field of AI, but perhaps her penchant to promote human-centric AI will turn out to be the most significant contribution to society.
I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.
“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.