AI Talk: Brain and language models

December 3, 2021 / By V. “Juggy” Jagannathan, PhD

In this blog, I want to highlight one theme that I found interesting from the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), one that explores the connection between brain and language models. Large language models (LLMs) like GPT-3 have been used to tackle a wide variety of tasks. They are particularly good at predicting what the next word is in a sentence—which is basically how they are trained. The link between how LLMs works and how they relate to how our brain processes language is new and fascinating research.

2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)

This year, I attended EMNLP 2021 virtually. This was the first time the conference was held using a hybrid format—with 3,000+ attending virtually and about five hundred in person at a venue in Punta Cana, Dominican Republic. More than 1,250 short and long papers were presented at this conference, along with 20+ technical tracks and 23 workshops on all sorts of NLP related themes. It is hard to keep up with this wealth of information being created and presented just at this conference—not to mention the scores of other conferences advancing research in AI. All the papers presented are in public domain and you can peruse their massive collection on the Association of Computational Linguistics (ACL) website.

Now, lets’ get back to the topic in question: The link between the brain and language models. Professor Evelina Fedorenko, head of the brain and cognitive science department at MIT, gave a keynote speech at EMNLP. She started with the fundamental question: What is the role of language? Is it to enable us to think complex thoughts or is it primarily to support communication? Through a series of experiments utilizing functional magnetic resonance imaging (fMRI) technology on human subjects, she sheds light on the subject. fMRI imaging studies have now identified specific areas in the brain that are activated when engaged in language tasks. The same imaging technique shows that different areas of the brain, and not the language perception area, are engaged when cognitive reasoning is called for. The implication? Language processing is different from cognitive thinking. Complex thoughts are not handled by the language machinery in humans. Now, that is an interesting conclusion.

Her work goes further: Transformer architectures and pre-trained language models have become one of the hottest areas in AI and natural language generation and understanding. GPT-3 models have astounded us with their functionality, but a big criticism levied against these models is that they cannot reason or think. They also cannot do math, nor do they have common sense. Turns out, Professor Fedorenko argues, neither can we! The portion of our brain that handles language is not handling complex reasoning either. She also shows, with various experiments, that language models like GPT can actually predict which areas of the brain will be triggered on predictive tasks. The prediction of next word only pays attention to roughly 8-10 words of context. This is the brain we are talking about. The same is mostly true of LLM as well. Unbelievable.

Another related talk that I want to focus on was one by Professor Willem Zuidema from University of Amsterdam, Institute for Logic, Language and Computation. His was an invited talk in a workshop called “Blackbox NLP: Analyzing and interpreting neural networks for NLP.” His presentation was titled “Language, Brains and Interpretability.” The underlying thread for this talk was that human brains are just as much a black box as current deep neural networks. It’s possible that the same approaches to investigating how to make neural networks explainable and interpretable can eventually be applied to our brain matter as well! In his talk, he referred to the work discussed above from Professor Fedorenko on the ability of LLMs to predict location of brain activity. His group has been investigating the similarity of representation of linguistics concepts between the brain and language models using fMRI and other brain scans. Representational similarity analysis (RSA), as it is referred to, shows that there are indeed regions in our brain where concept representations are held. And, when these are translated to a vector form (embeddings), can be shown to behave similar to embeddings derived from LLMs. The group is investigating various interpretability techniques and hope to use their research to shed light on how humans process language.

At the end of the day, we can walk away with the following insights: LLMs are quite similar to our brains in regard to how language is processed and handled. LLMs are not however, good for common sense or cognitive reasoning. Neither is the language processing section of our brain. Essentially this implies we need separate structures that handle common sense reasoning in silicon architectures – just like we have separate neural architecture to deal with our cognitive abilities.

I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.