From 3M Health Information Systems
AI talk: Naturalness of software and constitutional AI
This week’s blog highlights content from my latest podcast with my friend Prem Devanbu, distinguished professor emeritus at UC Davis. He works in empirical software engineering and artificial intelligence (AI) for software development. In this blog, I want to touch on three topics from the episode.
Naturalness of software
In 2011, Professor Devanbu and his students discovered that software code has the same properties as natural language. What does that really mean? Natural language follows a predictable cadence. This is a fact that language models exploit in predicting the next word given the left context. Similar predictability also exists for software code! This discovery led Prem and his team to develop code completion algorithms using language models based on code from more than a decade ago. Of course, the language models then were based on counts – referred to as n-grams – counting how often different combinations of tokens (words) occurred together in a corpus of code. Fast forward to the present day, large language models (LLMs) like ChatGPT and Bard exploit the same feature and the naturalness of software to assist with coding. Not based on counts any more, but on training super large neural networks fed with all the open-source code in internet land.
CoPilot and Codey
Microsoft and Google now have AI assistants for software developers. Microsoft’s GitHub CoPilot was released a few years ago and has garnered significant attention in the coding community. Dubbed as pair-programming, it watches what you type and completes your comments and code. It uses the underlying tech from ChatGPT. In my experience it is quite impressive in its ability to interpret English instructions and spit out code. Not perfect, but quite good and can save coders time (most of the time). Not free, but included with popular integrated development environments (IDE) that developers use. Recently Microsoft also introduced a chat feature to the development process, that can help with things like tutorials, creating tests, helping debug, etc.
Google is not sitting idle. It recently launched its rival solutions, and the text-to-code foundation model is dubbed “Codey.” This supports code completion and code generation from English prompts and suggests ways to improve code quality. It seems to be available in Google’s popular Collab environments (a staple for grad students) and it is free. It also released a cloud-based chat version “Duet AI” that adds to code assistance, a chatbot based on PaLM 2.
What in world is “Constitutional AI” and what is its relevance to software development or chatbots? The answer, my friends, is blowing in the wind – or more accurately in the “Claude Constitution” drafted by the startup Anthropic. One of the major issues with LLMs is their ability to generate toxic responses or inappropriate advice. If you ask a chatbot to develop code that will hack into a system or exploit system vulnerability, it might very well do that. How do you prevent AI from doing that? Obviously, this has ramifications on almost any domain, not just coding.
Constitutional AI addresses this problem. First you draw up a “constitution” on what proper behavior is for a chatbot. Anthropic has designed an extensive and evolving constitution (amendments?). One of the ways ChatGPT (and Google’s Bard and DeepMind’s Sparrow) were trained was using reinforcement learning with human feedback (RLHF). This basically involved generating multiple chat response and having a human rank the ones they like. The model learns not to generate toxic content or inappropriate responses using this feedback.
This process can be cumbersome and is not foolproof. With a constitution drawn up on what is appropriate, the chatbot is given the constitution, the context and response are generated, and is asked to critique the original response. If it violates any of the constitutional principles, the response is revised until it is acceptable. This form of reinforcement learning with AI feedback (RLAIF) now allows for the generation of chat responses that are harmless. Anthropic has a detailed technical paper on this front and this short YouTube video is an excellent explanation of the concepts underlying this approach.
The world of coding is being transformed before our eyes. Computer science education and coding must evolve to incorporate these new realities. Student evaluations probably need to incorporate more oral quizzes to deter the use of AI. The logical reasoning and step-by-step solving of complex problems, hallmarks of developing software solutions, are still a prerequisite for developing practical solutions – albeit with turbocharged tools.
I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.
“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.