AI Talk: Code generation and COVID-19 complications

August 23, 2021 / By V. “Juggy” Jagannathan, PhD

Code generation

Over the years I have programmed in many languages, starting with Fortran IV in the early seventies. In those days, we had to punch the program instructions into cards and feed them into a card reader. A bunch of these programs were then fed to a main frame computer and half a day later you could get a printout of your results, assuming your program was bug free—a rare occurrence. Programming languages have evolved significantly over the years, but the goal is always the same: Make the specification of what needs to be accomplished as close to natural language as possible.

AI research company OpenAI is going after this holy grail—software specification in plain English. They are releasing a new tool called Codex that takes advantage of the GPT-3 language model. It can generate code in dozens of different languages, but is most proficient in the language of choice for machine learning enthusiasts: Python. Watch a demo of Open AI researchers executing simple instructions to create a simple game here. Because the English statements are being converted to code, the code generated is already well documented, which means no more complaining to your coder that their code is incomprehensible. This system is not going to replace coders any time soon, but is intended to make them more efficient.

What happens when there is a problem, either in the generated code or in the logic of the person specifying what needs to be done? OpenAI’s complete technical paper evaluates how well the model performs in converting English specifications to code. The language model underlying its code generation capabilities starts with GPT-3 and is fine-tuned on 54 million public software repositories. They have created multiple models of varying sizes, the largest one being a 12 billion parameter model.

How do you test and determine whether or not the code is good? Well, they have 164 handwritten programs, dubbed the HumanEval dataset. All these programs have unit test code that will determine if the program works correctly on specified input. For each specification of a problem, they generate sample code multiple times k, where k is 1 or 10 or 100. With the largest model, they get 72 percent of coding right when they try 100 times to generate code. That’s actually not bad. After all, it is the machine that is producing the code—who cares it if it tries 100 or 1,000 times!

The business of writing code automatically is actually an evolving competition. There is another group of researchers from EleutherAI attempting to replicate GPT. They have released a model called GPT-J. According to Open AI, GPT-J got 28 percent correct in 100 tries in their HumanEval dataset.

The paper also raises concerns over bias in the code generated with embedded comments. Considering the language model is trained on a large swath of internet data, that is not really surprising. Nonetheless, the release of Codex represents an important milestone in the code generation landscape.

COVID-19 complications

Researchers from nference, Inc. and Mayo Clinic have just released a detailed retrospective study analyzing EHR data of a cohort of 1,803 hospitalized patients with COVID-19. It is a sad state of affairs that such a substantial cohort can be put together and a study released even while the pandemic morphs and rages on.

In any case, what was the main objective of the study? To analyze the impact of pre-existing comorbidities on hospitalized COVID-19 patients. This study will help inform care givers on what to anticipate and how to take care of such patients. The researchers collected six months of data from the EHRs of multiple Mayo Clinic locations last year. More than one million clinical documents were collected to cover one year of data prior to a positive COVID-19 test and three months of data after such a test. A baseline data set of an additional 1,803 hospitalized patients who tested negative for COVID-19 provided the control group.

A BERT-based transformer deep learning model was used to extract the 21 most common comorbidities and 20 common complications. What are findings from the study? The most common complication that occurs within 30 days of a positive COVID-19 test is pleural effusion. The most common comorbidity that leads to most complications is hypertension. It is interesting to note that deep learning models have become a mainstay in most studies.

Acknowledgement

The story on COVID-19 complications was suggested by my colleague, Deanna Berkowitz.

I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.