AI Talk: Tracking the COVID-19 pandemic

Oct. 16, 2020 / By V. “Juggy” Jagannathan, PhD

As COVID-19 cases continue to creep up, I wanted to assess the state of analytics with respect to the disease. AI and machine learning have been central to the data analysis process and continue to help researchers see what’s happening on this front.

Symptom, treatment analysis

I saw a nice summary published last month about massive data initiatives used for pandemic analytics. The report catalogs a number of consortiums that focus on different aspects of the pandemic. These include registries and mobile apps that track virus symptoms in the population, those that track diagnostic efficacy using a combination of genomic and imaging modalities and those that track treatment efficacies. One of the research organizations mentioned is OpenSAFELY, a UK-based group studying COVID-19 data. They analyzed the records of over 18 million patients to determine who was at the greatest risk of dying (and probably being hospitalized, as well) from contracting COVID-19. They highlighted a number of risk factors from being obese, to old age, to major comorbid conditions such as renal failure, cancer and cardiac problems.

Forecasting models

When it comes to disease forecasting, there are an astounding number of ongoing efforts. COVID-19 ForecastHub is a good source for many these models. They track 42 different models and also have an ensemble model that combines them all. They are also the source of the data published by CDC. FiveThirtyEight, an analytic firm which is part of ABC News, has a nice visualization of a subset of these models. You can see various predictions from the different models for the next few weeks in their visualization. Obviously, each model uses a differing set of assumptions and the predictions vary significantly. But it is instructive to look at the bounds of these predictions, i.e. what is the predicted high and the predicted low. The sad fact is, even if you take the predicted low, there is a steady increase in the number of deaths reported. These predictions like the weather predictions we tend to see for say, hurricanes, provide a region/band where the number of deaths is expected to fall.

Model performance

How good are these predictions? There is more data now and these models are continually forecasting, so it should be relatively straightforward to figure out if past predictions proved true. The best source of this information comes from Youyang Gu, who started one of the first forecasting efforts based entirely on machine learning. Gu’s forecasting is pretty impressive, considering it is a solo effort! He recently published a blog and it gives us a window into what impact this work had on his life – six months of collecting public data and producing daily forecasts, all by himself. This particular blog is also his farewell to the forecasting effort. However, he has taken it upon himself to evaluate each forecasting model on the accuracy of their predictions. You can look at his evaluation here. He compares all the models with a baseline which is simply a projection of a trend line from the previous two weeks. At a glance you can tell which model has performed well and which has failed miserably. All you have to do is to look at the color: red=bad, green=good. The consistent winner in the model predictions? The ensemble model developed by COVID-19 Forecasthub!

Unfortunately, all the models are pointing towards more infection , especially now that the flu season is upon us. The silver lining, though, is whatever precautions one takes to avoid getting COVID-19 will also help in avoiding the flu!

I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something!” Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.