Millions of recent responses to a survey created at Carnegie Mellon — and shared via Facebook and Google — could soon help us understand where the next big outbreaks of COVID-19 will be.
The survey asks Facebook and Google users to self-report symptoms associated with the coronavirus (you can respond if you don’t have symptoms–that helps too). Since a lot of people are staying home and not visiting doctor’s offices and hospitals, information about the spread of symptomatic infections isn’t really available from any other source.
The data collected will be made public at CMU’s COVIDcast website, and Facebook is making the aggregated survey information from its users available here. Later this week, the COVIDcast site will have interactive heat maps of the U.S. showing survey results.
Facebook’s Mark Zuckerberg just wrote about the project in The Washington Post. He said that Facebook is partnering with the University of Maryland to expand the survey globally.
So far, CMU is seeing about a million responses per week from Facebook users. Every day last week, 600,000 users of Google Opinion Rewards and AdMob apps answered the CMU survey.
“I’m very happy with both the Facebook and Google survey results,” says Ryan Tibshirani, associate professor of statistics and machine learning, and co-leader of Carnegie Mellon’s Delphi COVID-19 response team. “They both have exceeded my expectations.”
Combined with other data like medical claims and testing, this will allow the CMU Delphi team to estimate disease activity in a way that better reflects reality than what’s available from positive coronavirus tests alone.
Relying only on positive tests may not give a complete picture of the COVID-19 pandemic’s spread, notes Roni Rosenfeld, co-leader of the CMU Delphi research group. Limited testing capacity, reporting delays and other factors leave the picture incomplete.
The researchers say that they have data now for a county-level breakdown of COVID-19 activity, for the 601 U.S. counties that have at least 100,000 people.
The Delphi research group has grown to include 30 faculty, students and volunteers. It has been working on forecasting nationwide influenza outbreaks for years and was designated last year by the U.S. Centers for Disease Control and Prevention (CDC) as one of two National Centers of Excellence for Influenza Forecasting. It’s adapting its flu forecasting abilities to COVID-19 at the CDC’s request.
Delphi uses two forecasting approaches, both of which have been effective in predicting the spread of the flu. Crowdcast is a “wisdom of the crowds” approach, based on aggregates of volunteers submitting weekly estimates. The other approach uses machine learning to recognize patterns in health care data.
The survey data collected is anonymous, and not shared with Google or Facebook. Since 2016, Google Health Trends has been providing CMU with information about users’ searches for the flu — and now is providing information about COVID-19-related terms. A major healthcare provider and Quidel, a diagnostic test provider, are also providing anonymous COVID test statistics. Five more data sources will be added in the coming weeks.
“The data they provide is priceless and will give us greater confidence once we are able to begin our forecasts for this deadly disease,” says Rosenfeld.