Data from 175K COVID-19 patients fuels predictive severity model

A study published in JAMA Network Open this week used the largest data repository of COVID-19 patients in the United States to develop a model to predict clinical severity based on first-day admission data.  

The research relied on roughly two million medical records stored in the Data Enclave of the National COVID Cohort Collaborative, or N3C.   

As the researchers explained, “This cohort study characterizes the largest U.S. COVID-19 cohort to date, including 174,568 adults who tested positive for SARS-CoV-2.”  


The study was the first to use the N3C database, which is specifically designed to support research on COVID-19.   

The N3C was developed by the National Center for Advancing Translational Sciences, the hub for research of this kind at the National Institutes of Health. As of December 2020, the N3C release set included information from 1,926,526 patients from 34 sites across the United States.   

The NCATS website notes that as of July 2021, the electronic health record repository included data from 6.3 million patients.

“This cohort is racially and ethnically diverse and geographically distributed,” said researchers. “We evaluated COVID-19 severity and associated clinical and demographic factors over time and used machine learning to develop a clinically useful model that accurately predicts severity using data from the first day of hospital admission.”  

Of the roughly 175,000 adults who tested positive for the coronavirus included in the study, 18.6% were hospitalized. 

Of those, 6,565, or about a fifth, had what researchers called a “severe clinical course”: invasive ventilatory support, extracorporeal membrane oxygenation, a discharge to hospice or death.  

The team found that inpatient mortality decreased over time from March and April 2020 to September and October. Treatment patterns also changed, with use of antimicrobial and immunomodulatory medications shifting over the course of the pandemic.   

By using the N3C data, researchers were able to develop accurate machine learning models to predict clinical severity based on data available on the first calendar day of admission, with the most powerful predictors being patient age and widely available vital sign and laboratory values. 

Although the team noted the models could act as a basis for generalizable clinical decision support tools, they cautioned that development of such tools would require additional work at deploying healthcare systems.  


Given COVID-19’s devastating effect on the world, informaticists and researchers have ramped up efforts to use artificial intelligence to most effectively treat patients.   

Earlier this year, researchers at Northwell’s Feinstein Institutes for Medical Research developed an AI-powered predictive tool intended to assess patients for their risk of COVID-19 respiratory failure within 48 hours.

Others at MIT used AI to find drugs that could be repurposed for COVID-19.

But not every model is equally effective. An audit undertaken by a team at the University of Washington found that AI systems aimed at detecting COVID-19 in chest radiographs sometimes failed when tested in new hospitals.   

“Because this approach to data collection has also been used to obtain training data for the detection of COVID-19 in computed tomography scans and for medical imaging tasks related to other diseases, our study reveals a far-reaching problem in medical-imaging AI,” wrote those researchers.  


“Developed under the intense time pressure of a health crisis, earlier data aggregation efforts may not have been designed to support future research,” observed researchers in the JAMA Network Open study.   

“The N3C Data Enclave provides transparent, easily shared, versioned, and fully auditable data and analytic provenance,” they said.


Kat Jercich is senior editor of Healthcare IT News.
Twitter: @kjercich
Email: [email protected]
Healthcare IT News is a HIMSS Media publication.

Source: Read Full Article