User login
The Million Veteran Program (MVP) was launched in 2011 by the US Department of Veterans Affairs (VA) to enroll at least 1 million veterans in a longitudinal cohort to better understand how genes, lifestyle, military experience, and environmental exposures interact to influence health and illness and ultimately enable precision health care. The MVP has established a national, centralized infrastructure for recruitment and enrollment, biospecimen and data collection and storage, data generation and curation, and secure data access. When the COVID-19 pandemic hit in 2020, the MVP was leveraged to support research utilizing the following key infrastructure components: (1) MVP recruitment and enrollment platform to provide support for COVID-19 vaccine and treatment trials and to collect COVID-19 data from MVP participants; (2) using MVP Phenomics for COVID-19 research data cleaning and curation, assisting with the development of a VA Severity Index for COVID-19, and forming 6 scientific working groups to coordinate COVID-19 research questions; and (3) the VA/MVP and US Department of Energy (DOE) partnership to assist in responding to COVID-19 research questions identified by the US Food and Drug Administration (FDA). This article describes these infrastructure components in more detail and highlights key findings from the MVP COVID-19 research efforts.
MVP Infrastructure
The Veterans Health Administration (VHA) Office of Research and Development (ORD) oversaw efforts to develop the VA Coronavirus Research Volunteer List (the COVID-19 registry). To support the registry, the MVP leveraged its infrastructure to facilitate a rapid response. The MVP is designed as a full-service and centralized recruitment and enrollment platform. This includes MVP office oversight; MVP coordinating centers that manage the centralized platform; an information center that handles inbound and outbound calls; an informatics system built for recruitment and enrollment monitoring and tracking; and a network of more than 70 participating MVP sites with dedicated staff to conduct recruitment and enrollment activities. The MVP used its informatics infrastructure to support secure data storage for the registry volunteer information. MVP coordinating center staff worked with the COVID-19 registry to invite > 125,000 MVP participants from approximately 20 MVP sites. Additionally, MVP information center staff made > 4000 calls to prospective registry volunteers. This work resulted in 1300 volunteers agreeing to be
New Data Collection
The MVP protocol was approved by the VA Central Institutional Review Board (IRB) in 2011. As part of initial enrollment in MVP, participants consented to recontact for additional self-report information along with access to their electronic health record (EHR). This allows for the linkage of EHR and survey response data, thus providing a comprehensive understanding of health history before and after a self-reported COVID-19 diagnosis. Between May 2020 and September 2021, the MVP COVID-19 survey was distributed to existing MVP participants via mail, telephone, and email with the ability to complete the survey by paper and pencil or through the MVP online system. Dissemination of the survey was approved by the VA Central IRB in 2020, with nearly 730,000 eligible MVP participants contacted. As of June 2022, 255,737 MVP participants (35% of the eligible cohort) had completed the survey; 86% completed a paper survey while 14% completed it online. Respondents were primarily older (≥ 65 years); 90% were male; close to 7% reported Hispanic ethnicity, and 11% reported Black race.
Findings from this survey provide insight into pandemic behaviors not consistently captured in EHRs, such as psychosocial aspects, including social and emotional support, loss of tangible and intangible resources, as well as COVID-19–related behaviors, such as social distancing and self-protective practices.1 MVP COVID-19 survey data combined with veteran EHRs, responses to other MVP surveys, and genetic data enable MVP researchers to better understand epidemiological, clinical, and psychosocial aspects of the disease. Future COVID-19 studies may use self-reported survey responses to enrich understanding about the effects of the disease on a veteran’s daily life, and possibly validate existing EHR COVID-19 diagnoses and hospitalization findings. This comprehensive data resource provides a unique opportunity to identify new targets for disease prevention, treatment, and management with an emphasis on individual variability in genes, environment, and lifestyle.
COVID-19 Research
In early 2020, the burden of COVID-19 on the US was unprecedented, and little was known about risk factors for severe COVID-19 and deaths. The MVP Phenomics team quickly responded with a large-scale phenome-wide association study (PheWAS) of >
To broaden disease progression data curation and fit the specific needs of the VA, we operationalized and validated the World Health Organization clinical severity scale and used VA EHR data to create the VA Severity Index for COVID-19 (VASIC).3 The VASIC category is now part of the MVP core data repository, where volumes of data from multiple activities are integrated through an automated process to create monthly research-ready data cubes. These activities include extensive data curation, mapping, phenotyping, and adjudication that are performed to curate oxygen supplementation status and other procedures related to treatment that are processed and understood in real time. The data cubes were provisioned to MVP COVID-19 researchers. In addition, the VASIC scale variable is now integrated within the larger VA system for all researchers to use as part of its wider COVID-19 initiative. The VA Centralized Interactive Phenomics Resource (CIPHER) phenomics library now hosts the details of VASIC, codes, metadata, and related COVID-19 data products for all VA communities. In partnership with CIPHER and other internal and external COVID-19 initiatives, the MVP continues to play an integral part for the VA and beyond in the development of a phenomics algorithm for long COVID, or post-acute COVID-19 syndrome (PACS).
Host Genetics in COVID-19
As the SARS-CoV-2 virus continued to spread globally, it became clear that the symptoms and severity of infection experienced by patients varied across a broad spectrum, from being asymptomatic carriers to experiencing severe symptoms in 1 or more organ systems in the body, resulting in death. This variability suggested that host genetics and other host factors may play a role in determining the severity of COVID-19 infection. The MVP dataset, with genetic and health information on > 600,000 MVP participants, provided an ideal dataset to explore host contributions to COVID-19.
In late spring 2020, the MVP executive committee issued a call to the MVP research community to propose study aims around the COVID-19 pandemic that could leverage the phenotypic and genetic data and resources. The MVP quickly formed 6 rapid-response scientific working groups. Their mission was to cultivate collaboration and inclusivity and to coordinate COVID-19 research questions. A steering committee composed of the MVP executive committee, staff from computational environments, working group cochairs, and an administrator, who was responsible for daily oversight of the working groups. In addition, the ORD COVID-19 steering committee reviewed and approved research activities to ensure scientific rigor, as well as alignment with overall ongoing research activities.
The MVP COVID-19 working groups included dozens of researchers who used MVP data to identify disease mechanisms; understand the impact of host genetics on susceptibility, morbidity, and mortality; and identify potential targets for treatments and therapies. The working groups were further supported by MVP analysts to work cross-functionally on genomics, phenomics, statistical genetics, and PheWAS. Each working group chair was responsible for prioritizing concepts and moving them forward in coordination with the MVP and ORD COVID-19 steering committees. An overview of the MVP COVID-19 working groups follows (Table).4-9
Druggable genome. This working group researched drug-repurposing opportunities to prevent severe COVID-19, defined as hospitalization with oxygen therapy (high flow), intubation, mechanical ventilation, vasopressors, dialysis, or death from COVID-19; and prevent complications in patients hospitalized by COVID-19.
Pharmacogenomics. This working group focused on 2 main aims: the impact of apolipoprotein L1 risk variants on acute kidney injury (AKI) and death in Black veterans with COVID-19; and pharmacogenetic analysis of remdesivir-induced liver chemistry abnormalities.
Disease mechanisms. Understanding the underlying pathways and mechanisms behind COVID-19 has been a difficult but important challenge overall in the scientific community. This working group investigated specific genetic markers and effects on COVID-19, including polygenic predisposition to venous thromboembolism associated with increased COVID-19 susceptibility; renal comorbidities and new AKI and unfavorable outcomes among COVID-19–positive sickle cell trait carriers; and mucin 5B, oligomeric mucus/gel-forming gene polymorphism, and protective effects in COVID-19 infection.
Genomics for risk prediction, polygenic risk scores, and mendelian randomization. Risk prediction for COVID-19 has been widely studied mostly aiming at comorbidities and preexisting conditions. The MVP cohort provided a unique opportunity to understand how genetic information can enhance our understanding of COVID-19 risk. This working group focused on: (1) ABO blood group typing and the protective effects of the O blood group on COVID-19 infection; (2) polygenic risk scores and COVID-19 outcomes; (3) human leukocyte antigen typing and COVID-19 outcomes; and (4) a transcriptome-wide association study of COVID-19–positive MVP participants.
Genome-Wide Association Study (GWAS) and Downstream Analysis. This working group performed GWAS of the main COVID-19 outcomes. Results from GWAS unveiled new genetic loci to suggest further investigation on these candidate genes. The results were used by other MVP COVID-19 working groups for their activities. The results also contributed to external collaborations, such as the COVID-19 Host Genetics Initiative.
COVID-19–Related PheWAS. This working group focused on understanding the potential clinical significance of genetic variants associated with susceptibility to, or outcomes of, COVID-19 infection. They worked to identify traits that share genetic variants associated with severe COVID-19 from the Host Genetics Initiative. The group also studied the phenotypic consequences of acquired mosaic chromosomal alterations with early data linking to COVID-19 susceptibility.
COVID-19 Research Partnerships
In 2016, the VA and DOE formed an interagency partnership known as Computational Health Analytics for Medical Precision to Improve Outcomes Now (CHAMPION) to demonstrate the power of combining the VA EHR system, MVP genetic data, and clinical research expertise with DOE high-performance computing infrastructure and artificial intelligence expertise. The VA EHR captures longitudinal care information on veterans with records that go back decades. Furthermore, the VA covers the costs of medications and
The DOE Oak Ridge National Laboratory (ORNL) in Tennessee securely maintains this rich database for the VA. The ORNL Summit supercomputer can complete trillions of calculations per second to provide critical and timely analyses, applying the most advanced and powerful artificial intelligence methods, which would not be possible in more conventional research settings. CHAMPION taught the VA and DOE how to bring their disparate research cultures together for innovative collaborative investigation. Moreover, this collaboration produced a cadre of VA and DOE scientists familiar with VA patient data and experienced in conducting joint research successfully and integrating omics data with clinical data for a better mechanistic understanding. Because of this preexisting collaboration between the VA and DOE, interagency teams were prepared at the start of the COVID-19 pandemic.10-15
Other recently completed studies have developed and validated short-term mortality indices in individuals with COVID-19 based on their preexisting conditions, assessed the generalizability of VA COVID-19 experiences to the US population, and evaluated the effectiveness of hydroxychloroquine with and without azithromycin in VA patients with COVID-19.12,15 A recent study demonstrated the benefit of prophylactic anticoagulation at initial hospitalization.14
The VA also provided the FDA with daily reports on aggregate VA COVID-19 cases and their distribution across the VA system, demographics of VA patients with COVID-19, and analyses of predictive models for positive test results and death. The VA regularly sent the FDA aggregated data showing patterns of medication use and retrospective analyses of the effectiveness of certain medications (including remdesivir and some antithrombotic agents). The FDA used these data along with other data to understand the scope of the pandemic and to predict drug shortages or needs for additional medical equipment, including ventilators.
Limitations
For the most part, MVP infrastructure and partnerships were efficiently leveraged to significantly advance our understanding of the biological basis of COVID-19 and to develop treatments and vaccines. However, there were a few limitations that may have slowed timely and optimal outcomes. An issue not limited to the MVP or VA was the continual evolution of the pandemic and its response. This included evolving definitions of disease, symptomatology, testing, vaccines, and public health recommendations. Keeping pace with the emerging knowledge from these domains was a struggle for the entire scientific community. A more discrete limitation was the number of participants in the MVP with positive COVID-19 test results and positive symptoms; however, this was mitigated by partnering with other groups like the COVID-19 Host Genetics Initiative to increase study participant numbers. Finally, there were logistical and regulatory challenges associated with coordination of national clinical trial recruitment across a VA system with > 100 discrete hospitals.
Conclusions
Having a centralized infrastructure for recruitment and enrollment, including a national research volunteer registry, information center, research staff, and coordinating centers, can allow for expedited enrollment in vaccine and treatment trials in the face of future public health emergencies. VA assets, including its rich EHR and MVP, the world’s largest genomic cohort, have contributed to improving our understanding and management of COVID-19.
1. Whitbourne SB, Nguyen XT, Song RJ, et al. Million Veteran Program’s response to COVID-19: survey development and preliminary findings. PLoS One. 2022;17(4):e0266381. doi:10.1371/journal.pone.0266381
2. Song RJ, Ho YL, Schubert P, et al. Phenome-wide association of 1809 phenotypes and COVID-19 disease progression in the Veterans Health Administration Million Veteran Program. PLoS One. 2021;16(5):e0251651. doi:10.1371/journal.pone.0251651
3. Galloway A, Park Y, Tanukonda V, et al. Impact of COVID-19 severity on long-term events in US veterans using the Veterans Affairs Severity Index for COVID-19 (VASIC). J Infect Dis. 2022;226(12):2113-2117. doi:10.1093/infdis/jiac182
4. Gaziano L, Giambartolomei C, Pereira AC, et al. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat Med. 2021;27(4):668-676. doi:10.0138/s41591-021-01310-z
5. Hung AM, Sha SC, Bick AG, et al. APOL1 risk variants, acute kidney injury, and death in participants with African ancestry hospitalized with COVID-19 from the Million Veteran Program. JAMA Intern Med. 2022;182(4):386-395. doi:10.1001/jamainternmed.2021.8538
6. Verma A, Huffman JE, Gao L, et al. Association of kidney comorbidities and acute kidney failure with unfavorable outcomes after COVID-19 in individuals with the sickle cell trait. JAMA Intern Med. 2022;182(8):796-804. doi:10.1001/jamainternmed.2022.2141
7. Verma A, Tsao NL, Thomann LO, et al. A phenome-wide association study of genes associated with COVID-19 severity reveals shared genetics with complex diseases in the Million Veteran Program. PLoS Genet. 2022;18(4):e1010113. doi:10.1371/journal.pgen.1010113
8. Peloso GM, Tcheandjieu C, McGeary JE, et al. Genetic loci associated with COVID-19 positivity and hospitalization in White, Black, and Hispanic Veterans of the VA Million Veteran Program. Front Genetic. 2022;12:777076. doi:10.3389/fgene.2021.777076
9. Verma A, Minnier J, Wan ES, et al. A MUC5B gene polymorphism, rs35705950-T confers protective effects against COVID-19 hospitalization but not severe disease or mortality. Am J Respir Crit Care Med. 2022;182(8):796-804. doi:10.1164/rccm.202109-2166OC
10. Garvin MR, Alvarez C, Miller JI, et al. A mechanistic model and therapeutic interventions for COVID-19 involving a RAS-mediated bradykinin storm. Elife. 2020;e59177. doi:10.7554/eLife.59177
11. Rentsch CT, Kidwai-Khan F, Tate JP, et al. Patterns of COVID-19 testing and mortality by race and ethnicity among United States veterans: A nationwide cohort study. PLoS Med. 2020;17(9):e1003379. doi:10.1371/journal.pmed.1003379
12. King JT, Yoon JS, Rentsch CT, et al. Development and validation of a 30-day mortality index based on pre-existing medical administrative data from 13,323 COVID-19 patients: the Veterans Health Administration COVID-19 (VACO) Index. PLoS One. 2020;15(11):e0241825. doi:10.1371/journal.pone.0241825
13. Joubert W, Weighill D, Kainer D, et al. Attacking the opioid epidemic: determining the epistatic and pleiotropic genetic architectures for chronic pain and opioid addiction. SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. Dallas, TX, USA, 2018:717-730. doi:10.1109/SC.2018.00060
14. Rentsch CT, Beckman JA, Tomlinson L, et al. Early initiation of prophylactic anticoagulation for prevention of COVID-19 mortality: a nationwide cohort study of hospitalized patients in the United States. BMJ. 2021;372:n311. doi:10.1136/bmj.n311
15. Gerlovin H, Posner DC, Ho YL, et al. Pharmacoepidemiology, machine learning, and COVID-19: an intent-to-treat analysis of hydroxychloroquine, with or without Azithromycin, and COVID-19 outcomes among hospitalized US Veterans. Am J Epidemiol. 2021;190(11): 2405-2419. doi:10.1093/aje/kwab183
The Million Veteran Program (MVP) was launched in 2011 by the US Department of Veterans Affairs (VA) to enroll at least 1 million veterans in a longitudinal cohort to better understand how genes, lifestyle, military experience, and environmental exposures interact to influence health and illness and ultimately enable precision health care. The MVP has established a national, centralized infrastructure for recruitment and enrollment, biospecimen and data collection and storage, data generation and curation, and secure data access. When the COVID-19 pandemic hit in 2020, the MVP was leveraged to support research utilizing the following key infrastructure components: (1) MVP recruitment and enrollment platform to provide support for COVID-19 vaccine and treatment trials and to collect COVID-19 data from MVP participants; (2) using MVP Phenomics for COVID-19 research data cleaning and curation, assisting with the development of a VA Severity Index for COVID-19, and forming 6 scientific working groups to coordinate COVID-19 research questions; and (3) the VA/MVP and US Department of Energy (DOE) partnership to assist in responding to COVID-19 research questions identified by the US Food and Drug Administration (FDA). This article describes these infrastructure components in more detail and highlights key findings from the MVP COVID-19 research efforts.
MVP Infrastructure
The Veterans Health Administration (VHA) Office of Research and Development (ORD) oversaw efforts to develop the VA Coronavirus Research Volunteer List (the COVID-19 registry). To support the registry, the MVP leveraged its infrastructure to facilitate a rapid response. The MVP is designed as a full-service and centralized recruitment and enrollment platform. This includes MVP office oversight; MVP coordinating centers that manage the centralized platform; an information center that handles inbound and outbound calls; an informatics system built for recruitment and enrollment monitoring and tracking; and a network of more than 70 participating MVP sites with dedicated staff to conduct recruitment and enrollment activities. The MVP used its informatics infrastructure to support secure data storage for the registry volunteer information. MVP coordinating center staff worked with the COVID-19 registry to invite > 125,000 MVP participants from approximately 20 MVP sites. Additionally, MVP information center staff made > 4000 calls to prospective registry volunteers. This work resulted in 1300 volunteers agreeing to be
New Data Collection
The MVP protocol was approved by the VA Central Institutional Review Board (IRB) in 2011. As part of initial enrollment in MVP, participants consented to recontact for additional self-report information along with access to their electronic health record (EHR). This allows for the linkage of EHR and survey response data, thus providing a comprehensive understanding of health history before and after a self-reported COVID-19 diagnosis. Between May 2020 and September 2021, the MVP COVID-19 survey was distributed to existing MVP participants via mail, telephone, and email with the ability to complete the survey by paper and pencil or through the MVP online system. Dissemination of the survey was approved by the VA Central IRB in 2020, with nearly 730,000 eligible MVP participants contacted. As of June 2022, 255,737 MVP participants (35% of the eligible cohort) had completed the survey; 86% completed a paper survey while 14% completed it online. Respondents were primarily older (≥ 65 years); 90% were male; close to 7% reported Hispanic ethnicity, and 11% reported Black race.
Findings from this survey provide insight into pandemic behaviors not consistently captured in EHRs, such as psychosocial aspects, including social and emotional support, loss of tangible and intangible resources, as well as COVID-19–related behaviors, such as social distancing and self-protective practices.1 MVP COVID-19 survey data combined with veteran EHRs, responses to other MVP surveys, and genetic data enable MVP researchers to better understand epidemiological, clinical, and psychosocial aspects of the disease. Future COVID-19 studies may use self-reported survey responses to enrich understanding about the effects of the disease on a veteran’s daily life, and possibly validate existing EHR COVID-19 diagnoses and hospitalization findings. This comprehensive data resource provides a unique opportunity to identify new targets for disease prevention, treatment, and management with an emphasis on individual variability in genes, environment, and lifestyle.
COVID-19 Research
In early 2020, the burden of COVID-19 on the US was unprecedented, and little was known about risk factors for severe COVID-19 and deaths. The MVP Phenomics team quickly responded with a large-scale phenome-wide association study (PheWAS) of >
To broaden disease progression data curation and fit the specific needs of the VA, we operationalized and validated the World Health Organization clinical severity scale and used VA EHR data to create the VA Severity Index for COVID-19 (VASIC).3 The VASIC category is now part of the MVP core data repository, where volumes of data from multiple activities are integrated through an automated process to create monthly research-ready data cubes. These activities include extensive data curation, mapping, phenotyping, and adjudication that are performed to curate oxygen supplementation status and other procedures related to treatment that are processed and understood in real time. The data cubes were provisioned to MVP COVID-19 researchers. In addition, the VASIC scale variable is now integrated within the larger VA system for all researchers to use as part of its wider COVID-19 initiative. The VA Centralized Interactive Phenomics Resource (CIPHER) phenomics library now hosts the details of VASIC, codes, metadata, and related COVID-19 data products for all VA communities. In partnership with CIPHER and other internal and external COVID-19 initiatives, the MVP continues to play an integral part for the VA and beyond in the development of a phenomics algorithm for long COVID, or post-acute COVID-19 syndrome (PACS).
Host Genetics in COVID-19
As the SARS-CoV-2 virus continued to spread globally, it became clear that the symptoms and severity of infection experienced by patients varied across a broad spectrum, from being asymptomatic carriers to experiencing severe symptoms in 1 or more organ systems in the body, resulting in death. This variability suggested that host genetics and other host factors may play a role in determining the severity of COVID-19 infection. The MVP dataset, with genetic and health information on > 600,000 MVP participants, provided an ideal dataset to explore host contributions to COVID-19.
In late spring 2020, the MVP executive committee issued a call to the MVP research community to propose study aims around the COVID-19 pandemic that could leverage the phenotypic and genetic data and resources. The MVP quickly formed 6 rapid-response scientific working groups. Their mission was to cultivate collaboration and inclusivity and to coordinate COVID-19 research questions. A steering committee composed of the MVP executive committee, staff from computational environments, working group cochairs, and an administrator, who was responsible for daily oversight of the working groups. In addition, the ORD COVID-19 steering committee reviewed and approved research activities to ensure scientific rigor, as well as alignment with overall ongoing research activities.
The MVP COVID-19 working groups included dozens of researchers who used MVP data to identify disease mechanisms; understand the impact of host genetics on susceptibility, morbidity, and mortality; and identify potential targets for treatments and therapies. The working groups were further supported by MVP analysts to work cross-functionally on genomics, phenomics, statistical genetics, and PheWAS. Each working group chair was responsible for prioritizing concepts and moving them forward in coordination with the MVP and ORD COVID-19 steering committees. An overview of the MVP COVID-19 working groups follows (Table).4-9
Druggable genome. This working group researched drug-repurposing opportunities to prevent severe COVID-19, defined as hospitalization with oxygen therapy (high flow), intubation, mechanical ventilation, vasopressors, dialysis, or death from COVID-19; and prevent complications in patients hospitalized by COVID-19.
Pharmacogenomics. This working group focused on 2 main aims: the impact of apolipoprotein L1 risk variants on acute kidney injury (AKI) and death in Black veterans with COVID-19; and pharmacogenetic analysis of remdesivir-induced liver chemistry abnormalities.
Disease mechanisms. Understanding the underlying pathways and mechanisms behind COVID-19 has been a difficult but important challenge overall in the scientific community. This working group investigated specific genetic markers and effects on COVID-19, including polygenic predisposition to venous thromboembolism associated with increased COVID-19 susceptibility; renal comorbidities and new AKI and unfavorable outcomes among COVID-19–positive sickle cell trait carriers; and mucin 5B, oligomeric mucus/gel-forming gene polymorphism, and protective effects in COVID-19 infection.
Genomics for risk prediction, polygenic risk scores, and mendelian randomization. Risk prediction for COVID-19 has been widely studied mostly aiming at comorbidities and preexisting conditions. The MVP cohort provided a unique opportunity to understand how genetic information can enhance our understanding of COVID-19 risk. This working group focused on: (1) ABO blood group typing and the protective effects of the O blood group on COVID-19 infection; (2) polygenic risk scores and COVID-19 outcomes; (3) human leukocyte antigen typing and COVID-19 outcomes; and (4) a transcriptome-wide association study of COVID-19–positive MVP participants.
Genome-Wide Association Study (GWAS) and Downstream Analysis. This working group performed GWAS of the main COVID-19 outcomes. Results from GWAS unveiled new genetic loci to suggest further investigation on these candidate genes. The results were used by other MVP COVID-19 working groups for their activities. The results also contributed to external collaborations, such as the COVID-19 Host Genetics Initiative.
COVID-19–Related PheWAS. This working group focused on understanding the potential clinical significance of genetic variants associated with susceptibility to, or outcomes of, COVID-19 infection. They worked to identify traits that share genetic variants associated with severe COVID-19 from the Host Genetics Initiative. The group also studied the phenotypic consequences of acquired mosaic chromosomal alterations with early data linking to COVID-19 susceptibility.
COVID-19 Research Partnerships
In 2016, the VA and DOE formed an interagency partnership known as Computational Health Analytics for Medical Precision to Improve Outcomes Now (CHAMPION) to demonstrate the power of combining the VA EHR system, MVP genetic data, and clinical research expertise with DOE high-performance computing infrastructure and artificial intelligence expertise. The VA EHR captures longitudinal care information on veterans with records that go back decades. Furthermore, the VA covers the costs of medications and
The DOE Oak Ridge National Laboratory (ORNL) in Tennessee securely maintains this rich database for the VA. The ORNL Summit supercomputer can complete trillions of calculations per second to provide critical and timely analyses, applying the most advanced and powerful artificial intelligence methods, which would not be possible in more conventional research settings. CHAMPION taught the VA and DOE how to bring their disparate research cultures together for innovative collaborative investigation. Moreover, this collaboration produced a cadre of VA and DOE scientists familiar with VA patient data and experienced in conducting joint research successfully and integrating omics data with clinical data for a better mechanistic understanding. Because of this preexisting collaboration between the VA and DOE, interagency teams were prepared at the start of the COVID-19 pandemic.10-15
Other recently completed studies have developed and validated short-term mortality indices in individuals with COVID-19 based on their preexisting conditions, assessed the generalizability of VA COVID-19 experiences to the US population, and evaluated the effectiveness of hydroxychloroquine with and without azithromycin in VA patients with COVID-19.12,15 A recent study demonstrated the benefit of prophylactic anticoagulation at initial hospitalization.14
The VA also provided the FDA with daily reports on aggregate VA COVID-19 cases and their distribution across the VA system, demographics of VA patients with COVID-19, and analyses of predictive models for positive test results and death. The VA regularly sent the FDA aggregated data showing patterns of medication use and retrospective analyses of the effectiveness of certain medications (including remdesivir and some antithrombotic agents). The FDA used these data along with other data to understand the scope of the pandemic and to predict drug shortages or needs for additional medical equipment, including ventilators.
Limitations
For the most part, MVP infrastructure and partnerships were efficiently leveraged to significantly advance our understanding of the biological basis of COVID-19 and to develop treatments and vaccines. However, there were a few limitations that may have slowed timely and optimal outcomes. An issue not limited to the MVP or VA was the continual evolution of the pandemic and its response. This included evolving definitions of disease, symptomatology, testing, vaccines, and public health recommendations. Keeping pace with the emerging knowledge from these domains was a struggle for the entire scientific community. A more discrete limitation was the number of participants in the MVP with positive COVID-19 test results and positive symptoms; however, this was mitigated by partnering with other groups like the COVID-19 Host Genetics Initiative to increase study participant numbers. Finally, there were logistical and regulatory challenges associated with coordination of national clinical trial recruitment across a VA system with > 100 discrete hospitals.
Conclusions
Having a centralized infrastructure for recruitment and enrollment, including a national research volunteer registry, information center, research staff, and coordinating centers, can allow for expedited enrollment in vaccine and treatment trials in the face of future public health emergencies. VA assets, including its rich EHR and MVP, the world’s largest genomic cohort, have contributed to improving our understanding and management of COVID-19.
The Million Veteran Program (MVP) was launched in 2011 by the US Department of Veterans Affairs (VA) to enroll at least 1 million veterans in a longitudinal cohort to better understand how genes, lifestyle, military experience, and environmental exposures interact to influence health and illness and ultimately enable precision health care. The MVP has established a national, centralized infrastructure for recruitment and enrollment, biospecimen and data collection and storage, data generation and curation, and secure data access. When the COVID-19 pandemic hit in 2020, the MVP was leveraged to support research utilizing the following key infrastructure components: (1) MVP recruitment and enrollment platform to provide support for COVID-19 vaccine and treatment trials and to collect COVID-19 data from MVP participants; (2) using MVP Phenomics for COVID-19 research data cleaning and curation, assisting with the development of a VA Severity Index for COVID-19, and forming 6 scientific working groups to coordinate COVID-19 research questions; and (3) the VA/MVP and US Department of Energy (DOE) partnership to assist in responding to COVID-19 research questions identified by the US Food and Drug Administration (FDA). This article describes these infrastructure components in more detail and highlights key findings from the MVP COVID-19 research efforts.
MVP Infrastructure
The Veterans Health Administration (VHA) Office of Research and Development (ORD) oversaw efforts to develop the VA Coronavirus Research Volunteer List (the COVID-19 registry). To support the registry, the MVP leveraged its infrastructure to facilitate a rapid response. The MVP is designed as a full-service and centralized recruitment and enrollment platform. This includes MVP office oversight; MVP coordinating centers that manage the centralized platform; an information center that handles inbound and outbound calls; an informatics system built for recruitment and enrollment monitoring and tracking; and a network of more than 70 participating MVP sites with dedicated staff to conduct recruitment and enrollment activities. The MVP used its informatics infrastructure to support secure data storage for the registry volunteer information. MVP coordinating center staff worked with the COVID-19 registry to invite > 125,000 MVP participants from approximately 20 MVP sites. Additionally, MVP information center staff made > 4000 calls to prospective registry volunteers. This work resulted in 1300 volunteers agreeing to be
New Data Collection
The MVP protocol was approved by the VA Central Institutional Review Board (IRB) in 2011. As part of initial enrollment in MVP, participants consented to recontact for additional self-report information along with access to their electronic health record (EHR). This allows for the linkage of EHR and survey response data, thus providing a comprehensive understanding of health history before and after a self-reported COVID-19 diagnosis. Between May 2020 and September 2021, the MVP COVID-19 survey was distributed to existing MVP participants via mail, telephone, and email with the ability to complete the survey by paper and pencil or through the MVP online system. Dissemination of the survey was approved by the VA Central IRB in 2020, with nearly 730,000 eligible MVP participants contacted. As of June 2022, 255,737 MVP participants (35% of the eligible cohort) had completed the survey; 86% completed a paper survey while 14% completed it online. Respondents were primarily older (≥ 65 years); 90% were male; close to 7% reported Hispanic ethnicity, and 11% reported Black race.
Findings from this survey provide insight into pandemic behaviors not consistently captured in EHRs, such as psychosocial aspects, including social and emotional support, loss of tangible and intangible resources, as well as COVID-19–related behaviors, such as social distancing and self-protective practices.1 MVP COVID-19 survey data combined with veteran EHRs, responses to other MVP surveys, and genetic data enable MVP researchers to better understand epidemiological, clinical, and psychosocial aspects of the disease. Future COVID-19 studies may use self-reported survey responses to enrich understanding about the effects of the disease on a veteran’s daily life, and possibly validate existing EHR COVID-19 diagnoses and hospitalization findings. This comprehensive data resource provides a unique opportunity to identify new targets for disease prevention, treatment, and management with an emphasis on individual variability in genes, environment, and lifestyle.
COVID-19 Research
In early 2020, the burden of COVID-19 on the US was unprecedented, and little was known about risk factors for severe COVID-19 and deaths. The MVP Phenomics team quickly responded with a large-scale phenome-wide association study (PheWAS) of >
To broaden disease progression data curation and fit the specific needs of the VA, we operationalized and validated the World Health Organization clinical severity scale and used VA EHR data to create the VA Severity Index for COVID-19 (VASIC).3 The VASIC category is now part of the MVP core data repository, where volumes of data from multiple activities are integrated through an automated process to create monthly research-ready data cubes. These activities include extensive data curation, mapping, phenotyping, and adjudication that are performed to curate oxygen supplementation status and other procedures related to treatment that are processed and understood in real time. The data cubes were provisioned to MVP COVID-19 researchers. In addition, the VASIC scale variable is now integrated within the larger VA system for all researchers to use as part of its wider COVID-19 initiative. The VA Centralized Interactive Phenomics Resource (CIPHER) phenomics library now hosts the details of VASIC, codes, metadata, and related COVID-19 data products for all VA communities. In partnership with CIPHER and other internal and external COVID-19 initiatives, the MVP continues to play an integral part for the VA and beyond in the development of a phenomics algorithm for long COVID, or post-acute COVID-19 syndrome (PACS).
Host Genetics in COVID-19
As the SARS-CoV-2 virus continued to spread globally, it became clear that the symptoms and severity of infection experienced by patients varied across a broad spectrum, from being asymptomatic carriers to experiencing severe symptoms in 1 or more organ systems in the body, resulting in death. This variability suggested that host genetics and other host factors may play a role in determining the severity of COVID-19 infection. The MVP dataset, with genetic and health information on > 600,000 MVP participants, provided an ideal dataset to explore host contributions to COVID-19.
In late spring 2020, the MVP executive committee issued a call to the MVP research community to propose study aims around the COVID-19 pandemic that could leverage the phenotypic and genetic data and resources. The MVP quickly formed 6 rapid-response scientific working groups. Their mission was to cultivate collaboration and inclusivity and to coordinate COVID-19 research questions. A steering committee composed of the MVP executive committee, staff from computational environments, working group cochairs, and an administrator, who was responsible for daily oversight of the working groups. In addition, the ORD COVID-19 steering committee reviewed and approved research activities to ensure scientific rigor, as well as alignment with overall ongoing research activities.
The MVP COVID-19 working groups included dozens of researchers who used MVP data to identify disease mechanisms; understand the impact of host genetics on susceptibility, morbidity, and mortality; and identify potential targets for treatments and therapies. The working groups were further supported by MVP analysts to work cross-functionally on genomics, phenomics, statistical genetics, and PheWAS. Each working group chair was responsible for prioritizing concepts and moving them forward in coordination with the MVP and ORD COVID-19 steering committees. An overview of the MVP COVID-19 working groups follows (Table).4-9
Druggable genome. This working group researched drug-repurposing opportunities to prevent severe COVID-19, defined as hospitalization with oxygen therapy (high flow), intubation, mechanical ventilation, vasopressors, dialysis, or death from COVID-19; and prevent complications in patients hospitalized by COVID-19.
Pharmacogenomics. This working group focused on 2 main aims: the impact of apolipoprotein L1 risk variants on acute kidney injury (AKI) and death in Black veterans with COVID-19; and pharmacogenetic analysis of remdesivir-induced liver chemistry abnormalities.
Disease mechanisms. Understanding the underlying pathways and mechanisms behind COVID-19 has been a difficult but important challenge overall in the scientific community. This working group investigated specific genetic markers and effects on COVID-19, including polygenic predisposition to venous thromboembolism associated with increased COVID-19 susceptibility; renal comorbidities and new AKI and unfavorable outcomes among COVID-19–positive sickle cell trait carriers; and mucin 5B, oligomeric mucus/gel-forming gene polymorphism, and protective effects in COVID-19 infection.
Genomics for risk prediction, polygenic risk scores, and mendelian randomization. Risk prediction for COVID-19 has been widely studied mostly aiming at comorbidities and preexisting conditions. The MVP cohort provided a unique opportunity to understand how genetic information can enhance our understanding of COVID-19 risk. This working group focused on: (1) ABO blood group typing and the protective effects of the O blood group on COVID-19 infection; (2) polygenic risk scores and COVID-19 outcomes; (3) human leukocyte antigen typing and COVID-19 outcomes; and (4) a transcriptome-wide association study of COVID-19–positive MVP participants.
Genome-Wide Association Study (GWAS) and Downstream Analysis. This working group performed GWAS of the main COVID-19 outcomes. Results from GWAS unveiled new genetic loci to suggest further investigation on these candidate genes. The results were used by other MVP COVID-19 working groups for their activities. The results also contributed to external collaborations, such as the COVID-19 Host Genetics Initiative.
COVID-19–Related PheWAS. This working group focused on understanding the potential clinical significance of genetic variants associated with susceptibility to, or outcomes of, COVID-19 infection. They worked to identify traits that share genetic variants associated with severe COVID-19 from the Host Genetics Initiative. The group also studied the phenotypic consequences of acquired mosaic chromosomal alterations with early data linking to COVID-19 susceptibility.
COVID-19 Research Partnerships
In 2016, the VA and DOE formed an interagency partnership known as Computational Health Analytics for Medical Precision to Improve Outcomes Now (CHAMPION) to demonstrate the power of combining the VA EHR system, MVP genetic data, and clinical research expertise with DOE high-performance computing infrastructure and artificial intelligence expertise. The VA EHR captures longitudinal care information on veterans with records that go back decades. Furthermore, the VA covers the costs of medications and
The DOE Oak Ridge National Laboratory (ORNL) in Tennessee securely maintains this rich database for the VA. The ORNL Summit supercomputer can complete trillions of calculations per second to provide critical and timely analyses, applying the most advanced and powerful artificial intelligence methods, which would not be possible in more conventional research settings. CHAMPION taught the VA and DOE how to bring their disparate research cultures together for innovative collaborative investigation. Moreover, this collaboration produced a cadre of VA and DOE scientists familiar with VA patient data and experienced in conducting joint research successfully and integrating omics data with clinical data for a better mechanistic understanding. Because of this preexisting collaboration between the VA and DOE, interagency teams were prepared at the start of the COVID-19 pandemic.10-15
Other recently completed studies have developed and validated short-term mortality indices in individuals with COVID-19 based on their preexisting conditions, assessed the generalizability of VA COVID-19 experiences to the US population, and evaluated the effectiveness of hydroxychloroquine with and without azithromycin in VA patients with COVID-19.12,15 A recent study demonstrated the benefit of prophylactic anticoagulation at initial hospitalization.14
The VA also provided the FDA with daily reports on aggregate VA COVID-19 cases and their distribution across the VA system, demographics of VA patients with COVID-19, and analyses of predictive models for positive test results and death. The VA regularly sent the FDA aggregated data showing patterns of medication use and retrospective analyses of the effectiveness of certain medications (including remdesivir and some antithrombotic agents). The FDA used these data along with other data to understand the scope of the pandemic and to predict drug shortages or needs for additional medical equipment, including ventilators.
Limitations
For the most part, MVP infrastructure and partnerships were efficiently leveraged to significantly advance our understanding of the biological basis of COVID-19 and to develop treatments and vaccines. However, there were a few limitations that may have slowed timely and optimal outcomes. An issue not limited to the MVP or VA was the continual evolution of the pandemic and its response. This included evolving definitions of disease, symptomatology, testing, vaccines, and public health recommendations. Keeping pace with the emerging knowledge from these domains was a struggle for the entire scientific community. A more discrete limitation was the number of participants in the MVP with positive COVID-19 test results and positive symptoms; however, this was mitigated by partnering with other groups like the COVID-19 Host Genetics Initiative to increase study participant numbers. Finally, there were logistical and regulatory challenges associated with coordination of national clinical trial recruitment across a VA system with > 100 discrete hospitals.
Conclusions
Having a centralized infrastructure for recruitment and enrollment, including a national research volunteer registry, information center, research staff, and coordinating centers, can allow for expedited enrollment in vaccine and treatment trials in the face of future public health emergencies. VA assets, including its rich EHR and MVP, the world’s largest genomic cohort, have contributed to improving our understanding and management of COVID-19.
1. Whitbourne SB, Nguyen XT, Song RJ, et al. Million Veteran Program’s response to COVID-19: survey development and preliminary findings. PLoS One. 2022;17(4):e0266381. doi:10.1371/journal.pone.0266381
2. Song RJ, Ho YL, Schubert P, et al. Phenome-wide association of 1809 phenotypes and COVID-19 disease progression in the Veterans Health Administration Million Veteran Program. PLoS One. 2021;16(5):e0251651. doi:10.1371/journal.pone.0251651
3. Galloway A, Park Y, Tanukonda V, et al. Impact of COVID-19 severity on long-term events in US veterans using the Veterans Affairs Severity Index for COVID-19 (VASIC). J Infect Dis. 2022;226(12):2113-2117. doi:10.1093/infdis/jiac182
4. Gaziano L, Giambartolomei C, Pereira AC, et al. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat Med. 2021;27(4):668-676. doi:10.0138/s41591-021-01310-z
5. Hung AM, Sha SC, Bick AG, et al. APOL1 risk variants, acute kidney injury, and death in participants with African ancestry hospitalized with COVID-19 from the Million Veteran Program. JAMA Intern Med. 2022;182(4):386-395. doi:10.1001/jamainternmed.2021.8538
6. Verma A, Huffman JE, Gao L, et al. Association of kidney comorbidities and acute kidney failure with unfavorable outcomes after COVID-19 in individuals with the sickle cell trait. JAMA Intern Med. 2022;182(8):796-804. doi:10.1001/jamainternmed.2022.2141
7. Verma A, Tsao NL, Thomann LO, et al. A phenome-wide association study of genes associated with COVID-19 severity reveals shared genetics with complex diseases in the Million Veteran Program. PLoS Genet. 2022;18(4):e1010113. doi:10.1371/journal.pgen.1010113
8. Peloso GM, Tcheandjieu C, McGeary JE, et al. Genetic loci associated with COVID-19 positivity and hospitalization in White, Black, and Hispanic Veterans of the VA Million Veteran Program. Front Genetic. 2022;12:777076. doi:10.3389/fgene.2021.777076
9. Verma A, Minnier J, Wan ES, et al. A MUC5B gene polymorphism, rs35705950-T confers protective effects against COVID-19 hospitalization but not severe disease or mortality. Am J Respir Crit Care Med. 2022;182(8):796-804. doi:10.1164/rccm.202109-2166OC
10. Garvin MR, Alvarez C, Miller JI, et al. A mechanistic model and therapeutic interventions for COVID-19 involving a RAS-mediated bradykinin storm. Elife. 2020;e59177. doi:10.7554/eLife.59177
11. Rentsch CT, Kidwai-Khan F, Tate JP, et al. Patterns of COVID-19 testing and mortality by race and ethnicity among United States veterans: A nationwide cohort study. PLoS Med. 2020;17(9):e1003379. doi:10.1371/journal.pmed.1003379
12. King JT, Yoon JS, Rentsch CT, et al. Development and validation of a 30-day mortality index based on pre-existing medical administrative data from 13,323 COVID-19 patients: the Veterans Health Administration COVID-19 (VACO) Index. PLoS One. 2020;15(11):e0241825. doi:10.1371/journal.pone.0241825
13. Joubert W, Weighill D, Kainer D, et al. Attacking the opioid epidemic: determining the epistatic and pleiotropic genetic architectures for chronic pain and opioid addiction. SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. Dallas, TX, USA, 2018:717-730. doi:10.1109/SC.2018.00060
14. Rentsch CT, Beckman JA, Tomlinson L, et al. Early initiation of prophylactic anticoagulation for prevention of COVID-19 mortality: a nationwide cohort study of hospitalized patients in the United States. BMJ. 2021;372:n311. doi:10.1136/bmj.n311
15. Gerlovin H, Posner DC, Ho YL, et al. Pharmacoepidemiology, machine learning, and COVID-19: an intent-to-treat analysis of hydroxychloroquine, with or without Azithromycin, and COVID-19 outcomes among hospitalized US Veterans. Am J Epidemiol. 2021;190(11): 2405-2419. doi:10.1093/aje/kwab183
1. Whitbourne SB, Nguyen XT, Song RJ, et al. Million Veteran Program’s response to COVID-19: survey development and preliminary findings. PLoS One. 2022;17(4):e0266381. doi:10.1371/journal.pone.0266381
2. Song RJ, Ho YL, Schubert P, et al. Phenome-wide association of 1809 phenotypes and COVID-19 disease progression in the Veterans Health Administration Million Veteran Program. PLoS One. 2021;16(5):e0251651. doi:10.1371/journal.pone.0251651
3. Galloway A, Park Y, Tanukonda V, et al. Impact of COVID-19 severity on long-term events in US veterans using the Veterans Affairs Severity Index for COVID-19 (VASIC). J Infect Dis. 2022;226(12):2113-2117. doi:10.1093/infdis/jiac182
4. Gaziano L, Giambartolomei C, Pereira AC, et al. Actionable druggable genome-wide Mendelian randomization identifies repurposing opportunities for COVID-19. Nat Med. 2021;27(4):668-676. doi:10.0138/s41591-021-01310-z
5. Hung AM, Sha SC, Bick AG, et al. APOL1 risk variants, acute kidney injury, and death in participants with African ancestry hospitalized with COVID-19 from the Million Veteran Program. JAMA Intern Med. 2022;182(4):386-395. doi:10.1001/jamainternmed.2021.8538
6. Verma A, Huffman JE, Gao L, et al. Association of kidney comorbidities and acute kidney failure with unfavorable outcomes after COVID-19 in individuals with the sickle cell trait. JAMA Intern Med. 2022;182(8):796-804. doi:10.1001/jamainternmed.2022.2141
7. Verma A, Tsao NL, Thomann LO, et al. A phenome-wide association study of genes associated with COVID-19 severity reveals shared genetics with complex diseases in the Million Veteran Program. PLoS Genet. 2022;18(4):e1010113. doi:10.1371/journal.pgen.1010113
8. Peloso GM, Tcheandjieu C, McGeary JE, et al. Genetic loci associated with COVID-19 positivity and hospitalization in White, Black, and Hispanic Veterans of the VA Million Veteran Program. Front Genetic. 2022;12:777076. doi:10.3389/fgene.2021.777076
9. Verma A, Minnier J, Wan ES, et al. A MUC5B gene polymorphism, rs35705950-T confers protective effects against COVID-19 hospitalization but not severe disease or mortality. Am J Respir Crit Care Med. 2022;182(8):796-804. doi:10.1164/rccm.202109-2166OC
10. Garvin MR, Alvarez C, Miller JI, et al. A mechanistic model and therapeutic interventions for COVID-19 involving a RAS-mediated bradykinin storm. Elife. 2020;e59177. doi:10.7554/eLife.59177
11. Rentsch CT, Kidwai-Khan F, Tate JP, et al. Patterns of COVID-19 testing and mortality by race and ethnicity among United States veterans: A nationwide cohort study. PLoS Med. 2020;17(9):e1003379. doi:10.1371/journal.pmed.1003379
12. King JT, Yoon JS, Rentsch CT, et al. Development and validation of a 30-day mortality index based on pre-existing medical administrative data from 13,323 COVID-19 patients: the Veterans Health Administration COVID-19 (VACO) Index. PLoS One. 2020;15(11):e0241825. doi:10.1371/journal.pone.0241825
13. Joubert W, Weighill D, Kainer D, et al. Attacking the opioid epidemic: determining the epistatic and pleiotropic genetic architectures for chronic pain and opioid addiction. SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. Dallas, TX, USA, 2018:717-730. doi:10.1109/SC.2018.00060
14. Rentsch CT, Beckman JA, Tomlinson L, et al. Early initiation of prophylactic anticoagulation for prevention of COVID-19 mortality: a nationwide cohort study of hospitalized patients in the United States. BMJ. 2021;372:n311. doi:10.1136/bmj.n311
15. Gerlovin H, Posner DC, Ho YL, et al. Pharmacoepidemiology, machine learning, and COVID-19: an intent-to-treat analysis of hydroxychloroquine, with or without Azithromycin, and COVID-19 outcomes among hospitalized US Veterans. Am J Epidemiol. 2021;190(11): 2405-2419. doi:10.1093/aje/kwab183