Decoding Ebola: Next-Generation Sequencing of the Ebola Genome for the FDA-ARGOS Database
An MCMi Regulatory Science Profile
On this page: Contribute samples | Why it's Important | About the Project | Outcomes | Technical Details | Project Updates | From the PI | Partnerships | References
Project: Next-generation Ebola sequencing for the FDA-ARGOS Database
Principal Investigator: Heike Sichtig, PhD
FDA Center: FDA Center for Devices and Radiological Health (CDRH), Division of Digital Health
Researchers: Contribute samples to FDA-ARGOS for free sequencing and analysis
The FDA-ARGOS team and collaborators are searching for unique, hard to source microbes such as biothreat organisms, emerging pathogens, and clinically significant bacterial, viral, fungal, and parasitic genomes. We aim to collect sequence information for a minimum of 5 isolates per species. Most-wanted organism list (PDF, 93 KB)
The FDA-ARGOS team employs a hybrid sequencing approach using Illumina and PacBio sequencing technologies pictured below to create the microbial reference sequences.
Each PacBio SMRT Cell contains 150,000 zero-mode waveguides enabling the sequencing of single DNA molecules.
The PacBio RS II platform has revolutionized bacterial genome sequencing. With consistently growing read lengths and increased yield, generating complete bacterial genome sequences has become the standard in the field.
The Illumina HiSeq flowcell is coated in oligonucleotides enabling the capture and sequencing of more than 2.5 billion DNA fragments per run.
The high-throughput, short-read Illumina HiSeq platform is capable of generating more than 1,000 draft-grade bacterial genome sequences per week. (Photos courtesy of the University of Maryland, Institute for Genome Sciences)
Why it's Important
When you need a test to confirm disease in an outbreak, you need it fast. Global public health providers need additional infrastructure and tools like rapid diagnostic tests to combat emerging threats, including Ebola and antimicrobial resistant-pathogens. Next-generation sequencing technologies show promise to improve rapid diagnostic tests, and have the potential to speed development of vaccines, therapeutics and diagnostic devices—all of which would enable quicker actions to protect public health.
About the Project
Early Ebola symptoms can mirror many other diseases. Hospital and clinical labs—those closest to patients—need a way to quickly diagnose or rule out Ebola infection. Rapid diagnosis will enable faster treatment and minimize exposing health care workers to potentially infectious specimens.
Currently, clinical labs use specific tests to identify pathogens (e.g., PCR assays, which identify certain genes, such as those containing antimicrobial resistance markers).1 These tests often require specialized equipment, and can take hours or days, depending on the test and the lab’s capabilities.
Next-generation gene sequencing (NGS) technologies have the potential to innovate how diagnostic testing is performed by allowing rapid identification of an infectious disease without a priori suspicion of the etiological agent. In other words, labs would be able to run diagnostic tests without knowing what disease(s) to test for. However, before hospital labs can conduct this type of test, the infrastructure to support NGS must be developed. A fundamental limitation that must be addressed is “developing reference databases of bacterial genomes, something that shows what the population structure for a particular organism looks like."2 Libraries of reference sequences for other disease-causing agents like fungi, protozoa, and viruses will also be required for comprehensive diagnosis.
Both industry and FDA need these types of databases to advance the development of new technologies, and in particular next-generation sequencing diagnostics. Regulatory-grade reference sequence standards are urgently needed to help diagnose—and rule out—Ebola infection.
In this two-year project, begun in May 2015, FDA will expand the existing regulatory-grade reference database FDA-ARGOS (FDA dAtabase for Regulatory Grade micrObial Sequences). FDA will add reference sequences for Ebola, closely related filoviruses such as Marburg, and organisms that cause infections with symptoms similar to Ebola. Research on many of these organisms requires high biosafety containment levels, such as BSL-4 labs, which are not readily available to industry. Fortunately, sequencing only requires the nucleic acids of these organisms, which may be obtained from samples that have been rendered non-infectious.
Working with other government partners, FDA will acquire nucleic acids from these organisms, transfer them to sequencing facilities, and place the resulting high-quality NGS data in publicly accessible NCBI databases, allowing diagnostic manufacturers and FDA to use computer simulations to supplement actual testing to assess how well new diagnostics perform. This would allow the use of in silico reference analysis rather than live organism studies to assess the performance of a diagnostic device.
What are the outcomes?
There is a significant gap—and in some cases, a total absence—in the public domain of high-quality Ebola genomic sequence data and sequences from diseases that might be mixed up with Ebola. This project will develop the genomic sequencing data and standards needed for diagnosing and ruling out Ebola infection. Specific aims include:
- Evaluating genomic sequence information gaps, and identifying material sources.
- Developing acceptance criteria for quality sequences to be included in the database, staging sequencing and assembling draft genomes that meet the quality criteria, and re-evaluating high-quality sequences against test panels to confirm quality parameters.
- Establishing a portal through NCBI containing only sequence deposits that meet FDA-defined quality parameters. FDA will disseminate this information to clinical end users and test developers, and encourage use of the genomic sequence quality criteria and standards.
Technical Details: About the Database
Infectious disease NGS-based diagnostics is very different from human genome NGS:
- Absolute need for immediate and actionable results
- Broad range of specimen types (e.g., urine, blood, CSF, stool, sputum, and others)
- Large diversity of the infectious disease agents possible present within a single specimen
- Dynamic nature of infectious disease agents (spatio-temporal)
As part of the FDA-ARGOS project, FDA received and sequenced three Ebola Makona isolates from the beginning of the 2014-2015 West African Ebola outbreak as challenge materials (Public Health Agency of Canada (PHAC) isolates C05, C07 and C15). We performed three different sequencing approaches (shotgun, IGS designed amplicon, RACE) using Illumina sequencing technology.
Metadata, raw reads, qualified assemblies and consensus sequences are available as part of the public FDA-ARGOS database accessible through NCBI. FDA presented results at the Filovirus Animal Non-Clinical Group (FANG) meeting in May 2015 on the Fort Detrick campus.
These Ebola Makona genomic sequences are regulatory-grade and high quality for use in MCM diagnostic test, vaccine and therapeutic development. FDA also received genomic sequences from the filovirus panel that includes historical Ebola isolates from the U.S. Army Medical Research Institute of Infectious Diseases (USAMRIID), and we have several other Ebola isolates in the pipeline for sequencing or qualification into FDA-ARGOS.
Project Updates
In 2015, FDA awarded additional funding to the Genomics Resource Center (GRC) at the University of Maryland (UMD) School of Medicine, Institute for Genome Sciences, to fill in existing sequence information gaps, and formulate a quality standard for diagnostic applications. The GRC will generate hybrid draft genome sequences of 1,380 bacterial, viral, fungal, and parasite pathogen isolates. These new genome sequences—including Ebola genomes—will extend and enhance the 552 genome sequences already in progress.3
In 2016, FDA awarded additional funding to the UMD GRC for genome sequencing of mosquito-borne viral pathogens in support of FDA-ARGOS. This work will support and inform Zika response activities.4
Featured Publication
FDA-ARGOS is a database with public quality-controlled reference genomes for diagnostic use and regulatory science
Nature Communications 10, Article number: 3313 (2019)
From the PI
Given the trend to make technology gadgets smaller and more portable, one can imagine NGS as an implantable or wearable device in the future. What if we could detect an infection before signs and symptoms manifest? And what if NGS could be used as a detective tool for completely unknown agents? NGS technology for infectious disease diagnostics has the potential to revolutionize the field through its foremost ability to detect emerging infectious disease agents using agnostic (metagenomics) sequencing, especially in an emergency situation.
The diagnostic detection ability of NGS-based tests requires access to high quality, regulatory-grade microbial genomic reference sequences of the infectious disease target and its near neighbors. The FDA-ARGOS project has funding to add 1500 high quality medical countermeasure and clinically relevant microbial sequences to the public domain.
A hybrid sequencing approach using Illumina and PacBio sequencing technologies is used to create the microbial reference sequences. These reference sequences support development and validation of infectious disease diagnostic tests, and have the potential to speed development of vaccines and therapeutics.
Photo: FDA-ARGOS Team (from right: Uwe Scherf, Sally Hojvat, Heike Sichtig, Brittany Goldberg, Kevin Snyder). Credit: Diane Garrett, FDA
Dr. Heike Sichtig is a principal investigator and lead technical regulatory scientist in FDA’s Office of In Vitro Diagnostics and Radiological Health in the Division of Microbiology Devices. She is leading the highly collaborative effort to develop FDA-ARGOS. She joined the Division of Microbiology Devices in 2012 and is primarily focusing on enabling NGS-based technologies for clinical diagnostics. She is part of a multidisciplinary team that is developing and implementing the concepts for validation and evaluation of NGS-based infectious disease diagnostic devices.
Dr. Sichtig obtained a BS/MS in Computer Science/Statistics from Kean University in 2002 and 2003, and a PhD in Biomedical Engineering from Binghamton University in 2009. She did her postdoctoral training at the University of Florida/Genetics Institute in Gainesville, FL in pathogen signatures, transcriptional regulation and epigenetics. Dr. Sichtig received the Commissioners’ Special Citation award in 2014 and 2015, and several other awards related to FDA review and regulatory science work.
Partnerships
FDA is collaborating with other organizations on this research, including the University of Maryland Genomics Resource Center (GRC), in the Institute for Genome Sciences (IGS).
University GRC directors Dr. Lisa Sadzewicz and Luke Tallon are leading the university's IGS component of the FDA-ARGOS project.
This project builds on FDA collaborations with partners including:
- National Institutes of Health (NIH) National Center for Biotechnology Information (NCBI) and National Institute of Allergy and Infectious Diseases (NIAID)
- Department of Defense (DoD) U.S. Army Medical Research Institute of Infectious Diseases (USAMRIID), Critical Reagents Program (CRP), and Defense Threat Reduction Agency (DTRA)
- Institute for Genome Sciences (IGS) at the University of Maryland School of Medicine
- Lawrence Livermore and Los Alamos National Laboratories (LLNL and LANL)
References and Further Reading
- More from FDA about FDA-ARGOS, including a description with additional resources, facts about the project, and a list of collaborators
- Burkholderia genomes generated via DTRA’s Threat Characterization Consortium, in a similar project
- Chain PS, Grafham DV, Fulton, RS, et al. Genome project standards in a new era of sequencing. Science. 2009 Oct 9;326(5950): 236-237.
- Gire SK, Goba A, Andersen KG, et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science. 2014 Sept 12; 345(6202):1369-1372.
- Koehler JW, Hall AT, Rolfe PA, et al. Development and evaluation of a panel of filovirus sequence capture probes for pathogen detection by next-generation sequencing. PLoS One. 2014 Sep 10;9(9): e107007.
- Ladner JT, Beitzel B, Chain PSG, et al. Standards for sequencing viral genomes in the era of high-throughput sequencing. mBio. 2014 May-Jun;5(3): e01360-14.
- Sichtig H, Minogue T, Yan Y, et al. FDA-ARGOS is a database with public quality-controlled reference genomes for diagnostic use and regulatory science. Nat Commun 2019 Jul;10:3313.
- Underwood A, Green J. Call for a quality standard for sequence-based assays in clinical microbiology: necessity for quality assessment of sequences used in microbial identification and typing. J Clin Microbiol. 2011 Jan;49(1):23-26.
- Ware L. Sequencing Outbreaks [Internet]. New York: Biotechniques. 2014 Aug 25 [cited 2015 Aug 25].
Footnotes
1. Ware L. Sequencing Outbreaks [Internet]. New York: Biotechniques. 2014 Aug 25 [cited 2015 Aug 25].
2. Ware, L.
3. This additional sequencing work was awarded to UMD under contract HHSF223201510106C on September 18, 2015, for $2,703,304.
4. This additional sequencing work was awarded to UMD under contract HHSF223201610073C on September 5, 2016, for $303,283.
Definitions
CDRH – FDA Center for Devices and Radiological Health – CDRH facilitates medical device innovation by advancing regulatory science, providing industry with predictable, consistent, transparent, and efficient regulatory pathways, and assuring consumer confidence in devices marketed in the U.S.
FDA-ARGOS – FDA dAtabase for Regulatory Grade micrObial Sequences, supporting development and validation of infectious disease diagnostic tests
NGS – next-generation sequencing (also known as high-throughput sequencing) is a term used to describe new technologies for sequencing DNA and RNA faster and at a lower cost than older sequencing methods
PCR – polymerase chain reaction – a microbiology technique used for a variety of applications including DNA sequencing
View more MCMi intramural research projects
Presentations
- February 1, 2017: FDA’s Role and Tools for ID-NGS Diagnostics (PDF, 1.7 MB) - presentation at the Department of Homeland Security Sequencing Meeting (Washington, DC)
- March 16-17, 2017: A Biocompute Object For FDA-ARGOS Reference Genomes (PDF, 827 KB) - presentation at the NIH HTS Computational Standards for Regulatory Sciences Workshop (Bethesda, MD)
- June 1, 2017: FDA-ARGOS Microbial Reference Genomes for Regulatory Use: Zika and Ebola - presentation at the 2017 FDA Science Forum by Heike Sichtig, PhD (webinar recording - this presentation is at 1:11:39-1:29:13)
- November 8-9, 2017: FDA's Role in Building the ID NGS Diagnostic Toolkit (PDF, 2 MB) - presentation at the Fraunhofer Life Science Symposium, Leipzig, Germany
- March 10-15, 2019: Diagnostic Genomes & the Precision FDA Biothreat Challenge (PDF, 2.2 MB) - presentation at Molecular Med Tri-Con 2019, San Francisco, CA
- September 9-10, 2019: Genomics Data Sharing and GWG Roadmap (webcast recording, 1 hour, 23 minutes) - presentation at the 7th Annual FDA Scientific Computing Days (Silver Spring, MD)
Related Links
- Database for Reference Grade Microbial Sequences (FDA-ARGOS)
The purpose of this website is to provide information about FDA-ARGOS, and to support and foster the development of ID-NGS technology. - FDA-ARGOS Database (NCBI)
FDA dAtabase for Regulatory Grade micrObial Sequences - FDA ARGOS Team – DoD USAMRIID Collaboration on Biothreat Detection
PrecisionFDA challenge - Ebola Preparedness and Response Updates from FDA
- In Vitro Diagnostics
In vitro diagnostics are tests that can detect diseases, conditions, or infections