Skip to main content

Table 4 Challenges for the integration of metagenomics into public health

From: Metagenomics for pathogen detection in public health

Challenge Description Relevance Solution Reference(s)
Multiple technologies Next-generation sequencing can be performed on multiple platforms each with different characteristics, and each constantly under improvement Difficulty comparing results from different platforms and with those from older techniques Pipelines must be constantly updated to account for new techniques [74, 76, 92]
Universal approach not yet possible Different platforms should be utilized depending on the question asked
   Continuously evolving technology requires skilled workforce rather than established pipelines   
Computational resources Our ability to generate DNA sequence data has rapidly surpassed our computational abilities to analyze the data Significant requirements for storage of DNA sequence Perform analysis using a staged approach [69, 93]
   Assembling and identifying short reads from next-generation sequencing is computationally intensive Cloud computing  
Suitable reference databases Multiple reference databases are available, which may generate different results depending on the database used Certain features of a metagenomic sample might be missed if the wrong database is used HMP aims to sequence multiple references genomes associated with the human body [94]
   Limited by the diversity represented in each database HMP currently has a total of 6,500 reference sequences generated  
Short read lengths Read lengths depend on sequencing platform used Makes de novo assembly more complicated Read lengths are continually increasing [92, 95]
   More difficult to identify large-scale genomic variations and repetitive regions Third-generation sequencing platforms promise much longer read lengths  
Causation Finding a pathogen in a disease sample does not imply causation Important to determine causation before changing public health management Follow-up studies are required - for example, using animal models, or serological or epidemiological methods. [11, 75, 96]
   False association can lead to costly, useless or even potentially harmful therapies Results must be independently validated  
Contamination Metagenomics can detect contaminants from cell cultures, reagents and laboratory equipment Contaminants may be incorrectly associated with the disease of interest Negative controls must be used [97]
Researchers must consider the plausibility of the findings
    Results must be independently validated  
Privacy Host nucleic acids are almost always sequenced in metagenomics studies Host genetic sequences are confidential Host DNA to be available only to researchers in HMP [92, 98]
   Human subjects might be traceable from their DNA sequences Only microbiome data are released to the public