Human genomics and preparedness for infectious threats

Public health preparedness requires effective surveillance of and rapid response to infectious disease outbreaks. Inclusion of research activities within the outbreak setting provides important opportunities to maximize limited resources, to enhance gains in scientific knowledge, and ultimately to increase levels of preparedness. With rapid advances in laboratory technologies, banking and analysis of human genomic specimens can be conducted as part of public health investigations, enabling valuable research well into the future.


Introduction
Despite major progress toward understanding infectious agents and controlling their spread, new and evolving infectious diseases -as well as old diseases in new contextscontinue to pose threats to humans worldwide. In 1992, the Institute of Medicine published an influential report calling attention to the emergence and re-emergence of human pathogens as a consequence of such factors as evolutionary changes in infectious agents and their human and non-human hosts; alterations in host behaviors and travel; and naturally occurring and man-made shifts in ecology, geography, and environment [1]. During the following decade, renewed concern about microbial threats to health spurred new investments in scientific research and public health infrastructure. In 2003 the Institute of Medicine published a report entitled Microbial Threats to Health, which highlighted the need for a global approach to preparedness [2]. That same year, the severe acute respiratory syndrome (SARS) epidemic acutely challenged the response capacity of scientists and public health officials across the globe [3,4]. Advances in high-throughput genome sequencing technology played a pivotal role in identifying the novel coronavirus associated with SARS and in facilitating the development of assays for diagnosis and control [5].

Technological advances
Public health investigations of infectious diseases have relied increasingly on molecular epidemiology since the introduction of restriction fragment length polymorphism (RFLP) analysis in the 1980s. The first full genome sequence for a human bacterial pathogen, Haemophilus influenzae, was completed in 1995 [6]. Since then, the development of sequencing technologies has made genomic analysis of emerging pathogens easier, faster, and less expensive; instead of taking months or weeks, such investigations can often be accomplished in days. Recently, Musser and Shelburne reviewed a decade of progress in patho genomic analysis of group A streptococcus infections, made possible by technical advances, including low-cost DNA sequencing, microarray technology, and high-through put proteomics [7]. Application of these techniques has uncovered new virulence factors and provided insights into bacterial-host interactions, which are important for preventing invasive infections and developing effective vaccines.
More recently, within days of the initial identification of the first cases of 2009 pandemic influenza A (H1N1) in spring 2009, scientists had identified the origin of all eight influenza virus gene segments. Within two weeks, the Centers for Disease Control and Prevention (CDC) began to distribute RT-PCR diagnostic test kits to public health laboratories under a quickly granted emergency use authorization by the US Food and Drug Administration [8][9][10].
Developments in informatics have been crucial for the successful application of genomics to infectious disease research [11]. Making research data freely accessible in continuously updated, online databases further enhances their utility for public health investigation. Such resources recently allowed researchers to compare sequences of the pandemic 2009 H1N1 influenza virus with other influenza viruses, to quickly identify potentially important features [12]. As part of its Influenza Virus Resource [13], the National Center for Biotechnology Information (NCBI) has created a specific resource for H1N1 influenza genome

Human genomics and preparedness for infectious threats
Nicole F Dowling*, Marta Gwinn* and Alison Mawle † Addresses: *Office of Public Health Genomics, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA. † National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA 30333, USA.
Correspondence: Nicole F Dowling. Email: ndowling@cdc.gov CDC, Centers for Disease Control and Prevention; GWAS, genome-wide association study; NCBI, National Center for Biotechnology Information; RFLP, restriction fragment length polymorphism; RT-PCR, reverse-transcriptase polymerase chain reaction; SARS, severe acute respiratory syndrome.
sequence data [14]; as of September 2009, the database contained more than 1,100 different nucleotide sequences and over 400 full-length viral genomes for 2009 H1N1 influenza viruses. NCBI's Entrez Genome database provides whole genome sequences for more than 1,000 organisms, including Homo sapiens, as well as bacteria, viruses, and parasites that cause human disease [15]. The CDC has quickly shared virus sequence data on public websites [16]. Other research organizations, such as the Viral Bioinformatics Resource Center and the Wellcome Trust's Sanger Institute, have also developed repositories of genomic information of public health importance [17,18].
Advances in genomics and the completion of the Human Genome and HapMap Projects have opened the door to research on the role of human genetic variation in the population distribution, transmission, and severity of infectious diseases. Published studies in 'human genome epidemiology' have been tracked since 2001 in a CDCsponsored database, HuGE Navigator, which is freely available online for quick searching on human genes, diseases, and environmental factors, including pathogens [19,20]. Of the more than 40,000 studies in the database, most focus on chronic diseases; however, more than 2,000 so far are related to infectious diseases, including several genome-wide association studies (GWASs). In recent years, GWASs have become a powerful tool for systematically searching the human genome for novel associations with infectious diseases, including tuberculosis, malaria, and HIV [21][22][23][24].

Preparedness for research
Public health efforts to control and prevent infectious diseases are based on epidemiologic and laboratory surveil lance systems that detect and monitor disease incidence, define pathogen characteristics, and track cases 'by person, place and time' [25]. Molecular epidemiology has long been a mainstay of public health surveillance; now, increasingly powerful molecular methods, including sequencing of whole pathogen genomes, help investigators to identify epidemiologically related cases, describe patterns of transmission, pinpoint sources of infection, and explain antimicrobial resistance [26][27][28][29][30][31][32][33]. Combining the methods of molecular pathogenomics with population genetics and epidemiology can provide new insights into the episodic behavior of epidemics of familiar pathogens, such as group A streptococcus [7]. Archived biological samples can also provide new insights into the emergence and evolution of infectious threats, from HIV [34] to influenza [35]. Building the capacity for human biological sample collection into existing surveillance networks has the potential to facilitate a more comprehensive, popu lation-based evaluation of genomic and environmental determinants of health outcomes. For example, a meta-analysis of surveillance cohorts from Arizona, Colorado, California, and Illinois demonstrated that human homo zygosity for CCR5delta32 (a non-functional variant of chemokine receptor CCR5) is consistently associated with symptomatic West Nile virus infection [36]. When disease surveillance is conducted independently in multiple jurisdictions, the capacity to share biological samples and epidemiologic and clinical data provides important infrastructure for research.
Laboratory infrastructure for pathogen genotyping and sequencing is readily adaptable to the analysis of human genomes; indeed, human genome studies can now be accomplished in the time that pathogen studies required only a decade ago. Additional planning and investment are needed to support the collection and storage of specimens that allow for comprehensive evaluation of key attributes of pathogen and host. In the United States, public health agencies routinely collect, store, and analyze data from individuals to identify and control public health threats; regulations regarding privacy and human subjects' research protections do not always apply to these activities [37]. Preparedness for research in the public health setting should explicitly address protection of human subjects, for example, through development of research protocols that have been pre-approved by review boards to pave the way for systematic research and data collection.
In an infectious disease outbreak, the immediate priority is to limit and contain the threat to human health and wellbeing. Research during and following the outbreak can also be important for developing effective treatments and preventive measures, and for guiding public health policies. Recently, research conducted within ongoing investigations of H1N1 infection has demonstrated that pregnant women may be at elevated risk for complications from infection [38]. These findings have led to recommendations to prioritize pregnant women for vaccination and, if infected, for antiviral therapy. Several published candidate gene studies of susceptibility to viral and bacterial diseases have demonstrated the feasibility and utility of integrating host genomics into epidemiologic studies and surveillance [36,[39][40][41][42]. The greatest public health impact of such research may be through accelerating the development of preventive vaccines; however, novel diagnostic tests developed for clinical use may also prove useful for epidemic monitoring and triage during out breaks. For example, scientists recently demonstrated that human gene expression profiles are an effective tool for determining the etiology of respiratory infections, provid ing a striking example of rapid translation from basic research to potential clinical and public health application [43].
In a recent editorial titled 'Epidemic science in real time', Fineberg and Wilson underscored the critical importance of 'conducting the right science and communicating expert judgment' to 'enable policies to be adjusted appropriately as an epidemic scenario unfolds' [44]. They emphasized that in times of diminishing public health resources, scientists from diverse disciplines -epidemiology, laboratory, social sciences -must work together to respond to immediate threats and follow through with research to understand key attributes of the affected populations and the disease process. The results of such research are needed to inform policy, to develop treatments and interventions, and to update and adjust recommendations as the state of knowledge changes.

Conclusions
Integrating genomics research into the context of public health surveillance and response can help maximize the use of limited resources, enhance the exchange and growth of scientific knowledge, and increase preparedness for infectious threats. Such research should be based on sound protocols that protect human subjects. Specimens should be processed and banked, enabling future research on genetic variation of both pathogen and host, as well as gene expression profiles, proteomics, and other measures. Epidemiologic data about environment and behaviors should be collected and stored to support additional analysis of gene-environment interactions. Such efforts will require a shift in culture and broadening of traditional public health definitions of preparedness and response, research, and collaboration.

Competing interests
The authors declare that they have no competing interests Authors' contributions NFD, MG, and AM were involved in drafting the manuscript and have given final approval of the version to be published.