The genomics of preterm birth: from animal models to human studies

Preterm birth (delivery at less than 37 weeks of gestation) is the leading cause of infant mortality worldwide. So far, the application of animal models to understand human birth timing has not substantially revealed mechanisms that could be used to prevent prematurity. However, with amassing data implicating an important role for genetics in the timing of the onset of human labor, the use of modern genomic approaches, such as genome-wide association studies, rare variant analyses using whole-exome or genome sequencing, and family-based designs, holds enormous potential. Although some progress has been made in the search for causative genes and variants associated with preterm birth, the major genetic determinants remain to be identified. Here, we review insights from and limitations of animal models for understanding the physiology of parturition, recent human genetic and genomic studies to identify genes involved in preterm birth, and emerging areas that are likely to be informative in future investigations. Further advances in understanding fundamental mechanisms, and the development of preventative measures, will depend upon the acquisition of greater numbers of carefully phenotyped pregnancies, large-scale informatics approaches combining genomic information with information on environmental exposures, and new conceptual models for studying the interaction between the maternal and fetal genomes to personalize therapies for mothers and infants. Information emerging from these advances will help us to identify new biomarkers for earlier detection of preterm labor, develop more effective therapeutic agents, and/or promote prophylactic measures even before conception.

preterm infants born at 36 weeks than those born at 34 weeks [5], infants categorized as late preterm are more likely than their term counterparts to experience diffi culties with feeding, jaundice and respiratory distress [6] ( Table 1). In addition to being particularly prone to infection as well as to metabolic and respiratory compli ca tions, late preterm infants are six times more likely to die within their first week of life and three times more likely to die within their first year of life compared with infants born at term [3].
Initiation of parturition before 37 weeks of gestation, most often occurring for an unidentifiable reason [7], may manifest as spontaneous labor or preterm premature rupture of membranes [7,8]. Controversy exists as to whether these two phenotypes reflect the same under lying pathology or distinct mechanisms. Although the pathogenesis of preterm labor is not well understood, a plethora of maternal risk factors associated with a higher incidence of preterm birth have been identified, including classification as black or AfricanAmerican, maternal stress, maternal age, tobacco use and surgical interven tion for cervical disease [7,9,10] (Table 2).
One important maternal risk factor for preterm labor may be grounded in genetics ( Table 2). The risk of a woman giving birth preterm is almost doubled if her sister has given birth to a preterm infant [11]. Mothers who were themselves born preterm are more likely to deliver preterm [12]. The maternal risk of delivering an infant preterm is four times higher if one of her previous children was delivered preterm [11]. However, the risk for a woman delivering preterm is unaffected by the history of preterm children of her partner with other women or by members of her partner's family [13].
To begin to pare apart the pathogenesis of preterm labor, the use of mouse, rat, guinea pig, sheep and non human primate models has been useful. However, each model organism has certain disadvantages that make it difficult to portray and study human preterm labor accu rately [14,15]. Because of the disadvantages of drawing comparisons between the pathophysiology of preterm birth in animal models and humans, coupled with the strong genetic foundation of preterm labor, it may be most effective to use human genetics and genomics to elucidate the mechanisms underlying preterm birth.
Understanding the determinants of a healthy preg nan cy is necessary for transformative advances in women's and infants' health. A healthy pregnancy requires both con served structural and, more uniquely, temporal compo nents. The temporal components the programming that determines birth timing will serve as the focus of this review. We will describe mechanistic findings that have emerged from animal studies, their limitations as applied to human parturition, evidence for genetic contributors to the risk for preterm birth in humans, and available results from human genetic and genomic investigations. We conclude with potential areas that could prove fruitful in further elucidating the genomics of preterm birth in the future.

Animal models: uses and limitations
The use of animal models to study the events leading up to and throughout birth has provided significant insight into the mechanisms regulating parturition, at term and preterm. However, the applicability of current animal models of parturition to the physiological mechanisms of human pregnancy and birth has been limited, as the means by which these different species regulate and initiate parturition differ from each other and from humans.
Perhaps the longest established animal model for birth timing is in sheep. The study of parturition in sheep is particularly relevant to human birth in that the gestation length and number of offspring per gestation is closer to that of humans than most of the common models in use, the sheer size of pregnant ewes and their fetuses make experimental manipulation easy, and a shift in the site of progesterone production from the corpus luteum of the ovary to the placenta occurs during pregnancy in both women and ewes [16]. However, in contrast to the human simplex uterus, sheep have a bicornuate uterus, allowing them to maintain one or two fetuses per gestation, and a cotyledonary placenta as opposed to a discoid placenta, as found in primates, indicating that the mechanisms controlling parturition in sheep may be different from Patent ductus arteriosis Average of 46% between 22 and 28 weeks of gestation [73] Bronchopulmonary dysplasia Variation from 12 to 32% in infants born at less than 32 weeks of gestation [74] those in humans. Previous studies in ewes have also shown that parturition events depend on fetal regulation through the hypothalamicpituitaryadrenal axis that ultimately result in a decrease in circulating maternal pro gesterone levels, that is, overt progesterone with drawal [16], which does not occur in human parturition, limiting the usefulness of sheep for modeling the events of human parturition and preterm birth ( Table 3). The use of mice has proven useful to further our understanding of the events leading up to parturition owing to the ease with which the mouse genome can be manipulated. Genes with the potential to influence par turi tion have been targets for the generation of knockout models. Several components of the cascade of events that occur during birth are conserved in both mice and humans (Figure 1), including prostaglandins, which serve as uterine contractile agonists, and contractionasso ciated proteins (CAPs), which activate the myometrium and facilitate its response to stimulants [17]. Mice deficient in enzymes necessary for the synthesis of prostaglandins cytoplasmic phospholipase A 2 (cPLA 2 ) and cyclooxygenase 1 (COX1), for example have shown delayed labor, which can be reversed by treatment with a progesterone receptor antagonist or by administration of exogenous prostaglandin F 2α (PGF 2α ), which stimulates luteolysis (degeneration of corpus luteum function for progesterone synthesis) [1821]. The significance of prosta glandins as uterine contractile agonists has been shown in studies of mice with reduced 15hydroxyprostaglandin dehydrogenase (15HPGD) an enzyme responsible for the metabolism of PGF 2α as well as prostaglandin E 2 (PGE 2 ) the expression of which has been shown to decrease in the chorionic trophoblast of women in labor [2225]. These mice deliver a day early without luteolysis, as demonstrated by a lack of progesteronewithdrawal induced labor [26]. At term, the expression of CAPs such as the oxytocin receptor (OXTR) and connexin 43 (CX43) in humans and mice increases in the myometrium [27]. Although the loss of oxytocin and OXTR in mice does not alter parturition [2830], the concomitant loss of oxytocin and COX1 in mice leads to prolonged parturition that starts at normal term. This suggests that oxytocin maintains a luteotrophic role opposing the luteolytic role of COX1, which affects the presence of the contractile agonist PGF 2α , allowing for normal proges terone withdrawal in these mice [21]. The loss of CX43 (responsible for coordinating contractions in myometrial cells during labor) in smooth muscle tissues leads to slightly delayed parturition, even though these mice experi enced normal upregulation of OXTR and proges terone withdrawal [31]. Despite the similarities, the physio logy of pregnancy and birth differ significantly between mice and humans. Mice have a bicornuate uterus and tend to have large litters (typically ranging from six to eight pups), and thus the mechanism(s) of uterine activation are likely to differ from that of humans. The primary source of steroid hormone production in mice is the corpora lutea of the ovaries throughout preg nancy, whereas humans undergo a lutealplacental shift of steroid production during pregnancy. Additionally, before parturition mice undergo progesterone with drawal, limiting the relevance of pregnancy and parturi tion studies of this model organism to humans (Table 3).
Perhaps not as commonly used in research as the mouse and rat, the study of parturition in guinea pigs has revealed this process to be more similar to that of humans than in mice or rats. Through the duration of gestation up until parturition, guinea pigs maintain high levels of maternal serum progesterone and, consequently, do not require progesterone withdrawal to initiate labor, similar to human labor. Also like humans, guinea pigs have a hemomonochorial type of placentation, and the placenta, after the first four weeks of gestation, is the predominant source of progesterone [15]. Although a 7X coverage genome sequence has recently become avail able, the use of guinea pigs to model human parturition and preterm birth is limited owing to a relatively long gestation length (67 days) relative to other rodent models, thus increasing the length of time required to perform experiments, as well as limited protein and cDNA reagents, and suboptimal molecular genetic techniques for use on this model system (Table 3).
Nonhuman primates present a more analogous model organism for the study of parturition and preterm birth as their reproductive biology is most similar to that of humans. Rhesus macaques, like humans, do not undergo maternal serum progesterone withdrawal at term. Great apes and humans experience a continual rise in proges terone concentration throughout the pregnancy, peaking at term, in comparison to Old World monkeys, which have basally low, unchanging levels of progesterone, and New World monkeys, which undergo progesterone with drawal [27]. Although nonhuman primates are physio logically representative model organisms to understand human parturition and preterm birth, the expense and time necessary to maintain and study these animals as well as their long gestation periods (greater than 5 months), the unfeasibility of genetic manipulation, and the lack or limitation of optimized reagents make their use less practical (Table 3).
Although the currently employed model organisms for parturition and preterm birth have their benefits, perhaps the best model for human parturition and preterm birth is humans themselves. Analyses of sequences across human, chimpanzee and mouse genomes indicate that one of the most divergent functional categories includes the genes involved in reproduction [32]. The use of com putational biology coupled with the rapidly expanding availability of mammalian genomes opens the possibility of isolating rapidly diverging genes or conserved non coding sequences that may be relevant to human par turition, and identifying interesting variants in genome wide association studies, which would provide potential targets that may determine birth outcome [27].

A role for genetics in human preterm birth
Preterm birth is etiologically complex, with contributions from the environment and genetics, which could involve both maternal and fetal genomes [33]. The analysis of birth timing concordance in the offspring of twins demon strated that both maternal and fetal genetic effects con tri buted to defining the variance in gestational age (Table 4) [33]. By studying correlations between genetic relatedness and trait concordance, genetic effects are calculated to account for 25 to 40% of variation in fetal growth rate and gestation duration; additionally, birth weight, small size for gestational age, and preterm birth had significant genetic contributions [34]. Comparisons between full siblings and half siblings facilitate studies of variable related ness within a shared environment, allow ing for the study of genetic contribution as well as differ ences between maternal genetic contribution (cal cu lated at 14%) and fetal genetic contribution (calculated at 11%) [35].
Preterm birth is a trait that appears to be transmitted primarily in a matrilineal manner across generations; the risk of a woman having a preterm delivery is increased if her mother, full sisters or maternal halfsisters have had preterm deliveries, but is unaffected by the occurrence of preterm deliveries in her paternal halfsisters or in members of her partner's family [13]. Birth before 37 weeks of gestation increases a woman's own risk of preterm delivery by almost 20% [11], and having a previous preterm delivery confers an increased risk of recurrent preterm delivery [13]. Although the paternal contribution to birth timing in the context of fetal genetic influences has been somewhat controversial, data suggest a smaller role for paternal compared with maternal fetal genes in birth timing [36,37].
On the basis of this evidence of a genetic component in the timing of birth, genetic studies have investigated both maternal and fetal genomes, as both genotypes can affect perinatal outcomes [38]. Thus far, most human genetics

Candidate gene studies
Perhaps the most commonly studied pathways for potential candidate genes are those involved in immunity and infl ammation, as infl ammatory factors have been suggested to play a role early in the transition from quiescence to active labor in term as well as preterm birth [42]. Th e induction of proinfl ammatory mediators has been implicated in the onset of parturition, parti cu larly tumor necrosis factor (TNF)α and its receptors (TNFR1 and TNFR2). Although some studies have detected polymorphisms in these genes that alter the risk These changes in prostaglandin metabolism lead to elevated prostaglandin F 2α (PGF 2α ), which acts on the ovarian corpus luteum to decrease circulating progesterone (P4). This systemic progesterone withdrawal results in induction of contraction-associated proteins (CAPs) and transition of the uterine myometrium from a quiescent to an actively contractile state. (b) In human pregnancy, labor initiation is associated with induction of amnion COX2 and placental corticotropin-releasing hormone (CRH) and a reduction in chorion HPGD. These changes in prostaglandin metabolism and peptide signaling are associated with increased amnion prostaglandin E 2 (PGE 2 ), pro-infl ammatory cytokines and estradiol. This pro-infl ammatory milieu is hypothesized to cause 'functional' progesterone withdrawal (circulating progesterone does not fall), or progesterone resistance, followed by induction of CAPs and labor. Note that several fundamental diff erences between human and mouse parturition exist beyond the diff erences in systemic progesterone regulation at term. Murine gestation is multi-fetal, whereas human gestation is predominantly single fetus. In mice, the sites of prostaglandin and progesterone synthesis are maternal, whereas in human pregnancy, the primary sites of prostaglandin and progesterone synthesis in late gestation are from fetal tissues. Adapted from [27], Ratajczak  for preterm birth in either the mother or fetus, the results have generally failed to be replicated [39] or have not been generalized across populations [43,44]. Analysis of proinflammatory and antiinflammatory interleukins and their receptors have generally revealed no consistent association with preterm birth, with mixed results for inter leukin (IL)4, 6 and 10 as well as for interleukin receptors 1 (IL1R) and 6 (IL6R) [39]. Examination of other cytokines and pathogen recognition genes has similarly not provided compelling evidence for asso ciation with preterm birth. Another critical process for the initiation of parturition involves the transition of the uterus from a quiescent tissue, influenced by mediators that inhibit contraction, to a synchronously contracting tissue. Searching for can di date genes involved in premature uterine contraction has revealed the potential association of two polymor phisms in the gene encoding the β 2 adrenergic receptor (ADRB2), which is responsible for modulating uterine muscle contractions through the promotion of smooth muscle relaxation in the uterus, with preterm birth risk [4547]. Analysis of another molecule involved in uterine contraction, the dopamine receptor D2 (DRD2), did not reveal an association with preterm birth [40]. Further more, no links with preterm birth were established between the prostaglandin pathway genes prostaglandin E receptor 2 (PTGER2), prostaglandin E synthase (PTGES) or prostaglandin F receptor (PTGFR), despite the activity of prostaglandins in promoting uterine contractility [40].
As maintenance of pregnancy requires proper func tioning of the placenta, genes involved in angiogenesis and thrombosis have also been targets for association with preterm birth and placental dysfunction. One small study described a potential association between preterm birth risk and an intronic polymorphism in the vascular endothelial growth factor (VEGF) gene, which is responsible for inducing angiogenesis [48], but this has not been replicated in larger cohorts. Studies of hemo stasis genes have yielded mixed results: an identified association [40] for the gene encoding one factor (factor V, F5) was not replicated [45,49]; some studies identified genes encoding factors associated in infants only (factor XIII, F13A1; thrombomodulin, THBD) [45,50], or in both mothers and infants (factor VII, F7); and others were not associated with preterm birth at all (factor II, F2; protein C receptor, endothelial, PROCR; protein C, PROC; and tissue factor pathway inhibitor, TFPI) [45,50].
Although some advances have been made in studies of the genetic contributions to preterm birth using candi date gene approaches, concrete causal links between these polymorphisms and preterm birth have not been established. Recent metaanalyses have been performed and summarized, providing a useful resource for investi gators interested in the genetics of preterm birth [51,52]. However, through the use of genomewide tools and analysis techniques, previously unanticipated mechanisms may be revealed.

Genomic approaches to preterm birth
To elucidate new genes and pathways involved in normal and pathological parturition, nonbiased, genomewide approaches offer substantial promise. Below, we describe complementary approaches to identify both common and rare variants that increase risk for preterm birth, and their associated findings.

Genome-wide association studies
Genomewide association studies (GWASs) are an un biased method for the discovery of new functional mecha nistic pathways in disease processes [53] such as preterm birth and abnormal fetal growth. Through meta analysis of six GWASs, variants in adenylyl cyclase type 5 (ADCY5), which has pleiotropic effects on glucose regulation, and variants near cyclin L1 (CCNL1), which may be involved in premRNA splicing and RNA pro cessing, have been shown to be associated with fetal growth and birth weight [54]. Furthermore, by looking genome wide at single nucleotide polymorphisms (SNPs) in a prioritized list that arose from a comparative analysis of genes showing divergence between humans and other mammals, a novel association between the follicle stimu lating hormone receptor gene (FSHR) and preterm birth was revealed [55]. GWAS data for approximately 1,000 preterm birth cases and 1,000 controls is currently available from the Danish National Birth Cohort through the database of Genotypes and Phenotypes (dbGaP) for analysis [56]. However, in this dataset, gene variants with genomewide statistical significance have not been found. Recent pathwaybased, rather than genebased, analyses using a custom database of genes curated from existing literature for preterm birth (dbPTB), have implicated several potentially relevant pathways that can be used to elucidate further genegene and geneenvironment interactions [52,57]. Further metaanalysis with emerging new data from other groups should provide additional power for the detection of associations. Modest sample sizes, along with the inability to rigorously subphenotype preterm births into more homogeneous groups based on gestational age or potential etiology, has limited the impact of GWASs so far.

Mitochondrial genetics
One underexplored avenue for determining maternal risk for preterm birth involves the influence of the mitochondrial genome. The high mutation rate of mito chondrial DNA (mtDNA), together with the fact that most of its encoded proteins are evolutionarily con served, allowing for the selection of neutral or beneficial variants, has generated interest in defining human mtDNA variations and their roles in human biology [58]. The results of two mitochondrial genome scans from two European populations examining links between mito chon drial genotypes and preterm birth did not reveal any associations with preterm delivery and its related out comes [59]. Perhaps one of the most signifi cant limita tions of the study was the lack of data regarding pre defined haplogroups. As the transmission of mtDNA is uniparental due to the lack of recombination, patho genic, functional and neutral variants can interact and are often linked with one another [58]. Thus, a multitude of SNPs can accumulate along branches of a haplogroup, which may alter the significance of other subhaplo groups; hence, the test for individual SNPs reveals an inadequate image of the evolutionary and functional role for mtDNA in an individual's haplotype [58].

Rare variant analysis
As the variants detected by GWASs have typically explained only a minor proportion of the heritability of complex diseases, it may prove more informative to pursue the identification and association of rare variants with disease [60]. Although not as many examples as for common SNPs have been reported, rare variants have been shown to play a role in complex traits [60]. So far, the analysis of rare variants associated with preterm birth has been limited. In one study exploiting candidate gene linkage for 33 genes in 257 families, nonparametric and parametric analyses performed on 99 SNPs in premature infants and mothers of premature infants resulted in the identification of a moderate association with preterm birth of corticotropin releasing hormone receptor 1 (CRHR1) and cytochrome P450 2E1 (CYP2E1) in affected infants, and with ectonucleotide pyrophosphatase/phos pho diesterase 1 (ENPP1), insulinlike growth factor bind ing protein 3 (IGFBP3), TNF receptorassociated factor 2 (TRAF2) and 7dehydrocholesterol reductase (DHCR7) in mothers [61]. DNA sequence analysis was performed for CRHR1 and TRAF2 to detect novel potentially causa tive mutations, but no new variants were detected [61].

Family-based designs
The use of familybased designs increases the power to detect associations, controls for heterogeneity/popula tion stratification, and might elucidate the effects of allele origin as well as transmission of phenotypes of disease modulation, allowing for the study of parentoforigin effects [62]. Familybased linkage analysis, combined with casecontrol association studies, have been used to determine a susceptibility haplotype for preterm birth in Finnish multiplex (multiple individuals affected in different generations) and nuclear families [63]. Through the use of parametric linkage analysis, genetic factors that influence preterm birth in either the fetus or mother may be elucidated by accounting for both potential mecha nisms of action of the preterm birth phenotype: being born preterm (an affected fetus or an infant phenotype); or giving birth preterm (an affected mother phenotype) (Figure 2) [59]. This method, followed by a casecontrol study and haplotype segregation analysis, identified a novel susceptibility gene (IGF1R) that, through the fetal genome, results in a predisposition to preterm birth [59]. Additionally, two Xlinked genes were implicated when preterm birth was the studied phenotype: the androgen receptor (AR) and IL2 receptor γ subunit (IL2RG) genes, which are located near a significant linkage signal locus (Xq13.1) [64]. Further analysis revealed long CAG repeats in the AR gene that were overrepresented, as well as short repeats that were underrepresented, in preterm individuals in comparison to term individuals, implicating this as a potential fetal susceptibility gene for preterm birth [64].

Conclusions and future approaches
We have summarized recent findings from model organisms that have defined components of birth timing mechanisms for those species, and the evidence for variable maternal and fetal contributions to the process depending upon the species investigated. We further provide evidence that genetic variation contributes to preterm birth risk in humans and, as a corollary, that genomic approaches should prove fruitful for revealing key components of the molecular machinery controlling the duration of human gestation. Although currently employed genomic approaches have been useful in starting to unravel the complexity of birth timing, much remains to be learned by applying more thorough genome sequencing methods. One such approach is wholeexome sequencing [65]. Several efforts are currently underway in this regard. An initial report from our group analyzing exomes of ten mothers from families with recurrent instances of preterm birth, including two motherdaughter pairs, found that the complement/ coagulation factor pathway was enriched in harboring rare variants [66]. Wholegenome sequencing may pro vide the opportunity for the identification of regulatory elements as well as amino acid changes, although the scale of the necessary bioinformatic analysis remains formidable [67]. Largescale genome sequencing studies, such as the Inova collaboration, are currently being planned and initiated to investigate de novo mutations in infants and contributions of the maternal genome to preterm birth [68]. The availability and synthesis of genomic data from preterm infants, controls and parents will significantly advance current understanding of the genomics of birth timing, paving the way for the determination of the mechanisms regulating the timing of parturition. What must not be overlooked in these future efforts is rigorous phenotyping and ascertainment of environmental exposures. Preterm birth is a very heterogeneous phenotype. Suitable numbers of individuals with more restricted phenotypes, including gestational age windows at birth, type of spontaneous delivery (spon taneous labor versus premature rupture of membranes), associated fetal growth characteristics or maternal characteristics, will greatly accelerate gene identification.
To facilitate timeefficient collection of subject samples (mother, father and infant at a minimum; maternal grandparents and more extended family structure when likely to be informative) and phenotype data, more extended international collaborations with comprehen sive, standardized definitions and data fields are essential [69]. Ultimately, gene discovery in humans will require the innovative application of noninvasive methods during pregnancy, such as functional imaging, proteomics and metabolomics, as well as the generation and investigation of new preclinical model organisms and ex vivo systems, to establish mechanisms and define potential pro phy lactic interventions for preterm birth.