Mitochondrial DNA copy number can influence mortality and cardiovascular disease via methylation of nuclear DNA CpGs

Background Mitochondrial DNA copy number (mtDNA-CN) has been associated with a variety of aging-related diseases, including all-cause mortality. However, the mechanism by which mtDNA-CN influences disease is not currently understood. One such mechanism may be through regulation of nuclear gene expression via the modification of nuclear DNA (nDNA) methylation. Methods To investigate this hypothesis, we assessed the relationship between mtDNA-CN and nDNA methylation in 2507 African American (AA) and European American (EA) participants from the Atherosclerosis Risk in Communities (ARIC) study. To validate our findings, we assayed an additional 2528 participants from the Cardiovascular Health Study (CHS) (N = 533) and Framingham Heart Study (FHS) (N = 1995). We further assessed the effect of experimental modification of mtDNA-CN through knockout of TFAM, a regulator of mtDNA replication, via CRISPR-Cas9. Results Thirty-four independent CpGs were associated with mtDNA-CN at genome-wide significance (P < 5 × 10− 8). Meta-analysis across all cohorts identified six mtDNA-CN-associated CpGs at genome-wide significance (P < 5 × 10− 8). Additionally, over half of these CpGs were associated with phenotypes known to be associated with mtDNA-CN, including coronary heart disease, cardiovascular disease, and mortality. Experimental modification of mtDNA-CN demonstrated that modulation of mtDNA-CN results in changes in nDNA methylation and gene expression of specific CpGs and nearby transcripts. Strikingly, the “neuroactive ligand receptor interaction” KEGG pathway was found to be highly overrepresented in the ARIC cohort (P = 5.24 × 10− 12), as well as the TFAM knockout methylation (P = 4.41 × 10− 4) and expression (P = 4.30 × 10− 4) studies. Conclusions These results demonstrate that changes in mtDNA-CN influence nDNA methylation at specific loci and result in differential expression of specific genes that may impact human health and disease via altered cell signaling.

FHS is approved by the IRB at Boston University Medical Center. CHS recruited participants from Medicare lists at four sites and IRBs at each site were involved in human subjects approval.

Residual Bootstrapping
The steps taken were as follows: 1) Residuals were derived from the full model, 2) Fitted values were derived from the null model (model without mtDNA-CN as independent factor), 3) The residuals from Step 1 were resampled and added to the fits from Step 2, 4) Each resulting matrix from Step 3 was run as pseudonull input in the formula lme(pseudonull~CN+covariates) to refit the full model and obtain null statistics, 4a) The most extreme P-value was pulled from each iteration, 4b) The resulting 100 most extreme P-values were ranked from least to most significant and the 95th value was chosen to be the 'genome-wide significance level' for the corresponding cohort.

mtDNA-CN Estimation using Quantitative PCR
Briefly, each well consisted of a VIC-labeled, primer limited assay specific to a mitochondrial target (MT-ND1), and a FAM-labeled assay specific to a region of the nuclear genome selected for being non-repetitive (RPPH1). Each sample was run in triplicate on a 384 well plate in a 10 µL reaction containing 20 ng of DNA. The cycle threshold (Ct) value was determined from the amplification curve for each target by the ABI Viia7 software. A ΔCt value was computed for each well as the difference between the Ct for the RPPH1 target and the Ct for the MT-ND1 target, as a measure of mtDNA copy number relative to nuclear DNA copy number. For samples with a standard deviation of ΔCt for the three replicates >0.5, an outlier replicate was identified and excluded. If the ΔCt standard deviation remained >0.5 after exclusion, the sample was completely excluded from future analyses. Replicates with Ct values for MT-ND1 >28, Ct values for RPPH1 >5 standard deviations from the mean, or ΔCt values >3 standard deviations from the mean of the plate were removed. Additionally, due to an observed linear increase in ΔCt value by the order in which the replicate was pipetted onto the plate, a linear regression was used to correct for pipetting order. Plate effects are controlled for by performing a linear regression whereby the plate a sample is run on is treated as a random effect.

Methylation Analysis
Quality control was performed in in the minfi R package (Aryee et al. 2014) (version 1.12.0, http://www.bioconductor.org/packages/release/bioc/html/minfi.html). Samples with low median intensities of below 10.5 (log2) across the methylated and unmethylated channels, samples with a proportion of probes falling detection of greater than 0.5%, samples with QC probes falling greater than three standard deviations from the mean, sex-check mismatches, failed concordance with prior genotyping or > 0.5% of probes with a detection P-value > 0.01 were removed. Probes with >1% of values below detection were removed. In total, 11 samples were removed for sample QC resulting in a sample of 323 European-ancestry and 326 African-American samples. Methylation values were normalized using the SWAN quantile normalization method (Maksimovic et al. 2012). Since white blood cell proportions were not directly measured in CHS they were estimated from the methylation data using the Houseman method (Houseman et al. 2014).

Mendelian Randomization meQTL Analysis
Haplotype phasing was performed using ShapeIt (Delaneau et al. 2013) and imputation was performed using IMPUTE2 (Howie et al. 2012). SNPs were filtered for allele frequency >0.05, and imputation quality >0.4. Genotypes were imputed to the 1000G reference panel (Phase I, version 3). The same covariates used for the ARIC EWAS analysis were used to call meQTLs as well as the addition of genotyping PCs (4 for EA, 10 for AA). Only meQTLs which had an individual cohort P value >0.05 were included in the meta-analysis.
A linear model was used for MatrixEQTL and a cis meQTL was defined as having a distance less than 100 kb. Only cis meQTLs derived from the six CpGs of interest and which met a cohort-specific permuted P-value cutoff (Permuted P: EA=7.84x10 -4 , AA=9.12x10 -4 ) or a permuted meta-analysis P-value cutoff (Permuted P, fixed effects (FE) model: 3.97x10 -5 ) were retained for use in Mendelian randomization.

Phenotype Analysis
Adjudicated events between visit 1 and the baseline visit for this study were considered prevalent events. Analyses of prevalent and incident events in CHS were adjusted for age, sex, clinic site and batch.
In ARIC, prevalent coronary heart disease (CHD) was defined as history of myocardial infarction (MI) or cardiac procedures (heart or arterial surgery, coronary bypass, or angioplasty). Cardiovascular disease (CVD) was defined as either CHD or stroke. Prevalent stroke was defined as stroke at baseline. For all phenotypes, prevalent disease was a combination of self-report at visit 1 plus adjudicated events between visit 1 and the baseline visit. Incident CHD was defined as the first incident MI or death owing to CHD. Incident stroke was defined as the first nonfatal stroke or death owing to stroke. In ARIC, the mean follow-up time was 20.6 years in the EA cohort and 18.1 years in the AA cohort. Follow-up for incident events was administratively censored at December 31, 2016.
CHS and FHS followed similar phenotype definitions as ARIC. For FHS, the mean follow-up time was 6.0 years and individuals were removed if follow-up years equaled 0, FHS events were adjudicated through 12/2016. In CHS, prevalent CVD/CHD was excluded during sampling and events were adjudicated through June 30, 2015. The follow up time for incident events from the time of methylation measurement was 23 years.

DNA Isolation
DNA extraction was performed on harvested HEK293T TFAM knockout cells using the AllPrep DNA/RNA Mini Kit (Qiagen #80204) following the manufacturer's protocol (passage 32 and 37). DNA was eluted in 100 µL ultrapure water. DNA was quantified using a Nanodrop 1000. Low purity samples were subjected to ethanol precipitation.

RNA Isolation
Total RNA was extracted from confluent T75 culture flasks of TFAM CRISPR Negative Control and KO cell lines using the AllPrep DNA/RNA/Protein Kit (Qiagen #80004) (passage 45). RNA was extracted using the provided kit manual/instructions for RNA extraction, except all microcentrifuge spins were performed at 10,000 x g. RNA was eluted twice in 50 uL molecular biology grade water and stored in a -80C freezer.

mtDNA-CN Estimation on TFAM Knockout Cell Lines
qPCR was used to measure mtDNA-CN as described above for CHS in section "mtDNA-CN Estimation using Quantitative PCR". qPCR to determine TFAM DNA quantity 20 ng of total DNA from each cell line was used as input for a 10 µL volume reaction. TFAM DNA were amplified using TaqMan probe Hs00273372_s1 (20x, FAM-labeled, Applied Biosystems #4331182). GAPDH DNA served as a housekeeping reference control and was amplified with probe Hs03929097_g1 (20x, VIC-labeled, Applied Biosystems #4448489). Both probes were multiplexed together and all qPCR reactions were conducted at 50 o C for 2 min, 95 o C for 10 min, and then 40 cycles of 95 o C for 15 sec and 60 o C for 1 min. DNA quantity was determined using double delta cycle threshold using GAPDH as the reference and displayed a 66% reduction of the TFAM gene.

TFAM Expression Assay
cDNA synthesis was performed with the SuperScript III First-Strand Synthesis System for RT-PCR (ThermoFisher #18080-051) following the manufacturer's protocol. 1.5 µg of total RNA from each cell line was used as input and primed with 50 ng random hexamers using the appropriate incubation conditions from the manufacture's protocol.
Following completed cDNA synthesis, samples were quantified using the Qubit ssDNA assay kit (Invitrogen #Q10212) and Qubit 2.0 Fluorometer. Synthesized cDNA was then diluted to 10 ng/µL using ultrapure water and stored in -20 o C. qPCR to determine TFAM gene expression for TFAM KO 20 ng of synthesized cDNA from each cell line was used as input for a 10 µL volume reaction. TFAM cDNA were amplified using TaqMan probe Hs00273372_s1 (20x, FAMlabeled, Applied Biosystems #4331182). GAPDH cDNA served as a housekeeping reference control and was amplified with probe Hs03929097_g1 (20x, VIC-labeled, Applied Biosystems #4448489). Both probes were multiplexed together and all qPCR reactions were conducted at 50 o C for 2 min, 95 o C for 10 min, and then 40 cycles of 95 o C for 15 sec and 60 o C for 1 min. Expression fold change was determined using double delta cycle threshold using GAPDH as the housekeeping reference control.

Total Protein Extraction
Total protein lysates from HEK293T TFAM CRISPR knockout cell lines were extracted using ice-cold radioimmunoprecipitation assay buffer (RIPA) buffer supplemented with Halt Protease and Phosphatase Inhibitor Cocktail (Thermo Scientific #78440) (passage 35). Protein concentrations were quantified using the Pierce BCA Protein Assay Kit (Thermo Scientific #23227) and lysates were stored at -80 o C.

Western Blotting
Equal amounts of each lysate were diluted 1:1 with 2x Laemmli Sample Buffer (Bio-Rad #161-0737) supplemented with 5% β-mercaptoethanol. Samples were then heated at 95 o C for 5 minutes to denature the proteins. 30 µg of each protein lysate was separated on a 12% polyacrylamide Mini-PROTEAN TGX Gel (Bio-Rad #456-1044) and then transferred to a PVDF membrane (Bio-Rad #1704156) using the Trans-Blot Turbo Transfer System. The membrane was blocked overnight at 4 o C in Tris-Buffered Saline and Tween 20 (TBST) containing 5% nonfat milk with gentle shaking. After blocking, the membrane was incubated with rabbit anti-TFAM primary antibody diluted 1:40,000 in 5% milk (Abcam #ab176558) and rabbit anti-β-Tubulin primary antibody diluted 1:3,000 in 5% milk (Invitrogen #PA5-27552) for 1 hour at room temperature with gentle shaking. The membrane was washed 5-times with TBST after primary antibody incubation, then incubated with goat anti-rabbit secondary antibody conjugated with horseradish peroxidase (1:20,000 dilution, Abcam #ab97080) in the dark for 1 hour at room temperature with shaking. Signals were visualized by enhanced chemiluminescent substrate (SuperSignal West Pico PLUS, Thermo Scientific #34577) and photographed digitally using the ChemiDoc-It 2 Imager. ImageJ was used to quantify protein abundance (normalized to tubulin control).

Library Preparation and Sequencing
Specifically, total RNA is converted to cDNA and size selected to 150 to 200 bp in length with 3' or 5' overhangs. End repair is performed where 3' to 5' exonuclease activity of enzymes removes 3' overhangs and the polymerase activity fills in the 5' overhangs. An 'A' base is then added to the 3' end of the blunt phosphorylated DNA fragments which prepares the DNA fragments for ligation to the sequencing adapters, which have a single 'T' base overhang at their 3' end. Ligated fragments are subsequently size selected through purification using SPRI beads and undergo PCR amplification techniques to prepare the 'libraries'. The BioAnalyzer is used for quality control of the libraries

A.
B. C.

A.
B.