High-resolution temporal profiling of the human gut microbiome reveals consistent and cascading alterations in response to dietary glycans

Background Dietary glycans, widely used as food ingredients and not directly digested by humans, are of intense interest for their beneficial roles in human health through shaping the microbiome. Characterizing the consistency and temporal responses of the gut microbiome to glycans is critical for rationally developing and deploying these compounds as therapeutics. Methods We investigated the effect of two chemically distinct glycans (fructooligosaccharides and polydextrose) through three clinical studies conducted with 80 healthy volunteers. Stool samples, collected at dense temporal resolution (~ 4 times per week over 10 weeks) and analyzed using shotgun metagenomic sequencing, enabled detailed characterization of participants’ microbiomes. For analyzing the microbiome time-series data, we developed MC-TIMME2 (Microbial Counts Trajectories Infinite Mixture Model Engine 2.0), a purpose-built computational tool based on nonparametric Bayesian methods that infer temporal patterns induced by perturbations and groups of microbes sharing these patterns. Results Overall microbiome structure as well as individual taxa showed rapid, consistent, and durable alterations across participants, regardless of compound dose or the order in which glycans were consumed. Significant changes also occurred in the abundances of microbial carbohydrate utilization genes in response to polydextrose, but not in response to fructooligosaccharides. Using MC-TIMME2, we produced detailed, high-resolution temporal maps of the microbiota in response to glycans within and across microbiomes. Conclusions Our findings indicate that dietary glycans cause reproducible, dynamic, and differential alterations to the community structure of the human microbiome.

2. The concentration of the perturbing compound is assumed subject-specific, and is captured using a pharmacokinetic model 3. When a perturbation is inferred to affect a bacterial species, the perturbation modulates the baseline dynamics with a magnitude driven by the subject-specific concentration of the perturbing compound passed through a non-linear transformation To share component 3 across microbes, we employ a double-Dirichlet Process (dDP). At the top level, the dDP forms clusters of perturbation responses, each parametrizing the non-linear transformation of the perturbing compound, as well as the time period when the perturbation is active. At the bottom level, the dDP forms clusters of bacterial trajectories, incorporating biological knowledge with a prior that encourages (but does not enforce) co-clustering of phylogenetically related species. These microbe groups are themselves assigned to perturbation clusters, allowing a single perturbation cluster to potentially capture phylogenetically distant microbes.

Input data and initial filtering
The input data to the model consists of sequencing counts, dosing information, and phylogenetic distances between microbes. For each subject s, we denote by y so (t s,i ) the sequencing counts for OTU o at subject-specific time points t s,i . The dosing information consists of subject-specific doses d si and their times of administration a si .
Before being analyzed by the model, initial filters are applied to remove bacterial time-series of no interest. An abundance filter is used to remove lowabundance species, keeping only those species which have a specified mean abundance in at least a specified number or proportion of subjects. A counts filter is then used to remove any individual time-series which does not maintain some minimum number of counts in some specified number of consecutive time points.
x so q so y so z so Figure 1: The MC-TIMME2 model depicted using plate notation. For brevity, some higher-level priors have been excluded. Random variables are indicated by open circles; observed data are indicated by shaded circles. Arrows indicate conditional independence relationships. Rounded rectangles indicate repeated structure. Solid squares indicate factors (functions) that link variables. The variables relevant to microbe clustering are highlighted in blue, those for perturbation clusters are highlighted in red, and the pharmacokinetics are indicated in green. φ 1 , π 1 and z so indicate the Dirichlet Process concentration parameters, mixture weights, and cluster assignments for microbe clustering, and φ 2 , π 2 and e k for perturbation clustering.

Dirichlet process
Our model employs Dirichlet process mixture models [1,2]. This section includes an overview of the Dirichlet process. Mixture weights C 1 , C 2 , . . . are generated from a "stick-breaking" construction with concentration parameter φ.
Atom locations η * 1 , η * 2 , . . . are drawn from a base distribution G 0 . Then, is a draw from a Dirichlet process with concentration parameter φ and base distribution G 0 , or Θ ∼ DP(φ, G 0 ) . To specify a Dirichlet process mixture model, we introduce cluster assignment variables Z n for each data point X n , and a density p which takes an atom location as a parameter.

Double Dirichlet process for microbes and perturbations
In our model of the responses of the microbiota to perturbations, two levels of Dirichlet process mixture model clustering are employed. The microbe clustering Dirichlet process has a concentration parameter φ 1 and mixture weights π 1 . Cluster assignments for each microbe o in subject s are designated by z so . The microbe clusters are themselves assigned either to a null perturbation effect (with prior probability 1−θ perturb ) or to perturbation clusters in the top level of the dDP. φ 2 and π 2 indicate the concentration parameter and mixture weights for the perturbation DP, and e k is the assignment to a perturbation cluster for microbe cluster k. Phylogenetic information is incorporated into microbe clustering using a potential function ψ. Each microbe cluster chooses a representative member λ k . Then, the likelihood for each OTU time-series being assigned to a microbe cluster incorporates the phylogenetic distance between that OTU and the representative. Letting Ω m indicate the perturbation parameters for perturbation cluster m, Letting d(o 1 , o 2 ) denote the patristic distance between species o 1 and o 2 on a phylogenetic tree, the potential function is given by with hyperparameters ζ 0 and ζ 1 .

Pharmacokinetics
Data for the subject-specific doses d si and their times of administration a si are available as inputs to the model. The level of the compound over time c in each subject s is modeled using first-order pharmacokinetics, with a subject-specific elimination rate κ s :

Baseline logistic growth dynamics
In the absence of a perturbation, each species is assumed to follow a stochastic logistic growth model with a growth rate α so and a self-limiting term β so . The latent trajectory for species o in subject s is x so (t s,i ) ∼ Normal(µ so (t s,i ), ∆ s,i σ 2 so (t s,i )) The time steps are denoted by ∆ s,i = t s,i − t s,i−1 . The process variance σ 2 so is modeled as potentially having a different value on and off the perturbation period.
To model the non-negativity of microbiome data, while still maintaining inference efficiency, we apply a relaxation method which has been introduced previously for microbiome dynamics [3]. Alongside the latent dynamical trajectory x so (t), there is a latent auxiliary trajectory q so (t) which is strictly nonnegative (from its prior distribution) but closely coupled to x so (t) via a normal distribution: (q so (t) − x so (t)) ∼ Normal(0, ϕ 2 so (t s,i )) The coupling variance ϕ 2 is modeled as scaling with magnitude, with global parameters v aux 1 and v aux 2 .

Perturbation and persistent effect
We assume the baseline growth rate can be modulated by a perturbation. Each microbe cluster k is probabilistically assigned to either a null perturbation effect (with prior probability 1 − θ perturb ) or to a perturbation cluster e k from the perturbation Dirichlet process. Each perturbation cluster m has a shape parameter r m , which controls the shape of a transfer function h: h(·; r m ) reshapes the subject-specific pharmacokinetic profile c s into a perturbation trajectory for each subject. Additionally, each perturbation cluster has a parameter γ 1m which sets the magnitude of the perturbation. The period of time when a perturbation is active is given by a step function w sm (t) for each subject, which is parametrized by an on-time and relative duration specific to the perturbation cluster. Each perturbation cluster m has an on-time t on m given as a number of days relative to the first administration time of the compound. Additionally, the perturbation cluster m has a relative duration u m , which specifies the perturbation duration as a proportion of each subject's full feeding period duration (the time from the first dose administration to the last administration). Assuming that t = 0 is the first administration time (min i (a si ) = 0), Each time-series may optionally have a persistent effect which continues to modulate the dynamics after the compound has been excreted. The parameters for these persistent effects are specific to each OTU time-series. The prior probability for the persistent effect being active is given by θ persist . An indicator variable b so specifies whether or not the persistent effect is present, and γ 2so controls the magnitude of the persistent effect. The period when the persistent effect is active is given by a step function p so (t), which turns on at the last dose administration and continues for t persist so days.
The modulated growth rate from the compound perturbation and the persistent effect is A microbe may have a persistent effect even when no perturbation cluster is selected, in which case the modulated growth rate is

Observation model and overdispersion parameter
The observed data are the sequencing counts y so (t s,i ), which can be modeled by a negative binomial distribution [4], with the mean depending on the auxiliary trajectory q so (t) and the total counts for the sample.
Here, ν s denotes the total number of counts and B s the bacterial concentration, which can be assumed constant when unknown.
We use a separate model applied to shallow shotgun replicates data to learn values for the overdispersion parameter . The replicates data consists of counts y io for species o in replicate i, with each species assumed to have one latent mean relative abundance µ o across all replicates. We model this data with a Dirichlet process infinite mixture model of negative binomial components, with each component having its own single value for the overdispersion parameter. This model is similar to that found in [5], which used a finite mixture of negative binomial components. Each species o in the replicates is assigned to a negative binomial cluster by an indicator variable z o , so that y io |z o ∼ Neg-Bin(η io , zo ). The sample specific mean is η io = ν i µ o , where ν i is the total number of counts in sample i.
Inference is performed for this overdispersion mixture model using the replicates data, and a consensus clustering is obtained consisting of n mixture weights π noise j and overdispersion parameters j (the procedure to obtain a consensus clustering from MCMC samples using agglomerative clustering is described below). To use these inferred values in the full time-series inference, we marginalize over assignments to the consensus clusters, assuming that

Zero-inflation of auxiliary trajectory
At time points t s,i where the observed counts y so (t s,i ) are zero, we allow for the possibility that the auxiliary trajectory at that time point is a structural zero (q so (t s,i ) = 0) with a zero-inflation model. At each time point with zero counts, the auxiliary trajectory probabilistically chooses whether to be a structural zero (indicated by ξ so (t s,i ) = 0), with prior probability given by a global parameter θ zero . Therefore, the distribution of q can be written y so (t s,i ) = 0 and ξ so (t s,i ) = 0 (29)

Priors
For both Dirichlet process concentration parameters, we assume a Gamma prior For both the probability of selecting any perturbation θ perturb and the probability of selecting a persistent effect θ persist , we assume a beta prior. In both cases, a beta prior with hyperparameters α and β can be interpreted as a prior observation of α − 1 present effects and β − 1 null effects. For the perturbation selection, hyperparameters (1, 50) are used, equivalent to an observation of about 50 microbe clusters all with no effect. For the persistent effect selection, the hyperparameters are (1, 3000), equivalent to observing about 3000 OTU time-series (about the number getting past the filter) with no persistent effect.
A truncated normal distribution is used as the prior for the subject-specific elimination rates.
The auxiliary trajectory has a uniform prior restricting it to positive values.
The growth rate and the self-limiting terms both have normal priors.
The process variance parameters have inverse gamma priors. v so and v pert so have: v so ∼ Inv. Gamma(0.00001, 1) The prior probability of the auxiliary trajectory choosing to be zero has a weak beta prior.
θ zero ∼ Beta(1, 1) The transfer function shape parameter r m has an exponential prior The perturbation cluster effect strengths and the persistent effect strengths use a normal prior.
The perturbation clusters are intended to capture sustained responses to the compound, and not short-term spikes. Therefore, truncated normal distributions are used as the priors for the on-time and duration, with the prior for the on-time requiring that perturbations appear within 20 days of the start of dosing. t on m ∼ Trunc. Normal(0, 20, 0, 20) For the perturbation durations u m , a truncated normal prior is also used, which requires that perturbations last at least 40% of the feeding period and places more weight on long perturbations.
The prior for the duration of the persistent effect is also a truncated normal distribution, which requires that persistent effects last at least 10 days.
The hyperparameters ζ 0 and ζ 1 for the phylogenetic term in microbe clustering take fixed values which are calibrated by comparing the strength of the phylogenetic prior at different taxonomic levels to the strength of the Dirichlet process term which controls the probability of a microbe being assigned to a cluster based on the number of members in that cluster. In particular, amongst the species which pass initial filtering settings of 5e-4 mean abundance in 25% of subjects in our FOS study, the median distance between those in the same genus is 0.063 and those in the same family is 0.13. By requiring that the distance of 0.063 is equivalent to a cluster size of 5 members and the distance 0.13 is equivalent to a cluster size of 4 members, we get ζ 0 = 3.33 and ζ 1 = 1.82.

Inference
Inference for all parameters in the model is performed using a Markov Chain Monte Carlo method. The method uses Gibbs steps for variables with conjugate priors and Metropolis-Hastings (MH) steps otherwise. The overview of the main sampling loop is:

Latent trajectory and auxiliary trajectory
First consider the case where y so (t s,i ) = 0. In this case, zero-inflation of q so (t s,i ) is not possible at this time point. At each such time point t s,i , the joint posterior distribution for the latent trajectory x so (t s,i ) and the latent auxiliary trajectory q so (t s,i ) is likely to exhibit high correlation. Therefore, we achieve efficient sampling of these parameters using a multivariate Metropolis-Hastings step with adaptive proposal. MH steps of this type are described in [6].
In the following equations, the indices s and o are excluded for brevity. At each time point t i , a multivariate normal jumping proposal distribution is used: The proposal covariance Σ prop (t i ) is set equal to the empirical covariance of the accepted trajectory and auxiliary trajectory samples at time point t i , multiplied by a scalar which is tuned to adjust the acceptance rate towards 0.234, the optimal acceptance rate for multivariate MH [6]. During the first 75% of the burn-in period, Σ prop (t i ) is updated every 20 iterations based on the acceptance rate and samples of the last 200 steps, and for the rest of the burn-in period and after the end of the burn-in period it remains constant for all remaining iterations. The acceptance probability for this jump depends on the unnormalized joint posterior for x(t i ) and q(t i ). The function l is the data likelihood: l(y; q, , π noise ) = j π noise j Neg-Bin y; ν q B , j The unnormalized posterior conditional on all other parameters Ω is Because the proposal is symmetric, the acceptance probability is calculated as the ratio The values for the trajectory and auxiliary trajectory for this step are given by with probability min(1, r accept ) At the time points where y(t i ) = 0, a structural zero of the auxiliary trajectory is possible. At these time points, an update step may also performed for the choice of a structural zero or not. This update is achieved using reversible jump MH step between two models [6]. Let M 0 denote the model with q(t i ) = 0 and M 1 denote the model with q(t i ) = 0. At each iteration, either a step is made to update the parameters within the current model or a jump is proposed to the other model. Letting J k,k * denote the probability of considering a move from model k to model k * , we use J 0,0 = J 1,1 = J 0,1 = J 1,0 = 0.5.
For the move within model M 1 , the same update as described above is used. For the move within model M 0 , a single variable MH step with the same acceptance ratio as given in (49) is used, without the terms involving q.
In the case that the current model is M 0 and a jump is proposed to M 1 , the reversible jump MH step is used. The augmenting variable u is drawn from J(u), which is a truncated normal u ∼ Trunc. Norm(0, 1, 0.1, 10000) Then, the parameter for model M 1 is defined by setting q(t i ) = u and leaving all other parameters at the same values. The acceptance ratio for this move is With probability min(1, r) the move to M 1 with the given value of q(t i ) is accepted. In the case that the current model is M 1 and a jump is proposed to M 0 , an augmenting variable is not required, as the proposal just sets q(t i ) = 0. Then the acceptance ratio is

Logistic growth parameters
The growth rate α so and the self-limiting term β so have conjugate Normal priors, so they are sampled using straightforward Gibbs steps.

Process variance parameters
The process variance parameters have conjugate inverse Gamma priors, so they are sampled using standard Gibbs steps.

Persistent effect selection and effect magnitude
The persistent effect magnitude γ 2so has a conjugate Normal prior, and is updated with a Gibbs step. The indicator variable controlling the presence of the persistent effect is updated with a Gibbs step.

Microbe cluster assignments
To sample assignments in the double Dirichlet Process, we use a Gibbs sampling method similar to [7]. In the first step, we sample assignments of bacterial trajectories to microbe clusters, marginalizing over the probabilities for the perturbation clusters.
In these equations, F m (x so ) denotes the likelihood of the trajectory x so under the perturbation cluster m: where µ (m) so indicates the µ computed with the perturbation parameters from perturbation cluster m: Conditional on the other parameters, the posterior distribution for the cluster assignment variable z so is The probability for a new microbe cluster k new is calculated by marginalizing over assignments to the perturbation clusters, including the null effect cluster, the existing perturbation clusters, and the possibility of a new perturbation cluster: (where N microbe is the total number of microbe clusters which have chosen to have a perturbation effect). Due to a lack of conjugacy for the perturbation cluster transfer shape, on-time, and duration, the likelihood of a new perturbation cluster F m new (x so ) is calculated by drawing a sample from the prior for the perturbation cluster parameters (effect, transfer function shape, on-time, and duration), as in Neal's Algorithm 8 for Dirichlet Process sampling [2]. When the sample from the categorical distribution specified by (56) yields a new microbe cluster k new , this new cluster is immediately assigned to a perturbation cluster according to (57):

Perturbation effect selection
For each microbe cluster, a Gibbs step is used to update the indicator variable for whether or not an effect is present.

Perturbation cluster assignments
In the second part of the dDP Gibbs sampler, microbe clusters are assigned to perturbation clusters or to a null effect perturbation cluster. Effect selection is sampled as described above, and microbe clusters which choose no effect are automatically assigned to the null cluster. For those microbe clusters which choose to have an effect, the posterior for the assignment to a perturbation cluster is calculated based on the likelihood of all the microbe cluster's members under each perturbation cluster. As in sampling for the microbe clusters, the case of a new cluster is handled by drawing a sample from the prior for the transfer function shape, on-time, duration, and perturbation effect.

Microbe cluster prototypes
The microbe cluster prototype is updated using a Gibbs step. Conditional on the cluster assignments, the posterior for the prototype member λ k is given by

Transfer function shape parameter
The transfer function shape parameters r m are updated with MH steps. The proposal is The acceptance ratio is

Perturbation effect
The perturbation effects have a conjugate prior, and are updated with Gibbs steps.

Pharmacokinetic elimination rate
The pharmacokinetic elimination rates κ s are updated using MH steps. A normal jumping proposal is used.
The proposal variances σ 2 κs are adjusted to tune the acceptance rate towards 0.234. Every 200 iterations during the first 75% of the burn-in period, the proposal variances are tuned based on the acceptance rate of the previous 200 steps for κ s .
The acceptance ratio for the proposal κ

Dirichlet process concentration parameters
The concentration parameters for each level of the Dirichlet process are sampled according to the auxiliary variable method described in [8].

Selection prior probabilities
Both θ perturb and θ persist have a conjugate Beta prior and are updated by standard Gibbs steps.
6 Analysis of MCMC samples 6

.1 Convergence of MCMC
We assess convergence of the MCMC chain using theR diagnostic described in [6].

Bayes factors
For each time-series, a Bayes factor for whether or not a perturbation was present is computed. The Bayes factor is the ratio of the posterior odds to the prior odds: where θ perturb,1 and θ perturb,2 indicate the two hyperparameters of the beta prior on θ perturb .