Method  Open  Published:
Designing stringofbeads vaccines with optimal spacers
Genome Medicinevolume 8, Article number: 9 (2016)
Abstract
Stringofbeads polypeptides allow convenient delivery of epitopebased vaccines. The success of a polypeptide relies on efficient processing: constituent epitopes need to be recovered while avoiding neoepitopes from epitope junctions. Spacers between epitopes are employed to ensure this, but spacer selection is nontrivial.
We present a framework to determine optimally the length and sequence of a spacer through multiobjective optimization for human leukocyte antigen class I restricted polypeptides. The method yields stringofbead vaccines with flexible spacer lengths that increase the predicted epitope recovery rate fivefold while reducing the immunogenicity from neoepitopes by 44 % compared to designs without spacers.
Background
One of the most promising approaches of rational vaccine design uses socalled epitopebased vaccines (EVs). Vaccines based on Tcell epitopes, short immunogenic peptide sequences derived from antigens, offer several advantages over traditional whole attenuated or subunit vaccines [1]. Unlike traditional vaccines, EVs do not contain potentially infectious material and the selection of peptides can be tailored to address the genetic variation of pathogens and that of a target population or of an individual patient. Wellestablished techniques for peptide synthesis guarantee rapid highquality production and an economical storage of the final vaccine [1].
Rational development of EVs relies on bioinformatics for prediction of viable epitopes. Machinelearning methods, such as probabilistic models, neural networks, and support vectors machines, are routinely used with high accuracy for epitope prediction [2–5]. Different algorithms have been suggested as well for selecting an optimal set of epitopes for EV design, each emphasizing different aspects of EVs [6–10]. Among these approaches is OptiTope, a mathematical framework that relies on integer linear programming, which can easily be adapted to many different settings and types of EVs [8, 11].
Nevertheless, the stability and delivery of EVs remain major obstacles. Several strategies have been explored in clinical studies and range from administration of peptide cocktails to assembly of selected peptides into polypeptides [12]. One popular approach concatenates the epitope sequences, like beads on a string, to create a stringofbeads vaccine (SBV, Fig. 1a). The efficacy of an SBV depends on the processing of the polypeptide such that the majority of desired Tcell epitopes are recovered and subsequently presented by human leucocyte antigen (HLA) molecules. A major factor for optimal recovery is the correct cleavage of the epitopes. It has been shown that recovery of the epitopes is strongly linked to the ordering of the peptides within the SBV due to its influence on the cleavage probability [13]. An unfavorable order can lead to miscleaved peptides and thus, to an ineffective vaccine (Fig. 1b). Furthermore, new cleavage sites and neoepitopes can arise from nonnative sequences at junctions between epitopes and/or spacers. These neoepitopes can also have detrimental effects [14] (Fig. 1b).
To improve the recovery of epitopes in SBVs, several groups have suggested the use of spacer sequences between epitopes [15–17] (Fig. 1a). However, it is unclear how to determine the optimal length and amino acid sequence of a spacer to exploit fully its potential. Furthermore, with increasing spacer length, the problem of induced neoepitopes and new arising cleavage sites becomes increasingly challenging. In addition, experimentally testing designs to determine an optimal SBV, even without considering spacer sequences, quickly becomes infeasible. A dozen epitopes can be combined into about half a billion (12!) distinct SBV sequences. Considering additional spacer sequences with flexible length, increases the possibilities many times over. For instance, allowing spacer sequences up to a length of three for 12 epitopes results in over 44 trillion possible designs. For simplicity, most SBV designs have so far used fixed spacer sequences. Until now, only a few computational approaches have been proposed to address the epitope assembly problem (i.e., the problem of choosing the right epitope order). ViderShalit et al. suggested a genetic algorithm that simultaneously performs epitope selection and assembly [6]. Toussaint et al. reduced the epitope assembly problem to the wellknown traveling salesperson problem (TSP) and solved it heuristically or optimally via integer linear programming [7]. Neither of these approaches considers spacer sequences though.
In this work, we propose an approach to determine a provably optimal spacer sequence of fixed length for a given HLAI restricted epitope pair. We also extend the formulation to determine the optimal spacer length and combine this approach with that of Toussaint et al. [7] to design an optimal SBV with flexible spacer sequences. Additionally, we account for the problem of arising neoepitopes and cleavage sites by formulating the problem of designing a spacer sequence as a multiobjective optimization problem that maximizes the recovery probability of the desired epitopes, minimizes the immunogenicity of neoepitopes, and (optionally) minimizes the cleavage probability at nonjunction sites at the same time. We focus our efforts solely on HLAI antigen processing, since computational prediction methods for proteasomal cleavage and HLAI binding are well established. The cleavagesite prediction models are used for designing spacer sequences and for ordering the therapeutic epitopes of the SBV to increase their cleavage likelihood artificially, whereas the HLAI binding prediction models are used to hinder the formation of neoepitopes at the epitope–spacer interfaces. Note that an experimental determination of such an optimal design is virtually impossible due to the vast number of possible designs; a computational approach is, thus, indispensable.
Our results indicate there is a strong increase in the number of correctly cleaved epitopes and a decrease in the neoimmunogenicity of the complete construct compared to SBV designs with commonly used fixed spacers and optimally arranged SBVs without spacer sequences.
Methods
Optimization problem from an immunological perspective
The goal of the optimization is to design a SBV based on a given set of N epitopes. The SBV construct will contain all epitopes, but the ordering of the epitopes, as well as the length and sequence of the N – 1 spacers between these epitopes, is variable. The SBV is designed in a way that (a) maximizes the recovery of the epitopes while (b) minimizing the production of undesired neoepitopes.
More formally: Given a set E of N epitopes e _{1,} …, e _{ N }, we specify an optimal spacer s _{ ij } of length k defined over the alphabet of amino acids Σ that connects two epitopes \( {e}_i\in {\Sigma}^{\left{e}_i\right} \) and \( {e}_j\in {\Sigma}^{\left{e}_j\right} \) as the sequence that maximizes the likelihood of it being cleaved at the respective junction cleavage sites c _{ i } and c _{ j } of the two epitopes. This increases the likelihood of recovering all desired epitopes (Fig. 1b), which in turn increases the likelihood of them being loaded and presented on HLAI molecules. If only a few epitopes are correctly processed and neoepitopes are formed (Fig. 1b), the influence of these neoepitopes on the immunological processes should be minimized, so that the risk of undesired immune responses is reduced. This can be achieved by designing the spacer sequences in such a way that the potential neoepitopes spanning the connected epitopes e _{ i }, e _{ j } and their spacer s _{ ij } are minimally immunogenic. To approach this problem computationally, proteasomal cleavage and immunogenicity prediction models are needed. In Tcell epitope prediction, proteasomal cleavage prediction was found to have a minor impact on prediction performance [18, 19]. However, in the context of in silico stringofbeads design, its impact is much more pronounced. Here, accurate cleavage prediction is important for predicting the recovery probabilities of the desired epitopes of the SBV, maximizing the individual cleavage probability by rearranging the order of the epitopes, and optimizing spacer sequences. These effects have been shown to be essential for a vaccine’s efficacy in several experimental studies [13–16].
In the following, we describe the prediction models used and derive the mathematical formulation to tackle the problem of designing a SBV with flexible spacer sequences. It should be mentioned that the developed framework is restricted to linear prediction methods. Nonlinear prediction models, like artificial neural networks (e.g., NetMHC [3]), or even more complex prediction approaches like the one proposed by Zhang et al. [19], would lead to a nonconvex, nonlinear mixed integer optimization problem that cannot be solved efficiently and optimally even for small instances [20]. Furthermore, the linear prediction methods have to be fully integrated into the optimization framework to be able to solve the corresponding optimization problem efficiently. Integrated linear methods for epitope and cleavage prediction are listed in “Implementation”.
Cleavage site model
For cleavage site prediction, we employ the positionspecific scoring matrix (PSSM) ϕ _{ C }(∙) proposed by Dönnes et al., which uses four Cterminal amino acids and two Nterminal amino acids to predict a cleavage site. It has been shown to give quite robust and generalizable predictions [18].
We define the cleavage objective of spacer s _{ ij } and epitope pair e _{ i }, e _{ j } as the linear combination of the individual cleavage likelihoods of site c _{ i } and c _{ j } predicted by the PSSM ϕ _{ C }:
Here S ∶ = e _{ i } s _{ ij } e _{ j } denotes the concatenated sequence of a spacer and its enclosing epitope pair e _{ i } and e _{ j }. S[x] indicates the xth character of sequence S, n _{ c } represents the number of amino acids used to predict a cleavage site, and i _{ c }, j _{ c } denote the start of the segments used to predict the cleavage likelihoods at site c _{ i } and c _{ j }, respectively. The PSSM ϕ _{ C } is a 20 × n _{ c } matrix, where each row represents an amino acid and each column the position within a sequence of length n _{ c }. The entry ϕ _{ C }(a, i) of an amino acid a at position i represents the influence of an amino acid at a particular position on the cleavage likelihood. Thus, the loglikelihood of being cleaved is obtained by summing over the entries of ϕ _{ C } for a given sequence of length n _{ c }.
Immunogenicity model
Our immunogenicity model is based on the formulation proposed by Toussaint et al., which assumes that each epitope independently influences the immune response with respect to a target population or individual represented by a set of HLA alleles H [8]. The contribution of an HLA allele h ∈ H is directly proportional to the probability p _{ h } of the allele occurring within any patient of the target population H. We, thus, obtain
where S is the input sequence of length n. ϕ _{ I }(∙) represents a linear model predicting the immunogenicity of an epitope of length n _{ e } for an HLA allele h ∈ H and τ _{ h } characterizes the threshold of the HLA allele. For the immunogenicity predictor, we use SYFPEITHI, a PSSM generated from natural processed HLA ligands [2].
Problem definition as multiobjective optimization
From the discussion of the previous sections, it becomes apparent that for successfully designing a spacer sequence s _{ ij } for an epitope pair e _{ i }, e _{ j }, one has to consider multiple design goals. On the one hand, the spacer sequence should be designed to maximize the cleavage probabilities of the cleavage sites c _{ i } and c _{ j }. On the other hand, it should also minimize the neoimmunogenicity I(∙) of the complete sequence S := e _{ i } s _{ ij } e _{ j }. Such problems can be conveniently described as multiobjective optimization problems. Solving a multiobjective optimization problem yields Paretooptimal solutions that resemble tradeoffs between all objective functions.
Most approaches for solving multiobjective optimization problems use scalarization techniques combining the different objectives [21]. A common approach linearly combines the objectives weighted by a coefficient reflecting the designers’ preferences. However, identifying the best weights is difficult because (a) the numerical properties of the objective functions usually differ and (b) the effect of the defined weights is hard to determine a priori.
Since our stated problem exhibits a clear ordering of the objectives with respect to their priority, namely junctioncleavage likelihood over neoimmunogenicity, the problem of finding a Paretooptimal solution can be significantly simplified by applying lexicographical ordered optimization (LO). In LO, the objectives are ordered based on their importance and several single objective problems of the following form are iteratively solved:
where i represents the priority of the objective function, and f _{ j }(x _{ j } ^{*}) the optimum of the jth objective function found at the jth iteration [22]. Note that after the first iteration, f _{ j }(x _{ j } ^{*}) does not necessarily obtain the same solution as the independent optimization of f _{ j }(x), since new constraints have been added to the problem formulation.
Spacer design with fixed length
We now formulate the problem of designing a spacer of fixed length k as a biobjective mixed integer linear program (ILP). We represent each position i and amino acid a of the concatenated sequence of spacer and epitope pairs with a binary decision variable x _{ i,a }. Additionally, we allow all 20 amino acids to appear within the spacer sequence. A constraint has to be added to allow only one amino acid per position. The complete Pareto formulation has, thus, the following form:
where S _{ i } denotes the set of amino acids allowed at position i.
Following the LO definition, we solve two consecutive ILPs to yield a lexicographically optimal solution:
Here, we restrict P2 to obtain at least α ∈ [0, 1] fraction of the maximal cleavage score achieved by solving P1. α represents the tradeoff between cleavage likelihood and the likelihood of decreasing the immunogenicity score.
Stringofbeads design with spacers of flexible length
To design a complete stringofbeads with flexible spacer lengths, the introduced LO formulation is iteratively solved for each pair e _{ i }, e _{ j } ∈ E of epitopes with varying spacer length k ∈ {0, …, K}. The design with the highest minimum of both cleavage site likelihoods is selected and the scores obtained are used to initialize a fully connected and directed graph, where the negative cleavage scores represent the weights of the edges between epitopes pairs. Following Toussaint et al., a TSP instance is formulated based on this graph by adding a node that represents the N and Ctermini of the SBV and connecting it with all other nodes with zero edge weights (Fig. 2). Solving this formulated TSP instance yields an optimal ordering of the epitopes. Together with the optimized spacers, we thus, obtain an optimal sequence for the entire vaccine construct. The description of the algorithm in pseudocode can be found in Additional file 1.
Nonjunction cleavage site minimization
Besides the maximization of the junction cleavage likelihood, minimizing the likelihood of being cleaved at any other position will also improve the recovery probability of the therapeutic epitopes. Nonjunction cleavage sites are partly influenced by the length of the spacer sequence and the epitope pairing. Therefore, we treat the minimization of nonjunction cleavage sites as an optional third design goal and add to the sequence of consecutively solved ILPs a third optimization problem of the form:
Here again, α and β represent the tradeoffs between the three objective functions. The influence of α and β on cleavage likelihood, neoimmunogenicity, and nonjunction cleavage likelihood is depicted in Additional file 2.
Implementation
To solve the problem efficiently, the spacer design was parallelized and the TSP solution was approximated using the Lin–Kernighan–Helsgaun heuristic [23]. The model was implemented in Python 2.7 using Pyomo 4.0 [24] and solved with ILOG CPLEX 12.5 (www.ilog.com) and the Lin–Kernighan–Helsgaun heuristic [23]. The complete framework was integrated into EpiToolKit, a webbased platform for rational vaccine design. It can be accessed at www.epitoolkit.de under Spacer Design [25]. The source code and example files can be found at https://github.com/FRED2/OptiVac. The implementations currently support SYFPEITHI [2], BIMAS [26], SMM [27], and SMMPMBEC [28] for epitope prediction, and PCM [18] and ProteaSMM [29] for proteasomal cleavage prediction. The statistical analysis was conducted using R (www.rproject.org). Statistical significance was considered at a significance level of 0.05. Data used in the statistical analysis can be found in Additional files 3 and 4.
Results
Designed spacers increase cleavage likelihood and decrease neoimmunogenicity
To validate the model performance, 1000 random epitope pairs, predicted for proteins of the cytomegalic virus strain AD169 (UniProt Proteom ID UP000008991), were generated and spacers of length 1–6 designed and optimized for the HLA distribution of the European population using α = 0.99. The fold change in cleavage likelihood as well as neoimmunogenicity were compared for concatenated epitopes without spacers, a commonly used fixed spacer (AAY) [16, 30, 31], and with optimally determined spacers (Fig. 3).
For each spacer length, a significant increase in cleavage likelihood could be observed for epitope pairs with optimized spacers compared to epitope pairs without spacers (paired onesided Wilcoxon ranksum test, Bonferroni corrected). In addition, the optimized spacers outperformed the constructs with a fixed spacer after a length of two (paired onesided Wilcoxon ranksum test, Bonferroni corrected). The maximum increase in cleavage likelihood was achieved with a spacer length of four, which is not surprising since the applied cleavage model uses four C and two Nterminal amino acids to predict a cleavage site. The use of optimal spacer sequences resulted in a 7.7fold increase in cleavage likelihood compared to epitope pairs without spacer sequences and a twofold increase compared to epitope pairs with a fixed AAY spacer.
In addition, significant improvements could be observed in terms of reduced neoimmunogenicity when using optimized spacers compared to both designs with fixed spacers and without spacers (paired onesided Wilcoxon ranksum test, Bonferroni corrected). With increasing spacer length, the immunogenicity decreased when using optimal spacer sequences. An average neoimmunogenicity reduction of 1.9fold and 2.7fold could be achieved at a spacer length of four compared to epitope pairs without spacers and fixed spacers, respectively. Detailed results can be found in Additional file 3.
Stringofbeads designs with optimal spacers improve epitope recovery
A pool of epitopes was produced. The epitopes were predicted to bind to at least one HLA allele present in a European population. Out of this pool, random sets of size l ∈ {3, 5, 10, 15, 20, 25, 30} were selected. The optimal ordering was determined for the stringofbeads construct without (SBV) and with spacer sequences (SBV_{spacer}) for a maximum spacer length of k = 6 amino acids. Additionally, ten randomly ordered stringsofbeads with fixed AAY spacers (SBV_{AAY}) for the given epitope set were generated. This procedure was repeated 50 times for each set size. The junction cleavage likelihood averaged over the number of arising junction sites, the fraction of recovered epitopes (i.e., epitopes with preceding and succeeding Cterminal cleavage sites with positive cleavage score), as well as the neoimmunogenicity of the complete construct normalized by the number of included epitopes were compared between the stringsofbeads with a spacer, without spacer sequences, and the average performance of the random constructs with fixed spacers (Fig. 4).
The average junction cleavage scores of SBV_{spacer} and SBV_{AAY} were stable and well above the cleavage threshold of 0.0 for all set sizes, with an average score of 1.74 ± 0.63 and 0.73 ± 0.53, respectively. The average junction cleavage score for SBV decreased with increasing set sizes and was below the cleavage threshold even for small set sizes with an average score of −0.85 ± 1.09. This was also reflected in the percentage of recovered epitopes. SBV exhibited a decreasing recovery with increasing set sizes with an average of 15.4 ± 24.3 %, while SBV_{spacer} and SBV_{AAY} achieved a stable average recovery of 78.3 ± 16.2 % and 62.7 ± 15.2 % corresponding to a fivefold and fourfold increase, respectively. SBV_{spacer} also consistently outperformed SBV_{AAY}, both in cleavage likelihood (2.38fold increase) and recovery rate (1.25fold increase).
The differences in neoimmunogenicity were not as strong, which is expected due to the chosen value of α. SBV_{spacer} consistently achieved a lower neoimmunogenicity score (average 1.88 ± 0.59) than SBV (average 3.37 ± 0.93) and SBV_{AAY} (average 4.31 ± 0.99), resulting in a decrease of 44.2 % and 56.8 %, respectively.
The optimal spacer length averaged at 3.23 ± 0.50 amino acids. The run time for instances with 30 epitopes was 5 min on average (maximum 5.6 min) on current commodity hardware (12core Intel Xeon E52620 running at 2 GHz). Detailed results can be found in Additional file 4.
Commonly used spacer designs tend to be worse than optimal designs
Several spacer sequences have been proposed in various settings ranging from a prophylactic vaccine to therapeutic cancer vaccine studies [15, 16, 30, 32–34]. However, these spacer sequences are not universally applicable and their usefulness is dependent on the epitope pairs they connect. To show the potential efficacy of the proposed model, we compared multiepitope studies that used spacers with our in silico designed spacers in terms of epitope recovery and induced neoepitopes. An epitope was considered recovered if its preceding and succeeding cleavage sites were likely to be cleaved, as predicted by PCM (i.e. PCM score > 0.0). Neoepitope prediction was performed with SYFPEITHI using the default threshold (i.e. SYFPEITHI score ≥ 20). Additionally, we computed the optimal ordering and selection of the experimental spacers similar to the approach in [35].
Levy et al. proposed a therapeutic multiepitope polypeptide consisting of HLAA*02:01 restricted modified epitopes derived from different melanomaassociated antigens (gp100:209–217(210 M): IMDQVPFSV, gp100:280–288(288 V): YLEPGEVTV; Mart1:27–35(27 L): LAGIGILTV; tyrosinase: 368–376(370D): YMDGTMSQV) and showed the proteasomaldependent efficacy in vitro using the peripheral blood mononuclear cells of healthy donors and patients undergoing treatment [30]. To combine the selected peptides, a natively derived spacer sequence (RKSY(L)) as well as experimentally derived spacers (AAY and ALL/SSL) were used. The selected epitopes were included multiple times in the polypeptide combined with the different spacers to maximize the recovery probability. Therefore, we compared the different segments of the vaccine that were connected with the same spacer sequences (Fig. 5). Detailed results of the neoepitope and cleavage site predictions can be found in Additional file 5.
In general, the optimal SBV design outperformed the experimentally used spacer sequences both in terms of therapeutic epitope recovery and in reduced neoepitope appearance. With the designed spacers, 100 % of therapeutic epitopes could be recovered without generating neoepitopes spanning the spacer sequences. The experimentally used spacers, on the other hand, either generated neoepitopes or were not able to recover an essential amount of the therapeutic epitopes. With the spacer RKSY(L), only one out of four epitopes could be recovered, and ALL induced five neoepitopes spanning the spacer. The Mart1derived epitope and the combination of SLL and AAY generated neoepitopes and resulted in the recovery of one out of four epitopes only. Even the design with optimally ordered epitopes and selected experimental spacer sequences could not recover all epitopes and introduced neoepitopes. To establish the effect of different (linear) epitope prediction methods, the comparison was repeated with different methods (BIMAS [26] and SMM [27]). The recovery analysis was again performed with PCM, and default thresholds for BIMAS (predicted T _{1/2} ≥ 100) and SMM (predicted IC_{50} ≤ 500 nM) were used for neoepitope detection. All therapeutic epitopes could be recovered using the in silico designed spacers with a smaller or equal number of neoepitopes compared to the best experimentally used spacer sequence. While there are differences in detail between the methods, their overall behavior remained the same. Differences can be attributed to variations in the prediction accuracy of the methods (Additional file 5 and 6).
Similar results could be observed for the SBV construct proposed by Ding et al. [15] (Additional files 7 and 8). The proposed SBV was composed of Tcell epitopes derived from the hepatitis B virus X protein, which were combined with different spacer sequences to reduce the number of junction neoepitopes. With the in silico designed spacer sequences, all therapeutic epitopes could be recovered without introducing neoepitopes, whereas the experimentally used spacers induced neoepitopes and were not able to recover all therapeutic epitopes.
Conclusion
In this work, we propose a mathematical model for designing spacer sequences of flexible length for SBVs by exploiting existing proteasomal cleavage and epitope prediction methods. We combined the model with a TSP approach for optimal epitope ordering. We also addressed the problem of neoepitopes and nonjunction cleavage sites arising from spacer sequences and the order of the epitopes within the stringofbeads by extending the formulation with two additional objective functions. To solve the multiobjective optimization problem efficiently, we employ lexicographical optimization techniques.
The efficacy of the model was shown by comparing the recovery rates and neoimmunogenicity of optimal designs with commonly used fixed spacer sequences and spacerless designs. In each case, the optimal design led to increased predicted epitope recovery and reduced generation of neoantigens.
We also compared experimentally tested stringofbeads designs that used spacer sequences with our optimized designs. The experimentally used spacer sequences were often suboptimally chosen for the connecting epitopes. As a consequence, there were neoepitopes spanning the spacer sequences or proteasomal cleavage could not be guided to cleave the therapeutic epitopes correctly. In contrast, the in silico designed stringofbeads with optimally determined spacers showed improved cleavage patterns and reduced neoimmunogenicity. Often all therapeutic epitopes could be correctly cleaved without introducing neoepitopes.
An obvious limitation of the current method is its reliance on computational models for proteasomal cleavage and epitope prediction. While models for HLA class I binding prediction exhibit a high accuracy, proteasomal cleavage models still leave room for improvements [36]. Currently, the approach is restricted to HLA class I epitopes but could be effortlessly extended once a cleavage prediction method for HLAII ligands becomes available. In addition, the framework is designed flexibly enough to replace the underlying proteasomal cleavage prediction method, once more reliable computational prediction models are published. An experimental validation of selected optimal spacer designs is a nontrivial task. It cannot be performed as exhaustively as our computational study – the number of possible designs is simply too large. An experimental validation will thus, most likely, be limited to comparing only a few selected optimal designs to fixed spacer or spacerless designs. Such validation is planned as future work together with experimental partners.
In conclusion, our method is a first framework that optimally designs both epitope order and spacers for SBV design. The mathematical method employs stateoftheart prediction methods, but does not depend on specific methods. Our model predicts an increased recovery of desired epitopes and a reduced production of neoepitopes compared to both fixed spacer and spacerless designs.
Abbreviations
 EV:

epitopebased vaccine
 HLA:

human leucocyte antigen
 ILP:

integer linear program
 LO:

lexicographical ordered optimization
 PSSM:

positionspecific scoring matrix
 SBV:

stringofbeads
 TSP:

traveling salesperson problem
References
 1.
Purcell AW, McCluskey J, Rossjohn J. More than one reason to rethink the use of peptides in vaccine design. Nat Rev Drug Discov. 2007;6(5):404–14.
 2.
Rammensee HG, Bachmann J, Emmerich NPN, Bachor OA, Stevanović S. SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics. 1999;50(3–4):213–19.
 3.
Lundegaard C, Lamberth K, Harndahl M, Buus S, Lund O, Nielsen M. NetMHC3.0: accurate web accessible predictions of human, mouse and monkey MHC class I affinities for peptides of length 8–11. Nucleic Acids Res. 2008;36 suppl 2:W509–12.
 4.
Dönnes P, Elofsson A. Prediction of MHC class I binding peptides, using SVMHC. BMC Bioinform. 2002;3(1):25.
 5.
Singh H, Raghava G. ProPred: prediction of HLADR binding sites. Bioinformatics. 2001;17(12):1236–7.
 6.
ViderShalit T, Raffaeli S, Louzoun Y. Virusepitope vaccine design: informatic matching the HLAI polymorphism to the virus genome. Mol Immunol. 2007;44(6):1253–61.
 7.
Toussaint NC, Maman Y, Kohlbacher O, Louzoun Y. Universal peptide vaccines – optimal peptide vaccine design based on viral sequence conservation. Vaccine. 2011;29(47):8745–53.
 8.
Toussaint NC, Dönnes P, Kohlbacher O. A mathematical framework for the selection of an optimal set of peptides for epitopebased vaccines. PLoS Comput Biol. 2008;4(12):e1000246.
 9.
Lundegaard C, Buggert M, Karlsson A, Lund O, Perez C, Nielsen M, editors. PopCover: a method for selecting of peptides with optimal population and pathogen coverage. Proceedings of the 1st ACM International Conference on Bioinformatics and Computational Biology; 2010. ACM.
 10.
Fischer W, Perkins S, Theiler J, Bhattacharya T, Yusim K, Funkhouser R, et al. Polyvalent vaccines for optimal coverage of potential Tcell epitopes in global HIV1 variants. Nat Med. 2007;13(1):100–6.
 11.
Toussaint NC, Kohlbacher O. OptiTope – a web server for the selection of an optimal set of peptides for epitopebased vaccines. Nucleic Acids Res. 2009;37 suppl 2:W617–22.
 12.
Sette A, Fikes J. Epitopebased vaccines: an update on epitope identification, vaccine design and delivery. Curr Opin Immunol. 2003;15(4):461–70.
 13.
Cornet S, Miconnet I, Menez J, Lemonnier F, Kosmatopoulos K. Optimal organization of a polypeptidebased candidate cancer vaccine composed of cryptic tumor peptides with enhanced immunogenicity. Vaccine. 2006;24(12):2102–9.
 14.
Livingston BD, Newman M, Crimi C, McKinney D, Chesnut R, Sette A. Optimization of epitope processing enhances immunogenicity of multiepitope DNA vaccines. Vaccine. 2001;19(32):4652–60.
 15.
Ding FX, Wang F, Lu YM, Li K, Wang KH, He XW, et al. Multiepitope peptide‐loaded virus‐like particles as a vaccine against hepatitis B virus–related hepatocellular carcinoma. Hepatology. 2009;49(5):1492–502.
 16.
Velders MP, Weijzen S, Eiben GL, Elmishad AG, Kloetzel PM, Higgins T, et al. Defined flanking spacers and enhanced proteolysis is essential for eradication of established tumors by an epitope string DNA vaccine. J Immunol. 2001;166(9):5366–73.
 17.
Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, et al. Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature. 2015;520(7549):692–6.
 18.
Dönnes P, Kohlbacher O. Integrated modeling of the major events in the MHC class I antigen processing pathway. Protein Sci. 2005;14(8):2132–40.
 19.
Zhang W, Niu Y, Zou H, Luo L, Liu Q, Wu W. Accurate prediction of immunogenic Tcell epitopes from epitope sequences using the genetic algorithmbased ensemble learning. PloS ONE. 2014;10(5):e0128194e.
 20.
Hemmecke R, Köppe M, Lee J, Weismantel R. Nonlinear integer programming. 50 years of integer programming 1958–2008. Berlin Heidelberg: Springer; 2010. p. 561–618.
 21.
Ehrgott M. A discussion of scalarization techniques for multiple objective integer programming. Ann Oper Res. 2006;147(1):343–60.
 22.
Marler RT, Arora JS. Survey of multiobjective optimization methods for engineering. Struct Multidiscip Optim. 2004;26(6):369–95.
 23.
Helsgaun K. General kopt submoves for the Lin–Kernighan TSP heuristic. Math Program Comput. 2009;1(2–3):119–63.
 24.
Hart WE, Watson JP, Woodruff DL. Pyomo: modeling and solving mathematical programs in Python. Math Program Comput. 2011;3(3):219–60.
 25.
Schubert B, Brachvogel HP, Jürges C, Kohlbacher O. EpiToolKit – a webbased workbench for vaccine design. Bioinformatics. 2015;31(13):22113. doi:10.1093/bioinformatics/btv116.
 26.
Parker KC, Bednarek MA, Coligan JE. Scheme for ranking potential HLAA2 binding peptides based on independent binding of individual peptide sidechains. J Immunol. 1994;152(1):163–75.
 27.
Peters B, Sette A. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. BMC Bioinform. 2005;6(1):132.
 28.
Kim Y, Sidney J, Pinilla C, Sette A, Peters B. Derivation of an amino acid similarity matrix for peptide: MHC binding and its application as a Bayesian prior. BMC Bioinform. 2009;10(1):394.
 29.
Tenzer S, Peters B, Bulik S, Schoor O, Lemmel C, Schatz M, et al. Modeling the MHC class I pathway by combining predictions of proteasomal cleavage, TAP transport and MHC class I binding. Cell Mol Life Sci. 2005;62(9):1025–37.
 30.
Levy A, Pitcovski J, Frankenburg S, Elias O, Altuvia Y, Margalit H, et al. A melanoma multiepitope polypeptide induces specific CD8+ Tcell response. Cell Immunol. 2007;250(1):24–30.
 31.
Aurisicchio L, Fridman A, Bagchi A, Scarselli E, La Monica N, Ciliberto G. A novel minigene scaffold for therapeutic cancer vaccines. Oncoimmunology. 2014;3(1):e27529.
 32.
Bazhan S, Karpenko L, Ilyicheva T, Belavin P, Seregin S, Danilyuk N, et al. Rational design based synthetic polyepitope DNA vaccine for eliciting HIVspecific CD8+ T cell responses. Mol Immunol. 2010;47(7):1507–15.
 33.
Moss SF, Moise L, Lee DS, Kim W, Zhang S, Lee J, et al. HelicoVax: epitopebased therapeutic Helicobacter pylori vaccination in a mouse model. Vaccine. 2011;29(11):2085–91.
 34.
Depla E, Van der Aa A, Livingston BD, Crimi C, Allosery K, De Brabandere V, et al. Rational design of a multiepitope vaccine encoding Tlymphocyte epitopes for treatment of chronic hepatitis B virus infections. J Virol. 2008;82(1):435–50.
 35.
Seyed N, Taheri T, Vauchy C, Dosset M, Godet Y, Eslamifar A et al. Immunogenicity evaluation of a rationally designed polytope construct encoding HLAA* 0201 restricted epitopes derived from Leishmania major related proteins in HLAA2/DR1 transgenic mice: steps toward polytope vaccine. PLoS ONE. 2014;9(10):e108848. doi: 10.1371/journal.pone.0108848.
 36.
Calis JJ, Reinink P, Keller C, Kloetzel PM, Keşmir C. Role of peptide processing predictions in T cell epitope identification: contribution of different prediction programs. Immunogenetics. 2014;67(2):85–93.
Acknowledgements
This project received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement 633592 (APERIM). OK acknowledges funding from the Deutsche Forschungsgemeinschaft (SFB685/B1).
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
BS developed and implemented the method. BS and OK wrote the paper. OK designed the study. Both authors read and approved the final manuscript.
Additional files
Additional file 1:
Algorithm for stringofbeads design with flexible spacer sequences. A description in pseudocode of the algorithm to determine the optimal ordering of epitopes and spacers for a stringofbeads vaccine. (PDF 1056 kb)
Additional file 2:
Influence of α and β on cleavage likelihood, neoimmunogenicity, and nonjunction cleavage likelihood exemplified for spacers of length three. Cleavage likelihood and neoimmunogenicity decrease linearly with α. For the conservatively chosen α = 0.99, β influences neoimmunogenicity only marginally. Once α is further decreased, β influences neoimmunogenicity in a nonlinear manner. Similar behavior can be seen for the nonjunction cleavage likelihood. It decreases linearly with α and nonlinearly with β. (PDF 51 kb)
Additional file 3:
Detailed results for comparing epitope pairs with and without spacers. Detailed results for the comparison of epitope pairs with spacers and without spacers including sequences of the paired epitopes and designed spacers, predicted cleavage likelihoods of the two induced cleavage sites, and the combined cleavage likelihood, as well as the neoimmunogenicity of the epitope pair–spacer construct. (XLS 1236 kb)
Additional file 4:
Detailed results for comparing stringofbeads vaccines of different lengths with and without spacers. Detailed results for the comparison of stringofbeads with spacers and without spacers including the stringofbeads sequences, predicted cleavage likelihoods, neoimmunogenicity, number of neoepitopes, as well as the recovery rate of the desired epitopes. (XLS 299 kb)
Additional file 5:
Detailed prediction results for the polypeptide proposed by Levy et al. Detailed results of the neoepitope and cleavage site prediction analysis performed with PCM for cleavage site prediction, and with SYFPEITHI, BIMAS, and SMM for neoepitope prediction for the polypeptide of Levy et al. (XLS 76 kb)
Additional file 6:
Comparison of different epitope prediction methods for in silico spacer design based on the polypeptide proposed by Levy et al. Spacer sequences were constructed with SYFPEITHI, BIMAS, and SMM. Cleavage prediction was performed with PCM, classifying a site as cleaved if its score was greater than zero. The epitope thresholds used for neoepitope detection were SYFPETHIscore ≥ 20, BIMAS ≥ 100 T _{1/2}, and SMM ≤ 500 nM. Red bars represent predicted epitopes and the intensity indicates overlapping epitopes at that position. The blue rectangles represent predicted Cterminal cleavage sites. Spacer sequences are marked in red. A tick indicates the start position of a predicted ninemer epitope. Although, the different prediction methods yielded different spacer sequences, the overall result remained the same. The in silico designed spacers were superior in terms of recovered epitopes and neoepitope formation. (PDF 1198 kb)
Additional file 7:
Comparison of experimentally used and in silico designed spacers based on the polypeptide proposed by Ding et al. Red bars represent predicted epitopes and the intensity indicates overlapping epitopes at that position. The blue rectangles represent predicted Cterminal cleavage sites. Spacer sequences are marked in red. A tick indicates the start position of a predicted ninemer epitope. Epitope and cleavage site prediction were performed with SYFPEITHI and PCM, respectively. A ninemer was predicted as an epitope if its predicted score was equal to or above a threshold of 20 (default threshold of SYFPEITHI). A cleavage site was said to be cleaved if the predicted PCM score was above zero. An epitope was defined as recovered if both the preceding and succeeding cleavage sites were predicted to be cleaved. (PDF 581 kb)
Additional file 8:
Detailed prediction results of the polypeptide proposed by Ding et al. Detailed results of the neoepitope and cleavage site prediction analysis performed with SYFPEITHI and PCM on the polypeptide of Ding et al. (XLS 43 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Space Sequence
 Space Length
 Human Leucocyte Antigen
 Epitope Prediction
 Travel Salesperson Problem