Table 1 Summary of binding affinity (“Binders”) and eluted ligand (“Ligands”) data sets used in this work

From: Footprints of antigen processing boost MHC class II natural ligand predictions

Reference Source Allele L11–19     
 DR1 BA Jensen et al. [36] DRB1*01:01 9987     
 DR15 BA Jensen et al. [36] DRB1*15:01 4466     
 DR51 BA Jensen et al. [36] DRB5*01:01 4840     
Reference Source Allele Cell Unique GC L11–9 Random
 DR1 Ph Ooi et al. [26] DRB1*01:01 Human 5131 4786 3992 38115
 DR1 Pm Ooi et al. [26] DRB1*01:01 Mouse 5744 5561 5385 55710
 DR1 Sm Clement et al. [24] DRB1*01:01 Mouse 3216 3112 2963 30510
 DR15 Ph Ooi et al. [26] DRB1*15:01 Human 2782 1590 1390 12870
 DR51 Ph DRB5*01:01 1087 989 9315
 DR15 Pm Ooi et al. [26] DRB1*15:01 Mouse 4810 4486 4229 42030
  1. Binders (upper table): data set reference name (“Reference”), data source (“Source”), MHC restriction (“Allele”), and the amount of sequences in the length range of 11 to 19 amino acids (“L11–19”). Ligands (lower table): data set reference name (“Reference”), data source (“Source”), MHC restriction (“Allele”), cell line species (“Cell”), amount of unique sequences present in the data set before filtering (“Unique”) and after filtering with GibbsCluster (“GC”), quantity of sequences in the 11–19mer range (“L11–19”), number of random negatives sequences added for training (“Random”). Note that the split of the Ooi et al. human data (DR15 Pm/DR51 Pm) was made using the GibbsCluster as described in the text