Skip to main content

Table 2 Overview of all datasets used in the analysis

From: Diagnostic Evidence GAuge of Single cells (DEGAS): a flexible deep transfer learning framework for prioritizing cells in relation to disease

Study

Dataset

Sample size

Data type

Attribute

Simulation

Simulated cellsa

5000 cells

scRNA-seq

Cell type

Simulated patientsa

600 patients

RNA-seq

Disease status

Glioblastoma

Patel et al., 2014 [33]

532 cells

(5 patients)

scRNA-seq

(SMART-seq)

None

TCGA GBM [34]

111 patients

Microarray

GBM subtype

Alzheimer’s disease

AIBS

47,396 cells

(11 patients)

scRNA-seq

(SMART-seq)

Brain cell types

Grubman et al., 2019 [36]

13,214 cells

(12 patients)

snRNA-seq

(10x Genomics)

AD and normal brain cell types

Mathys et al., 2019 [15]

5288 cellsb

(48 patients)

snRNA-seq

(10x Genomics)

AD and normal brain cell types

MSBB [35]

682 samples

(221 patients)

RNA-seq

AD diagnosis

Multiple myeloma

MMRF [47]

647 patients

RNA-seq

PFS

IUSM

Chen et al. 2021 [46]

22,968 cells

(4 patients)

scRNA-seq

(10x Genomics)

Subtype cluster

(Subtype 1-5)

Ledergor et al., 2019 [45]

13,440 cells

(35 patients)

scRNA-seq

(MARS-seq)

Malignancy (NHIP, MGUS, SMM, MM)

Zhan et al., 2006 [44]

559 patients

Microarray

OS

  1. aThe simulated patients were generated from the simulated cells by combining known proportions of cell types. “None” is used to denote the lack of labels for the cells/samples in a given dataset. bCells were down-sampled from the total number of cells because some cell types were over-represented. The following are all of the abbreviations: The Cancer Genome Atlas (TCGA), Glioblastoma Multiforme (GBM), Allen Institute for Brain Science (AIBS), Mount Sinai/JJ Peters VA Medical Center Brain Bank (MSBB), Multiple Myeloma Research Foundation (MMRF), Indiana University School of Medicine (IUSM), Alzheimer’s disease (AD), progression-free survival (PFS), overall survival (OS), normal hip (NHIP), monoclonal gammopathy of undetermined significance (MGUS), smoldering multiple myeloma (SMM), multiple myeloma (MM), RNA sequencing (RNA-seq), single-cell RNA-seq (scRNA-seq), and single nuclei RNA-seq (snRNA-seq)