Skip to main content
Fig. 2 | Genome Medicine

Fig. 2

From: Ontology-aware deep learning enables ultrafast and interpretable source tracking among sub-million microbial community samples from hundreds of niches

Fig. 2

ONN4MST’s prediction accuracies are among the best on different datasets and different biome layers, while the performance of ONN4MST does not depend heavily on the number of biomes or number of samples in the dataset. a The five datasets (Combined, Human, Water, Soil, and FEAST datasets) with varied complexities have provided source tracking tasks with different difficulties. The complexity of the dataset is positively associated with the number of biomes and Shannon diversity and negatively associated with the number of samples. For example, source tracking tasks on the Soil dataset is difficult because of the medium number of biomes and small number of samples in the Soil dataset. b The ROC curve of ONN4MST and other methods on all five datasets. c The number of samples, the Shannon diversity and the source tracking results by different methods for the five datasets. The samples involved in each dataset are shown with blue bars, the Shannon diversity of each dataset is shown with red boxes, the AUC of several methods on each dataset is shown with dash lines. d The AUC of all methods on all five datasets. e The number of biomes and the source tracking results by different methods at different layers for the Combined dataset. The samples involved in each biome ontology layer are shown with blue bars, the AUC of different methods on each layer is shown with dash lines. f The AUC of all methods at different layers. Abbreviations: ONN4MST_FS, ONN4MST using selected features

Back to article page