Module | Output | Format(s) | Description | Tab for visualization/download |
---|---|---|---|---|
Read quality analysis and improvement | FastQC report (raw reads) | .html | FastQC graphical quality reports for raw read files uploaded | Samples/extra info |
FastQC report (quality processed reads) | .html | FastQC graphical quality reports for quality processed reads | Samples/extra info | |
Quality processed reads (1P and 2P) | .fastq.gz | Uploaded reads after quality improvement using Trimmomatic | Samples/extra info | |
Type and sub-type identification | Influenza type and sub-type/lineage | graphical | INSaFLU detects the influenza A and B types, as well as all currently defined influenza A subtypes (18 hemagglutinin subtypes and 11 neuraminidase subtypes) and the two influenza B lineages (Yamagata and Victoria) | Samples/type and subtype (output also included in each project’s “Sample list” table) |
Draft assembly | .fasta | Draft de novo assembly used for type and sub-type/lineage identification. “Influenza-specific” contigs are assigned both to the corresponding viral segments number and to a related reference influenza virus (see next output). | Samples/extra info/type and subtype/lineage identification > “Draft assembly” | |
Assignment of viral segments and references | .tsv | Tab-separated file, where each “influenza-specific” NODE (or contig) is assigned both to the corresponding viral segment number (“GENE” column) and to a related reference influenza virus (“ACCESSION” column). | Samples/extra info/type and subtype/lineage identification > “Seg./Ref. to contigs” | |
Variant detection and consensus generation | Annotated reference file | .gbk | Uploaded reference genome (in .fasta) annotated using Prokka | References/GenBank file |
Mapping file | .bam/graphical | Binary file storing aligned reads to a reference sequence (multi-mapping and unmapped reads are not included); the index is also provided (.bam.bai). “.bam” files can be explored in situ using the Integrative Genomics Viewer (IGV) | Projects/show project results/show sample detail results/mapping file by IGV | |
Annotated variants (SNPs and indels) per sample | .tab/.vcf | List of annotated variants assumed in the consensus sequences (for each sample)* | Projects/show project results/show sample detail results/mapping file by IGV | |
Annotated variants (SNPs and indels) per project | .tsv | Compiles all lists of annotated variants assumed in the consensus sequences* | Projects/show project results/project “name” > variants | |
Consensus sequences per sample (for the pool of loci) | .fasta | A version of the reference sequence with all validated variants replaced. Note: sequences are exclusively generated for locus with 100% of its length covered by ≥ 10-fold)* | Projects/show project results/show sample detail results | |
Coverage analysis | Coverage report per project | .tsv | Compiles the coverage reports for each sample, including the following data: mean depth of coverage per locus, % of locus size covered by at least 1-fold and % of locus size covered by at least 10-fold. | Projects/show project results/project “name” > coverage |
Coverage report per sample per locus (interactive color-coded statistics) | graphical | Green: % of locus size covered by at least 1-fold = 100% and % of locus size covered by at least 10-fold = 100%; | Projects/show project results | |
Yellow: % of locus size covered by at least 1-fold = 100% and % of locus size covered by at least 10-fold < 100%; | ||||
Red: % of locus size covered by at least 1-fold < 100% and % of locus size covered by at least 10-fold < 100%; | ||||
Coverage report per sample per locus (plot) | graphical | Plot of the depth of coverage throughout each locus | Projects/show project results | |
Alignment/phylogeny | Consensus nucleotide alignments per locus | .fasta/.nex/graphical | Locus-specific consensus nucleotide alignments. NOTE1: consensus sequences are exclusively generated for locus with 100% of its length covered by ≥ 10-fold). Note 2: The “.fasta” files can be directly uploaded, together with associated metadata (“Sample_list.tsv”), to visualization tools, such as PHYLOViZ. | Projects/show project results/nucleotide alignments by MSAViewer |
Consensus nucleotide alignments—whole genome | .fasta/.nex/graphical | Consensus nucleotide alignments of the “whole genome” sequences (i.e., upon concatenation of all individual locus). Note 1: whole-genome sequences are exclusively generated for samples with all loci with 100% of its length covered by ≥ 10-fold. NOTE2: The “.fasta” files can be directly uploaded, together with associated metadata (“Sample_list.tsv”), to visualization tools, such as PHYLOViZ. | Projects/show project results/nucleotide alignments by MSAViewer | |
Consensus amino acid alignments per encoded protein | .fasta/.nex/graphical | Consensus amino acid alignments per encoded protein. Note: sequences are exclusively generated for locus with 100% of its length covered by ≥ 10-fold) | Projects/show project results/amino acid alignments by MSAViewer | |
Phylogenetic tree per locus | .nwk/.tree/graphical | Maximum likelihood phylogenetic tree for each locus-specific nucleotide alignment. NOTE: The “.nwk” and “.tree” phylogenetic trees can be directly uploaded, together with associated metadata (“Sample_list.csv”), to visualization tools, such as Microreact and Phandango, respectively. | Projects/show project results/phylogenetic trees by PhyloCanvas | |
Phylogenetic tree—whole genome | .nwk/.tree/graphical | Maximum likelihood phylogenetic tree for the alignments of the “whole-genome” sequences (upon concatenation of all individual locus). Note: The “.nwk” and “.tree” phylogenetic trees can be directly uploaded, together with associated metadata (“Sample_list.csv”), to visualization tools, such as Microreact and Phandango, respectively. | Projects/show project results/phylogenetic trees by PhyloCanvas | |
Intra-host minor variant detection (and uncovering of putative mixed infections) | Annotated minor intra-host single nucleotide variants (iSNVs) per project | .tsv | Compiles all lists of detected and annotated minor iSNVs (i.e., SNV displaying intra-sample variation at frequency between 1 and 50% - minor variants). | Projects/show project results/intra-host minor variants annotation and uncovering of mixed infections |
Plots of the proportion of iSNVs at frequencies 1–50% (minor iSNVs) and 50–90% | graphical | Plots the proportion of iSNV at frequency at 1–50% (minor iSNVs) and at frequency 50–90%. You may inspect this plot to uncover infections with influenza viruses presenting clearly distinct genetic backgrounds (so called “mixed infections”). INSaFLU flags samples as “putative mixed infections” if they fulfill the following cumulative criteria: the ratio of the number of iSNVs at frequency 1–50% (minor iSNVs) and 50–90% falls within the range 0.5–2.0 and the sum of the number of these two categories of iSNVs exceeds 20. Alternatively, to account for mixed infections involving extremely different viruses (e.g., A/H3N2 and A/H1N1), the flag is also displayed when he sum of the two categories of iSNVs exceeds 100, regardless of the first criterion. | ||
Extra | List of samples per project | .csv/.tsv | List of samples per project (compiles all samples’ metadata and additional INSaFLU outputs). This file can be directly uploaded, together with associated alignment or phylogenetic data, to visualization tools, such as PHYLOViZ, Microreact and Phandango. | Projects/show project results/sample list |