Ingenio NGS Run Analysis Report

Section 1: Workflow Summary

Samples were processed according to the workflow below. PolyA-tail selection was used to separate extracted mRNA for reverse transcription to cDNA. Samples were ligated with Illumina TruSeq RNA CD barcode adapters and assessed for quality via Agilent TapeStation and Thermofisher Qubit prior to sequencing.

Figure 1.1: RNA Sequencing Process

Section 2: Sequencing Overview

Following the wetlab workflow, samples were sequenced on an Illumina NextSeq550 instrument and demultiplexed into Fastq files using Local Run Manager bcl2fastq v.1.8.4. Fastq files were assessed for read type, quantity and quality before proceeding with additional downstream analysis.
Sample ID Total Read Pairs Total Bases Unique Exonic Rate Bases >Q30 Rate
Sample1A 32441484 7438531874 68.1% 93.4%
Sample1B 32010165 7342693975 61.1% 93.8%
Sample2A 28250897 6738027868 75.4% 94%
Sample2B 28718004 6375599965 42.6% 92.8%
Sample3A 26598050 6347449662 74% 93.9%
Sample3B 27951890 6622742671 71.3% 94%
Sample4A 27237497 6357419924 67.3% 94%
Sample4B 30260137 6956166950 67.2% 94.4%

Table 2.1: Sample Sequencing Statistics.

Figure 2.1: Breakdown of RNA types, as well as a measure of contamination rate (rRNA).

Section 3: Sample Quality Review

FastQC v0.11.5 was run to assess sample quality and produce reviewable sample run metrics. Presented below are charts summarizing sample Phred quality score over the length of the reads, the general distribution of Phred quality scores per sample and the general distribution of GC content % per sample.

Figure 3.1: Per Base Sequence Quality.

Figure 3.2: Per Sequence Quality Scores.

Figure 3.3: Per Sequence GC Content.

Section 4: Alignment Overview

Using CutAdapt v3.2, sample reads with low quality or length were removed and the remaining reads were trimmed for Illumina read-end issues. Trimmed reads were aligned to the Mus musculus GRCm39 reference genome with STAR v.2.7.8a. Resulting BAM (Binary Alignment Map) files were assessed for alignment metrics.
Sample ID Total Mapped Pairs MEND Pairs Duplicate Mapping Rate Alternative Alignments Genes Detected
Sample1A 24920790 11013369 35.1% 5327953 19610
Sample1B 24593638 10362724 31.1% 6061740 17850
Sample2A 22551624 12235349 28% 4645845 17408
Sample2B 21359721 6712908 26.2% 7065037 23006
Sample3A 21250632 11475450 27% 4649572 17220
Sample3B 22168051 11377332 28% 4593455 19875
Sample4A 21299899 10252892 28.5% 4575012 20058
Sample4B 23292745 10572972 32.4% 6327077 17435

Table 4.1: Alignment Statistics.

Figure 4.1: Read QC per sample.

Figure 4.2: Mapping rate per sample.

Section 5: Differential Expression Overview

Following alignment review, read counts on a gene level were assembled using featureCounts v1.6. Sample data was grouped according to the provided specifications and compared for differences in expression with DESeq2 v1.2.4.0. Only genes with p.adj < 0.05 and | log2FoldChange | > 1 were considered significant. Additional enrichment/pathway analyses were also performed to assess larger-scale functional changes.
Comparison Significant Genes Significant Ontologies Significant KEGG Pathways
TreatmentvsControl 22551 673 17