Genesis Global Group
Ingenio Diagnostics
GD3

You will now be redirected to the GD3 website

Ingenio Diagnostics is a proud member of Genesis Drug Discovery & Development (GD3), a fully integrated CRO providing services to support drug discovery programs of our clients from target discovery through IND filing and managing Phase I-III clinical trials.

Learn more about GD3
Search

SERVICES

Diversity Assessment

High-throughput sequencing and taxonomic classification are essential for microbiome research projects. Our complete 16S microbiome analysis includes 16S sequencing, taxonomic classification of organisms, sample diversity statistics, and the option to assess the beta diversity between groups of samples (e.g. populations, treatments, sites). Project-specific analyses are available upon request.

We currently offer amplicon based 16S sequencing for V1-2, V3, V4, and V6. A custom mix of variable regions can be ordered as well.

View a typical 16S Microbiome Analysis report
16S Microbiome Sample

Ingenio NGS Differential Expression Analysis Report

Mouse Run Report

Report generated on November 10, 2021

Submitted Samples

Sample names and analysis groups were taken directly from the submission form. The table was created with the kableExtra R package[1]

Number of Samples Submitted: 10

Run Report

Sequencer Used: Illumina MiniSeq
NGS Sequencing Method: Paired-End Amplicon Sequencing
Bacterial Region Sequenced: 16S (V3, V4, and V6)
Amplicon Size: 160-300bp
Mock Communities Used: BEI Mock Even and Mock Staggered Communities for 16S

Total Reads: 7,871,350
Total Paired Reads: 3,935,675
Minimum Paired Read Count: 237,641
Maximum Paired Read Count: 410,966
Average Read Size: 128 bases

Read Counts Per Sample

Raw paired-end reads from each sample are reported in the following bar graph and corresponding table. All samples must pass this threshold for successful reporting of low abundant bacterial organisms. The submitted samples are labeled with the user-provided IDs from the submission form. Submitted samples are listed alphabetically, and duplicates are denoted with an underscore and numbered sequentially. The bar graph was generated with the ggplot2 and plotly R packages[2, 3], and the table was created with the kableExtra R package[1]

Paired Read Counts Per Sample Graph

Read Counts After Filtering

Paired-end reads were filtered with Fastp[4] and new FastQ files were generated with the passing reads. To ensure proper alignments and classification, reads with ambigous bases in the forward or reverse reads were removed from the read count. Additionally, reads with less than 50 bases in either direction were filtered out. The resulting read counts after Fastp filtering are shown in the bar graph and table below. Raw read counts are shown as a blue bar and read counts following Fastp filtering are shown as a yellow bar on the graph. The percentages of passing paired-reads after Fastp filtering are reported in the table. The bar graph was generated with the ggplot2 and plotly R packages[2, 3], and the table was created with the kableExtra R package[1]

Filtered Read Counts Per Sample Graph

Read Counts After Mothur Filtering

Read alignment and taxonomic classification were done using the Mothur pipeline for 16S data[5]. Paired-end reads were used to generate a Fasta file that was used for 16S analysis. The Fasta sequences were aligned to the reference 16S V4 region obtained from the SILVA database[68]. Aligned sequences were then classified using the RDP database[9]. Classified reads were filtered to remove sequences outside of the 16S region, sequences with non-bacterial classification, and poor quality sequences. The remaining read counts are reported in the following graph and table. The percentages of passing reads after Mothur filtering are listed in the table. The bar graph was generated with the ggplot2 and plotly R packages[2, 3], and the table was created with the kableExtra R package[1]

Mothur Filtered Read Counts Per Sample Graph

Group Diversity

The sequences were divide into organizational taxonomic units (OTUs) based on their taxonomic classifications. Group diversity (beta diversity) metrics assess the similarity and dissimilarity of OTU composition between groups of samples. Due to varying read counts between samples, all read counts were normalized to match the sample with the lowest read count, via subsampling without replacement, prior to statistical analysis. Diversity heatmaps and bar charts were generated using the total read count per sample.

Principal Component Analysis

Principal Component Analysis (PCA) was done using normalized read counts as described above. Groups were predefined on the sample submission form, and colored accordingly. Mock samples were also included for this analysis and separated as an independent group. A PCA was done to reduce the dimensions of the final OTU table, and the first two principal components (PC1 and PC2) are plotted below using the plotly R package[3].

V3 PCOA Graph
V4 PCOA Graph
V6 PCOA Graph

Phylum Level OTU Heatmap

Samples were clustered based on phylum level OTUs and the heatmap is shown below. OTU read counts were log10 .transformed and infinite values were set to zero. The sample and OTU dendrograms were ordered using optimal leaf ordering (OLO) from the heatmaply R package[10] and the heatmap was visualized using the plotly R package[3]. Values are shown as log10(total reads) and infinite values were set to zero.

V3 heatmap Graph
V4 heatmap Graph
V6 heatmap Graph

AMOVA

A Statistics section will only be included if there is more than one group for the project.

The Analysis of Molecular Variance (AMOVA) is used to detect the variance of molecular markers between two groups. The groups used for these analyses have been predefined on the sample submission form. Normalized read counts from genus level OTUs were used for this section. Mothur uses an asterisk to denote significance (p-Value < 0.01). All the tables in this section were generated using the kableExtra R package[1].

Bar Charts

Visualization of group diversity and taxonomic classification with bar charts. The Mock Communities were assessed for accurate classification and compared to the expected read counts from the BEI data sheet. The phylum level composition and the genus level composition are provided for all submitted samples. All bar graphs were generated using the ggplot2 and plotly R packages[2, 3].

V3 Composition

V3 Mock Graph
V3 Genera Graph
V3 Phyla Graph

V4 Composition

V4 Mock Graph
V4 Genera Graph
V4 Phyla Graph

V6 Composition

V6 Mock Graph
V6 Genera Graph
V6 Phyla Graph

Sample Diversity

Sample Diversity (alpha diversity) summarizes the composition of the microbial community within a sample using measurements for its richness (number of taxonomic groups) and/or evenness (distribution of abundances of the groups). The calculations for Sample Diversity were done using normalized read counts as described in the Group Diversity section.

Rarefaction Curves

Rarefaction and rarefaction curves are used to measure OTU richness within a sample. The number of OTUs are plotted against the number of sequences and the resulting curves suggest the richness of each sample. Sequences were subsampled 100 times to generate the rarefaction curves below. Rarefaction curves typically flatten after a steep incline for samples that have been sufficiently sequenced. Low diversity samples will reach their peak and flatten out much earlier. Very high diversity samples or samples that have not been sequenced completely will continue to rise without flattening.

V3 Rarefaction Graph
V4 Rarefaction Graph
V6 Rarefaction Graph
Diversity Index

Sample diversity can also be assessed using the Simpson’s Diversity Index and the Shannon’s Diversity Index. Simpson’s Diversity Index measures the probability that any two individuals drawn at random from a community belong to different species/OTUs (dominance and richness). Shannon’s Diversity Index describes the diversity and species/OTU richness of a sample in one metric. All the tables in this section were generated using the kableExtra R package[1].

Taxonomic Classification

Sunburst Plots

The sunburst plots for the groups and individual samples were generated using the plotly R package[3].

V3 Classification
V3 Mock Group
V3 Mock Even
V3 Mock Staggered
V3 JAX Group
V3 JAX 1
V3 JAX 2
V3 JAX 3
V3 JAX 4
V3 JAX 5
V3 TAC Group
V3 TAC 1
V3 TAC 2
V3 TAC 3
V3 TAC 4
V4 Classification
V4 Mock Group
V4 Mock Even
V4 Mock Staggered
V4 JAX Group
V4 JAX 1
V4 JAX 2
V4 JAX 3
V4 JAX 4
V4 JAX 5
V4 TAC Group
V4 TAC 1
V4 TAC 2
V4 TAC 3
V4 TAC 4
V6 Classification
V6 Mock Group
V6 Mock Even
V6 Mock Staggered
V6 JAX Group
V6 JAX 1
V6 JAX 2
V6 JAX 3
V6 JAX 4
V6 JAX 5
V6 TAC Group
V6 TAC 1
V6 TAC 2
V6 TAC 3
V6 TAC 4
Phylum Level Pie Charts

Pie charts showing the phylum level composition for the groups defined on the submission sheet as well as individual samples. These plots were generated using the plotly R package[3].

V3 Pie Charts
V3 JAX Group
V3 JAX 1
V3 JAX 2
V3 JAX 3
V3 JAX 4
V3 JAX 5
V3 TAC Group
V3 TAC 1
V3 TAC 2
V3 TAC 3
V3 TAC 4
V4 Pie Charts
V4 JAX Group
V4 JAX 1
V4 JAX 2
V4 JAX 3
V4 JAX 4
V4 JAX 5
V4 TAC Group
V4 TAC 1
V4 TAC 2
V4 TAC 3
V4 TAC 4
V6 Pie Charts
V6 JAX Group
V6 JAX 1
V6 JAX 2
V6 JAX 3
V6 JAX 4
V6 JAX 5
V6 TAC Group
V6 TAC 1
V6 TAC 2
V6 TAC 3
V6 TAC 4
Differential OTUs

Differntial OTU tables were generated using LEfSe. The LDA Effect Size (LEfSe) is an algorithm for High-Dimensional microbiome biomarker discovery. LEfSe uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers of groups[11].

Mouse feces JAX

Classification Table

Tables with read counts and full taxonomic breakdowns. Regions are separated into tabs and the link for the csv can be found at the bottom of each table. These tables were created using the reactable R package[12].

References

  1. Zhu H. kableExtra: Construct complex table with ’kable’ and pipe syntax. 2021. https://CRAN.R-project.org/package=kableExtra.
  2. Wickham H. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York; 2016. https://ggplot2.tidyverse.org.
  3. Sievert C. Interactive web-based data visualization with r, plotly, and shiny. Chapman; Hall/CRC; 2020. https://plotly-r.com.
  4. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90. doi:10.1093/bioinformatics/bty560.
  5. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, et al. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology. 2009;75:7537–41. doi:10.1128/AEM.01541-09.
  6. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research. 2012;41:D590–6. doi:10.1093/nar/gks1219.
  7. Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, et al. The SILVA and “All-species Living Tree Project (LTP)” taxonomic frameworks. Nucleic Acids Research. 2013;42:D643–8. doi:10.1093/nar/gkt1209.
  8. Glöckner FO, Yilmaz P, Quast C, Gerken J, Beccati A, Ciuprina A, et al. 25 years of serving the community with ribosomal RNA gene reference databases and tools. Journal of Biotechnology. 2017;261:169–76. doi:https://doi.org/10.1016/j.jbiotec.2017.06.1198.
  9. Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Research. 2013;42:D633–42. doi:10.1093/nar/gkt1244.
  10. Galili, Tal, O’Callaghan, Alan, Sidi, Jonathan, et al. Heatmaply: An r package for creating interactive cluster heatmaps for online publishing. Bioinformatics. 2017. doi:10.1093/bioinformatics/btx657.
  11. Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, et al. Metagenomic biomarker discovery and explanation. Genome Biol. 2011;12:R60.
  12. Lin G. Reactable: Interactive data tables based on ’react table’. 2020. https://CRAN.R-project.org/package=reactable.

Sample Submissions Guidelines

DNA for 16S Sequencing

  • Extraction via a kit designed for microbiome analysis (e.g. ZymoBIOMICS DNA Kit)
  • Optical Density (260/280): 1.8-2.0
  • No RNA contamination
  • Suspended in DNase-free water or such buffers as 10mM Tris, Qiagen EB, or TE
  • At least 10ul of DNA at a concentration of > 5 ng/ul
  • Shipped with dry ice

Stool and Soil Samples for 16S Sequencing

  • Flash freeze as soon as collected
  • A minimum of 0.5g of stool or soil
  • Stored at -80C prior to shipping
  • Shipped with dry ice
  • Alternative: collect samples with Zymo DNA/RNA Shield and ship at ambient temperature