Numerical ecology

The primary analysis of a metabarcoding experiment processes a set of FASTQ files (raw reads) to generate:

  • a set of representative sequences (either Amplicon Sequence Variants, ASVs, or Operational Taxonomic Units, OTUs)
  • a feature table (or contingency table): a matrix of counts of hits against each representative sequence per sample

Additionally:

  • taxonomic annotation of each representative sequence
  • a phylogenetic tree of the representative sequences

These files can be analysed using the principles of numerical ecology, to

MicrobiomeAnalyst

MicrobiomeAnalyst is both an R module and a webserver to perform a range of explorative analyses and statistical tests, like:

  • Compositional profiling
  • Comparative analysis
  • Functional analysis
  • Taxon Set Enrichment Analysis

A nature protocol is available.

Rhea

Dadaist2 implements the Rhea workflow to normalize the feature table, analyse the alpha and beta diversity, generate taxonomy barplots.

Lagkouvardos I, Fischer S, Kumar N, Clavel T. (2017) Rhea: a transparent and modular R pipeline for microbial profiling based on 16S rRNA gene amplicons. PeerJ 5:e2836 https://doi.org/10.7717/peerj.2836

Dadaist2 produce a Rhea subdirectory with the input files to follow the full Rhea protocol. In addition some steps (those not requiring assumptions on the experiment) are performed automatically:

  • Normalization (this can be invoked independently via dadaist2-normalize)
  • Alpha diversity (this can be invoked independently via dadaist2-alpha)

PhyloSeq

PhyloSeq is an R module that allows several analyses of microbiome datasets.

Dadaist2 conveniently produces a phyloseq object that can be loaded with:

ps <- loadRDS("phyloseq.rds")