Notes on Metagenomics
-
Estimate abundance of different microbes from amplicon sequencing or shutgun sequencing
- alignment based methods vs. kmer based methods
- metaphlan
- kraken
- metaphlan2
- kraken2
- marker gene based methods vs. marker gene independent methods
- metaphylan2, krakenunique
- kraken2
- nucleotide level analysis vs. protein level analysis
- kraken
- kraken2
- Kaiju
- benchmark study
Assembly
- metaspades: memory intesive
Tools for binning
- https://bitbucket.org/berkeleylab/metabat/src/master/
Strategy for removing contaminations
-
2014, BMC Biology, Reagent and laboratory contamination can critically impact sequence-based microbiome analyses
-
2019, Trends in Microbiology, Contamination in Low Microbial Biomass Microbiome Studies: Issues and Recommendations
-
2019, Plos Computational Biology, Recentrifuge: Robust comparative analysis and contamination removal for metagenomics
Genome similarity
@@ -55,25 +64,31 @@ categories: jekyll update
- unifrac distance
-
weighted unifrac distance
- PCoA Analysis
Community analysis based on pairwise distance
- PCoA Analysis
- PCoA (principle coordinate analysis) and MDS (multi-dimesional scaling) are essentially the same thing (but this seems the term MDS is seldom used by the microbiology community …)
- project each sample in a distance matrix to a low dimensional space, attempt to preserve the pairwise distances
- See
- permanova analysis
- permanova analysis
- For a given grouping (real grouping or shuffled grouping), a pseudo F statistics can be calculated
- P value can be calculated by shuffling group labels
Differential analysis
- 2020, Genome Biology, Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data
Functional annotation
- eggnog: http://eggnog.embl.de/
Visualization
Genome annotation
- CDS prediction for prokaryotes
- CDS prediction for eukaryotes