Notes on Metagenomics
- 
    
Estimate abundance of different microbes from amplicon sequencing or shutgun sequencing
 - alignment based methods vs. kmer based methods
    
- metaphlan
 - kraken
 - metaphlan2
 - kraken2
 
 - marker gene based methods vs. marker gene independent methods
    
- metaphylan2, krakenunique
 - kraken2
 
 - nucleotide level analysis vs. protein level analysis
    
- kraken
 - kraken2
 - Kaiju
 
 - benchmark study
 
Assembly
- metaspades: memory intesive
 
Tools for binning
- https://bitbucket.org/berkeleylab/metabat/src/master/
 
Strategy for removing contaminations
- 
    
2014, BMC Biology, Reagent and laboratory contamination can critically impact sequence-based microbiome analyses
 - 
    
2019, Trends in Microbiology, Contamination in Low Microbial Biomass Microbiome Studies: Issues and Recommendations
 - 
    
2019, Plos Computational Biology, Recentrifuge: Robust comparative analysis and contamination removal for metagenomics
 
Genome similarity
@@ -55,25 +64,31 @@ categories: jekyll update
- unifrac distance
 - 
    
weighted unifrac distance
 - PCoA Analysis
 
Community analysis based on pairwise distance
- PCoA Analysis
    
- PCoA (principle coordinate analysis) and MDS (multi-dimesional scaling) are essentially the same thing (but this seems the term MDS is seldom used by the microbiology community …)
 - project each sample in a distance matrix to a low dimensional space, attempt to preserve the pairwise distances
 - See
 
 - permanova analysis
 - permanova analysis
    
- For a given grouping (real grouping or shuffled grouping), a pseudo F statistics can be calculated
 - P value can be calculated by shuffling group labels
 
 
Differential analysis
- 2020, Genome Biology, Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data
 
Functional annotation
- eggnog: http://eggnog.embl.de/
 
Visualization
Genome annotation
- CDS prediction for prokaryotes
 - CDS prediction for eukaryotes