Alignment free method for large scale genomic sequence analysis

Approximate jaccard index calculation

k-mer sampling

LSH in other fields

Approximate member query and efficient counting

  • Bloom filter
    • Query whether an element exists in a large set. May generate false positive, but never false negative.
    • An alternative to the memory intensive hash table
    • A m bits array, K hash functions
    • For a new instance, each hash function map input to one of the positions in 1..m
  • Some variant
    • counting bloom filter
    • spectral bloom filter
  • Quotient filter

Reading List