GenMap: Ultra-fast Computation of Genome Mappability (bibtex)
by Christopher Pockrandt, Mai Alzamel, Costas S Iliopoulos, Knut Reinert
Abstract:
Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e., with up to e mismatches.We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position.GenMap can be installed via bioconda. Binaries and C ++ source code are available on https://github.com/cpockrandt/genmap.
Reference:
GenMap: Ultra-fast Computation of Genome Mappability (Christopher Pockrandt, Mai Alzamel, Costas S Iliopoulos, Knut Reinert), In Bioinformatics, 2020. (btaa222)
Bibtex Entry:
@article{10.1093/bioinformatics/btaa222,
    author = {Pockrandt, Christopher and Alzamel, Mai and Iliopoulos, Costas S and Reinert, Knut},
    title = "{GenMap: Ultra-fast Computation of Genome Mappability}",
    journal = {Bioinformatics},
    year = {2020},
    month = {04},
    abstract = "{Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e., with up to e mismatches.We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position.GenMap can be installed via bioconda. Binaries and C ++ source code are available on https://github.com/cpockrandt/genmap.}",
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa222},
    url = {https://doi.org/10.1093/bioinformatics/btaa222},
    note = {btaa222},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa222/33012140/btaa222.pdf},
}
Powered by bibtexbrowser