License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2017.7
URN: urn:nbn:de:0030-drops-76501
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7650/
Go to the corresponding LIPIcs Volume Portal


Girotto, Samuele ; Comin, Matteo ; Pizzi, Cinzia

Fast Spaced Seed Hashing

pdf-format:
LIPIcs-WABI-2017-7.pdf (0.6 MB)


Abstract

Hashing k-mers is a common function across many bioinformatics applications and it is widely used for indexing, querying and rapid similarity search. Recently, spaced seeds, a special type of pattern that accounts for errors or mutations, are routinely used instead of k-mers. Spaced seeds allow to improve the sensitivity, with respect to k-mers, in many applications, however the hashing of spaced seeds increases substantially the computational time. Hence, the ability to speed up hashing operations of spaced seeds would have a major impact in the field, making spaced seed applications not only accurate, but also faster and more efficient.

In this paper we address the problem of efficient spaced seed hashing. The proposed algorithm exploits the similarity of adjacent spaced seed hash values in an input sequence in order to efficiently compute the next hash. We report a series of experiments on NGS reads hashing using several spaced seeds. In the experiments, our algorithm can compute the hashing values of spaced seeds with a speedup, with respect to the traditional approach, between 1.6x to 5.3x, depending on the structure of the spaced seed.

BibTeX - Entry

@InProceedings{girotto_et_al:LIPIcs:2017:7650,
  author =	{Samuele Girotto and Matteo Comin and Cinzia Pizzi},
  title =	{{Fast Spaced Seed Hashing}},
  booktitle =	{17th International Workshop on Algorithms in Bioinformatics (WABI 2017)},
  pages =	{7:1--7:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-050-7},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{88},
  editor =	{Russell Schwartz and Knut Reinert},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/7650},
  URN =		{urn:nbn:de:0030-drops-76501},
  doi =		{10.4230/LIPIcs.WABI.2017.7},
  annote =	{Keywords: k-mers, spaced seeds, efficient hashing}
}

Keywords: k-mers, spaced seeds, efficient hashing
Collection: 17th International Workshop on Algorithms in Bioinformatics (WABI 2017)
Issue Date: 2017
Date of publication: 11.08.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI