License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2018.3
URN: urn:nbn:de:0030-drops-93051
Go to the corresponding LIPIcs Volume Portal

Prezza, Nicola ; Pisanti, Nadia ; Sciortino, Marinella ; Rosone, Giovanna

Detecting Mutations by eBWT

LIPIcs-WABI-2018-3.pdf (0.5 MB)


In this paper we develop a theory describing how the extended Burrows-Wheeler Transform (eBWT) of a collection of DNA fragments tends to cluster together the copies of nucleotides sequenced from a genome G. Our theory accurately predicts how many copies of any nucleotide are expected inside each such cluster, and how an elegant and precise LCP array based procedure can locate these clusters in the eBWT.
Our findings are very general and can be applied to a wide range of different problems. In this paper, we consider the case of alignment-free and reference-free SNPs discovery in multiple collections of reads. We note that, in accordance with our theoretical results, SNPs are clustered in the eBWT of the reads collection, and we develop a tool finding SNPs with a simple scan of the eBWT and LCP arrays. Preliminary results show that our method requires much less coverage than state-of-the-art tools while drastically improving precision and sensitivity.

BibTeX - Entry

  author =	{Nicola Prezza and Nadia Pisanti and Marinella Sciortino and Giovanna Rosone},
  title =	{{Detecting Mutations by eBWT}},
  booktitle =	{18th International Workshop on Algorithms in  Bioinformatics (WABI 2018)},
  pages =	{3:1--3:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-082-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{113},
  editor =	{Laxmi Parida and Esko Ukkonen},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-93051},
  doi =		{10.4230/LIPIcs.WABI.2018.3},
  annote =	{Keywords: BWT, LCP Array, SNPs, Reference-free, Assembly-free}

Keywords: BWT, LCP Array, SNPs, Reference-free, Assembly-free
Collection: 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Issue Date: 2018
Date of publication: 02.08.2018
Supplementary Material:

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI