License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2017.15
URN: urn:nbn:de:0030-drops-76582
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7658/
Mao, Shunfu ;
Mohajer, Soheil ;
Ramachandran, Kannan ;
Tse, David ;
Kannan, Sreeram
abSNP: RNA-Seq SNP Calling in Repetitive Regions via Abundance Estimation
Abstract
Variant calling, in particular, calling SNPs (Single Nucleotide Polymorphisms) is a fundamental task in genomics. While existing packages offer excellent performance on calling SNPs which have uniquely mapped reads, they suffer in loci where the reads are multiply mapped, and are unable to make any reliable calls. Variants in multiply mapped loci can arise, for example in long segmental duplications, and can play important role in evolution and disease.
In this paper, we develop a new SNP caller named abSNP, which offers three innovations. (a) abSNP calls SNPs from RNA-Seq data. Since RNA-Seq data is primarily sampled from gene regions, this method is inexpensive. (b) abSNP is able to successfully make calls on repetitive gene regions by exploiting the quality scores of multiply mapped reads carefully in order to make variant calls. (c) abSNP exploits a specific feature of RNA-Seq data, namely the varying abundance of different genes, in order to identify which repetitive copy a particular read is sampled from.
We demonstrate that the proposed method offers significant performance gains on repetitive regions in simulated data. In particular, the algorithm is able to achieve near-perfect sensitivity on high-coverage SNPs, even when multiply mapped.
BibTeX - Entry
@InProceedings{mao_et_al:LIPIcs:2017:7658,
author = {Shunfu Mao and Soheil Mohajer and Kannan Ramachandran and David Tse and Sreeram Kannan},
title = {{abSNP: RNA-Seq SNP Calling in Repetitive Regions via Abundance Estimation}},
booktitle = {17th International Workshop on Algorithms in Bioinformatics (WABI 2017)},
pages = {15:1--15:14},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-050-7},
ISSN = {1868-8969},
year = {2017},
volume = {88},
editor = {Russell Schwartz and Knut Reinert},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7658},
URN = {urn:nbn:de:0030-drops-76582},
doi = {10.4230/LIPIcs.WABI.2017.15},
annote = {Keywords: RNA-Seq, SNP Calling, Repetitive Region, Multiply Mapped Reads, Abundance Estimation}
}
Keywords: |
|
RNA-Seq, SNP Calling, Repetitive Region, Multiply Mapped Reads, Abundance Estimation |
Collection: |
|
17th International Workshop on Algorithms in Bioinformatics (WABI 2017) |
Issue Date: |
|
2017 |
Date of publication: |
|
11.08.2017 |