License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2018.6
URN: urn:nbn:de:0030-drops-93082
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2018/9308/
Go to the corresponding LIPIcs Volume Portal


Dey, Tamal K. ; Mandal, Sayan

Protein Classification with Improved Topological Data Analysis

pdf-format:
LIPIcs-WABI-2018-6.pdf (1 MB)


Abstract

Automated annotation and analysis of protein molecules have long been a topic of interest due to immediate applications in medicine and drug design. In this work, we propose a topology based, fast, scalable, and parameter-free technique to generate protein signatures.
We build an initial simplicial complex using information about the protein's constituent atoms, including its radius and existing chemical bonds, to model the hierarchical structure of the molecule. Simplicial collapse is used to construct a filtration which we use to compute persistent homology. This information constitutes our signature for the protein. In addition, we demonstrate that this technique scales well to large proteins. Our method shows sizable time and memory improvements compared to other topology based approaches. We use the signature to train a protein domain classifier. Finally, we compare this classifier against models built from state-of-the-art structure-based protein signatures on standard datasets to achieve a substantial improvement in accuracy.

BibTeX - Entry

@InProceedings{dey_et_al:LIPIcs:2018:9308,
  author =	{Tamal K. Dey and Sayan Mandal},
  title =	{{Protein Classification with Improved Topological Data Analysis}},
  booktitle =	{18th International Workshop on Algorithms in  Bioinformatics (WABI 2018)},
  pages =	{6:1--6:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-082-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{113},
  editor =	{Laxmi Parida and Esko Ukkonen},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2018/9308},
  URN =		{urn:nbn:de:0030-drops-93082},
  doi =		{10.4230/LIPIcs.WABI.2018.6},
  annote =	{Keywords: topological data analysis, persistent homology, simplicial collapse, supervised learning, topology based protein feature vector, protein classificatio}
}

Keywords: topological data analysis, persistent homology, simplicial collapse, supervised learning, topology based protein feature vector, protein classificatio
Collection: 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Issue Date: 2018
Date of publication: 02.08.2018
Supplementary Material: http://web.cse.ohio-state.edu/~dey.8/proteinTDA


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI