License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2017.11
URN: urn:nbn:de:0030-drops-76556
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7655/
Hu, Chenyue W. ;
Li, Hanyang ;
Qutub, Amina A.
Shrinkage Clustering: A Fast and Size-Constrained Algorithm for Biomedical Applications
Abstract
Motivation: Many common clustering algorithms require a two-step process that limits their efficiency. The algorithms need to be performed repetitively and need to be implemented together with a model selection criterion, in order to determine both the number of clusters present in the data and the corresponding cluster memberships. As biomedical datasets increase in size and prevalence, there is a growing need for new methods that are more convenient to implement and are more computationally efficient. In addition, it is often essential to obtain clusters of sufficient sample size to make the clustering result meaningful and interpretable for subsequent analysis.
Results: We introduce Shrinkage Clustering, a novel clustering algorithm based on matrix factorization that simultaneously finds the optimal number of clusters while partitioning the data. We report its performances across multiple simulated and actual datasets, and demonstrate its strength in accuracy and speed in application to subtyping cancer and brain tissues. In addition, the algorithm offers a straightforward solution to clustering with cluster size constraints. Given its ease of implementation, computing efficiency and extensible structure, we believe Shrinkage Clustering can be applied broadly to solve biomedical clustering tasks especially when dealing with large datasets.
BibTeX - Entry
@InProceedings{hu_et_al:LIPIcs:2017:7655,
author = {Chenyue W. Hu and Hanyang Li and Amina A. Qutub},
title = {{Shrinkage Clustering: A Fast and Size-Constrained Algorithm for Biomedical Applications}},
booktitle = {17th International Workshop on Algorithms in Bioinformatics (WABI 2017)},
pages = {11:1--11:13},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-050-7},
ISSN = {1868-8969},
year = {2017},
volume = {88},
editor = {Russell Schwartz and Knut Reinert},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7655},
URN = {urn:nbn:de:0030-drops-76556},
doi = {10.4230/LIPIcs.WABI.2017.11},
annote = {Keywords: Clustering, Matrix Factorization, Cancer Subtyping, Gene Expression}
}
Keywords: |
|
Clustering, Matrix Factorization, Cancer Subtyping, Gene Expression |
Collection: |
|
17th International Workshop on Algorithms in Bioinformatics (WABI 2017) |
Issue Date: |
|
2017 |
Date of publication: |
|
11.08.2017 |