License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2020.2
URN: urn:nbn:de:0030-drops-127911
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12791/
Zabeti, Hooman ;
Dexter, Nick ;
Safari, Amir Hosein ;
Sedaghat, Nafiseh ;
Libbrecht, Maxwell ;
Chindelevitch, Leonid
An Interpretable Classification Method for Predicting Drug Resistance in M. Tuberculosis
Abstract
Motivation: The prediction of drug resistance and the identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Modern methods based on testing against a catalogue of previously identified mutations often yield poor predictive performance. On the other hand, machine learning techniques have demonstrated high predictive accuracy, but many of them lack interpretability to aid in identifying specific mutations which lead to resistance. We propose a novel technique, inspired by the group testing problem and Boolean compressed sensing, which yields highly accurate predictions and interpretable results at the same time.
Results: We develop a modified version of the Boolean compressed sensing problem for identifying drug resistance, and implement its formulation as an integer linear program. This allows us to characterize the predictive accuracy of the technique and select an appropriate metric to optimize. A simple adaptation of the problem also allows us to quantify the sensitivity-specificity trade-off of our model under different regimes. We test the predictive accuracy of our approach on a variety of commonly used antibiotics in treating tuberculosis and find that it has accuracy comparable to that of standard machine learning models and points to several genes with previously identified association to drug resistance.
BibTeX - Entry
@InProceedings{zabeti_et_al:LIPIcs:2020:12791,
author = {Hooman Zabeti and Nick Dexter and Amir Hosein Safari and Nafiseh Sedaghat and Maxwell Libbrecht and Leonid Chindelevitch},
title = {{An Interpretable Classification Method for Predicting Drug Resistance in M. Tuberculosis}},
booktitle = {20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
pages = {2:1--2:18},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-161-0},
ISSN = {1868-8969},
year = {2020},
volume = {172},
editor = {Carl Kingsford and Nadia Pisanti},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2020/12791},
URN = {urn:nbn:de:0030-drops-127911},
doi = {10.4230/LIPIcs.WABI.2020.2},
annote = {Keywords: Drug resistance, whole-genome sequencing, interpretable machine learning, integer linear programming, rule-based learning}
}
Keywords: |
|
Drug resistance, whole-genome sequencing, interpretable machine learning, integer linear programming, rule-based learning |
Collection: |
|
20th International Workshop on Algorithms in Bioinformatics (WABI 2020) |
Issue Date: |
|
2020 |
Date of publication: |
|
25.08.2020 |
Supplementary Material: |
|
https://github.com/hoomanzabeti/TB_Resistance_RuleBasedClassifier |