License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2020.18
URN: urn:nbn:de:0030-drops-128077
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2020/12807/
Go to the corresponding LIPIcs Volume Portal


Frisby, Trevor S. ; Langmead, Christopher J.

Fold Family-Regularized Bayesian Optimization for Directed Protein Evolution

pdf-format:
LIPIcs-WABI-2020-18.pdf (1 MB)


Abstract

Directed Evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property (ex. binding affinity to a specified target). Unfortunately, the underlying optimization problem is under-determined, and so mutations introduced to improve the specified property may come at the expense of unmeasured, but nevertheless important properties (ex. subcellular localization). We seek to address this issue by incorporating a fold-specific regularization factor into the optimization problem. The regularization factor biases the search towards designs that resemble sequences from the fold family to which the protein belongs. We applied our method to a large library of protein GB1 mutants with binding affinity measurements to IgG-Fc. Our results demonstrate that the regularized optimization problem produces more native-like GB1 sequences with only a minor decrease in binding affinity. Specifically, the log-odds of our designs under a generative model of the GB1 fold family are between 41-45% higher than those obtained without regularization, with only a 7% drop in binding affinity. Thus, our method is capable of making a trade-off between competing traits. Moreover, we demonstrate that our active-learning driven approach reduces the wet-lab burden to identify optimal GB1 designs by 67%, relative to recent results from the Arnold lab on the same data.

BibTeX - Entry

@InProceedings{frisby_et_al:LIPIcs:2020:12807,
  author =	{Trevor S. Frisby and Christopher J. Langmead},
  title =	{{Fold Family-Regularized Bayesian Optimization for Directed Protein Evolution}},
  booktitle =	{20th International Workshop on Algorithms in Bioinformatics (WABI 2020)},
  pages =	{18:1--18:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-161-0},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{172},
  editor =	{Carl Kingsford and Nadia Pisanti},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/12807},
  URN =		{urn:nbn:de:0030-drops-128077},
  doi =		{10.4230/LIPIcs.WABI.2020.18},
  annote =	{Keywords: Protein design, Bayesian Optimization, Gaussian Process Regression, Regularization}
}

Keywords: Protein design, Bayesian Optimization, Gaussian Process Regression, Regularization
Collection: 20th International Workshop on Algorithms in Bioinformatics (WABI 2020)
Issue Date: 2020
Date of publication: 25.08.2020


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI