Opportunities and challenges for machine learning-assisted enzyme engineering


Enzymes can be engineered at the level of their amino acid sequences to optimize key properties such as expression, stability, substrate range, and catalytic efficiency─or even to unlock new catalytic activities not found in nature. Because the search space of possible proteins is vast, enzyme engineering usually involves discovering an enzyme starting point that has some level of the desired activity followed by directed evolution to improve its “fitness” for a desired application. Recently, machine learning (ML) has emerged as a powerful tool to complement this empirical process. ML models can contribute to (1) starting point discovery by functional annotation of known protein sequences or generating novel protein sequences with desired functions and (2) navigating protein fitness landscapes for fitness optimization by learning mappings between protein sequences and their associated fitness values. In this Outlook, we explain how ML complements enzyme engineering and discuss its future potential to unlock improved engineering outcomes.

ACS Central Science