ProteinGym

ProteinGym is a collection of benchmarks aiming at comparing the ability of models to predict the effects of protein mutations. The benchmarks in ProteinGym are divided according to mutation type (substitutions vs. indels), ground truth source (DMS assay vs. clinical annotation), and training regime (zero-shot vs. supervised).

DMS Benchmarks

Clinical Benchmarks

Find the github repository for the benchmark here

Find the ProteinGym paper here

This project has been developed by:

Pascal Notin, Aaron Kollasch, Daniel Ritter, Lood Van Niekerk, Steffanie Paul, Han Spinner, Nathan Rollins, Ada Shaw, Rose Orenbuch, Ruben Weitzman, Jonathan Frazer, Mafalda Dias, Dinko Franceschi, Yarin Gal, and Debora Marks



OATML - Oxford Applied and Theoretical Machine Learning Group

Marks Lab - Harvard Medical School