Benchmark Scores
|
|
1 | ProSST (K=2048) | Hybrid - Structure & PLM | 0.507 | 0.0 | 0.476 | 0.445 | 0.53 | 0.431 | 0.653 | 0.473 | 0.511 | 0.578 | 0.516 | 0.573 | 0.549 | 0.454 | 0.521 | 0.394 | 0.317 | 0.277 | 0.332 | ProSST (K=2048) | Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Ziyi Zhou, Huiqun Yu, Wanli Ouyang, Liang Hong, Bingxin Zhou, Pan Tan. (2024). ProSST: Protein language modeling with quantizied structure and disentangled attention. bioRxiv. |
2 | ProSST (K=4096) | Hybrid - Structure & PLM | 0.498 | 0.008 | 0.444 | 0.472 | 0.507 | 0.416 | 0.652 | 0.477 | 0.488 | 0.579 | 0.497 | 0.574 | 0.547 | 0.44 | 0.505 | 0.426 | 0.388 | 0.342 | 0.408 | ProSST (K=4096) | Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Ziyi Zhou, Huiqun Yu, Wanli Ouyang, Liang Hong, Bingxin Zhou, Pan Tan. (2024). ProSST: Protein language modeling with quantizied structure and disentangled attention. bioRxiv. |
3 | ProSST (K=1024) | Hybrid - Structure & PLM | 0.485 | 0.006 | 0.433 | 0.436 | 0.499 | 0.414 | 0.642 | 0.466 | 0.473 | 0.58 | 0.483 | 0.568 | 0.539 | 0.436 | 0.492 | 0.434 | 0.373 | 0.341 | 0.403 | ProSST (K=1024) | Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Ziyi Zhou, Huiqun Yu, Wanli Ouyang, Liang Hong, Bingxin Zhou, Pan Tan. (2024). ProSST: Protein language modeling with quantizied structure and disentangled attention. bioRxiv. |
4 | ProSST (K=512) | Hybrid - Structure & PLM | 0.471 | 0.006 | 0.423 | 0.431 | 0.479 | 0.394 | 0.629 | 0.446 | 0.458 | 0.566 | 0.472 | 0.55 | 0.529 | 0.408 | 0.476 | 0.41 | 0.354 | 0.314 | 0.383 | ProSST (K=512) | Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Ziyi Zhou, Huiqun Yu, Wanli Ouyang, Liang Hong, Bingxin Zhou, Pan Tan. (2024). ProSST: Protein language modeling with quantizied structure and disentangled attention. bioRxiv. |
5 | PoET (200M) | Hybrid - Alignment & PLM | 0.47 | 0.009 | 0.494 | 0.396 | 0.466 | 0.475 | 0.519 | 0.488 | 0.472 | 0.515 | 0.482 | 0.541 | 0.464 | 0.491 | 0.466 | 0.299 | 0.419 | 0.395 | 0.418 | PoET (200M) | Truong, Timothy F. and Tristan Bepler. PoET: A generative model of protein families as sequences-of-sequences. NeurIPS. |
6 | ProSST (K=128) | Hybrid - Structure & PLM | 0.469 | 0.007 | 0.417 | 0.44 | 0.473 | 0.387 | 0.628 | 0.451 | 0.453 | 0.561 | 0.466 | 0.545 | 0.523 | 0.415 | 0.472 | 0.418 | 0.376 | 0.332 | 0.394 | ProSST (K=128) | Mingchen Li, Yang Tan, Xinzhu Ma, Bozitao Zhong, Ziyi Zhou, Huiqun Yu, Wanli Ouyang, Liang Hong, Bingxin Zhou, Pan Tan. (2024). ProSST: Protein language modeling with quantizied structure and disentangled attention. bioRxiv. |
|
* Non-parametric bootstrap standard error of the difference between the Spearman performance of a given model and that of the best overall model (ie., TranceptEVE), computed over 10k bootstrap samples from the set of proteins in the ProteinGym substitution benchmark.