Towards A Generative Protein Evolution Machine with DPLM-Evo

Published in ICML, 2026

DPLM-Evo is an edit-based discrete diffusion protein language model that supports substitutions, insertions, and deletions. It introduces an upsampled latent alignment space for variable-length ins/del modeling, and uses a contextualized noising kernel to learn biologically plausible substitution patterns. DPLM-Evo achieves SOTA single-sequence mutation effect prediction on ProteinGym, while enabling variable-length generation and post-editing of existing proteins.

Keywords: edit-based discrete diffusion models, contextualized noising kernel, variable-length modeling, latent alignment, mutation effect prediction

Recommended citation: Xinyou Wang*, Liang Hong, Jiasheng Ye, Zaixiang Zheng, Yu Li, Shujian Huang, and Quanquan Gu. (2026). "Towards A Generative Protein Evolution Machine with DPLM-Evo." International Conference on Machine Learning.
Download Paper