Elucidating the Design Space of Multimodal Protein Language Models

Published in ICML, 2025

DPLM-2.1 identifies three core bottlenecks in structure modeling for multimodal protein language models: information loss caused by structure discretization, suboptimal index-based learning targets, and missing geometric modeling. It addresses them with better generative modeling, geometric modules, and representation learning, including residual diffusion, bitwise modeling, a hybrid flow-based sampler, and architectures that incorporate geometric priors.

Code: https://github.com/bytedance/dplm

Keywords: multimodal protein language models, structure tokenization, geometric representation learning, protein folding

Recommended citation: Xinyou Wang^*, Cheng-Yen Hsieh^*, Daiheng Zhang, Dongyu Xue, Fei Ye, Shujian Huang, Zaixiang Zheng, and Quanquan Gu. (2025). "Elucidating the Design Space of Multimodal Protein Language Models." Proceedings of the 42nd International Conference on Machine Learning, 24156-24175.
Download Paper

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)