Generative AI for Biology

Generative Artificial Intelligence for Biology:
Toward Unifying Models, Algorithms, and Modalities

Xiner Li^1,7,*,†, Xingyu Su^1,*, Yuchao Lin^1,11, Chenyu Wang², Yijia Xiao³, Tianyu Liu⁴, Chi Han⁵, Michael Sun², Montgomery Bohde¹, Anna Hart⁵, Wendi Yu¹, Masatoshi Uehara⁶, Gabriele Scalia⁷, Xiao Luo⁸, Carl Edwards⁷, Wengong Jin^9,10, Jianwen Xie¹¹, Ehsan Hajiramezanali⁷, Edward De Brouwer⁷, Qing Sun¹², Byung-Jun Yoon^13,16, Xiaoning Qian^1,13,16, Marinka Zitnik¹⁴, Heng Ji⁵, Hongyu Zhao⁴, Wei Wang³, Shuiwang Ji^1,15,17,†

¹Department of Computer Science and Engineering, Texas A&M University, ²MIT CSAIL, ³Department of Computer Science, UCLA, ⁴Interdepartmental Program of Computational Biology and Bioinformatics, Yale University, ⁵Siebel School of Computing and Data Science, UIUC, ⁶Chan Zuckerberg Initiative, ⁷Genentech, Inc., ⁸Department of Statistics, University of Wisconsin–Madison, ⁹Broad Institute of MIT and Harvard, ¹⁰Khoury College of Computer Sciences, Northeastern University, ¹¹Lambda, Inc., ¹²Artie McFerrin Department of Chemical Engineering, Texas A&M University, ¹³Department of Electrical and Computer Engineering, Texas A&M University, ¹⁴Department of Biomedical Informatics, Harvard Medical School, ¹⁵Department of Materials Science and Engineering, Texas A&M University, ¹⁶Computing and Data Sciences, Brookhaven National Laboratory, ¹⁷J. Mike Walker '66 Department of Mechanical Engineering, Texas A&M University

^*Equal Contribution, ^†Correspondence to: Xiner Li <lxe@tamu.edu>, Shuiwang Ji <sji@tamu.edu>

Abstract

Rapid advances in generative artificial intelligence have revolutionized biological modeling across domains such as protein, genetics, and single-cell. However, existing works often organize applications by molecule types or specific research tasks, overlooking the methodological convergence and cross-modal innovations. This paper aims to present a unified methodological perspective that highlights the fundamental technical commonalities across biological modalities. We systematically organize recent advances in generative modeling for biology through the lens of core machine learning paradigms, from language models (LMs) and diffusion models to their emerging hybrid architectures. Our work reveals how techniques initially developed for one molecular type (e.g., protein design) can be effectively transferred to others (e.g., RNA engineering), and identifies the convergence trend where discrete diffusion models and iterative language models represent different facets of a unified generative framework. We cover the evolution from domain-specific models to multi-modal biological foundation models and agent-based systems. By emphasizing methodological connections rather than applications, this paper aims to accelerate cross-domain innovation and make the field more accessible to the broader machine learning community. We conclude by identifying promising research directions where successful techniques in one biological domain remain unexplored in others, offering a roadmap for future advances in generative biology.

BibTeX

@article{ GenAI4bio:Li, author = {Xiner Li and Xingyu Su and Yuchao Lin and Chenyu Wang and Yijia Xiao and Tianyu Liu and Chi Han and Michael Sun and Montgomery Bohde and Anna Hart and Wendi Yu and Masatoshi Uehara and Gabriele Scalia and Xiao Luo and Carl Edwards and Wengong Jin and Jianwen Xie and Ehsan Hajiramezanali and Edward De Brouwer and Qing Sun and Byung-Jun Yoon and Xiaoning Qian and Marinka Zitnik and Heng Ji and Hongyu Zhao and Wei Wang and Shuiwang Ji}, title = {Generative Artificial Intelligence for Biology: Toward Unifying Models, Algorithms, and Modalities}, journal = {ChemRxiv}, volume = {2026}, number = {0212}, pages = {}, year = {2026}, }

Generative Artificial Intelligence for Biology:
Toward Unifying Models, Algorithms, and Modalities

Abstract

Survey Overview

Language Modeling for Biological Generation

Diffusion & Flow Matching Models

Toward Unified Models

Multi-Modal Foundation Models

Biological Agent Systems

BibTeX

Generative Artificial Intelligence for Biology:Toward Unifying Models, Algorithms, and Modalities

Abstract

Survey Overview

Language Modeling for Biological Generation

Diffusion & Flow Matching Models

Toward Unified Models

Multi-Modal Foundation Models

Biological Agent Systems

BibTeX

Generative Artificial Intelligence for Biology:
Toward Unifying Models, Algorithms, and Modalities