A groundbreaking study led by Rosa Fernández at the Institute for Evolutionary Biology (IBE: CSIC-UPF) and Ana Rojas at the Andalusian Center for Developmental Biology (CABD, CSIC–UPO) has unveiled FANTASIA, an artificial intelligence tool capable of predicting the function of any protein directly from its genetic sequence. Using the same principles behind language models like ChatGPT, FANTASIA (Functional ANnoTAtion based on embedding space SImilArity) treats DNA as a biological language, deciphering its “grammar” to infer what each protein does—without relying on traditional homology-based comparisons.
In a sweeping analysis of nearly 1,000 animal genomes, FANTASIA successfully annotated 24 million previously uncharacterized genes with remarkable accuracy. The open-access tool runs efficiently on standard computers and can process an entire genome in just hours, bringing high-level genomic analysis within reach of research groups worldwide. By revealing the “dark proteome”—the half of all proteins whose functions remain unknown—FANTASIA provides an unprecedented window into the complexity and evolution of life.
Although we know the function of about 80–90% of human proteins, in other organisms like invertebrates, the function of more than half of all proteins remains a mystery.
Beyond its evolutionary implications, the tool holds vast potential for biomedical and biotechnological innovation. By identifying novel protein functions, FANTASIA could guide the discovery of new therapeutic targets and enzymes. As researchers describe it, this “ChatGPT of proteins” is set to transform how we explore the molecular foundations of life.
“We know that other research groups around the world are already using FANTASIA and that it works not only in animals, but also in plants, viruses, fungi, and protists. The potential for discovering new genes that could revolutionize biotechnology, medicine, or biodiversity conservation is limitless,” concludes Fernández.
Martínez-Redondo, G. I., Perez-Canales, F. M., Carbonetto, B., Fernández, J. M., Barrios-Núñez, I., Vázquez-Valls, M., Cases, I., Rojas, A. M., & Fernández, R. (2025). FANTASIA leverages language models to decode the functional dark proteome across the animal tree of life. Communications Biology, 8(1), 1227. https://doi.org/10.1038/s42003-025-08651-2



