A groundbreaking AI model called popEVE, developed by researchers at Harvard Medical School and the Centre for Genomic Regulation (CRG), promises to transform the diagnosis of rare genetic diseases. PopEVE uses evolutionary data across hundreds of thousands of species, as well as data of genetic variation across the human population, to pinpoint mutations in human proteins that may cause disease, even those that have never been seen before. By comparing human protein sequences to those of other species, popEVE identifies which parts of the about 20,000 human proteins are essential for life and which can tolerate mutations. This allows the model to not only detect potential disease-causing mutations but also rank their severity, offering doctors a new tool to prioritize the most dangerous mutations in a patient’s genome.
Over billions of years, evolution on Earth has already run countless experiments, testing which changes a protein can tolerate and which are too damaging to survive
This model is particularly vital for rare diseases, where half of patients never receive a clear diagnosis. Unlike traditional methods that rely on large patient cohorts, popEVE works with individual genetic data, making it faster, simpler, and more accessible—especially in healthcare systems with limited resources. By accurately identifying harmful mutations, even in cases where no previous data exists, popEVE can help doctors make better-informed decisions without needing parental DNA or extensive genetic histories, says Mafalda-Dias, from the CRG and one of the leaders of the study. In testing, popEVE correctly identified known disease mutations 98% of the time, and uncovered new candidate genes linked to developmental disorders.
Furthermore, popEVE addresses the issue of underrepresentation and lack of diversity in genetic databases, where most of the data is from European ancestry. By asking whether a mutation has been seen before in humans, regardless of whether it’s once in a specific population or a thousand times in over-represented European populations, it ensures that all populations are equally considered, regardless of ancestry.
“No one should get a scary result just because their community isn’t well represented in global databases. popEVE helps fix that imbalance, something the field has been missing for a long time”
Jonathan Frazer, co-corresponding author (CRG)
In the study, popEVE uncovered 123 mutations that had never been linked to developmental disorders before; 104 of these were observed in just one or two patients.
This development holds promise for revolutionizing genetic diagnostics, making them more equitable and efficient.
Orenbuch, R., Shearer, C.A., Kollasch, A.W. et al. Proteome-wide model for human disease genetics. Nat Genet (2025). https://doi.org/10.1038/s41588-025-02400-1



