UW Nobel winnerβs lab releases most powerful protein design tool yet

David Bakerβs lab at the University of Washington is announcing two major leaps in the field of AI-powered protein design. The first is a souped-up version of its existing RFdiffusion2 tool that can now design enzymes with performance nearly on par with those found in nature. The second is the release of a new, general-purpose version of its model, named RFdiffusion3, which the researchers are calling their most powerful and versatile protein engineering technology to date.
Last year, Baker received the Nobel Prize in Chemistry for his pioneering work in protein science, which includes a deep-learning model called RFdiffusion. The tool allows scientists to design novel proteins that have never existed. These machine-made proteins hold immense promise, from developing medicines for previously untreatable diseases to solving knotty environmental challenges.
Baker leads the UWβs Institute for Protein Design, which released the first version of the core technology in 2023, followed by RFdiffusion2 earlier this year. The second model was fine-tuned for creating enzymes β proteins that orchestrate the transformation of molecules and dramatically speed up chemical reactions.
The latest accomplishments are being shared today in publications in the leading scientific journals Nature and Nature Methods, as well as a preprint last month on bioRxiv.
A better model for enzyme construction

In the improved version of RFdiffusion2, the researchers took a more hands-off approach to guiding the technology, giving it a specific enzymatic task to perform but not specifying other features. Or as the team described it in a press release, the tool produces βblueprints for physical nanomachines that must obey the laws of chemistry and physics to function.β
βYou basically let the model have all this space to explore and β¦ you really allow it to search a really wide space and come up with great, great solutions,β said Seth Woodbury, a graduate student in Bakerβs lab and author on both papers publishing today.
In addition to UW scientists, researchers from MIT and Switzerlandβs ETH Zurich contributed to the work.
The new approach is remarkable for quickly generating higher-performing enzymes. In a test of the tool, it was able to solve 41 out of 41 difficult enzyme design challenges, compared to only 16 for the previous version.
βWhen we designed enzymes, theyβre always an order of magnitude worse than native enzymes that evolution has taken billions of years to find,β said Rohith Krishna, a postdoctoral fellow and lead developer of RFdiffusion2. βThis is one of the first times that weβre not one of the best enzymes ever, but weβre in the ballpark of native enzymes.β
The researchers successfully used the model to create proteins calls metallohydrolases, which accelerate difficult reactions using a precisely positioned metal ion and an activated water molecule. The engineered enzymes could have important applications, including the destruction of pollutants.
The promise of rapidly designed catalytic enzymes could unleash wide-ranging applications, Baker said.
βThe first problem we really tackled with AI, it was largely therapeutics, making binders to drug targets,β he said. βBut now with catalysis, it really opens up sustainability.β
The researchers are also working with the Gates Foundation to figure out lower-cost ways to build what are known as small molecule drugs, which interact with proteins and enzymes inside cells, often by blocking or enhancing their function to effect biological processes.
The most powerful model to date

While RFdiffusion2 is fine-tuned to make enzymes, the Institute for Protein Design researchers were also eager to build a tool with wide-ranging functionality. RFdiffusion3 is that new AI model. It can create proteins that interact with virtually every type of molecule found in cells, including the ability to bind DNA, other proteins and small molecules, in addition to enzyme-related functions.
βWe really are excited about building more and more complex systems, so we didnβt want to have bespoke models for each application. We wanted to be able to combine everything into one foundational model,β said Krishna, a lead developer of RFdiffusion3.
Today the team is publicly releasing the code for the new machine learning tool.
βWeβre really excited to see what everyone else builds on it,β Krishna said.
And while the steady stream of model upgrades, breakthroughs and publications in top-notch journals seems to continue unabated from the Institute for Protein Design, there are plenty of behind-the-scenes stumbles, Baker said.
βIt all sounds beautiful and simple at the end when itβs done,β he said. βBut along the way, thereβs always the moments when it seems like it wonβt work.β
But the researchers keep at it, and so far at least, they keep finding a path forward. And the institute continues minting new graduates and further training postdocs who go on to launch companies or establish their own academic labs.
βI donβt surf, but I sort of feel like weβre riding a wave and itβs just fun,β Baker said. βI mean, itβs so many, so many problems are getting solved. And yeah, itβs really exhilarating, honestly.β
The Nature paper, titled βComputational design of metallohydrolases,β was authored by Donghyo Kim, Seth Woodbury, Woody Ahern, Doug Tischer, Alex Kang, Emily Joyce, Asim Bera, Nikita Hanikel, Saman Salike, Rohith Krishna, Jason Yim, Samuel Pellock, Anna Lauko, Indrek Kalvet, Donald Hilvert and David Baker.
The Nature Methods paper, titled βAtom-level enzyme active site scaffolding using RFdiffusion2,β was authored by Woody Ahern, Jason Yim, Doug Tischer, Saman Salike, Seth Woodbury, Donghyo Kim, Indrek Kalvet, Yakov Kipnis, Brian Coventry, Han Raut Altae-Tran, Magnus Bauer, Regina Barzilay, Tommi Jaakkola, Rohith Krishna and David Baker.