Using AI to Create Proteins from Beyond Nature

By Mark Miller

Machine learning and other artificial intelligence (AI) tools have already been used in protein research to predict the structures of naturally occurring proteins. Now, biochemists are using AI to move beyond natural templates to build proteins that have never existed before. But how does AI replicate natural processes to help build proteins from scratch, and what are their possible applications?

A Text Model Like ChatGPT

According to the article "Proteins Never Seen in Nature Are Designed Using AI to Address Biomedical and Industrial Problems Unsolved by Evolution" by Michael Eisenstein in Scientific American, language-based generative AI models—like the one used by ChatGPT—can be adapted to generate new protein sequences and structures. In fact, an effective way to understand protein sequences is to think of them as text.

In these applications, AI algorithms are trained on vast amounts of biological information but must also follow chemical and biological rules—or biological "grammar," as Eisenstein calls it. “To generate a fluent sentence or a document, the algorithm needs to learn about relationships between different types of words, but it also needs to learn facts about the world to make a document that’s cohesive and makes sense,” Ali Madani, founder of the protein design company Profluent, said in the article. With this text-based modeling technology in place, AI can help develop new proteins similar to the way ChatGPT produces text based on the language it’s been trained on.

Images and Landscapes

While the language-model approach is proving to work, it’s not the only option. A program called Chroma employs diffusion models—typically used in image-generation AI tools—that are adept at manipulating multidimensional data.

Faruck Morcos, PhD, an associate professor of biological sciences at The University of Texas at Dallas (UT Dallas), is using a variant of this imaging strategy. According to a story published by UT Dallas, he and his team are generating 3D landscapes that allow them to visualize new proteins. “Our new framework is like a road map,” Morcos said. “Rather than simply analyzing existing protein sequences, we look at the evolution of the proteins and construct maps looking both at proteins that already exist as well as generating and plotting out potential sequences.”

"For the applications we are interested in, like sustainability, medicine, food, health, and materials design, we are going to need to go beyond what nature has done."
- Markus Buehler, PhD, McAfee Professor of Engineering, Massachusetts Institute of Technology

Proof Is in the Folding

One of the key challenges of designing and building new proteins is the ability to validate that they will operate as natural proteins rather than just be random chains of chemical compounds.

A team of researchers at the University of Toronto is testing their AI-built proteins using OmegaFold, a version of the DeepMind software AlphaFold 2. With this system, they were able to confirm that any new sequences folded into a functional structure. This validation is critical because folding translates a protein chain into a three-dimensional structure and can determine whether it is in the correct shape to function. The team confirmed the viability of their structures by creating physical versions of them in the lab.

Protein Power

Because new proteins can be designed for specific traits, they hold tremendous promise in biomedical, industrial, and environmental applications.

A report from the Massachusetts Institute of Technology (MIT) states that while new proteins can be problematic in biomedical applications because their properties aren’t fully understood, they show great potential because they can be modeled after existing natural proteins and tailored to meet specific requirements.

In the industrial world, new proteins can be used to manufacture materials with specific rigidity and pliability properties to replace petroleum or ceramic-based materials, but with a much smaller carbon footprint. Food coatings that help keep produce safe to eat and fresher for longer are another possibility.

"For the applications we are interested in, like sustainability, medicine, food, health, and materials design, we are going to need to go beyond what nature has done,“ said Markus Buehler, PhD, McAfee Professor of Engineering at MIT.

Mark Miller is a Thermo Fisher Scientific staff writer.

Using AI to Create Proteins from Beyond Nature