Ivan Kairatov is a leading expert in biopharmaceutical innovation with a distinguished career in research and development, specializing in the intersection of artificial intelligence and synthetic biology. As a key contributor to the development of high-throughput protein engineering frameworks, he has pioneered methods that move beyond traditional “trial and error” lab work. His expertise lies in leveraging machine learning to navigate the astronomical complexity of sequence space, transforming how we design enzymes, genome editors, and therapeutic antibodies.
The following discussion explores the shift from massive, random datasets to strategic, small-scale experimental design. We delve into the mechanics of “lab-in-the-loop” systems, the hidden biases of current protein language models, and the technical breakthroughs in DNA assembly that allow researchers to compress years of evolutionary work into a matter of weeks.
Protein engineering involves navigating a search space of $20^{100}$ possible variants, yet practical lab constraints often limit testing to just a few hundred. How do you prioritize which variants to build to avoid the “noise” of non-functional mutations, and what specific metrics define a high-quality training set?
The reality of the lab is that we are constantly fighting the “curse of dimensionality,” where the number of possible protein sequences outstrips the atoms in the universe. To avoid drowning in non-functional noise, we pivot away from the traditional approach of testing thousands of random variants, which usually only teaches a model how to fail. A high-quality training set is defined by its density of functional information; specifically, we focus on a compact library of roughly 200 variants that are enriched for known or predicted beneficial mutations. By prioritizing mutations that already show a propensity for function, every data point we collect serves as a meaningful signal for the neural network. This strategic curation ensures that we aren’t just scanning sequence space, but are instead mapping the peaks of the fitness landscape where activity actually resides.
Neural networks trained solely on single-mutant data frequently struggle to predict complex multi-mutant performance. Why is it more effective to systematically test pairwise combinations of 15–20 beneficial mutations rather than larger random datasets, and how do these interactions reveal the underlying rules of protein synergy?
Single mutations are the building blocks, but the real magic—and the real challenge—lies in epistasis, or how mutations interact with one another. When we systematically test all pairwise combinations of 15 to 20 beneficial mutations, we create a specialized dataset of 100 to 200 measurements that explicitly capture synergy, antagonism, and additivity. A double mutant might perform 10 times better than the sum of its parts, or it might completely collapse the protein structure, and these specific outcomes are the “rules” the neural network needs to learn. By feeding the model these double-mutant interactions, it gains the mathematical intuition required to extrapolate and predict how five, six, or even seven mutations will behave together. This pairwise approach is far more potent than a random dataset of 10,000 variants because it focuses entirely on the functional “cross-talk” between the most promising residues.
Many protein language models exhibit biases, such as penalizing proline substitutions, which can lead researchers to overlook critical mutations. How does combining sequence and 3D structural models mitigate these errors, and what steps are necessary to normalize scores so that unconventional but high-performing variants are identified?
Relying on a single model is dangerous because every architecture has a “blind spot,” such as the common bias against proline which many models view as a structural disruptor regardless of its actual functional benefit. In our work with the APEX enzyme, we found that standard models completely missed the A134P mutation, which actually delivers a massive 53-fold improvement in activity. To fix this, we use an ensemble approach that blends sequence-based predictions with 3D structural analysis, effectively cross-referencing different “opinions” on a mutation’s viability. We then implement a normalization process for amino acid-specific biases, which levels the playing level and allows these unconventional, high-performing candidates to rise to the top of our priority list. By combining these diverse perspectives, we typically identify around 20 beneficial mutations, nearly doubling the 11 discovered by any single model alone.
In a single round of machine learning, it is now possible to predict hyperactive variants with up to seven mutations. What is the step-by-step process for using a compact training set of 200 variants to reach these results, and how does this compressed timeline compare to traditional iterative engineering?
The process is remarkably streamlined: first, we use our ensemble of models to pick the top 15–20 single mutations, then we synthesize and test their pairwise combinations to generate a training set of roughly 200 variants. This data is fed into a fully connected neural network which, having learned the rules of epistasis from the doubles, predicts a small handful of complex multi-mutants—sometimes testing as few as nine final candidates. In the case of dCasRx, this allowed us to identify variants with up to seven mutations that showed nearly 10-fold improvements in trans-splicing efficiency. This “one-and-done” machine learning round replaces the traditional iterative cycle of 5 to 10 rounds of evolution. We are effectively compressing a project that used to take the better part of a year into a window of just a few weeks, moving from initial prediction to hyperactive variant with unprecedented speed.
Traditional DNA synthesis and mutagenesis can take months when building variants with many simultaneous mutations. What specific changes to reaction conditions or primer design allow for 40–70% assembly efficiency in complex variants, and how does this rapid synthesis accelerate the transition from prediction to experimental validation?
The technical bottleneck in the lab is often the physical assembly of these complex designs, where traditional methods fail as the number of mutations increases. To overcome this, we developed a method called MULTI-assembly, which uses a computational oligo designer to generate primers that are mathematically optimized for the specific target sequence. By systematically refining the reaction conditions and assembly parameters, we’ve pushed efficiency to the 40–70% range for variants containing up to nine mutations across several kilobases of DNA. This means we no longer have to wait weeks for expensive commercial synthesis or struggle with failed manual mutagenesis. We can now move from a computer-generated design to a physical, testable protein in just a few days, which is the heartbeat of a true “lab-in-the-loop” framework.
High-performing variants like those found in APEX or dCasRx show that small, strategic datasets can outperform deep mutational scans of 11,000+ variants. How do you determine which function-enhancing mutations to extract for pairwise testing, and what anecdotal evidence suggests this approach is versatile across different protein families?
The secret is focusing on “quality over quantity” by using deep mutational scans only as a starting pool from which we extract only the genuine high-performers for our pairwise matrix. In the case of APEX, this led us to a variant with a 256-fold improvement over the wild-type, which is 4.8 times better than the already heavily optimized APEX2 enzyme. We’ve seen similar success with an anti-CD122 antibody, where we achieved a 2.7-fold improvement in binding and a 6.5-fold increase in expression using the same workflow. This versatility across enzymes, genome editors, and antibodies proves that the underlying logic—learning from strategic double mutants—is a universal principle of protein architecture. It suggests that whether you are working on a therapeutic or an industrial catalyst, the rules of synergy remain the same.
What is your forecast for AI-guided protein design?
I believe we are entering an era where “rational design” finally lives up to its name, moving away from massive, wasteful screening towards surgical, data-efficient interventions. In the near future, the integration of 3D structural data with real-time experimental feedback will allow us to design highly complex, multi-functional proteins in a single afternoon. We will see AI models that don’t just predict fitness, but also account for manufacturability and stability from the very first step, drastically reducing the failure rate of new drug candidates. Ultimately, the barrier between a computational idea and a functional biological tool will become almost transparent, allowing us to respond to new pathogens or environmental challenges with bespoke molecular solutions in a matter of days.
