Generative AI Reconstructs DNA’s 3D Shape for the First Time

Deep within the nucleus of every human cell, a microscopic miracle unfolds: over two meters of DNA, twisted and folded into an impossibly compact structure, choreographs the symphony of life. For decades, scientists have worked to understand not only the sequence of DNA—the genetic alphabet of A, T, C, and G—but the three-dimensional dance that gives these letters their function. Now, a revolutionary study from researchers at Skoltech has found an unexpected ally in this quest: generative artificial intelligence.

In a world-first achievement published in Scientific Reports, the Skoltech team, led by Assistant Professor Kirill Polovnikov, has shown that generative AI—the same type of technology that powers chatbots and image generators—can fill in the blanks in the spatial map of DNA, reconstructing the intricate architecture that underpins gene expression, development, and disease.

Why DNA’s 3D Shape Matters More Than We Realized

At a glance, DNA might seem like a simple code—just four letters repeated billions of times. But in reality, genes don’t operate in isolation. Their position in three-dimensional space determines when, where, and how they are activated. Imagine an orchestra: even if every musician is playing the right notes, without the right arrangement and timing, the result is chaos.

Inside each of our cells are 46 massive DNA molecules—our chromosomes—each coiled, looped, and folded in a unique way. These configurations influence everything from cell identity to genetic disorders. If a gene is tucked away in an inaccessible region of chromatin (the complex of DNA and protein that makes up chromosomes), it may remain silent. Conversely, genes exposed at the wrong time or in the wrong place can trigger disease, including cancer and congenital disorders.

Thus, understanding the spatial arrangement of genes isn’t just an academic pursuit—it’s a medical necessity.

The Challenge: Fragmented Data and the Limits of Microscopy

For years, the go-to method for visualizing the 3D layout of DNA was fluorescence microscopy. Scientists would stain selected gene sequences with fluorescent markers, enabling them to light up under the microscope. But there’s a catch: you can only tag gene sequences that are unique enough to be recognized by a complementary probe. If a sequence contains repetitive elements—say, a string of adenines (A’s)—it becomes impossible to mark it precisely without staining similar sequences elsewhere.

This limitation has plagued researchers. Despite sophisticated imaging tools, the result has always been a jigsaw puzzle with too many missing pieces. Scientists could see some parts of the DNA structure—but not the whole picture. This has slowed progress in understanding how chromatin folds and how that folding changes in disease states.

“We’ve always had to work with incomplete data,” said Polovnikov. “That limits everything—from basic science to medical applications.”

Until now.

A New Solution: Teaching AI to Reconstruct the Genome’s Missing Links

What if, instead of just collecting more data, we could generate the missing pieces based on what we already have? This is where generative AI enters the picture. Best known for crafting realistic images, deepfake videos, and eerily human-like conversations, generative models—particularly diffusion and transformer-based architectures—have recently shown promise far beyond their original domains.

“We realized that if we could know the distances between enough pairs of genes, the rest becomes a mathematical problem,” Polovnikov explained. “The missing distances can, in theory, be inferred. But doing that accurately and reliably? That’s where generative AI becomes indispensable.”

In the Skoltech study, the team trained a generative model to learn from the partially observed distance maps obtained via fluorescence microscopy. Once trained, the model could predict the missing distances between gene pairs, effectively reconstructing a more complete 3D map of the genome. The result wasn’t a vague approximation—it was a precise, mathematically consistent structure that fit the known data.

This is not just an improvement—it’s a breakthrough.

From Polymer Physics to Probabilistic Modeling: A Scientific Paradigm Shift

Traditionally, the modeling of DNA structure has belonged to the realm of polymer physics—the same branch of physics used to describe spaghetti noodles, rubber bands, and other long-chain molecules. While effective to a point, these models often rely on assumptions that limit their predictive power.

Generative AI takes a data-first approach. It doesn’t rely on fixed rules about how DNA should behave. Instead, it learns from actual measurements and builds probabilistic models that reflect the messy, complex realities of the cell. This is an important philosophical shift: from imposing laws on nature to learning nature’s patterns from within.

“This is an unconventional application of AI,” Polovnikov said. “We’re using it not for image generation, not for creative writing—but to solve real, concrete problems in structural biology.”

It’s a bold crossover of disciplines—AI meets biophysics—and one that is already yielding tangible results.

The Practical Payoff: Better Diagnosis, Better Treatment

So what can we do with these newly complete 3D maps of the genome?

Quite a lot, it turns out.

First, by comparing the DNA architecture in healthy cells to that in diseased cells, researchers can pinpoint structural anomalies that act as biomarkers. These could be used for early diagnosis of genetic disorders, especially those that arise not from mutations in the DNA sequence, but from how the sequence is organized in space.

Second, such insights open up new therapeutic targets. If a disease is caused by a misfolded region of chromatin, drugs or gene-editing tools could be developed to restructure the genome and restore normal function. This is the next frontier of precision medicine—not just editing the genome’s letters, but rearranging its pages.

Third, AI-driven reconstruction allows scientists to run simulations of how specific interventions—like CRISPR edits or drug treatments—might impact DNA folding before they are ever tested in the lab. This saves time, resources, and lives.

Beyond Biology: A New Era for Generative AI

What makes this study especially exciting is what it reveals about the evolving role of AI in science. Until recently, generative AI was largely considered a tool for creativity—fun and flashy, but with limited scientific rigor. That perception is changing fast.

By solving a long-standing problem in chromatin research, the Skoltech team has shown that generative models can be precise, disciplined, and useful in scientific domains previously thought to be beyond their scope.

This could spark a wave of innovation, as other fields—structural chemistry, neuroscience, climate modeling—begin to explore generative AI not as a gimmick, but as a core analytical tool.

A Glimpse into the Future of Genomics

The discovery that AI can reconstruct the hidden 3D landscape of our genome is not just a technical triumph—it’s a conceptual one. It tells us that even in the face of incomplete, noisy, or fragmented data, there is a way forward. That biology, with all its chaotic complexity, is still accessible to human understanding—if we bring the right tools to bear.

In many ways, this study marks the beginning of a new chapter in genomics, one where data science and molecular biology are no longer separate disciplines but intertwined paths to the same goal: understanding life in all its dimensions.

And thanks to a bit of help from machines that were never built for biology, that goal just got a little closer.

Reference: Alexander Lobashev et al, Generative inpainting of incomplete Euclidean distance matrices of trajectories generated by a fractional Brownian motion, Scientific Reports (2025). DOI: 10.1038/s41598-025-97893-5

Think this is important? Spread the knowledge! Share now.