Skip to main content

Something written in every biology textbook of the past 70 years may need a rewrite. Not because the science was careless, or the scientists were wrong to try – but because the tools available at the time created a blind spot so fundamental that it quietly shaped a generation of assumptions about how life first arose on this planet. The implications are not minor. They reach back four billion years, to the very first coded instructions life ever used to build itself. And, it turns out, those instructions may be older and stranger than anyone expected.

The question at the heart of this is deceptively simple: in what order did the alphabet of life assemble itself? Every protein in every organism that has ever lived on Earth is built from the same set of 20 amino acids (the molecular building blocks that proteins are made of). Scientists have long believed they understood, broadly, which of those amino acids came first and which came last. New research published in late 2024 has challenged that timeline in ways that matter deeply – both for understanding life on Earth and for knowing where to look for it elsewhere in the cosmos.

The study came out of the University of Arizona, and its findings are generating serious attention in origin-of-life research circles. The conclusions are not just a minor recalibration. They suggest that the entire foundation on which scientists built their model of genetic code evolution may rest, at least in part, on a flawed set of assumptions – assumptions that date back to a legendary laboratory experiment conducted in 1952.

The Genetic Code, and Why Its Origins Matter

The genetic code, as senior study author and University of Arizona professor of ecology and evolutionary biology Joanna Masel describes it, is “this amazing thing in which a string of DNA or RNA containing sequences of four nucleotides is translated into protein sequences using 20 different amino acids.” Every protein your body makes – every enzyme, every structural molecule, every antibody – is the product of that code executing its instructions.

The modern genetic code was likely assembled in stages, beginning with “early” amino acids that may have been delivered to Earth by asteroids or formed by simple chemistry, and ending with “late” amino acids requiring biological synthesis. The order in which those amino acids were added was previously inferred from a consensus among some forty different metrics – many of which reflect the abundance of those molecules in the non-living chemistry of ancient Earth.

The last universal common ancestor, known as LUCA, is the hypothesized common ancestral cell population from which all subsequent life – bacteria, archaea, and all complex organisms – ultimately descends. Most studies suggest LUCA existed by around 3.5 billion years ago, and possibly as early as 4.3 billion years ago. A landmark 2024 study in Nature Ecology and Evolution led by researchers at the University of Bristol pinpointed LUCA’s existence more precisely, inferring that LUCA lived approximately 4.2 billion years ago and possessed a genome of at least 2.5 megabases encoding roughly 2,600 proteins, comparable in scale to modern prokaryotes, functioning as a complex anaerobic acetogen that was already part of an ecosystem.

LUCA did not come from nowhere. The protein-building machinery it relied on had a history, and that history is what the University of Arizona team set out to excavate.

The Flaw at the Foundation

One of the cornerstones of conventional thinking about genetic code evolution rests on the famous Miller-Urey experiment, conducted in 1952 and published in 1953, which attempted to simulate conditions on early Earth. While valuable in demonstrating that non-living matter could give rise to amino acids through simple chemical reactions, the experiment’s implications have been called into question. Crucially, it did not yield any amino acids containing sulfur, despite sulfur being abundant on early Earth. As a result, sulfur-containing amino acids were assumed to have joined the code much later – but that conclusion is hardly surprising, given that sulfur was omitted from the experiment’s ingredients in the first place.

The Miller-Urey experiment simulated early Earth’s atmosphere and oceans to test whether organic molecules could form abiotically – that is, through chemistry alone, without living organisms. It documented the production of amino acids and other organic molecules, demonstrating that chemical evolution (that is, the formation of complex chemicals from simple ones) is possible. But as Britannica reports, the experiment’s gas mixture – methane, ammonia, hydrogen, and water – was a representation of what scientists at the time believed the early atmosphere looked like. Later work revealed that the gases Miller used did not exist in large amounts on early Earth.

The current “consensus” order in which amino acids were added to the genetic code is therefore based on potentially biased criteria. More broadly, the abiotic abundance of a molecule in a laboratory flask might not reflect its biotic abundance inside the primitive cells where the genetic code was actually evolving. This distinction turns out to be enormously consequential.

What the University of Arizona Team Did Differently

Sawsan Wehbi, a doctoral student in the Genetics Graduate Interdisciplinary Program at the University of Arizona, found strong evidence that the textbook version of how the universal genetic code evolved needs revision. Wehbi is the first author of a study published in PNAS suggesting the order in which amino acids were recruited is at odds with what is widely considered the “consensus” of genetic code evolution.

The methodological shift that made this possible was a move away from full protein sequences and toward something smaller and more ancient. Unlike previous studies, which used full-length protein sequences, Wehbi and her group focused on protein domains – shorter stretches of amino acids. “If you think about the protein being a car, a domain is like a wheel,” Wehbi explained. “It’s a part that can be used in many different cars, and wheels have been around much longer than cars.”

This distinction matters enormously. Full proteins are relatively recent evolutionary inventions. The domains within them are far older – some surviving virtually unchanged across billions of years of evolution. By analyzing domains rather than whole proteins, the team could peer further back into the past than previous analyses allowed.

To determine when specific amino acids were likely recruited into the genetic code, the researchers used statistical analysis tools to compare the enrichment of each amino acid in protein sequences dating back to LUCA and even farther back. The team identified more than 400 families of sequences dating back to LUCA. More than 100 of them originated even earlier and had already diversified prior to LUCA.

The Key Findings: A Rewritten Timeline

The study overturned several long-held assumptions about which amino acids are early and which are late. Three findings stand out as particularly significant.

Smaller Is Older – But Not in the Way Previously Thought

The team found that smaller amino acids were added to the code earlier. Metal-binding amino acids, including cysteine and histidine, and sulfur-containing ones, including cysteine and methionine, were added to the genetic code much earlier than previously thought. This is a direct contradiction of what the Miller-Urey-era model predicted. Sulfur-containing amino acids were assumed to be late additions because the Miller-Urey experiment failed to produce them – but as the Arizona team demonstrates, that failure was an artifact of leaving sulfur out of the experiment, not evidence that ancient life lacked sulfur.

Cysteine and methionine are the only sulfur-containing amino acids in the contemporary genetic code, and modern prokaryotes living in hydrogen sulfide-rich environments use them more heavily than species in other environments. The enrichment of these amino acids in ancient LUCA sequences likely reflects an early life environment that was rich in hydrogen sulfide gas.

Early Life Liked Rings

The pre-LUCA protein sequences turned out to contain more amino acids with aromatic ring structures (ring-shaped molecular structures that confer particular chemical properties) – like tryptophan and tyrosine – despite these amino acids being considered late additions to the current genetic code. “This gives hints about other genetic codes that came before ours, and which have since disappeared in the abyss of geologic time,” said Masel.

This is one of the study’s most striking conclusions. Even more ancient protein sequences – those that had already diversified into multiple distinct copies prior to LUCA – have significantly higher frequencies of aromatic amino acids, including tryptophan, tyrosine, phenylalanine, and histidine. If at least some of these sequences predate the current genetic code, their distinct enrichment patterns provide hints about earlier, alternative genetic codes. Masel summarized the discovery with characteristic directness: “Early life seems to have liked rings.”

The Tryptophan Paradox

Tryptophan holds a unique place in biology. It is the largest of all 20 amino acids. Its side chain is an indole group – an aromatic structure with a binuclear ring – and the biosynthetic pathway to produce it is the most complex and energetically expensive among all amino acids. Tryptophan is widely regarded as the last of the 20 amino acids to have evolved. Its large size and complex chemical structure make its late appearance, in the current model, seem perfectly logical.

Yet the Arizona team found tryptophan showing up prominently in the most ancient protein sequences they analyzed – those predating LUCA itself. Sequences that had already diversified into multiple distinct copies in LUCA would be expected to be enriched in early amino acids and depleted in late ones. Surprisingly, these more ancient sequences showed a different pattern – significantly less depleted for tryptophan and tyrosine, and enriched rather than depleted for phenylalanine. This is compatible with at least some of these sequences predating the current genetic code, and their distinct enrichment patterns provide hints about earlier, alternative genetic codes.

The implication is that tryptophan may not have been a late addition to all genetic codes – only to the particular one that eventually won out and became universal. Before that, older codes that used tryptophan freely may have competed, evolved, and ultimately gone extinct.

A Code That Replaced Other Codes

This brings the research to its most conceptually striking claim: the genetic code every living thing uses today is not the original. The genetic code we use today likely came after other codes that have since disappeared. The researchers argue that the current understanding of how the genetic code evolved needs to be revised.

The modern genetic code was likely assembled in stages, hypothesized to begin with early amino acids possibly delivered by extraterrestrial sources and ending with late amino acids requiring biological synthesis. But the pre-LUCA sequences suggest there was an intermediate chapter in this story – older codes, now extinct, that used a different amino acid vocabulary. Some of these earlier codes may have preferentially used aromatic amino acids, including tryptophan, which the modern code later relegated to “late addition” status because they were costly and complex to synthesize.

Science Daily’s reporting on the Bristol LUCA study provides useful context here: LUCA represents the root of the tree of life before it splits into bacteria, archaea, and eukarya. Modern life evolved from LUCA from various different sources, including the same amino acids used to build proteins in all cellular organisms, the shared energy currency ATP, and the cellular machinery like the ribosome associated with translating genetic information into proteins. If the Arizona team is right, LUCA itself inherited some of its molecular toolkit from predecessor organisms that ran on a different code entirely.

Read More: Grandmothers Played a Crucial Role in Human Evolution

What This Means for Astrobiology – and the Search for Life

The implications extend beyond Earth. Co-author Dante Lauretta, Regents Professor of Planetary Science and Cosmochemistry at the University of Arizona’s Lunar and Planetary Laboratory, points out that early life’s sulfur-rich nature offers insights for astrobiology, particularly in understanding the potential habitability and biosignatures of extraterrestrial environments.

Lauretta noted that “on worlds like Mars, Enceladus, and Europa, where sulfur compounds are prevalent, this could inform our search for life by highlighting analogous biogeochemical cycles or microbial metabolisms.”

The mention of Enceladus – Saturn’s icy moon – is particularly relevant. NASA’s Jet Propulsion Laboratory reported that data from the Cassini spacecraft revealed organic compounds freshly ejected from Enceladus’ subsurface ocean. Together with confirmed aromatic, nitrogen- and oxygen-bearing compounds, these compounds can form the building blocks to support chemical reactions and processes that could have led to more complex organic chemistry – the kind of interest to astrobiology. The high pH of Enceladus’ ocean is interpreted as a consequence of serpentinization of chondritic rock, leading to the generation of hydrogen gas – a geochemical source of energy that could support both abiotic and biological synthesis of organic molecules such as those detected in Enceladus’s plumes.

Research published in Astrobiology has taken this further. The active plume at Enceladus’ south pole makes indirect sampling of its global ocean possible. The partially resolved chemistry of the plume points to conditions seemingly compatible with life, and hydrothermal activity is estimated to be sufficient to sustain both abiotic and biotic amino acid synthesis.

If ancient genetic codes on Earth could use aromatic amino acids like tryptophan – and if those compounds can form abiotically at the water-rock interface of a moon like Enceladus – then the Arizona team’s findings have quietly widened the window of where in the solar system chemistry might shade into biology.

Read More: Some Pacific Islanders Have DNA Not Linked To Any Known Human Ancestor

What This Means for You

The University of Arizona research, published in PNAS in December 2024, does not overturn everything known about the origin of life. What it does is reveal a significant methodological flaw in a framework that has guided origin-of-life science for decades – and replaces it with direct evolutionary evidence drawn from the most ancient protein structures on Earth.

The core lessons are worth sitting with. What happens in a laboratory flask does not necessarily mirror what happened inside the first cells. The Miller-Urey experiment’s failure to produce sulfur-containing amino acids reflected the design of the experiment, not the chemistry of early life. Sulfur-rich amino acids appear to have been central to the earliest genetic code, not peripheral to it.

Beyond that, the genetic code we all share today was not the first genetic code. It competed with, and eventually replaced, older codes that ran on a different amino acid vocabulary – one that may have favored aromatic molecules like tryptophan far earlier than the current model allows. That history is still readable in the deepest, most conserved protein domains in every living organism on Earth today. The instructions life wrote four billion years ago did not disappear. They just got buried under layers of time – and researchers are only now learning how to read them again.

For those who follow science closely, the practical takeaway is this: the origin-of-life field is not settled, and that is a good thing. The sulfur chemistry and aromatic compounds found at hydrothermal vents – whether on Earth’s ocean floor or under the ice of Enceladus – deserve more attention than they have historically received. The universe did not necessarily follow the same recipe our textbooks described. That, in itself, changes where scientists will look next, and what they will be hoping to find.

AI Disclaimer: This article was created with the assistance of AI tools and reviewed by a human editor.

Read More: Why Do We Have Bunions? You Can Blame Evolution.