eNon-canonical bases were delivered to Earth by meteorites and formed in prebiotically plausible reactions.
Non-canonical bases were delivered to Earth by meteorites and formed in prebiotically plausible reactions.
Meteorites may have been important for the delivery of starting materials for the evolution of life. Callahan et al. (2011) analyzed 12 meteorites for complex organics finding both canonical bases and the non-canonical bases purine, 2,6-diaminopurine, and 6,8-diaminopurine. They also perform prebiotically relevant reactions and see formation of both the canonical and non-canonical bases. The presence of the non-canonical bases through meteorite delivery and prebiotic chemistry may have led to their incorporation in early RNA.
All terrestrial organisms depend on nucleic acids (RNA and DNA), which use pyrimidine and purine nucleobases to encode genetic information. Carbon-rich meteorites may have been important sources of organic compounds required for the emergence of life on the early Earth; however, the origin and formation of nucleobases in meteorites has been debated for over 50 y. So far, the few nucleobases reported in meteorites are biologically common and lacked the structural diversity typical of other indigenous meteoritic organics. Here, we investigated the abundance and distribution of nucleobases and nucleobase analogs in formic acid extracts of 12 different meteorites by liquid chromatography-mass spectrometry. The Murchison and Lonewolf Nunataks 94102 meteorites contained a diverse suite of nucleobases, which included three unusual and terrestrially rare nucleobase analogs: purine, 2,6-diaminopurine, and 6,8-diaminopurine. In a parallel experiment, we found an identical suite of nucleobases and nucleobase analogs generated in reactions of ammonium cyanide. Additionally, these nucleobase analogs were not detected above our parts-per-billion detection limits in any of the procedural blanks, control samples, a terrestrial soil sample, and an Antarctic ice sample. Our results demonstrate that the purines detected in meteorites are consistent with products of ammonium cyanide chemistry, which provides a plausible mechanism for their synthesis in the asteroid parent bodies, and strongly supports an extraterrestrial origin. The discovery of new nucleobase analogs in meteorites also expands the prebiotic molecular inventory available for constructing the first genetic molecules.
eThe non-canonical base 6-aminouracil is reactive to form the necessary glycosidic bond while uracil is not.
The non-canonical base 6-aminouracil is reactive to form the necessary glycosidic bond while uracil is not.
The formation of glycosidic bonds between the sugar and base of an RNA monomer has proven difficult through prebiotic chemistry in reactions with canonical RNA bases. While uracil is unreactive toward ribose, Kim and Benner (2015) tested various modified uracil bases for formation of the glycosidic bond. They foud that 6-aminouracil gave the glycosylated product at greater than 50% yield. Moreover, 6-aminouracil can form the same pattern of hydrogen bonds for Watson-Crick base pairing as uracil with an additional ability to form a base pair on its opposite face. The opposite face can pair with a second non-canonical base. The better ability of 6-aminouracil to link with ribose may have made it a more abundant in RNA monomers.
Kim HJ,Benner SA
Following the suggestion that nucleoside analogues having their nucleobases joined to ribose via a carbon-carbon bond might easily arise prebiotically, the glycosylation of uracil carrying electron-donating substituents (Me, OH, OCH3, NH2) at its 5 or 6 positions was investigated. Of these, only 6-aminouracil gave glycosylated products in greater than 50% yield under simulated prebiotic conditions. The reaction provided four products, three of which were purified by preparative HPLC. The structure of the isolated compounds was determined by high-resolution mass spectrometry and NMR spectroscopy. The glycosylation products were, as expected, C-nucleosides, with the sugar having either a pyranose or a furanose structure, with the ratio depending on the precise conditions, implying reversible addition. Interestingly, the 6-aminouracil riboside displays two hydrogen bonding patterns, the "acceptor-donor-acceptor" pattern of uridine itself and (upon 180° rotation) the "acceptor-donor-donor" hydrogen bonding pattern. The second, in an artificially expanded genetic information system, is trivially called "V" and pairs with a purine analogue that presents the complementary "donor-acceptor-acceptor" hydrogen bonding pattern, trivially called "J."
eThe non-canonical base urazole is reactive to form the glycosidic bond and preserves the same base pairing face as uracil.
The non-canonical base urazole is reactive to form the glycosidic bond and preserves the same base pairing face as uracil.
Kolb et al. (1994) put forward a method for formation of the challenging glycosidic bond through use of a prebiotically plausible non-canonical base, urazole. Urazole mimmics the Watson-Crick base pairing pattern of uracil but instead contains a five-membered ring. When mixed and heated in an aqueous solution with ribose, β-furanoside (the standard sugar in RNA) linked to urazole by a glycosidic bond was formed at 9% at equilibrium. Uracil was completely unreactive, showing within this study that a non-canonical base is better able to form the necessary glycosidic bond while also mimicing the base pairing ability of the canonical base.
Kolb VM,Dworkin JP,Miller SL
Urazole is a five-membered heterocyclic compound which is isosteric with uracil's hydrogen-bonding segment. Urazole reacts spontaneoulsy with ribose (and other aldoses) to give a mixture of four ribosides: alpha and beta pyranosides and furanosides. This reaction occurs in aqueous solution at mild temperatures. Thermodynamic and kinetic parameters for the reaction of urazole with ribose were determined. In contrast, uracil is completely unreactive with ribose under these conditions. Urazole's unusual reactivity is ascribed to the hydrazine portion of the molecule. Urazole can be synthesized from biuret and hydrazine under prebiotic conditions. The prebiotic synthesis of guanazole, which is isosteric in part to diaminopyrimidine and cytosine, is accomplished from dicyandiamide and hydrazine. Kinetic parameters for both prebiotic reactions were measured. Urazole and guanazole are transparent in the UV, which would be a favorable property in the absence of an ozone layer on the early Earth. Urazole makes hydrogen bonds with adenine in DMSO similar to those of uracil, as established by H NMR. All of these properties make urazole an attractive potential precursor to uracil and guanazole a potential precursor to cytosine in the RNA or pre-RNA world.
eIn a unifying prebiotic chemical synthesis of both purine and pyrimidine nucleosides, only the canonical nucleosides accumulate as products.
In a unifying prebiotic chemical synthesis of both purine and pyrimidine nucleosides, only the canonical nucleosides accumulate as products.
It is thought that the prebiotic synthesis of nucleosides from a single “pot” of simple molecules was the pathway to the production of the first RNA. However, one challenge has been the concurrent production of products with both purines and pyrimidines. In work completed by Becker et al. (2019) a pyrimidine nucleoside forming reaction compatible with purine synthesis was demonstrated. Multiple products and byproducts are common phenomena in organic synthesis, yet the final products of the performed reactions only included canonical bases without the appearance of non-canonical bases of similar structure. If this prebiotic chemical route fed into early RNA formation then the major starting material available for incorporation would have been canonical nucleosides.
Theories about the origin of life require chemical pathways that allow formation of life's key building blocks under prebiotically plausible conditions. Complex molecules like RNA must have originated from small molecules whose reactivity was guided by physico-chemical processes. RNA is constructed from purine and pyrimidine nucleosides, both of which are required for accurate information transfer, and thus Darwinian evolution. Separate pathways to purines and pyrimidines have been reported, but their concurrent syntheses remain a challenge. We report the synthesis of the pyrimidine nucleosides from small molecules and ribose, driven solely by wet-dry cycles. In the presence of phosphate-containing minerals, 5'-mono- and diphosphates also form selectively in one-pot reactions. The pathway is compatible with purine synthesis, allowing the concurrent formation of all Watson-Crick bases.
eNon-canonical bases self-assemble by base pairing into polymerization-ready stacks.
Non-canonical bases self-assemble by base pairing into polymerization-ready stacks.
Two problems in the derivatization of RNA from prebiotic conditions are the formation of the glycosidic bond between the base and ribose sugar and the lack of Watson-Crick-type base pairing in water. Chen et al. (2014) addressed both issues using the prebiotically plausible base 2,4,6-triaminopyrimidine (TAP). They found that TAP conjugates to ribose with a standard glycosidic bond. They further showed that the resultant nucleoside forms base pairs in water with another non-canonical base, cyanuric acid, albeit through an arrangement of six alternating coplanar bases (3 TAP and 3 cyanuric acid) rather than the standard two. The hexameric base pairs stack in solution so that covalent links could potentially connect the stacks in downstream chemistry to facilitate a polymerization step necessary to form RNA.
The RNA World hypothesis is central to many current theories regarding the origin and early evolution of life. However, the formation of RNA by plausible prebiotic reactions remains problematic. Formidable challenges include glycosidic bond formation between ribose and the canonical nucleobases, as well as the inability of nucleosides to mutually select their pairing partners from a complex mixture of other molecules prior to polymerization. Here we report a one-pot model prebiotic reaction between a pyrimidine nucleobase (2,4,6-triaminopyrimidine, TAP) and ribose, which produces TAP-ribose conjugates in high yield (60-90%). When cyanuric acid (CA), a plausible ancestral nucleobase, is mixed with a crude TAP+ribose reaction mixture, micrometer-length supramolecular, noncovalent assemblies are formed. A major product of the TAP+ribose reaction is a β-ribofuranoside of TAP, which we term TARC. This nucleoside is also shown to efficiently form supramolecular assemblies in water by pairing and stacking with CA. These results provide a proof-of-concept system demonstrating that several challenges associated with the prebiotic emergence of RNA, or pre-RNA polymers, may not be as problematic as widely believed.
eNon-canonical bases synthetically added to nucleic acids cause structural, functional, or polymerization problems.
Non-canonical bases synthetically added to nucleic acids cause structural, functional, or polymerization problems.
In work that goes beyond the theoretical speculation of how alternate nucleobases might behave in a biopolymer, Hirao and Kimoto (2012) synthetically produced a range of possible bases and tested their functionality and properties when base-paired and inserted into DNA. They highlighted that proper base pairing involves a balance of complementary shapes, hydrogen bonding, polarity, electrical repulsion, and hydrophobicity, and that modifying any of these affects the ability of a potential base pair to participate in a functional double helix. They summarized their findings in a table of the most-compatible base pairs. However, all the synthetic base pairs have weaknesses in some aspect of nucleic acid chemistry. Frequent weaknesses are problems with replication or polymerization, although synthetic P-Z base pairs are replicated at 99.8% efficiency. Further studies by Molt et al. (2017) showed that despite this replication, P-Z base pairs alter the structure of DNA. It is possible that more than two types of base pairs in a nucleic acid may make its structure and properties irregular.
Hirao I,Kimoto M
Toward the expansion of the genetic alphabet of DNA, several artificial third base pairs (unnatural base pairs) have been created. Synthetic DNAs containing the unnatural base pairs can be amplified faithfully by PCR, along with the natural A-T and G-C pairs, and transcribed into RNA. The unnatural base pair systems now have high potential to open the door to next generation biotechnology. The creation of unnatural base pairs is a consequence of repeating "proof of concept" experiments. In the process, initially designed base pairs were modified to address their weak points. Some of them were artificially evolved to ones with higher efficiency and selectivity in polymerase reactions, while others were eliminated from the analysis. Here, we describe the process of unnatural base pair development, as well as the tests of their applications.
eNon-canonical bases are ubiquitous and functional in modern RNA.
Non-canonical bases are ubiquitous and functional in modern RNA.
Modern biology is evolved to use non-canonical RNA bases as part of over 100 modifactions that have been discovered on RNA post-transcription. The most common non-canonical RNA base is pseudouridine (Ψ), present at nucleotide frequencies of ~4% in yeast tRNAs. Hamma & Ferré-D'Amaré (2006) expanded upon these observations with a study of Ψ synthatases, the enzymes responsible for converting uracil to Ψ. They showed that Ψ synthatases are present in all domains of life and share a common core fold. It follows that the core for Ψ synthatase functionality existed before the last universal common ancestor and that Ψ has been part of mature RNAs since at least that time. This long-standing reliance on Ψ could indicate that Ψ was incorporated in early life’s RNAs and preserved to modern times through Ψ synthatases rather than the DNA code.
Hamma T,Ferré-D'Amaré AR
Pseudouridine synthases are the enzymes responsible for the most abundant posttranscriptional modification of cellular RNAs. These enzymes catalyze the site-specific isomerization of uridine residues that are already part of an RNA chain, and appear to employ both sequence and structural information to achieve site specificity. Crystallographic analyses have demonstrated that all pseudouridine synthases share a common core fold and active site structure and that this core is modified by peripheral domains, accessory proteins, and guide RNAs to give rise to remarkable substrate versatility.
eThe genetic code, necessary for all biological protein translation, only contains canonical bases.
The genetic code, necessary for all biological protein translation, only contains canonical bases.
The genetic code refers to the ordered combination of nucleic acid bases (letters) into short three-letter codons (words) that are translated to one amino acid character. The genetic code ensures that a gene containing a specific sequential arrangement of nucleic acids is faithfully translated into a specific functional chain of amino acids. Much work has been done on the origin of the genetic code: i.e. why does a certain codon correspond with a certain amino acid? Crick postulated in a theory that the genetic code was a frozen accident; a codon-amino acid match was made by some “accidental” circumstance and frozen because changing the code later would disrupt the existing proteins. Regardless of the extent of circumstance, the general system has been frozen in a situation using codons of three nucleotides composed of only the four canonical bases. If there had been other bases when the genetic code evolved, they too could have entered the genetic code, yet bases other than AUG&C in RNA are absent.
eModern RNAs largely do not contain records inherited from the first RNAs, so arguments for other bases are only speculative.
Modern RNAs largely do not contain records inherited from the first RNAs, so arguments for other bases are only speculative.
To investigate the lineage of RNAs tracing from early RNA to modern day, Hoeppner et al. (2012) analyzed over 3 million modern RNA sequences for conserved RNA families appearing in all of life’s three domains. The results of their comparisons show that 99% of RNA families fall within only one branch. Some of the few families that are in all domains of life bear signatures of horizontal gene transfer rather than vertical heredity from LUCA or earlier. From their finding that RNA families across domains are dissimilar, they concluded that we can reconstruct little of the RNA that was in LUCA, and therefore do not have a strong line back to the earliest RNA. Unless new organisms are found that are transcribed using an expanded set of RNA bases, the current preserved record of RNA sequences does not extend to the earliest RNAs, and theories of alternate bases must remain purely speculative.
Hoeppner MP,Gardner PP,Poole AM
The RNA world hypothesis, that RNA genomes and catalysts preceded DNA genomes and genetically-encoded protein catalysts, has been central to models for the early evolution of life on Earth. A key part of such models is continuity between the earliest stages in the evolution of life and the RNA repertoires of extant lineages. Some assessments seem consistent with a diverse RNA world, yet direct continuity between modern RNAs and an RNA world has not been demonstrated for the majority of RNA families, and, anecdotally, many RNA functions appear restricted in their distribution. Despite much discussion of the possible antiquity of RNA families, no systematic analyses of RNA family distribution have been performed. To chart the broad evolutionary history of known RNA families, we performed comparative genomic analysis of over 3 million RNA annotations spanning 1446 families from the Rfam 10 database. We report that 99% of known RNA families are restricted to a single domain of life, revealing discrete repertoires for each domain. For the 1% of RNA families/clans present in more than one domain, over half show evidence of horizontal gene transfer (HGT), and the rest show a vertical trace, indicating the presence of a complex protein synthesis machinery in the Last Universal Common Ancestor (LUCA) and consistent with the evolutionary history of the most ancient protein-coding genes. However, with limited interdomain transfer and few RNA families exhibiting demonstrable antiquity as predicted under RNA world continuity, our results indicate that the majority of modern cellular RNA repertoires have primarily evolved in a domain-specific manner.
eNon-canonical bases mutate to canonical bases after multiple rounds of replication.
Non-canonical bases mutate to canonical bases after multiple rounds of replication.
Among tested non-canonical base pairs, PZ base pairs perform most-similarly to other base pairs when assimilated into nucleic acids. Using polymerases that have been modified to accept DNA with PZ base pairs, copies of DNA containing the extra bases can be replicated and theoretically passed from generation to generation. Reichenbach et al. (2016) interrogated the mechanism by which PZ base pairs mutate. They find that above pH 7.8, Z mispairs with G, which in subsequent duplications converts a PZ base pair to a GC base pair. This research suggests that mutations may quickly “weed out” non-canonical bases from nucleic acids.
Reichenbach, Linus F and Sobri, Ahmad Ahmad and Zaccai, Nathan R and Agnew, Christopher and Burton, Nicholas and Eperon, Lucy P and de Ornellas, Sara and Eperon, Ian C and Brady, R Leo and Burley, Glenn A
Relative to naturally occurring Watson-Crick base pairs, the synthetic nucleotide P pairs with Z within DNA duplexes through a unique hydrogen-bond arrangement. The loss of this synthetic genetic information by PCR results in the conversion of P-Z into a G-C base pair. Here, we show structural and spectroscopic evidence that the loss of this synthetic genetic information occurs via G-Z mispairing. Remarkably, the G-Z mispair is both plastic and pH dependent; it forms a double-hydrogen-bonded “slipped” pair at pH 7.8 and a triple-hydrogen-bonded Z-G pair when the pH is above 7.8. This study highlights the need for robust structural and functional methods to elucidate the mechanisms of mutation in the development of next-generation synthetic genetic base pairs.
RNA is critical in all extant life and its evolution occurred before the first cellular organisms. It has been proposed that early RNA may have incorporated a wide selection of nucleobases that helped it fulfill enzymatic or other essential functions. The additional, non-canonical nucleobases may resemble the base modifications that are found in RNA today, which include appendages similar to the side chains of amino acids. While these bases are post-translational modifications in extant biology, their utility and inclusion may be a vestige from earlier RNA. On the other side, the canonical nucleobases in RNA are prominent in the combined synthesis of purines and pyrimidines, although synthesis of non-canonical nucleobases has also been demonstrated. The ancient nature of the genetic code, which was sophisticated even before LUCA and only operates with the modern RNA bases, is perhaps a testament to the integration of only canonical bases from life’s origins.