3.2: Natural and un-natural groups - Biology

It is worth reiterating that while a species can be seen as a natural group, the higher levels of classification may or may not reflect biologically significant information. We can be sure that we are reading the same book, and studying the same organism!

Genera and other higher-level classifications are generally based on a decision to consider one or more traits as more important than others. The assignment of a particular value to a trait can seem arbitrary. Let us consider, for example, the genus Canis, which includes wolves and coyotes and the genus Vulpes, which includes foxes. The distinction between these two groups is based on smaller size and flatter skulls in Vulpes compared to Canis. Now let us examine the genus Felis, the common house cat, and the genus Panthera, which includes tigers, lions, jaguars and leopards. These two genera are distinguished by cranial features and whether (Pathera) or not (Felix) they have the ability to roar. So what do we make of these distinctions, are they really sufficient to justify distinct groups, or should Canis and Vuples (and Felix and Panthera) be merged together? Are the differences between these groups biologically meaningful? The answer is that often the basis for higher order classifications are not biologically meaningful. This common lack of biological significance is underscored by the fact that the higher order classification of an organism can change: a genus can become a family (and vice versa) or a species can be moved from one genera to another. Consider the types of organisms commonly known as bears. There are a number of different types of bear-like organisms, a fact that Linnaeus’s classification scheme acknowledged. Looking at all bear-like organisms we recognize eight types.59 We currently consider four of these, the brown bear (Ursus arctos), the Asiatic black bear (Ursus thibetanus), the American bear (Ursus americanus), and the polar bear (Ursus maritimus) to be significantly more similar to one another, based on the presence of various traits, than they are to other types of bears. We therefore placed them in their own genus, Ursus. We have placed each of the other types of bear-like organisms, the spectacled bear (Tremarctos ornatus), the sloth bear (Melurus ursinus), the sun bear (Helarctos mayalanus), and the giant panda (Ailuropoda melanoleuca) in their own separate genus, because scientists consider these species more different from one another than are the members of the genus Ursus. The problem here is how big do these differences have to be to warrant a new genus?

So where does that leave us? Here the theory of evolution together with the cell (continuity of life) theory come together. We work on the assumption that the more closely related (evolutionarily) two species are, the more traits they will share and that the development of new, biologically significant trait is what distinguishes on group from another. Traits that underlie a rational classification scheme are known as synapomorphies (a technical term); basically these are traits that appeared in one or the other branch point of a family tree and serve to define that branch point, such that organism on one branch are part of a “natural” group, distinct from those on the other branch (lineage). In just the same way that the distortion of space-time provided a reason for why there is a law of gravity, so the ancestral relationships between organisms provides a reason for why organisms can be arranged into a Linnaean hierarchy.

So the remaining question is, how do we determine ancestry when the ancestors lived, thousands, millions, or billions of years in the past. Since we cannot travel back in time, we have to deduce relationships from comparative studies of living and fossilized organisms. Here the biologist Willi Hennig played a key role.60 He established rules for using shared, empirically measurable traits to reconstruct ancestral relationships, such that each group should have a single common ancestor. As we will discover later on, one of the traits now commonly used in modern studies is gene (DNA) sequence and genomic organization data, although even here there are plenty of situations where ambiguities remain, due to the very long times that separate ancestors and present day organisms.

A Phage Display System with Unnatural Amino Acids

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

1. Introduction

Proteins are essential components of synthetic cellular networks. Synthesis of cellular pathways with the ultimate goal of creating functional artificial cells can benefit greatly from the generation of proteins with desirable properties that can be effectively regulated. Proteins capable of functioning independently of the host organism’s endogenous circuitry require further developments in “parts” design.

The recent advances in reassigning the genetic code to unnatural amino acids (UAAs) 12,14–16 expand the molecular toolbox of protein “parts”. In fact, introduction of UAA can impart new functions that are difficult or impossible to create with proteins comprised of the natural 20 amino acid building blocks such as photo-induced switching, 17 IR probe active 18 and redox sensitive 19 proteins . Moreover, proteins bearing UAAs may result in improved properties such as hyperstability 20,21 and protease resistance. 22 Such features may be useful for tailoring orthogonal proteins in the context of synthetic biological networks. Here we discuss approaches that focus on protein engineering and that can potentially expand synthetic biology methods by UAA incorporation for the creation of artificial proteins .

Biology Skills You’ll Learn

  • Explore many of the broad concepts that underpin our understanding of the natural world.
  • Gain experience implementing the scientific method, including hypothesis testing, experimental design, accessing and understanding the scientific literature and analyzing data.
  • Acquire the knowledge and practical skills needed to be highly competitive for exceptional career opportunities following graduation or continue on to a graduate degree.

Templated synthesis of non-natural nucleic acids

The phosphoramidite approach to solid-phase DNA synthesis [38, 39] developed in the early 1980s led to the mainstream use of synthetic DNA in a myriad of applications. Similarly, improvements in the synthesis of XNAs can be expected to increase their usage and unlock new applications in the coming decades. XNA synthesis can be broadly divided into enzymatic and non-enzymatic approaches. Some widely used XNAs (e.g., 2′OMe, LNA, PS, 2′MOE) can be chemically synthesized, although yields of even the most synthetically accessible XNAs are limited to around 150 bp [40]. For other XNA chemistries, no reliable solid-phase synthetic route is known [41]. Furthermore, solid-phase synthesis depends on explicit knowledge of the sequence being synthesized. In contrast, templated synthesis enables general information transfer and is essential for the evolution of functional oligonucleotides.

Non-enzymatic templated synthesis

Non-enzymatic templated synthesis traditionally uses activated nucleotides or short oligonucleotide building blocks that self-organize on a template via base-paring interactions and react to polymerize. Such template copying, pioneered in the 1970s [42,43,44], has been extended to several XNA chemistries [45,46,47,48,49,50,51,52,53], often in the context of prebiotic nucleic acid replication. For example, systems for templated replication of PNA pentamers [54, 55] using reductive amination are efficient enough to permit model selection experiments [56]. Similar strategies have also been exploited to assemble nucleic acids via copper-catalyzed click ligation of oligonucleotides [57, 58] and by phosphoramidate ligation [59], and the resulting backbones have shown some biocompatibility [60]. In an interesting extension of this work, nucleic acid templates have also been used to organize chemical entities by proximity for programmed reactions [61,62,63]. In particular, the Liu group has extended such templated synthesis approaches to chemically ligate diverse non-nucleic acid entities in precise sequences defined by a DNA template [64]. Such templated chemical syntheses allow increasingly divergent chemistries to be synthesized and replicated without constraining polymer diversity by the ability of its constituent monomers to serve as substrates for polymerases or other nucleic acid-modifying enzymes.

Enzymatic ligation and modification of XNAs

Several DNA ligases have been shown to accept non-natural substrates and have been used alone or in conjunction with XNA polymerases to synthesize XNA oligonucleotides. Much like the aforementioned chemical ligation strategies, enzymatic ligation of pre-organized oligomers on a template allows for positioning of modified nucleotides, including multiple different modifications, at defined loci. It has recently been shown that several commercially available ligases can catalyze ligation of XNA substrates including 2′OMe, HNA, LNA, TNA, and FANA [41, 65]. Moreover, rational design and molecular modeling approaches have now resulted in the first XNA-templated XNA ligase [66]. The Liu and Hili groups have also worked extensively with ligase-mediated synthesis of modified DNA from diversely functionalized DNA 3- or 5-mers [67, 68], allowing a great variety of modifications, including hydrophobic, aliphatic, aromatic, acid, and basic moieties, to be incorporated using these short “quasicodons.” In agreement with results obtained with SOMAmer (Slow Off-rate Modified Aptamer) selections [22], they find that functionalization with nonpolar moieties appears to promote faster selection convergence and stronger aptamer binding [21]. Incorporating chemical diversity beyond that found in proteins (e.g., halogenated residues), the group isolated high affinity aptamers against PCSK9 and interleukin-6 [69]. The above strategies encapsulate the paradigm of leveraging the unique properties of nucleic acids (i.e., encoded synthesis, evolvability) coupled with an expanded set of chemical substituents to create functional molecules with therapeutic potential.

Polymerase synthesis of XNAs

Enzymatic polymerization of nucleic acids is orders of magnitude faster and more accurate than chemical copying or synthesis and is now being pursued for next-generation DNA oligonucleotide synthesis [70, 71], with a view to supersede phosphoramidite technology. Similarly, enzymatic XNA synthesis may facilitate production of XNA oligonucleotides. XNAs with chemically conservative modifications are accepted as substrates by natural polymerases, while engineering of polymerases to accept a broader range of XNA substrates has also met with success [72, 73]. Indeed, some polymerases have proven highly adaptable to new substrates. Thermophilic B-family polymerases, especially those from Pyrococcus and Thermococcus genera, have proven especially amenable to engineering for XNA substrates [33, 74,75,76,77,78,79,80,81]. This functional plasticity may be aided by high thermostability, which promotes greater tolerance for mutations that would otherwise be excessively destabilizing. However, A-family [82,83,84,85,86,87,88,89,90] and mesophilic polymerases such as Phi29 [91, 92] as well as RNA polymerases [93,94,95,96,97] have also been engineered to accept XNA substrates with success (Table 1). However, the efficiency, fidelity, and kinetics of engineered polymerases on XNA substrates are usually compromised relative to the native enzymes and natural substrates. Often, “forcing” conditions (e.g., superstoichiometric polymerase concentrations or the presence of Mn 2+ , which in turn reduces fidelity) are needed to boost synthetic yields [106].

Capitalizing on the promiscuity of natural polymerases

It is perhaps remarkable that natural polymerases can incorporate modified nucleotides at all, given their need for stringent substrate specificity and accuracy. However, at some positions in the nucleotide, modifications are readily tolerated. For example, modifications at the C5 position of pyrimidines and C7 position of N7-deaza-purines project into the duplex’s major groove and do not interfere with Watson-Crick base pairing. Therefore, even bulky adducts at these positions are generally well tolerated by polymerases [22, 107,108,109,110,111,112]. Enzymatic synthesis of nucleotides with reactive groups (e.g., for copper-catalyzed click chemistry) at these positions can allow elaborate post-synthetic modifications [113,114,115,116,117].

The plasticity of polymerases has also allowed researchers to carry out wholesale replacement of natural bases with close chemical analogues [118, 119]. In some cases, synthesis is efficient enough to produce modified PCR products of up to 1.5 kb [119]. Moreover, entirely new unnatural base pairs (UBPs) have been developed based on novel hydrogen bonding patterns [105, 120], hydrophobicity [121, 122], and shape complementarity [123,124,125]. These UBPs can be replicated by engineered and natural polymerases, demonstrating successful expansion of the genetic code, and recently achieving information encoding in an unprecedented eight-letter genetic alphabet [120].

XNA synthesis in vivo

The synthesis and replication of XNAs in vivo represents an important frontier in XNA research towards aims such as stable encoding of unnatural amino acids and genetic orthogonality for “firewalling” synthetic biology. To this end, complete replacement of cytidine and/or thymidine with 5-substituted pyrimidine analogues in bacterial genomes has now been achieved by careful metabolic engineering and evolution [126,127,128,129].

Remarkably, Romesberg and colleagues have been able to engineer E. coli strains with the capacity to replicate and maintain their unnatural NaM–TPT3 base pair in both plasmid [130] and genomic [131] contexts. To achieve this, they engineered a nucleoside triphosphate transporter [132], invoked Cas9 to degrade sequences having lost the UBP, and made several tweaks to the E. coli DNA synthesis and repair machinery. They further demonstrated the compatibility of the UBP with translation machinery to site-specifically encode unnatural amino acids [133, 134]. Further semi-synthetic organisms with substituted genomes or the ability to propagate UBPs will likely follow these early successes [135, 136].

Directed evolution for XNA polymerase engineering

While base modifications and even completely new base pairs are tolerated by natural polymerases to varying extents, modifications to the sugar and backbone generally prove more challenging, especially at full substitution. To improve activity with such unnatural substrates, polymerase engineering approaches, ranging from rational engineering driven by structural insights or computational analysis to screening and directed evolution, are often employed. Among these, emulsion-based methods and phage display have been especially useful [137, 138].

A range of directed evolution strategies have been employed in the field. Compartmentalized self-replication (CSR) [139,140,141] challenges polymerases to PCR amplify their own gene inside an emulsion compartment. The exponential enrichment of highly active clones, as opposed to simple partitioning of active variants, makes CSR a powerful method. However, it places stringent demands on polymerase performance, which are somewhat relieved in short-patch CSR (spCSR) [85] where amplification is confined to only a short section of the polymerase gene. Nested CSR allows selection of sequences or features not present in the polymerase gene itself (e.g., novel base pairs, reverse transcription of XNA templates) by using out-nesting primers with these features during the CSR amplification step [79, 86]. A further extension called compartmentalized partnered replication (CPR) can be used to select for enzymatic activities other than nucleic acid polymerization by constraining the selectable PCR reaction by the fitness of a partner gene [86, 142]. Recently, CSR has also been adapted for isothermal amplification reactions (iCSR), both at high temperatures [143] and at lower temperatures for evolution of non-thermostable enzymes [91]. Compartmentalized self-tagging (CST) [144] allows selection for much more difficult substrates or conditions selection is based on a primer extension templated by the encoding plasmid. The biotinylated primer along with any plasmids that have been “captured” by sufficient primer extension are partitioned on streptavidin beads. CST has yielded polymerases for a range of XNAs [78], most recently for dxNA [145]. In contrast to bulk emulsification methods, microfluidic devices have been used to generate highly homogeneous emulsion droplets, enhancing the ability to accurately distinguish small incremental improvements important in stepwise selections [146, 147]. Another directed evolution strategy is phage display, whereby polymerases displayed on phage particles are challenged to extend a primer with separable (e.g., biotinylated) nucleotides [80, 89, 148]. In a notable recent use of phage display for polymerase evolution, Chen et al. identified several Taq variants capable of synthesizing and reverse transcribing 2′OMe RNA [88], an XNA of particular interest for therapeutics due to its high biostability and nuclease resistance.

Rational engineering of XNA polymerases: structural insights

Polymerases make many key contacts with the incoming nucleotide triphosphate and primer strand during the catalytic cycle [149]. Structures of polymerases in complex with modified substrates may therefore aid in both rational engineering and retrospective understanding of XNA polymerases. Recently, Kropp et al. solved ternary structures of KlenTaq polymerase with a C5-modified cytidine at each of six successive positions [150] in the catalysis cycle, recapitulating the movement of the modified substrate through the polymerase’s active site and revealing a surprising level of flexibility in both the modified nucleotide and the interacting polymerase residues to accommodate the modification.

Further structural insight into XNA polymerization has been reported by Singh et al. [151] with structures of an engineered KlenTaq in a binary pre-incorporation and a ternary post-incorporation complex with the unnatural base pair dZ:dP. Although none of the mutated amino acids are in direct contact with primer, template, or incoming dZTP, the structure suggests that increased flexibility, particularly in the thumb subdomain, and an increased angle of closure in the finger subdomain facilitate primer elongation with the unnatural base pair.

Chim et al. recently reported ternary structures of a hyperthermophilic B-family polymerase, an engineered KOD mutant, polymerizing TNA [152]. Given the prevalent use of B-family hyperthermophiles in polymerase engineering, the structures of the apo, binary, and ternary polymerase complexes will prove useful in developing hypotheses for further work in the field. Subsequently published ternary structures of KOD and 9°N with DNA primer and template and an incoming dNTP [149, 153] suggest possible explanations for the predisposition of hyperthermophilic archaeal B-family polymerases to engineering: an unusually wide and positively charged channel between the finger and thumb subdomains may allow space for substrate modifications while maintaining strong binding to the template and high processivity. Additionally, whereas the primer template duplex adopts an A-form in the active site of A-family polymerases, it adopts a B-form in KOD, possibly contributing to accommodation of nucleotide modifications in the B-form duplex’s wider major groove.

Rational engineering of XNA polymerases: translation of mutations across substrates and polymerases

There are numerous examples of transferable mutations across XNA substrates and polymerases. A recent report [154] details engineering of a previously described Taq mutant to better incorporate 2′ modified XNAs. From kinetic data on incorporation of different 2′ modified nucleotides, relevant mutations from the literature were chosen based on the hypothesis that the rate-limiting step may be recognition of the modified primer strand. Several mutants showed enhanced synthesis of 2′F and 2′OMe and/or reverse transcription of 2′OMe RNA. In another instance, diversification at eight “specificity determining residues” selected by computational analysis, evolutionary conservation, and literature precedent yielded improved polymerases for RNA and TNA [75]. Furthermore, testing the top mutation sets in the context of different homologous B-family polymerases evidenced their general utility as well as revealing the structural context in which they functioned best. Similarly, polymerase synthesis of increasingly diverse XNAs was achieved with a TNA polymerase by combining the TNA sugar with base modifications known to be polymerase-compatible [31, 32]. While it has been generally believed that mutation of conserved residues is highly deleterious, it is becoming increasingly apparent through this and other work that such mutations can be key in allowing activity with unnatural substrates [75, 155].

Liu et al. recently reported engineering of a polymerase to synthesize a newly described XNA, tPhoNA, with both sugar and backbone modifications [33]. Impressively, the polymerase engineering strategy consisted solely of successive introduction and evaluation of mutations at positions known to affect synthesis for other XNA substrates. Their success is a testament to the accumulating knowledge base in the field of polymerase engineering and the extraordinary ability of a small number of specific key mutations to confer an expanded substrate spectrum.

RNA polymerases have also been engineered to synthesize XNAs. It has been shown that T7 RNAP can transcribe content-expanded RNA from DNA containing the Ds-Pa UPB as well as further modified Pa bases [103]. Through a small screen of previously identified mutations, Kimoto et al. identified a T7 RNAP mutant that incorporates 2′-F uracil and cytidine triphosphates and shows improved incorporation of Pa nucleotides and their analogues in difficult sequence contexts [93]. Surprisingly, several bulky modifications of the Pa nucleotide designed to expand chemical diversity for aptamer and ribozyme selections appear to be even better substrates than the original Pa triphosphates. Similarly, a Tgo polymerase mutant previously evolved to synthesize HNA was recently shown to incorporate HNA nucleotides with various aromatic modifications on the uridine base [35]. Here too, some of the bulkier base modifications demonstrated superior incorporation.

Reverse transcription of XNAs

Identification of enzymes capable of reverse transcribing XNAs demonstrates the potential of these divergent chemistries as genetic materials and is crucial for in vitro selections [156]. Some XNAs can be reverse transcribed by natural polymerases (e.g., TNA and FANA by Bst polymerase [157,158,159]). Alternately, engineering approaches must be taken [79, 88]. A small number of mutations in Tgo polymerase yielded a reverse transcriptase with a generally expanded substrate spectrum that can reverse transcribe several XNAs to DNA [29, 78].

Analysis of XNAs and XNA polymerases

One difficulty in working with XNAs is that they are often incompatible with traditional methods of nucleic acid manipulation or analysis such as restriction enzymes and sequencing technologies. Thus, an important parallel effort to the development of XNAs and XNA polymerases is the development of tools to analyze them. For example, fidelity has not been evaluated in detail for many of the described XNA polymerases. A recent method for assaying fidelity reported finding that even relatively small and natural RNA modifications can significantly increase polymerase error rates [160]. Deep sequencing has also been used to investigate error profiles and polymerase read-through of templates with backbone modifications [161] and found that some commonly used isosteric modifications, such as phosphorothioates, caused significant copying errors. This fidelity data foreshadows an important limitation in enzymatic XNA synthesis that will likely need to be addressed as the field moves forward. On the other hand, the promiscuous behavior of polymerases has been leveraged to read and record the presence of epigenetic DNA and RNA modifications via misincorporation “signatures” at the modifications [162,163,164,165,166,167,168,169,170]. In addition, the first instance of direct sequencing of an XNA was recently reported FANA was sequenced using nanopore technology, albeit with relatively short read lengths [171].

An XNA particularly limited by a lack of tools is L-DNA. Such “mirror image” DNA provides both complete orthogonality in macromolecule-scale features and interactions, while maintaining identical properties to DNA at the chemical level. Thus, enzymatic synthesis of L-DNA requires mirror image polymerases made of D-amino acids. Zhu and colleagues first made the D-form of the smallest known polymerase, African Swine Fever Virus polymerase X, and used it to synthesize L-DNA, showing that the polymerase was strictly enantio-specific with no crossover inhibition from D-nucleotide triphosphates [102]. The group subsequently reported synthesis of a mutant of the larger thermostable Dpo4 polymerase with D-amino acids, which enabled PCR amplification of an L-DNA product [99, 100]. Klussmann and colleagues had also reported synthesis of a D-Dpo4 mutant [101], which was used to assemble gene-length L-DNA sequences. A mirror image ligase has also been reported [172]. Mirror image nucleic acids are of immense interest for therapeutics and are being actively pursued in research and clinical development [173]. However, their usage remains limited by the arduous process of synthesizing D-polymerases and the incompatibility of L-DNA with traditional nucleic acid manipulation tools.

Chemistry in living systems

Dissecting complex cellular processes requires the ability to track biomolecules as they function within their native habitat. Although genetically encoded tags such as GFP are widely used to monitor discrete proteins, they can cause significant perturbations to a protein's structure and have no direct extension to other classes of biomolecules such as glycans, lipids, nucleic acids and secondary metabolites. In recent years, an alternative tool for tagging biomolecules has emerged from the chemical biology community—the bioorthogonal chemical reporter. In a prototypical experiment, a unique chemical motif, often as small as a single functional group, is incorporated into the target biomolecule using the cell's own biosynthetic machinery. The chemical reporter is then covalently modified in a highly selective fashion with an exogenously delivered probe. This review highlights the development of bioorthogonal chemical reporters and reactions and their application in living systems.

3.2: Natural and un-natural groups - Biology


People and cultures in the past and today


People and artifacts from ancient time


The Universe and everything in it


The rich variety of life on Earth


The organ inside our skulls

Climate Change

Long-term changes in global temperature


The dynamic planet that we call home


How genes are passed down generations

Marine Biology


Bacteria, viruses, and other microorganisms


Dinosaurs and other things that lived long ago


Matter and its motion through space and time


The liquid that makes life on Earth possible


All animals from insects to mammals

OLogy Card of the Day: Did You Know?

Koalas are not bears. They're marsupials and are more closely related to kangaroos.

Some meteorites are small pieces of the moon.

If melted, the ice sheets covering Antarctica would raise global sea level by almost 70 meters (230 feet).

Bats are the only mammals that can truly fly.

Fireflies aren't flies at all. They're beetles!

Antarctica is a continent surrounded by ocean. The Arctic is the opposite, an expanse of ocean surrounded by continents.

"Shooting stars" are actually meteors.

Many sauropods grew new teeth as often as once a month, as old ones wore out.

Some meteorites are as old as the solar system.

Half a million neurons form every minute during the first five months in the womb.

Identical twins have the exact same genes, but their fingerprints are unique.

What is Artificial Selection

Artificial selection is the selective breeding of animals and plants to produce an offspring with desirable and inheritable characters. Artificial selection is a man-made selection process of desired characters, and it is mainly used in livestock and improved crops. Farmers used artificial breeding even before Darwin’s discovery genetics to maintain inheritable characters which they desired in both animals and plants. The beneficial characters such as the ability to produce more milk in cattle, the accelerated lean muscle growth, exotic pets such as Savannah cat and small dogs such as Chihuahua are produced by artificial breeding. A Belgian cow is maintained by selective breeding due to its accelerated lean muscle growth.

Figure 3: A Belgian cow

Furthermore, artificial selection is used in the production of untold diversity in plants. Corn, wheat, and soybeans strains are developed by artificial selection of beneficial traits in agriculture. Broccoli, Brussels sprouts, cabbage, cauliflower, collards, and kale are produced by the careful selective breeding of wild mustard. Roses and orchids are also cultivated by selective breeding. Artificial selection can also produce various colors in carrot roots.

Figure 4: Carrots with multiple colored roots

2. Kin Selection and Inclusive Fitness

The basic idea of kin selection is simple. Imagine a gene which causes its bearer to behave altruistically towards other organisms, e.g. by sharing food with them. Organisms without the gene are selfish&mdashthey keep all their food for themselves, and sometimes get handouts from the altruists. Clearly the altruists will be at a fitness disadvantage, so we should expect the altruistic gene to be eliminated from the population. However, suppose that altruists are discriminating in who they share food with. They do not share with just anybody, but only with their relatives. This immediately changes things. For relatives are genetically similar&mdashthey share genes with one another. So when an organism carrying the altruistic gene shares his food, there is a certain probability that the recipients of the food will also carry copies of that gene. (How probable depends on how closely related they are.) This means that the altruistic gene can in principle spread by natural selection. The gene causes an organism to behave in a way which reduces its own fitness but boosts the fitness of its relatives&mdashwho have a greater than average chance of carrying the gene themselves. So the overall effect of the behaviour may be to increase the number of copies of the altruistic gene found in the next generation, and thus the incidence of the altruistic behaviour itself.

Though this argument was hinted at by Haldane in the 1930s, and to a lesser extent by Darwin in his discussion of sterile insect castes in The Origin of Species, it was first made explicit by William Hamilton (1964) in a pair of seminal papers. Hamilton demonstrated rigorously that an altruistic gene will be favoured by natural selection when a certain condition, known as Hamilton's rule, is satisfied. In its simplest version, the rule states that b > c/r, where c is the cost incurred by the altruist (the donor), b is the benefit received by the recipients of the altruism, and r is the co-efficient of relationship between donor and recipient. The costs and benefits are measured in terms of reproductive fitness. The co-efficient of relationship depends on the genealogical relation between donor and recipient&mdashit is defined as the probability that donor and recipient share genes at a given locus that are &lsquoidentical by descent&rsquo. (Two genes are identical by descent if they are copies of a single gene in a shared ancestor.) In a sexually reproducing diploid species, the value of r for full siblings is ½, for parents and offspring ½, for grandparents and grandoffspring ¼, for full cousins 1/8, and so-on. The higher the value of r, the greater the probability that the recipient of the altruistic behaviour will also possess the gene for altruism. So what Hamilton's rule tells us is that a gene for altruism can spread by natural selection, so long as the cost incurred by the altruist is offset by a sufficient amount of benefit to sufficiently closed related relatives. The proof of Hamilton's rule relies on certain non-trivial assumptions see Frank 1998, Grafen 1985, 2006, Queller 1992a, 1992b, Boyd and McIlreath 2006 and Birch forthcoming for details.

Though Hamilton himself did not use the term, his idea quickly became known as &lsquokin selection&rsquo, for obvious reasons. Kin selection theory predicts that animals are more likely to behave altruistically towards their relatives than towards unrelated members of their species. Moreover, it predicts that the degree of altruism will be greater, the closer the relationship. In the years since Hamilton's theory was devised, these predictions have been amply confirmed by empirical work. For example, in various bird species, it has been found that &lsquohelper&rsquo birds are much more likely to help relatives raise their young, than they are to help unrelated breeding pairs. Similarly, studies of Japanese macaques have shown that altruistic actions, such as defending others from attack, tend to be preferentially directed towards close kin. In most social insect species, a peculiarity of the genetic system known as &lsquohaplodiploidy&rsquo means that females on average share more genes with their sisters than with their own offspring. So a female may well be able to get more genes into the next generation by helping the queen reproduce, hence increasing the number of sisters she will have, rather than by having offspring of her own. Kin selection theory therefore provides a neat explanation of how sterility in the social insects may have evolved by Darwinian means. (Note, however, that the precise significance of haplodiploidy for the evolution of worker sterility is a controversial question see Maynard Smith and Szathmary 1995 ch.16, Gardner, Alpedrinha and West 2012.)

Kin selection theory is often presented as a triumph of the &lsquogene's-eye view of evolution&rsquo, which sees organic evolution as the result of competition among genes for increased representation in the gene-pool, and individual organisms as mere &lsquovehicles&rsquo that genes have constructed to aid their propagation (Dawkins 1976, 1982). The gene's eye-view is certainly the easiest way of understanding kin selection, and was employed by Hamilton himself in his 1964 papers. Altruism seems anomalous from the individual organism's point of view, but from the gene's point of view it makes good sense. A gene wants to maximize the number of copies of itself that are found in the next generation one way of doing that is to cause its host organism to behave altruistically towards other bearers of the gene, so long as the costs and benefits satisfy the Hamilton inequality. But interestingly, Hamilton showed that kin selection can also be understood from the organism's point of view. Though an altruistic behaviour which spreads by kin selection reduces the organism's personal fitness (by definition), it increases what Hamilton called the organism's inclusive fitness. An organism's inclusive fitness is defined as its personal fitness, plus the sum of its weighted effects on the fitness of every other organism in the population, the weights determined by the coefficient of relationship r. Given this definition, natural selection will act to maximise the inclusive fitness of individuals in the population (Grafen 2006). Instead of thinking in terms of selfish genes trying to maximize their future representation in the gene-pool, we can think in terms of organisms trying to maximize their inclusive fitness. Most people find the &lsquogene's eye&rsquo approach to kin selection heuristically simpler than the inclusive fitness approach, but mathematically they are in fact equivalent (Michod 1982, Frank 1998, Boyd and McIlreath 2006, Grafen 2006).

Contrary to what is sometimes thought, kin selection does not require that animals must have the ability to discriminate relatives from non-relatives, less still to calculate coefficients of relationship. Many animals can in fact recognize their kin, often by smell, but kin selection can operate in the absence of such an ability. Hamilton's inequality can be satisfied so long as an animal behaves altruistically towards other animals that are in fact its relatives. The animal might achieve this by having the ability to tell relatives from non-relatives, but this is not the only possibility. An alternative is to use some proximal indicator of kinship. For example, if an animal behaves altruistically towards those in its immediate vicinity, then the recipients of the altruism are likely to be relatives, given that relatives tend to live near each other. No ability to recognize kin is presupposed. Cuckoos exploit precisely this fact, free-riding on the innate tendency of birds to care for the young in their nests.

Another popular misconception is that kin selection theory is committed to &lsquogenetic determinism&rsquo, the idea that genes rigidly determine or control behaviour. Though some sociobiologists have made incautious remarks to this effect, evolutionary theories of behaviour, including kin selection, are not committed to it. So long as the behaviours in question have a genetical component, i.e. are influenced to some extent by one or more genetic factor, then the theories can apply. When Hamilton (1964) talks about a gene which &lsquocauses&rsquo altruism, this is really shorthand for a gene which increases the probability that its bearer will behave altruistically, to some degree. This is much weaker than saying that the behaviour is genetically &lsquodetermined&rsquo, and is quite compatible with the existence of strong environmental influences on the behaviour's expression. Kin selection theory does not deny the truism that all traits are affected by both genes and environment. Nor does it deny that many interesting animal behaviours are transmitted through non-genetical means, such as imitation and social learning (Avital and Jablonka 2000).

The importance of kinship for the evolution of altruism is very widely accepted today, on both theoretical and empirical grounds. However, kinship is really only a way of ensuring that altruists and recipients both carry copies of the altruistic gene, which is the fundamental requirement. If altruism is to evolve, it must be the case that the recipients of altruistic actions have a greater than average probability of being altruists themselves. Kin-directed altruism is the most obvious way of satisfying this condition, but there are other possibilities too (Hamilton 1975, Sober and Wilson 1998, Bowles and Gintis 2011, Gardner and West 2011). For example, if the gene that causes altruism also causes animals to favour a particular feeding ground (for whatever reason), then the required correlation between donor and recipient may be generated. It is this correlation, however brought about, that is necessary for altruism to evolve. This point was noted by Hamilton himself in the 1970s: he stressed that the coefficient of relationship of his 1964 papers should really be replaced with a more general correlation coefficient, which reflects the probability that altruist and recipient share genes, whether because of kinship or not (Hamilton 1970, 1972, 1975). This point is theoretically important, and has not always been recognized but in practice, kinship remains the most important source of statistical associations between altruists and recipients (Maynard Smith 1998, Okasha 2002, West et al. 2007).

2.1 A Simple Illustration: the Prisoner's dilemma

The fact that correlation between donor and recipient is the key to the evolution of altruism can be illustrated via a simple &lsquoone shot&rsquo Prisoner's dilemma game. Consider a large population of organisms who engage in a social interaction in pairs the interaction affects their biological fitness. Organisms are of two types: selfish (S) and altruistic (A). The latter engage in pro-social behaviour, thus benefiting their partner but at a cost to themselves the former do not. So in a mixed (S,A) pair, the selfish organism does better&mdashhe benefits from his partner's altruism without incurring any cost. However, (A,A) pairs do better than (S,S) pairs&mdashfor the former work as a co-operative unit, while the latter do not. The interaction thus has the form of a one-shot Prisoner's dilemma, familiar from game theory. Illustrative payoff values to each &lsquoplayer&rsquo, i.e., each partner in the interaction, measured in units of biological fitness, are shown in the matrix below.

Player 2
Altruist Selfish
Player 1 Altruist 11,11 0,20
Selfish 20,0 5,5
Payoffs for (Player 1, Player 2) in units of reproductive fitness

The question we are interested in is: which type will be favoured by selection? To make the analysis tractable, we make two simplifying assumptions: that reproduction is asexual, and that type is perfectly inherited, i.e., selfish (altruistic) organisms give rise to selfish (altruistic) offspring. Modulo these assumptions, the evolutionary dynamics can be determined very easily, simply by seeing whether the S or the A type has higher fitness, in the overall population. The fitness of the S type, W(S), is the weighted average of the payoff to an S when partnered with an S and the payoff to an S when partnered with an A, where the weights are determined by the probability of having the partner in question. Therefore,

(The conditional probabilities in the above expression should be read as the probability of having a selfish (altruistic) partner, given that one is selfish oneself.)

Similarly, the fitness of the A type is:

From these expressions for the fitnesses of the two types of organism, we can immediately deduce that the altruistic type will only be favoured by selection if there is a statistical correlation between partners, i.e., if altruists have greater than random chance of being paired with other altruists, and similarly for selfish types. For suppose there is no such correlation&mdashas would be the case if the pairs were formed by random sampling from the population. Then, the probability of having a selfish partner would be the same for both S and A types, i.e., P(S partner/S) = P(S partner/A). Similarly, P(A partner/S) = P(A partner/A). From these probabilistic equalities, it follows immediately that W(S) is greater than W(A), as can be seen from the expressions for W(S) and W(A) above so the selfish type will be favoured by natural selection, and will increase in frequency every generation until all the altruists are eliminated from the population. Therefore, in the absence of correlation between partners, selfishness must win out (cf. Skyrms 1996). This confirms the point noted in section 2&mdashthat altruism can only evolve if there is a statistical tendency for the beneficiaries of altruistic actions to be altruists themselves.

If the correlation between partners is sufficiently strong, in this simple model, then it is possible for the condition W(A) > W(S) to be satisfied, and thus for altruism to evolve. The easiest way to see this is to suppose that the correlation is perfect, i.e., selfish types are always paired with other selfish types, and ditto for altruists, so P(S partner/S) = P(A partner/A) = 1. This assumption implies that W(A)=11 and W(S)=5, so altruism evolves. With intermediate degrees of correlation, it is also possible for the condition W(S) > W(A) to be satisfied, given the particular choice of payoff values in the model above.

This simple model also highlights the point made previously, that donor-recipient correlation, rather than genetic relatedness, is the key to the evolution of altruism. What is needed for altruism to evolve, in the model above, is for the probability of having a partner of the same type as oneself to be sufficiently larger than the probability of having a partner of opposite type this ensures that the recipients of altruism have a greater than random chance of being fellow altruists, i.e., donor-recipient correlation. Whether this correlation arises because partners tend to be relatives, or because altruists are able to seek out other altruists and choose them as partners, or for some other reason, makes no difference to the evolutionary dynamics, at least in this simple example.

3.2: Natural and un-natural groups - Biology

Graduate School of Pharmaceutical Science, Tokushima University

Graduate School of Pharmaceutical Science, Tokushima University

2018 Volume 66 Issue 2 Pages 132-138

  • Published: February 01, 2018 Received: August 29, 2017 Released on J-STAGE: February 01, 2018 Accepted: - Advance online publication: - Revised: -

(compatible with EndNote, Reference Manager, ProCite, RefWorks)

(compatible with BibDesk, LaTeX)

In this review, we have summarized the research effort into the development of unnatural base pairs beyond standard Watson–Crick (WC) base pairs for synthetic biology. Prior to introducing our research results, we present investigations by four outstanding groups in the field. Their research results demonstrate the importance of shape complementarity and stacking ability as well as hydrogen-bonding (H-bonding) patterns for unnatural base pairs. On the basis of this research background, we developed unnatural base pairs consisting of imidazo[5′,4′:4.5]pyrido[2,3-d]pyrimidines and 1,8-naphthyridines, i.e., Im : Na pairs. Since Im bases are recognized as ring-expanded purines and Na bases are recognized as ring-expanded pyrimidines, Im : Na pairs are expected to satisfy the criteria of shape complementarity and enhanced stacking ability. In addition, these pairs have four non-canonical H-bonds. Because of these preferable properties, ImN N : NaO O , one of the Im : Na pairs, is recognized as a complementary base pair in not only single nucleotide insertion, but also the PCR.

Recently, work by the Human Genome Project-Write, which focuses on synthesizing human genomes, has started. 1) Rewriting entire human genomes will deepen our understanding of the genetic code and have an impact on human health. In this manner, synthetic biology is a bottom-up-type research field that deals with the preparation of materials that comprise life systems. As only two base pairs have been selected during the evolution of life, i.e., adenine (A) : thymine (T) and guanine (G) : cytosine (C) pairs, these represent ideal genetic polymers. The specific formation of hydrogen bonds (H-bonds) in the A : T pair (two H-bonds) and G : C pair (three H-bonds) is the most fundamental rule of genetic information. In 1962, with surprising foresight, Rich proposed the possibility of an extra artificial base pair, i.e., isoguanine (isoG, 6-amino-2-oxopurine) and isocytosine (isoC, 2-amino-4-oxopurine), representing fifth and sixth DNA nucleobases. 2) The artificially designed isoG : isoC pair has three H-bonds with the specific proton donor (D) and proton acceptor (A) geometry [DDA : AAD], which is different from those in the A : T pair ([DA : AD]) and the G : C pair ([ADD : DAA]) (Fig. 1). If an extra base pair can function selectively in replication, transcription, and translation alongside natural Watson–Crick (WC) base pairs, it could potentially allow expansion of the genetic code. Thus, the creation of unnatural base pairs is a challenging and ideal research theme in synthetic biology. Herein, research into the development of unnatural base pairs and their applications are described.

Prior to presenting our unnatural base pair studies, the work of four famous and pioneering groups focusing on unnatural base pairs is introduced.

2.1. Unnatural Base Pairs with Non-standard H-Bonding Geometries Benner’s Group

In 1989, Benner and colleagues synthesized isoG and isoC nucleosides and their triphosphates with the goal of expanding the genetic alphabet 3) (Fig. 1). The isoG : isoC pair was recognized as a complementary base pair by polymerases both in in vitro replication and transcription systems. 4) They also designed other unnatural base pairs with different H-bonding patterns, such as the X : κ pair. 5) Additionally, they succeeded in incorporating the unnatural amino acid 3-iodotyrosine into a peptide by using a pair of 54-mer mRNA comprising isoC and tRNA with an isoGUC anticodon in in vitro translation systems. 6) These were the first studies to succeed in artificially rebuilding the central dogma using unnatural base pairs, indicating that the alteration of H-bonding geometries in base pairs is a promising strategy for creating a new unnatural base pairs. However, the selectivity of the isoG : isoC pair in enzymatic replication was unsatisfactory. This is because isoG has a problem with tautomerism, in that the enol form of isoG has a [DAD] H-bonding pattern that is complementary to that of T. 4) In 2005, to address this drawback of isoG, they replaced natural T with 2-thioT (T s ). 7) Because of the bulkiness and H-bonding properties (weak proton acceptability) of the thione, T s is less likely to mispair with the tautomer of isoG than natural T. Fidelity per doubling of the isoG : isoC pair along with the A : T s pair in the PCR was improved by around 98%, although that with natural A : T was 93%. 7) However, when using an unnatural base pair with 98% replication fidelity, the retention of the unnatural base pair in its amplified DNA fragment after a 20-cycle PCR is decreased to 67% (i.e., 0.98 20 =ca. 0.67). Because the error rate for natural WC pairing in replication is ca. 10 −6 errors/bp, highly exclusive selectivity of unnatural base pairs is required. Thus, they also created another unnatural base pair comprising 2-aminoimidazo[1,2-a]-1,3,5-triazin-4(8H)-one (P) and 6-amino-5-nitro-2(1H)-pyridone (Z). 8) The P : Z pair, which has [AAD : DDA] H-bonding geometry, exhibits up to 99.8% fidelity per doubling without using the A : T s pair because, unlike isoG, Z does not tautomerize. 9) Recently, they applied the six-letter genetic system with the P : Z pair to the cell-systematic evolution of ligands by exponential enrichment (SELEX) system and succeeded in obtaining highly active aptamers against HepG2 liver cancer cells. 10)

2.2. Non-hydrogen-Bonded Unnatural Base Pairs Kool’s Group

During the same decade as Benner’s pioneering works, Kool et al. have explored the possibility of non-H-bonded unnatural base pairs. In 1998, they created an unnatural base pair comprising 4-methylbenzimidazole (Z) 11) and 2,4-difluorotoluene (F) as steric isosteres of the natural A : T pair 12,13) (Fig. 2A). In an in vitro replication system, Z and F were equally replaced with natural A and T but not G and C, demonstrating the importance of shape complementary and stacking interactions in addition to H-bonding in base pairing. Additionally, they designed a modified Z base, 9-methyl-1H-imidazo[4,5-b]pyridine (Q), that has a proton acceptor corresponding to the N3 atom. 14) Because the incorporation efficiency of Q by Klenow fragment (KF) DNA polymerase is superior to that of Z, the importance of proton acceptors in the minor groove for unnatural base pair design is also demonstrated.

To further evaluate the importance of shape complementarity in base pairing, they also created size-expanded (benzo-fused) WC-like base pairs, such as xA : T and A : xT pairs (termed xDNA), that have the same H-bonding geometry as natural WC base pairs but with their pairing edges shifted outward by 2.4 Å (i.e., the width of benzene) 15,16) (Fig. 2B). KF polymerase incorporated natural nucleoside triphosphate (dNTP) opposite xDNA bases in a DNA template with an efficiency ca. 1000-fold lower than that of natural pairs, 16) and endogenous Escherichia coli (E. coli) enzymes accurately transcribed xDNA to encode the bacteria phenotype. 17)

2.3. Creation of a Semi-synthetic Organism with an Unnatural Base Pair Romesberg’s Group

Romesberg and colleagues have also developed various kinds of non-H-bonded unnatural base pairs. In 1999, they reported the self-complementary 7-propynylisocarbostyril (PICS) : PICS pair 18) (Fig. 3). When PICS : PICS base pairs are incorporated into DNA, the resulting duplex shows high thermal stability, and KF polymerase recognizes the PICS : PICS pair as a complementary base pair. However, further replication reactions after PICS : PICS base pairing are terminated because PICS bases overlap with each other, indicating structural change in the DNA duplex. Consequently, they explored more than 100 kinds of unnatural base pairs 19–25) and succeeded in developing 5SICS : MMO2 and 5SICS : NaM pairs, which are replicable unnatural base pairs in the PCR. 26–28) In 2014, they reported the creation of a semi-synthetic organism containing the 5SICS : NaM base pair. In this work, an exogenously expressed nucleoside triphosphate transporter imported d5SICS and dNaM triphosphates efficiently into E. coli, and an endogenous replication system used them in the genetic codes. 29) This report had a great impact on synthetic biology, and some researchers consider the created organism to be “alien.”

2.4. Unnatural Base Pair as a Powerful Tool for Creating Highly Functional Nucleic Acids Hirao’s Group

Hirao et al. have also focused on the creation of unnatural base pairs that function in replication, transcription, and translation in the same way as natural WC base pairs. Their unnatural base pairs were developed by exploiting the concept of steric hindrance. In their 2-amino-6-(2-thienyl)purine (s) : 2-oxo-1H-pyridine (y) pair, the purine-like s has a bulky substituent at the major groove side 30,31) (Fig. 4). Thus, the s : y pair is selectively recognized as a complementary base pair by KF polymerase in in vitro replication systems. Furthermore, in 2002 they succeeded in synthesizing the Ras protein modified with 3-iodotyrosine from a DNA template containing the s base by combining T7 polymerase transcription and E. coli in vitro translation systems. 32) They also developed the Ds : Pa base pair, in which H-bonding atoms and substituents located at the base-pairing side are excluded. 33) The replication selectivity of the Ds : Pa pair is superior to that of the s : y pair, and the Ds : Pa pair can be amplified in the PCR with over 99% fidelity per doubling using γ-amino triphosphates of Ds and A. 33) Concerning selectivity in replication, their unnatural base pairs exhibit the best performances among the reported unnatural base pairs. The low misincorporation rate of the recently developed Ds : diol1-Px pair (5×10 −5 errors/bp) is close to the mispairing error rate of natural WC pairs (2×10 −5 errors/bp). 34,35) By making use of this superior property of the Ds : diol1-Px pair, they succeeded in obtaining a DNA aptamer containing the Ds base against human protein target, vascular endothelial cell growth factor-165 (VEGF-165). 36) Because the affinities of aptamers that have Ds bases are >100-fold improved over those of aptamers containing only natural bases, the potential of genetic alphabet expansion as a powerful tool for creating highly functional nucleic acids is demonstrated.

In contrast to the research described above, we began our unnatural base pair studies to address the simple question : why did WC base pairs come to contain two or three H-bonds during the evolution of life? To answer this, we have explored four H-bonding base pairs. 37–41) As purine-type nucleobases, a series of imidazo[5′,4′:4.5]pyrido[2,3-d]pyrimidines (Im) were designed, 37) while 1,8-naphthyridines (Na) were designed as their complementary pyrimidine nucleobases. 38) For the first generation of our four-H-bonding unnatural base pairs, two Im : Na pairs, i.e., ImN O : NaO N and ImO N : NaN O , which have alternate H-bonding geometries, were developed. As can be seen in Fig. 5a, these pairs have four non-canonical H-bonds and expanded aromatic surfaces, and they satisfy the shape complementarity criterion like WC base pairs. Because of the contributions of these effects, DNA duplexes containing these pair(s) are significantly thermally stabilized (ca. +8°C/pair). 38) In addition, both pairs are recognized by KF polymerase as complementary in single nucleotide insertion. However, the kinetic parameters determined for their 5′-triphosphates revealed that the efficiencies of incorporation for ImN O : NaO N and ImO N : NaN O pairs are 1–2 orders of magnitude lower than those of natural A : T and G : C pairs. Furthermore, misincorporation of natural dNTP, for example, that of 2′-deoxyadenosine 5′-triphosphate (dATP) against NaN O in the template was clearly observed at the same efficiency as that of ImO N TP against NaN O in the template owing to the possible formation of an A : NaN O pair with two H-bonds 42,43) (Fig. 5b).

To improve efficiency and selectivity, a new Im : Na pair, i.e., ImN N : NaO O , has been envisioned 39,44) (Fig. 5c). This pair has a [DAAD : ADDA] H-bonding geometry, and thus is expected to avoid the misincorporation of natural A and G (Fig. 5d). The chemistry and enzymatic behavior of the ImN N : NaO O pair is described below.

3.1. Synthesis of the Nucleoside Units for the ImN N : NaO O Pair

The most straightforward synthesis of ImN N nucleoside 1 is thought to be through intramolecular cyclization of the 5-pyrimidinylimidazole nucleoside, which can be prepared via Stille coupling between the 5-iodoimidazole nucleoside 2 and (tributylstannyl)pyrimidine 3 (Chart 1). When a mixture of 2 prepared from 2′-deoxyinosine and 3 prepared from 2,4-dichloropyrimidine is heated in N,N-dimethylformamide (DMF) in the presence of tris(dibenzylideneacetone)dipalladium(0)-chloroform adduct (dba3Pd2·CHCl3), a mixture of coupling product 4 and a spontaneously cyclized tricyclic product 5 is obtained. Subsequent treatment of the mixture under basic conditions converges the mixture to the tricyclic product 5. Finally, treatment of 5 with a mixture of 1,4-dioxane and NH4OH gives the desired ImN N nucleoside 1 in good yield. 37) The resulting 1 is then converted into the corresponding phosphoramidite unit and 5′-triphosphate under the appropriate conditions. 44,45)

Reagents and conditions: (a) dba3Pd2·CHCl3, DMF, 100°C (b) Na2CO3, aq. EtOH, 80°C (c) NH4OH/1,4-dioxane, 100°C.

For the synthesis of NaO O nucleoside 6, which is an unusual C-nucleoside, the palladium-catalyzed Heck reaction was envisioned. As illustrated in Chart 2, 3-iodo-1,8-naphthyridine derivative 7 prepared from 2-amino-7-hydroxy-1,8-naphthyridine and glycal 8 are prepared. Then, Heck coupling of 7 with 8 in the presence of palladium acetate and triphenylarsine followed by deprotection and stereoselective reduction affords 1,8-naphthyridine C-nucleoside 9. After protection of the hydroxyl groups with silyl groups to give 10, the substituent at the 2-position is converted into an acetoxy group via 11. Finally, treatment of the resulting 12 with methanolic ammonia at 60°C in a sealed tube gives the desired NaO O nucleoside 6 in good yield. In a similar manner as for 1, 6 is converted into the corresponding phosphoramidite unit and 5′-triphosphate for enzymatic evaluation. 39,45)

Reagents and conditions: (a) Pd(OAc)2, AsPh3, Bu3N, DMF, 60°C (b) TBAF, THF (c) NaBH(OAc)3, AcOH, CH3CN (d) TIPSCl, imidazole, DMF, 55°C (e) NH3/MeOH, 80°C (f) NaNO2, AcOH (g) NH3/MeOH, 80°C.

3.2. Investigation of Single Nucleotide Insertion with the ImN N : NaO O Pair

To investigate the efficiency and selectivity of the newly designed ImN N : NaO O pair in in vitro replication systems, we examined single nucleotide insertion using KF polymerase, and the kinetic parameters, such as the Michaelis constant (Km), the maximum rate of the enzyme reaction (Vmax), and the incorporation efficiency (Vmax/Km), for the ImN N : NaO O pair were determined and compared with those of the two previous Im : Na pairs 45) (Fig. 6). As discussed above, the values of Vmax/Km for the ImN O : NaO N pair are 1–2 orders of magnitude lower than those of the natural A : T pair (6.0×10 7 –9.0×10 7 % min −1 M −1 ), as presented in the first row of Fig. 6. This result is thought to be due to the fact that the NaO N base lacks a proton acceptor corresponding to the O2 atom of the natural pyrimidine base. For the ImO N : NaN O pair, the Vmax/Km values are better than those for the ImN O : NaO N pair (second row in Fig. 6). However, as well as the desired ImO N , undesired A is incorporated against NaN O in the template with a comparable Vmax/Km value (Fig. 5b).

a) Incorporation of dYTP against a series of Im bases in the template. b) Incorporation of dYTP against a series of Na bases in the template. a n.d.=not determined.

Concerning incorporation efficiencies, the Vmax/Km values for the ImN N : NaO O pair are superior to those for the ImN O : NaO N and ImO N : NaN O pairs because the ImN N and NaO O bases have proton acceptors at positions corresponding to the N3 of a purine and the O2 of a pyrimidine, respectively. In addition, the ImN N : NaO O pair has higher thermal stability than the two previous Im : Na pairs owing to the [DAAD : ADDA] H-bonding pattern. 39) The preferable base-pairing properties of the ImN N : NaO O pair lead to it having the highest incorporation efficiency among the three Im : Na pairs. With respect to specificity, misincorporations of natural A and/or G against ImN N in the template are controlled by the [DAAD : ADDA] H-bonding pattern of the ImN N : NaO O pair. The efficiency of ImN N TP incorporation against NaO O is at least ten-times higher than those of natural dATP and 2′-deoxyguanosine 5′-triphosphate (dGTP) incorporations. Thus, as expected from Fig. 5d, formation of both A : NaO O and G : NaO O should be negligible owing to the NH proton repulsion between the 6-amino group of A and N8 of NaO O , and that between N1 of G and N1 of NaO O , respectively.

3.3. PCR Amplification with ImN N : NaO O Pair

To apply the newly developed ImN N : NaO O pair to synthetic biology research like that reported by the four aforementioned groups, this pair should be viable in PCR amplification. Thus, according to the method reported by Hirao et al., 34) PCR involving the ImN N : NaO O pair was examined under various dNTP conditions (Fig. 7a).

(a) Schematics of the template and primers, and the resulting amplicon. Gel electrophoresis of PCR products obtained using Taq DNA polymerase (b), Deep Vent exo − DNA polymerase (c), Deep Vent exo + DNA polymerase (d), and Pfx 50 DNA polymerase (e) under different dNTP conditions.

First, when Taq DNA polymerase, which is a standard thermophilic DNA polymerase for routine PCR, is used, a 75 base-pair amplicon in the presence of ImN N TP and NaO O TP along with all four kinds of dNTPs is successfully obtained (Fig. 7b, lane 4). However, similar PCR products are observed under the conditions lacking ImN N TP (lane 3), indicating that inaccurate amplification occurs, presumably owing to misincorporation of natural A and/or G against NaO O in the resulting DNA fragment. Thus, we screened suitable thermophilic DNA polymerases, and typical results are shown in Figs. 7c–e. Exonuclease-deficient Deep Vent (Deep Vent exo − ) DNA polymerase gives the full-length amplicon in both the presence and absence of NaO O TP (Fig. 7c). Conversely, the same polymerase with 3′→5′ exonuclease activity (Deep Vent exo + ) preferentially affords the PCR product in the presence of all 5′-triphosphates (Fig. 7d), suggesting that the proofreading activity identifies mismatched base pairs with natural nucleobases and corrects them to the ImN N : NaO O pair. It has been reported that the proofreading activity of DNA polymerases improves the accuracy of incorporating unnatural base pair analogs, 32,33) and the benefits of this activity are apparent in our case.

To further evaluate the fidelity of the ImN N : NaO O pair in PCR amplification, we sequenced the resulting PCR product according to methods reported by the groups of Benner 26) and Hirao. 34) As a result, the lowest total mutation rates of the ImN N : NaO O pair is observed when using Pfx 50 DNA polymerase, and it is estimated to be ca. 6% after 15 PCR cycles (fidelity ≈0.995 per doubling) (the analysis of PCR products by gel electrophoresis is shown in Fig. 7d). Although the replication fidelity of the ImN N : NaO O pair is slightly inferior to those of other unnatural base-pair analogs, 8,9,26,28,34,46,47) it is strongly indicated that the ImN N : NaO O pair acts as an orthogonal base pair for WC base pairs during PCR amplification.

Watch the video: Το Παράδοξο του H. M. (January 2022).