Synthetic DNA and the necessity of biological mechanisms

Scientists have created synthetic DNA with four extra “letters”:

A couple billion years ago, four molecules danced into the elegant double-helix structure of DNA, which provides the codes for life on our planet. But were these four players really fundamental to the appearance of life — or could others have also given rise to our genetic code?

A new study, published today (Feb. 20) in the journal Science, supports the latter proposition: Scientists have recently molded a new kind of DNA into its elegant double-helix structure and found it had properties that could support life.

But if natural DNA is a short story, this synthetic DNA is a Tolstoy novel.

The researchers crafted the synthetic DNA using four additional molecules, so that the resulting product had a code made up from eight letters rather than four. With the increase in letters, this DNA had, a much greater capacity to store information. Scientists called the new DNA “hachimoji” — meaning “eight letters” in Japanese — expanding on the previous work from different groups that had created similar DNA using six letters.

The team was able to confirm that the synthetic DNA could be replicated into RNA, and could form the double helix structure.  However, they stopped short of confirming that it could replicate itself:

Still, in order for the Hachimoji DNA to support life, there’s a fifth requirement, Benner said. That is, it needs to be self-sustaining or have the ability to survive on its own. However, the researchers stopped short of investigating this step, in order to prevent the molecule from becoming a biohazard that could one day work its way into the genomes of organisms on Earth.

The article takes Hachimoji DNA as evidence that extraterrestrial DNA could be made of different components than the ones found on Earth.  However, since they stopped short of replication, I’m not sure we have that evidence yet.

One question that often comes up in biology is, how necessary or arbitrary is a particular solution in evolution?  In other words, were there other solutions that life could have taken to solve a particular problem?  This is always a difficult question, because we don’t know whether those alternate solutions arose at some point in the past, but were subsequently selected against, or never arose because the right mutation just never happened.

Biological traits arise because of mutations.  Mutations can be beneficial, in which case they’ll usually be selected for, or they can be detrimental, leading to them being selected against.  Or they can be neutral, in which case whether they propagate may come down to the random fluctuations of genetic drift.

This is complicated by the fact that phenotypic (observed) traits typically arise from the complex interactions of proteins produced by individual genes.  So a beneficial trait might be paired with a detrimental one, with whether the combination propagates depending on how the mix of benefit and detriment works out.

An aspect of any trait or mechanism is how much energy it needs.  The trait or mechanism might be neutral or perhaps even mildly beneficial, but if it’s costly in terms of energy, it’s likely to end up falling on the detrimental side of the ledger.  Although if it’s very beneficial, then even being costly in terms of energy might not matter.  (The brain is a prime example of this latter case, an energy hungry organ that nonetheless earns its keep.)

Energy is what I’m not sure about with these additional letters.  How much chemical energy do they require to be incorporated into the DNA structure?  The answer might not matter for any artificial applications we come up with, such as DNA storage, but if they require more energy to form, that might be why we don’t see them in nature.

Along those lines, I’d be very interested if anyone has seen information on this aspect of the development.  Or, as usual, if I’m missing anything here.

21 thoughts on “Synthetic DNA and the necessity of biological mechanisms

  1. Good point. There probably is a trade off on the complexity of a system and its expressive power.

    One thing: an eight-symbol alphabet just means more storage given the same number of symbols. The same amount of information can be stored in any alphabet, but sequences may be longer or shorter. Given that human DNA has a lot of junk in it, I’m not sure what extra information storage really buys.

    Liked by 2 people

    1. Ah, well there is that too. But I was wondering more specifically about the energy to incorporate the new molecules that make up the new letters. My suspicion is that the natural ones probably take a less energetic chemical reaction, in other words, require less of a organisms hard earned energy to do their thing.

      It does seems like the additional letters allow for higher information density. If someone is planning to store information in synthetic DNA, that could be useful. (Although it’s not clear to me how useful DNA storage is as an actual technology.) But to your point, I’m not sure it’s at all useful for living organisms.


      1. Just to be specific, we’re talking about: cytosine [C], guanine [G], adenine [A] or thymine [T] (and uracil [U], which replaces thymine in RNA). AIUI, these “letters” occur in three-letter groups that are the actual protein-encoding symbols.

        I have no idea about how those “letter” molecules are formed. Are they taken from food or synthesized by the body? And I certainly don’t know about energy requirements in either synthesizing or or using one molecule over another.

        I have heard that one of them (guanine?) doesn’t seem to have a natural path to synthesis in pre-biotic Earth, so there’s some mystery associated with how RNA got going in the first place. (Panspermia theories are one way around that.)

        Physical systems, including biological ones, are all about free energy physics, so I’m sure you’re right about the cost of additional chemicals, both making and using them.


        1. Those were the ones I meant. I think they are synthesized from molecules that ultimately come from food.

          I hadn’t heard that about guanine. I wonder if it has something to do with the enzymes needed to synthesize it? In principle, it seems like anything that can be synthesized in biology can happen abiotically, although some reactions need either a catalyst or a lot of energy, either of which might have been available on the early Earth.

          I’m personally not much of a fan of Panspermia. I think it just pushes the problem back. At some point abiogenesis had to happen. To me, moving it to space just makes it even harder to account for.


          1. Assuming you meant “either of which might NOT have been available,” yes, right, that’s how I understood it. Given the organic soup believed to have been present, plus energy (lightning, electrical discharge), these molecules are created. But one of them was much less likely.

            I completely agree about Panspermia. Never found much merit in the idea.

            Liked by 1 person

    2. This seems like a good place for a brief lesson in molecular biology, so really quick:

      Proteins are made of chains of amino acids. All the proteins we know about are made from about 20 amino acids, give or take 1 or 2. I can see aliens using different amino acids, but it seems likely to be a number in that range.

      Assuming a double-helix model, a digital code for storing the information would be in multiples of 2. With just two DNA molecules in your code you would need a chain of 5 DNA molecules to specify a codon, which specifies one amino acid. A chain of 4 could only specify 16. A codon of 5 molecules could specify 32, so you would have some redundancy.

      With 4 DNA letters (molecules) in your code, a codon of 2 letters can only specify 16 amino acids (not enough), and a codon of 3 letters can specify 64 amino acids. Lots of redundancy, which is what we have.

      With 6 molecules in your code, a codon of 2 letters would code for 36 amino acids, which is more than enough, but not as much redundancy as 64. A codon of 3 letters would code for 216, i.e., likely overkill. Thus, there would be no benefit of going to 8.

      So what are the benefits of using 6 letters in your code with a codon of 2? Your coding medium uses 33% less material. What are the detriments? You need 2 more metabolic pathways, which will require some number of extra proteins, which are additional points of potential failure. You also have less redundancy in the code, which means any given mutation is more likely to be lethal.

      There are probably better arguments, but I would guess that 4 letters in the code is a general sweet spot between two much complexity with more than 4 and insufficient robustness (redundancy) with less than 4.

      So I think the main point of the article/research is that a four letter code might easily use different letters. I’m going to guess that an energy requirement is insignificant, as each of the new letters are not likely to be too exotic if they can fit into the helix and are long-term stable. All you need is a decent enzymatic pathway, and those are a dime a dozen. I’m not saying such energy problems are out of the question. Just saying that they probably would not have done the experiements if the things were that hard to make.


      Liked by 1 person

      1. Thanks for the lesson. If I recall correctly, the extra redundancy also allows for multiple codons to specify the same amino acid, with variations near each other specifying the same one, which seems like a hedge against both replication and transcription errors.

        Unless you know the chemistry well, I think you might be hasty in dismissing the energy necessary for forming the chemical bonds. The microbiology and astrobiology I’ve read notes it as an issue. A fairly modest difference, when compounded across innumerable replications across billions of years can make a major difference.

        Why wouldn’t a code with 8 letters being able to code 64 amino acids with codon of 2 not be an advantage? (Assuming the metabolic pathways issue doesn’t negate it)


        1. Re: advantage of 8 letter code. Clearly you have efficiency. But I see two difficulties. First would be the complexity of evolving an 8 letter code without locking in a simpler version (local maximum, like Paul pointed out). But another might be error correction. With a 3 letter codon, a point mutation knocks out 33% of the information, but for a 2 letter codon, said mutation knocks out 50% of the info.



      2. “Replication of DNA and synthesis of proteins are studied from the view-point of quantum database search. Identification of a base-pairing with a quantum query gives a natural (and first ever!) explanation of why living organisms have 4 nucleotide bases and 20 amino acids. It is amazing that these numbers arise as solutions to an optimisation problem.”

        Click to access 0002037.pdf


  2. A computer has a two-symbol alphabet: 0 and 1. A neural system also has a two-symbol alphabet: excitation and inhibition. As you say, it’s amazing the complexity that can be generated out of simplicity.

    Liked by 2 people

    1. What’s interesting about neural systems is that, although the action potential is a binary event, it either fires or doesn’t, whether it fires in a particular neuron depends on the strength and frequency of the excitations and inhibitions coming in. John Dowling describes it as a translation between amplitude modulation (the neural inputs) and frequency modulation (the rate of action potential firings).

      Liked by 1 person

        1. I know lateral inhibition is used in the retina in recognizing edges, and I think I recall reading somewhere that it’s in other sensory areas, but I’m not aware of it being a major force in the CNS.


          1. Lateral inhibition is the neural mechanism which creates sensations–boundary lines, colors, sounds, smells, tastes–everything the CNS has to work with. George Wald and Keffer Hartline shared the Nobel Prize in physiology or medicine in 1963 for their discoveries of its role in the creation of color in in the human retina and boundary lines in the eccentric cells of the compound eye of the horseshoe crab. I think It is fundamental to understanding how consciousness arises in the human brain and discuss it in a couple of essays on my website.

            Liked by 1 person

  3. As a generalization, I think it makes sense that alien DNA might not be put together with the same “letters” as our own. But if someone’s trying to say specifically which “other letters” we’ll find in alien DNA, I wouldn’t take that too seriously. We can’t predict what alien DNA might be like until we actually find some.

    Liked by 2 people

    1. I don’t think there’s any assertion that these specific letters would be in alien DNA, just that it shows that alternate letters are possible. The question is, are there factors, such as the amount of chemical energy required, that had an effect on the letters our DNA use? If so, then alien DNA might use the same letters.

      But I agree we’re not in any position to know at this point.

      Liked by 1 person

  4. Evolution finds local optima, which frames my default hypothesis about other DNA “letters”: they might not work well in an environment in which so much else is encoded in the prevailing letters. In which case, we don’t need to assume that other letters have any inherent disadvantage. Alien DNA might have a different starting point – which then would lead to everything J.S. Pailly says above.

    Liked by 1 person

Your thoughts?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.