COMPLEXITY EXPLAINED: 9. How Did Complex Molecules Like Proteins and DNA Emerge Spontaneously?
Note: For previous parts to Dr. Wadhawan’s series on complexity check out the ‘Related Posts’ found at the bottom of this article.
How could the blind forces of Nature create large and highly information-laden molecules like DNA and proteins just by random processes? DNA carries information for the synthesis of proteins, but it requires the prior availability of certain protein molecules for performing its genetic duties. Such proteins help the double-helix DNA molecule to uncoil itself and split into two strands for replication purposes. Therefore, DNA and certain proteins must have emerged independently, by some efficient (and therefore reasonably likely) chemical processes. But how? The answer has to do with the chemical evolution of autocatalytic sets of molecules, which could consume energy-rich molecules and other precursors (‘food’) to ‘reproduce’. These molecules were the predecessors of proteins and DNA etc., and thence of life.
Catalysis is a process that facilitates or speeds up a chemical reaction. Often, a chemical process may involve two or more intermediate reactions. A catalyst is a molecule that speeds up the production of an end product of the chemical process by participating in the intermediate reactions, but separates at the end of the chain of reactions, thus becoming available all over again for further catalysis. Often, a chemical reaction may almost never occur if no catalyst is present. Enzymes are examples of proteins that assist (i.e. catalyze) chemical reactions in biological systems.
Photosynthesis carried out by green plants in the presence of sunlight is another familiar example of catalysis. Chlorophyll is the catalyst here. Through a number of intermediate reactions, the net reaction is as follows:
Photons from the Sun make this reaction possible, and their energy gets stored in the form of chemical energy, resulting in an increase in the degree of complexity, or information content. Living organisms consume the energy stored in glucose, and some of it gets converted into more complex or more information-rich forms. Of course, not all photons from the Sun falling on our ecosphere get utilized like this. Most of them just dissipate their energy, with a corresponding increase of entropy.
A polymer is a very long molecule, made up by covalent bonding among a large number of repeat units (monomers). There can be variations, either in that the bonding is not covalent everywhere, or in that not all subunits are identical. Examples of polymers and polymer solutions include plastics such as polystyrene and polyethylene, glues, fibres, resins, proteins, and polysaccharides like starch.
A homopolymer consists of a single type of repeat unit. A copolymer has more than one type of repeat units. A random copolymer has a random arrangement of two or more types of repeat units.
Sequenced copolymers are different from random copolymers in that, although the sequence of different subunits is not periodic, it is not completely random either. Biopolymers like DNA and proteins are examples of this. Their very specific sequence of subunits, ordained by Nature (through the processes of evolution), results in particular properties. Proteins in humans are sequenced copolymers made up of ~20 amino acids. The sequences of these amino acids in proteins give them the property to fold and self-organize into very specific 3-dimensional configurations.
How do short polymers form spontaneously in Nature? Recall the lock-and-key mechanism outlined in Section 8.4 (Part 8). Suppose a monomer has a shape and charge distribution such that another monomer can fit snugly into some part of it. There are random collisions among the monomers in a fluid medium, and usually they do not stick together, and simply bounce off after a collision. But once in a while the collision may be such that the two monomers have just the right orientation for a lock-and-key fitting. Then the chances of the two sticking together and forming a stable dimer are much larger. Dimers can lead to trimers, and so on, resulting in a polymer. Naturally, this can be a rather unlikely and therefore very slow process, and only short polymers can possibly form spontaneously in reasonable time.
9.3 Cell Biology
All tissues in animals and plants are made up of cells, and all cells come from other cells. A cell may be either a prokaryote or an eukaryote. The former is an organism that has neither a distinct nucleus, nor other specialized subunits or organelles. Examples include bacteria and blue-green algae. Unicellular organisms like yeast are eukaryotes. Such cells are separated from the environment by a semi-permeable membrane. Inside the membrane there is a nucleus and the cytoplasm surrounding it. Multicellular organisms are all made up of eukaryote-type cells. In them the cells are highly specialized, and perform the function of the organ to which they belong.
The nucleus contains nucleic acids, among other things. With the exception of viruses, two types of nucleic acids are found in all cells: RNA (ribonucleic acid) and DNA (deoxyribonucleic acid). Viruses have either RNA or DNA, but not both (but then viruses are not cells). Apart from the nucleus, an eukaryotic cell has mitochondria, ribosomes, and vacuoles. Plant cells also have chloroplasts. Mitochondria make energy out of food. Ribosomes make proteins. Vacuoles are used for storage of water or food. Chloroplasts use sunlight to create food by photosynthesis.
DNA is a long molecule that has the genetic information encoded in it as a sequence of four different molecules called nucleotides (adenine (A), thymine (T), guanine (G), and cytosine (C)). There is a double backbone of phosphate and sugar molecules, each carrying a sequence of the ‘bases’ A, T, G, C. This backbone is coiled into a double helix (like a twisted ladder). In this double-helix structure, base molecule A bonds almost always to base molecule T (via a weak hydrogen bond), and G bonds to C. The sequence of base pairs defines the primary structure of DNA.
DNA contains the codes for manufacturing various proteins. Production of a protein in the cell nucleus involves transcription of a stretch of DNA (this stretch is called a gene) into a portable form, namely the messenger RNA (or mRNA). This messenger then travels to the cytoplasm of the cell, where the information is conveyed to the ribosome. This is where the encoded instructions are used for the synthesis of the protein. The code is read, and the corresponding amino acid is brought into the ribosome. Each amino acid comes connected to a specific transfer RNA (tRNA) molecule; i.e. each tRNA carries a specific amino acid. There is a three-letter recognition site on the tRNA that is complementary to, and pairs with, the three-letter code sequence for that amino acid on the mRNA.
The one-way flow of information from DNA to RNA to protein is the basis of all life on Earth. This is the central dogma of molecular biology.
Three letters (out of the four, namely the bases A, T, C, G) are needed to code the synthesis of any particular protein. The term codon is used for the three consecutive letters on an mRNA. The possible number of codons is 64, and only 20 amino acids are processed by these codons. The linking of most of the amino-acid-triplets for synthesizing a protein can be coded by more than one codon.
There are ~60-100 trillion cells in the human body. In this multicellular organism (as also in any other multicellular organism), almost every cell (red blood ‘cells’ are an exception) has the same DNA, with exactly the same order of the nucleotide bases. The nucleus contains 95% of the DNA, and is the control centre of the cell. The DNA inside the nucleus is complexed with proteins to form a structure called chromatin.
The fertilized mother cell (the zygote) divides (self-replicates) into two cells. Each of these again divides into two cells, and so on. Before this cell division (mitosis) begins, the chromatin condenses into elongated structures called chromosomes. A gene is a functional unit on a chromosome, which directs the synthesis of a particular protein. As stated above, the gene is transcribed into mRNA, which is then translated into the protein. Humans have 23 pairs of chromosomes. Each pair has two non-identical copies of chromosomes, derived one from each parent.
During cell division, the double-stranded DNA splits into the two component strands, each of which acts as a replication template for the construction of the complementary strand. ‘Complementary strand’ means that for every A on the original template these is a T on the new strand; similarly, there is a C for every G, A for T, and G for C. At every stage, the two daughter cells are of identical genetic composition (they have identical genomes). In each of the 60 trillion cells in the human body, the genome consists of around three billion nucleotides.
9.4 Autocatalytic Sets of Molecules
Life depends on molecules of DNA, RNA, proteins, polysaccharides, etc. How did such large molecules get synthesized ‘spontaneously’ from their building blocks, namely nucleic acids, amino acids, sugars, etc.? DNA and RNA have the crucial self-replication property. If we can explain their appearance on Earth, then self-replication and Darwinian natural selection can account for the emergence of simple life forms, as also their evolution into more and more complex life forms. Invoking random chance processes for the creation of large molecules like DNA, which are bearers of genetic information, is not a tenable idea because of the miniscule probability, and the correspondingly large time required for this to happen. In any case, there is no evidence that the origin of life on earth can be equated with the appearance of DNA.
The answer came through the idea of autocatalysis. Autocatalytic sets of molecules are those which can catalyse the synthesis of themselves. Autocatalysis requires that a given ‘factor’ (say A) should be able to convert a substrate or precursor B into a new factor of the same type: A + B → 2A + C. Melvin Calvin (1969) introduced the idea of autocatalysis as a mechanism for molecular selection, with implications for how life emerged on Earth.
There was little or no molecular oxygen (O2) in the original atmosphere around the Earth. A variety of local energy sources were, of course, present (undersea hydrothermal vents; ultraviolet radiation; volcanic energy; radioactive nuclei; lightning; meteoric impacts). Under these conditions, amino acids, nucleotides, and other building blocks of the future living organisms got synthesized in the seas, and in the rock structures, and in the atmosphere around the Earth. Several energy-rich molecules like H2S, FeS, H2, phosphate esters, HCN, pyrophosphates, and thioesters, were also produced.
Thus, in this so-called primordial soup, namely a fluid in contact with rocks of various types, there existed small molecules of amino acids, sugars etc. Given enough time, some of them must have undergone random polymerization reactions of various types, producing short polymers. It is entirely possible that at least some of these end-products, with some side chains and branches hanging around, acted as catalysts for facilitating the production of other molecules which may also be catalysts for another chemical reaction. Thus: A facilitates the production of B, and B does the same job for C, and so on. Given enough time, and a large enough pool containing all sorts of molecules, it is quite probable that, at some stage a molecule, say Z, will get formed (aided by catalytic reactions of various types), which would be a catalyst for the formation of the catalyst molecule A we started with.
Once such a loop closes on itself, it would head towards what we now call self-organized criticality (and order). There will be more production of A, which will lead to more production of B, and so on. The plausibility advantage of this scenario visualised by Stuart Kauffman is that there is no need to wait for random reactions for the spontaneous formation of large molecules. And once a threshold has been crossed, the system is likely to inch towards the edge of chaos, and acquire robustness against destabilizing agencies.
Kauffman argued that this order, emerging out of molecular chaos, was akin to life: The system could consume (metabolize) raw materials, and grow into more and more complex molecules. It progressed into a situation where the forebears of DNA started appearing, with potential for replication. An era of chemical Darwinism or molecular Darwinism followed next, in which autocatalytic systems of molecules competed with one another for the limited supply of precursors and energy-rich molecules. These sets of autocatalytic molecules had at least three of the features of what constitutes life: They ‘ate’ the energy-rich molecules; they reproduced themselves; and they competed with other autocatalytic sets of molecules for survival.
The probability is next to nil that highly complex molecules like RNA, DNA and proteins got created spontaneously through purely random or chance processes. However, the nearly-impossible became possible, i.e. the unlikely set of events became likely, through the mechanism of autocatalysis. As John Avery has pointed out in his book Information Theory and Evolution (2003), ‘A notable feature of autocatalysis (apart from providing a credible mechanism for the origin of life) is that it has the seeds of natural selection at the molecular level: The precursor molecules and the energy-rich molecules are ‘food’. And the alternative autocatalytic systems compete for this supply of food. The efficient ones have a better chance of dominating and winning (through faster reproduction). Supply of free energy, of course, was/is the prerequisite for all this to become possible.’
Once a set of autocatalytic reactions had established itself, it went on incrementally evolving into still more complex sets of molecules. Chance events and/or new external conditions resulted in the emergence of a slightly more complex version of, say, one of the molecules in the autocatalytic set. A further round of chemical Darwinism and evolution of a new set of autocatalytic set of molecules followed. And so on, till molecules as complex as RNA, DNA and proteins emerged on the scene, which have life-sustaining and life-propagating properties.
This explanation is an important milestone in our quest for understanding in a rational manner the origin, or origins, of life on Earth. But what is life? I shall address this question in the next article in this series.
“The more we learn about the unbelievably complex, immensely varied, and yet simultaneously simple origin and development of life on earth, the more it looks like a miracle, and one that is still unfolding. The miracle of evolution.”