As argued in Part 4 of this series of articles, the Shannon formula for information and the Boltzmann-Gibbs formula for entropy provide equivalentdescriptions of the same combinatoric problem. One can formally equate these two formulations and work out the energy required for one bit-flip of information-processing. This equivalence between negative entropy and information is very important in the context of understanding complexity. But the fact remains that the communication of information between two individuals is not independent of their prior states of knowledge and the nature of the language they use for communication of information. Thus, information has a subjective, particularly an anthropocentric, aspect also, as will become clear as we discuss here some of the ways of describing and quantifying complexity.
The Nobel Laureate Murray Gell-Mann is best known to the physics community for his seminal contributions to particle physics. He was awarded the Nobel Prize in 1969 for work which led to his discovery of quarks (the basic building blocks of all atoms, along with leptons). What is less well known is that he also wrote one of the most insightful and pioneering books on complexity. Published in 1994, the book has the interesting title: The Quark and the Jaguar: Adventures in the Simple and the Complex. Quarks have no individuality. They are governed by the laws of quantum mechanics, like everything else in the universe. And yet the same laws of quantum mechanics result in the evolution of complexity from quarks to jaguars and other creatures, which do have individuality. How this happens was the subject matter of Gell-Mann’s book. The title of the book came from a poem by the Chinese-American poet Arthur Sze:
The world of the quark has everything to do with a jaguar circling in the night.
Flowers are beautiful examples of what purposeless and unpredictable evolution of complexity can result in. I like to photograph flowers, and have been embellishing this series of articles with some of the pictures I have taken. Next time you see a flower, think complexity! A flower is a work of art, except that there is no artist involved anywhere in the creation of the art.
I present here a brief survey of some of the ways of identifying and defining complexity, drawing freely from the book by Gell-Mann.
Computational complexity, as we have seen in Part 4, is a measure of the shortest amount of computation needed, and therefore the shortest amount of computer time needed, for the solution of a problem. Although this time will naturally depend on the type of computer used and the efficiency of the computer program, one can still work with an asymptotic limit: For a given computer and a given computer program, how does the minimum necessary solution time increase as the size of the problem increases? The degree of complexity of a system depends on the level of detail at which we agree to, or are able to, describe it. Computational complexity is a measure of the effort required to obtain an agreed amount of details of information.
LENGTH OF DESCRIPTION
Complexity is not always about solving problems on a computer. Quite often we need to describe a complex system to an agreed level of detail. The notion of coarse graining is familiar in science and elsewhere. In principle, there is no end to the level of detail at which we can describe the various components of a system and the interactions among them. Practical considerations, as also matters of relevance, demand that we stop at some level of detail, and ignore (or average out, or coarse-grain) the details below that level. Naturally, the degree of complexity of a system, and therefore the length of the description for it, will depend on the level of detail at which we describe it.
Consider a system of N individuals or ‘agents’ comprising a complex system. The length of description of such a system will depend on the degree of interaction or communication among the agents. If none of the agents interacts with any other, then the length of description is the least. And it is about the same as (i.e., the least) when all possible pairs of agents interact with each other. If the nature of these pairwise interactions is more complicated (e.g. if only certain pairs interact), then the length of description, and therefore the degree of complexity, is higher.
To the extent that the length of description of a system is a measure of the degree of complexity, the complexity depends on the context of description, and on our previous knowledge about the systematic or regular features present in the system. It depends on who is describing the system to whom, and on the kind of language they both understand. The length of description, and the degree of complexity, can collapse in the light of new insights, or new laws discovered. Thus, degree of complexity has a strong anthropocentric underpinning.
It is conceivable that words or other descriptors are used inefficiently for describing a complex system. Therefore, Gell-Mann defines what he calls ‘crude complexity’ as follows: It is ‘the length of the shortest message that will describe a system, at a given level of coarse graining, to someone at a distance, employing language, knowledge, and understanding that both parties share (and know they share) beforehand.’ Subjectivity or arbitrariness is inherent in the definition of crude complexity, arising from factors like coarse-graining and the language of description.
ALGORITHMIC INFORMATION CONTENT
The minimum number of bits (i.e. strings of zeroes and ones) needed to store the algorithm needed for computing the data or information for describing the structure and function of a system (and then stopping the computation), i.e. for ‘explaining’ or ‘summarizing’ the system, is called the algorithmic information content (AIC) of the system. The larger this number is, the higher is the degree of complexity of that system. Thus AIC is a measure of how hard it is to represent a text or a bit-stream using a computer program. It is the length (in bits) of the shortest computer program that can produce that text or bit-stream as output.
If a set of data or a number (a digit stream) is such that the number of bits required to store and implement the algorithm needed for explaining or generating it is no shorter that the data or the number itself, the digit stream has no redundancy, and is said to be incompressible. Such digit streams are called irreducible or algorithmically random. Gregory Chaitin (2006) has shown that certain facts (digit streams) are not just computationally irreducible; they are logically irreducible as well: The proof of their ‘truth’ must be in the form of additional axioms, without any reasoning.
It can be shown mathematically that most digital streams or bit strings of a given length are incompressible. A peculiar feature of AIC is that it is generally not computable. Even though most bit strings are algorithmically random, it is not possible to know which ones are. In general, it is not possible to be sure that the AIC of a given bit string is not lower than what we think it is. This follows from the work of Gödel, and more recently of Chaitin, as discussed earlier in Part 4. There may always be a theorem we do not know about, or an as yet undiscovered algorithm, or any such insight that may result in a collapse of the AIC. There is no way of knowing all possible theorems that may result in a further compression of the AIC.
I am reminded of a statement by Bertrand Russell:
The whole problem with the world is that fools and fanatics are always so certain of themselves, but wiser people so full of doubts.
And Kahlil Gibran said this in Sand and Foam:
I am ignorant of absolute truth. But I am humble before my ignorance and therein lies my honour and my reward.
Algorithmic probability (AP) is the probability that a random program of a given length fed into a computer will give a desired output, say the first million digits of π. Following Bennett and Chaitin’s pioneering work done in the 1970s, let us assume that the random program has been produced by a monkey. The AP in this case is the same as the probability that the monkey would type out the same bit string, i.e. the same computer program, as, say, a Java program suitable for generating the first million digits of π. The probability that the monkey would press the first key on the keyboard correctly is 0.5. The probability that the first two keys would be pressed correctly is (0.5)2 or 0.25. And so on. Thus the probability gets smaller and smaller as the number of correctly sequenced bits increases. The longer the program, the less likely it is that the monkey will crank it out correctly. This means that the AP is the highest for the shortest programs, and vice versa.
This is reminiscent of the proverbial Ockham’s razor. The philosopher Ockham advocated the use of simplest possible explanations for natural phenomena: ‘Plurality should not be posited without necessity‘. Ockham’s razor cuts away complicated and long explanations. Ockham declared that simple explanations are the most plausible. In the present context, suppose we are having a bit string representing a set of data, and we want to understand the mechanism responsible for the creation of that set of data. In other words, we want to discover the computer program, among many we could generate randomly, which produced that set of data. According to Ockham’s philosophy, the shortest such program is the most plausible guess, because it has the highest AP.
According to Gell-Mann, AIC does not fully match our intuitive idea of complexity.
This is because it attains infinite value for totally random bit strings. The crux of a highly complex system is in its non-random aspects. Gell-Mann defines effective complexity as roughly the length of a concise description of the regularities (as contrasted to the degree of randomness) of that system or bit string. By contrast, AIC refers to the length of the concise description of the whole system or string, rather than referring to the regularities alone. It is indeed difficult to give a universally agreed definition of degree of complexity!
COMPLEX ADAPTIVE SYSTEMS AND EFFECTIVE COMPLEXITY
We humans are examples of what Gell-Mann calls complex adaptive systems (CASs). CASs are systems that learn or evolve by making use of acquired information. Here are some examples: a child learning the mother tongue; a strain of bacteria developing resistance to an antibiotic; an artist implementing a creative idea; and the scientific community testing new theories. Learning by a CAS requires, among other things, the evolution of an ability to distinguish between the random and the regular.
Effective complexity, introduced above, is related to the description of the regularities of a complex system by a CAS that is observing it. A CAS separates regularities from randomness. Therefore a CAS provides the possibility of defining complexity in terms of the length of the schema used by it for describing and predicting an incoming data stream. Gell-Mann defines the effective complexity of a system, relative to a CAS that is observing it, as the length of the schema used to describe its regularities.
Suppose a bit stream is totally random (as perceived by a CAS). Then its AIC is infinite. But its effective complexity is zero, because the CAS observing the bit stream will not find any regularity in it, and therefore the length of the schema describing the regularities will be zero. At the other extreme, if the bit stream is totally regular, the AIC is very small (nearly zero), and so is the effective degree of complexity.
For intermediate situations, the effective complexity will be substantial (i.e., different from zero). Thus, for effective complexity to be substantial, the AIC must not be too high or too low. That is, the system should be neither too orderly nor too disorderly. For such situations, the AIC is substantial, but not maximal (for a given length of bit stream), and it has two contributions: the apparently regular portion (corresponding to the effective complexity), and the apparently random or stochastic portion. This critical balance between order and chaos, occurring at or near the so-called edge of chaos is the crux of what complexity is all about.
Laws of physics impose a trade-off between information and the effort required to get or generate that information. Charles Bennett (1987, 1988) made this the basis for defining ‘logical depth’ as a measure of complexity. The shortest program that can generate a given bit string deciphers the most likely or plausible mechanism responsible for generating that data set or bit string. In fact, there can be one or more programs almost as short as the shortest one. Bennett took all such programs, and looked at the computational complexity or the effort required for generating each such program. Logical depth was defined as the minimum necessary effort or cost required for creating and running a program that can produce a given bit string. It is a measure of how laborious it is to go from the brief program, or a ‘highly compressed schema’, to a full description of the system itself, or of its regularities.
A concept reciprocal to depth is that of crypticity, which is a measure of how laborious it is, starting from a system, to compress its description (or a description of its regularities) into a program or schema (Gell-Mann 1994).
Bit strings that are obviously simple are logically shallow. And so are bit strings that are totally random. Bit strings that include structure and regularity, in addition to some randomness, are logically deep; they take a long time to compute.
Bennett’s notion of logical depth referred to bit strings, computer programs, and logical operations etc. Lloyd and Pagels (1988) introduced the notion of thermodynamic depth as the ‘physical’ analogue of logical depth, by bringing in quantities like energy and entropy. Instead of looking for the most plausible explanation for how a given bit string was produced, they looked directly at the way a given physical system was produced. Similarly, instead of invoking computational complexity, they attempted to identify the physical resources needed to produce a given complex system. The resource they considered was entropy.
Entropy is a measure of ignorance, or the unknown random bits of information. And negative entropy or negentropy connotes known information; it is a measure of how far the system is from its maximum possible entropy. Entropy signifies random or useless bits, and negentropy signifies useful, ordered bits. The thermodynamic depth of a system was defined as the number of useful bits that went into assembling the system.
Simple ordered systems like crystals are thermodynamically shallow, as are random, totally disordered systems like glasses. Living systems, by contrast, are thermodynamically deep.
BIOLOGICAL DEGREE OF COMPLEXITY
Biological systems are not only complex, they also have remarkable adaptability features. A variety of criteria can be used for defining measures of biological complexity (Chaisson 2001). Her are some examples:
- Non-junk genome size.
- Position on the evolutionary bush.
- Number of cell types.
- Physical size of the organism.
There are dozens of other ways of classifying and describing complexity, but this much should suffice.
DOES COMPLEXITY ALWAYS INCREASE?
The answer is No. There is no law of science which says that cosmic complexity in general, and terrestrial complexity in particular, must always increase. In fact, the contents of this article, and those of Part 4, should make it amply clear that we cannot even define the degree of complexity of a system in terms independent of the perspective and background of the observer or the experimenter. The problem is further compounded by the fact that we do not yet have a good understanding of the information processing that goes on in the brains of experimenters or people communicating information to one another. Nevertheless, it is an empirical fact that the overall complexity of our ecosphere has been increasing. This is clear from the estimates made by Eric Chaisson of the free- energy rate density Φ (cf. Part 3). In fact, even the cosmic complexity is increasing all the time, and it is increasing exponentially fast. But we must make a distinction between empirical data and hard-core well-established scientific truths.
The nonexistence of a law regarding the increase of complexity with time is dramatically highlighted by the ‘fact’ that, if there is going to be a Big Crunch in the future, just as there was a Big Bang in the past, the degree of complexity of the cosmos would collapse to zero. In any case, long before the moment of the Big Crunch, something else will happen that will reduce the terrestrial complexity by a huge amount, namely the snuffing out of all life on Earth. Our Sun is not exactly a young star. When it burns out all its nuclear fuel, no life form would survive on Earth.
Question: What is complexity?
Answer: Complexity is something we can associate with a ‘complex system’.
Question: What is a complex system?
Answer: A complex system usually consists of a large number of simple members, elements or agents, which interact with one another and with the environment, and which have the potential to generate qualitatively new collective behaviour, the manifestations of this largely unpredictable behaviour being the spontaneous creation of new spatial, temporal, or functional structures which are usually characterized by a coexistence of order and disorder.
Question: What is degree of complexity?
Answer: If we can unambiguously specify the meaning of ‘information’ in a given context, then the degree of complexity of a system may be defined as the amount of information needed to describe the function and ‘structure’ of the system; the term ‘structure’ may have to include what goes on in the brains of one or more ‘live’ creatures.
But then we should be clear about what is a ‘live’ creature. And so on. I quote Bertrand Russell:
‘But,’ you might say, ‘none of this shakes my belief that 2 and 2 are 4.’ You are quite right, except in marginal cases — and it is only in marginal cases that you are doubtful whether a certain animal is a dog or a certain length is less than a meter. Two must be two of something, and the proposition ‘2 and 2 are 4’ is useless unless it can be applied. Two dogs and two dogs are certainly four dogs, but cases arise in which you are doubtful whether two of them are dogs. ‘Well, at any rate there are four animals,’ you may say. But there are microorganisms concerning which it is doubtful whether they are animals or plants. ‘Well, then living organisms,’ you say. But there are things of which it is doubtful whether they are living organisms or not. You will be driven into saying: ‘Two entities and two entities are four entities.’ When you have told me what you mean by ‘entity,’ we will resume the argument.