Monday, 19 January 2015

Bayesian and Frequentist probability

John O. Campbell

The new book Darwin does Physics is now published! Here is an excerpt.

Edwin Jaynes, a great champion of the Bayesian revival, spent much of his long career battling those supporting the Frequentist interpretation of probability theory. Jaynes was convinced that probabilities describe a state of knowledge or the degree of certainty we may claim given the data we have (1).
We believe that to achieve a rational picture of the world it is necessary to set up another clear division of labour within theoretical physics; it is the job of the laws of physics to describe physical causation at the level of ontology, and the job of probability theory to describe human inferences at the level of epistemology.
On the other hand Frequentists understand probabilities as simply frequencies; the proportion of times that a given outcome occurs. They consider frequencies to be properties of nature with no strong links to inference.



Figure 1: Edwin Jaynes made the most important contributions to the development of Bayesian inference of anyone since Laplace.

Jaynes was a physicist, a field of study where the Bayesian approach makes a lot of sense. We might say that in random encounters with the atoms of the universe, the probability that the next one will be a hydrogen atom is about .75. While this probability might seem merely a property of nature it is actually calculated or inferred from a physical model which describes the creation of atoms and which is supported by many spectral measurements. 

Physics has developed a detailed model of how atoms were formed which is roughly:
  1.  First hydrogen was created and then some hydrogen atom combined to become helium atoms.  
  2. The model predicts this should have happened in the ratio of 1 helium to 3 hydrogen atoms.
  3.  Almost all atoms more complex than helium were created in stars.
  4. As stars are relatively sparse in the universe only a scant number of atoms more complex than helium have been created, so the relative abundance remains at nearly 75% hydrogen and 25% helium.
When we measure the relative abundance of atoms out to the extent of the visible universe the data supports many aspects of this physical model. Thus .75 is a number we have inferred using a model that is supported by the data.

While it is true that a physical process is responsible for .75 of the atoms in the universe being hydrogen the model which infers this number was developed by physicists, it is a model created by humans and it is here that the inference takes place.

Frequentist probability theory, on the other hand, was developed largely by biologists. Their primary objective was to develop mathematical tools capable of modelling the details of Darwinian evolution. The aim of this project was to model the shift in the proportion of physical characteristics or alleles within a population between generations. This proportion or frequency of a given characteristic can be found out merely by counting; it does not require a sophisticated human model. Any scientific model designed to infer these probabilities would have to include the nebulous quality of ‘fitness’. Worse and somewhat of an embarrassment, any equation referring to fitness used to calculate the change in frequencies between generations is clearly in the form of a Bayesian update.

However biologist became increasingly aware of one thing: the model responsible for inferring the proportion of characteristics which would be found in the next generation was not the scientific model, it was a biological model written in DNA. This model is in the form of the genome and this model is also involved in inference, it infers those characteristics which will contribute to the reproductive success of the next generation. The model is supported by evidence in the form of the reproductive success experienced by the current generation.

Biological science developed a model of this process composed of Darwinian Theory and the many details of micro-biology. This scientific model also performs inference; competing hypothesis may be formed and decisions reached on the basis of evidence and experimental data. Nevertheless, it is clear that the scientific model is essentially a secondary model developed to describe the primary model composed of genomes, phenotypes and their experience in the environment.

While the secondary model fits nicely within the Bayesian interpretation, the primary model provides some challenges. Natural selection, the inferential system operating within the biological world, is clearly not a form of human reasoning. It was busy inferring new life forms long before humans existed and in fact humans themselves were inferred by this process. In this sense the probabilities involved with the primary model are properties of nature and may fit more comfortably within the Frequentist interpretation.

Clearly some compromises must be made to reconcile the Bayesian and Frequentist interpretations of probability. The one I suggest is that probabilities be considered as components of inferential systems, not just as tools of human inference but of all inferential systems that are found in nature.  Bayesians have always understood that probabilities are a component of inference but they insist they are restricted to human inferences. Frequentist recognize the role of probabilities in non-human spheres but deny them a role in inference. Perhaps the strengths of both these interpretations may be retained by viewing probabilities as a component of all inferential systems whether human or non-human.

Although Jaynes never deviated from his anthropocentric interpretation of probability he did consider some instances where inferences are performed in the non-human realm. One case was with intelligent robots. The unifying theme of his great text on probability is the design of a robot capable of performing human-like inferences, making it clear that he considered it possible for inference to be performed by non-humans (2).

He also made a brief foray into considerations of the inferential ability of natural selection. Jaynes starts by noting that the scientific understanding of our rich mental life had not yet discovered any general organizing principles. He then suggests that Bayesian inference might fill this role and that natural selection, in its evolutions of brains, would surely have hit upon Bayesian inference as the correct mechanism for gaining knowledge of the world.
Workers note the seeming absence of any organizing principle; we wonder whether the principles of Bayesian inference might serve as a start. We would expect Natural Selection to produce such a result; after all, any reasoning format whose results conflict with Bayesian inference will place a creature at a decided survival disadvantage. Indeed, as we noted long ago (Jaynes, 1957b), to deny that we reason in a Bayesian way is to assert that we reason in a deliberately inconsistent way; we find this very hard to believe.
He opens the door, a crack, and considers ‘creatures’ rather than only humans. Jaynes also comes tantalizingly close to recognizing natural selection as an inferential system. He is on record as claiming that Bayesian inference is the only mechanism capable of finding good solutions. Surely, if pressed, as to the process used by natural selection to discover the optimum mathematics of reasoning, he would have identified this as a form of Bayesian inference.

In considering the challenges of designing an inferential robot he is drawn to examine the mental processes of sensation and perception. He concludes that the general organizing principles of these mental functions is Bayesian inference.
The general conclusion from all these experiments is less surprising to our relativist generation than it was to the absolutist generation which made the discoveries. Seeing is not a direct apprehension of reality, as we often like to pretend. Quite the contrary: Seeing is Inference from Incomplete Information, no different in nature from the inference that we are studying here.

The information that reaches us through our eyes is grossly inadequate to determine what is “really there" before us. The failures of perception revealed by the experiments of Ames and Cantrell are not mechanical failures in the lens, retina, or optic nerve; they are the reactions of the subsequent inference process in the brain when it receives new data that are inconsistent with its prior information.
It did not escape Jaynes that the brain is a biological adaptation which came into existence long before humans and that animal brains presumably also invoke the principle of inference:
Presumably, a dozen other examples of human and animal perception would be found to obey a Bayesian reasoning format as its “high level" organizing principle, for the same reason.
Here he explicitly recognizes other animal’s ability to perform inference but nowhere in his writing does he concede a non-anthropocentric definition of inference.

Jaynes’ view of the mental processes taking place within brains is well in accord with current understanding; they are examples of Bayesian inference. For example vision is explained as inferring an accurate mental model of the world from scant sensory evidence. Many researchers describe the process of evolving such accurate mental models as involving a Darwinian process (2).

This view of mental processes, such as vision, as Darwinian processes following the principle of Bayesian inference, leads us to add mental processes to our list of inferential systems found in nature. Brains accumulate knowledge in order to more accurately model the world around them. Neuroscience or the scientific study of these mental processes and behaviours is also an inferential system involving models and supporting evidence. Again the science is clearly at a secondary level as its subject is composed of a more fundamental inferential system.

This view suggests an astonishing unification of our understanding of nature as many of the subject matters studied by science are found to operate according to the same general inferential principles as does science.

If we consider science to be but an example of the more general process of cultural evolution, we are led to a four tier nested hierarchy of inferential systems: Cultural evolution entails mental evolution which entails biological evolution which entails physical structures. Placing ‘physical structures’, at the bottom of this hierarchy is troublesome. Are we confident that this is really the bottom? Why not consider instead ‘physical evolution’ as an inferential system at the bottom of the hierarchy? This would have the benefit of leaving our hierarchy open-ended.

We have no clear evidence that currently known physical structures form the true bottom of this hierarchy and some, admittedly inconclusive and indirect, evidence that they do not. Uncomfortable as we are with ignorance, we tend to claim that the boundary between our knowledge and ignorance accurately maps nature; that nature extends only as far as our knowledge. This must surely be hubris. I earlier presented some arguments which might lead us to consider ‘physical structures’ themselves as inferential systems and will continue to develop this notions in latter sections.



Bibliography


1. Jaynes, Edwin T. Clearing up the mysteries - the original goal. [book auth.] John Skillings. Maximum Entropy and Bayesian Methods. s.l. : http://bayes.wustl.edu/etj/articles/cmystery.pdf, 1989.

2. —. Probability Theory: The Logic of Science. s.l. : University of Cambridge Press, 2003.

3. Selectionist and evolutionary approaches to brain function: a critical appraisal. Fernando, Chrisantha, Szathmary, Eros and Husbands, Phil. s.l. : http://www.sussex.ac.uk/Users/philh/pubs/fncom-Fernandoetal2012.pdf, 2012, Computational Neuroscience.