illustration (attribution, if any possible, is at the end of the article)
by Rolf Degen:
Perceptions are 90 percent expectations
The sensory centers of our brain are largely activated by anticipatory computations
Have you ever noticed how the regular drip of a faucet completely fades out after a while when you are resting comfortably? It is, however, the sudden absence of this acoustic stimulus that unmistakably pops up in your mind. This everyday experience bears witness to a foresighted mechanism that, according to new research, governs the workings of perception: "Predictive Coding".
How does the world get into our head? The question seems pretty unproductive at first glance, as we obviously must only open our eyes to obtain a picture of reality. According to common sense, perception functions a bit like a submarine commander who peeps through a periscope and scans the horizon for suspicious activity. Another popular metaphor is the camera that films the course of events and forwards the input to a monitor in our head, in front of which sits an imaginary "homunculus", satisfying it curiosity. In reality, however, all comparisons with optical devices do our sensory apparatus a disservice. When we consciously perceive, we are more like viewers in a cinema who catch sight of a well-styled work of art, tinkered behind our backs by a script, directing, editing, censorship and other brownies.
Between the first draft on the retina (or in the other sense organs) and perception sits a hidden recognition service that constantly sets up and discards hypotheses, guesses missing locations, whitewashes inconsistencies and applies complicated mathematical formulas that would have overwhelmed us at school. The work of this "ratiomorphic system" remains hidden to our consciousness, which only marvels at the finished results and takes them for granted.
The most controversial issue that has long been plaguing cognitive psychology is this: Is that which we perceive only a passive and mindless registration of stimulus information crackling onto our sense organs? Or is the picture in front of us decisively and from the outset affected by stored experiences, expectations and knowledge? In the first case, that of "direct perception", information processing is "bottom up" or "data driven", that is, from below, from the smallest sensory bits of information, up to the higher cognitive centers. Our intellect can then only interpret the final image, like an intelligence official quibbling over a satellite photo.
On the other hand, in case of "indirect perception", expectations and previous experiences serve as foundation, controlling the act of perception. The process is here "top down" or "conceptually driven", that is, knowledge constantly slips into perceptions and generates hypotheses about the expected stimulus material. With every modification of the hypothesis, incoming sensory data obtain a new structure.
According to the current paradigm, perception of the outside world is not a passive process in which the "receiver" is passively fed sensory impressions. Rather, the organism at any time produces a "concurrent world model", which includes hypotheses about the expected stimuli. These expected values are stored in long-term memory as a comprehensive simulation of external reality. During an ongoing act of perception, the retrieved hypotheses are checked against the incoming sensory data; perception is therefore an interactive process, which is taking shape through a gradual testing and refinement of predictions.
This new perspective skews the whole picture: Our expectations control what we perceive; memory and perception are inextricably linked. The world outside answers questions which our brain poses. The best evidence primarily originates from studies on the architecture and the activity patterns of the brain. The different areas of the brain are never connected to each other in only one direction; there are always feedback connections leading from the higher centers back to the lower centers. Even more importantly: In the sensory system they even make up the majority.
A few years ago, a seminal meta analysis evaluated numerous studies looking at the primary visual cortex - the drop-in center for optical input - using functional magnetic resonance imaging. The inspection showed that this area is much more busy trying to process the feedback signals from the higher-level brain regions than to analyze information from the visual system. In other words: The activity of the primary visual cortex is surprisingly independent of external stimuli. More than 90 percent of the impulses that arrive there do not originate from the visual pathway but from "higher" areas of the cerebral cortex.
Already two decades earlier, researchers had began to wonder what these feedback loops in the brain are all about. These considerations lead to the theory of the so-called "non-classical" effects of neurons of the visual cortex. Until then, it was believed that nerve cells in this area are all responsible for the representation of visual information. However, the functioning of some neurons can be explained more elegantly assuming that they compare the incoming signals with expectations. The researchers therefore called these neurons "error finders" and described their activity as "predictive coding".
The concept is traceable to telephone technology and data processing. Instead of transmitting the entire signal, it is often sufficient to consider only the deviation from the previous signal. When dubbing an image file, it doesn't make sense to indicate the color of each individual pixel separately. Only when the color changes from one to the next point, this information needs to be transmitted. By merely coding deviations from the expected, this method (which led to mp3 and the demise of the music industry) reduces transmission overhead and increases processing speed. The theory of the hypotheses testing brain assumes that a similar principle governs most brain functions.
The fact that knowledge determines vision can be easily demonstrated via the so-called "degraded images" (see the pictures below). At first glance, most people have great difficulties to recognize something meaningful. Having read the annotations, the identification runs smoothly. Even more: With some, if not all pictures it now becomes nearly impossible to carry back one's mind to the naïve state and to forget the identities of the images.
The interplay between base and superstructure in vision has also become increasingly likely since computer science has taken up the problem of perception. When you are developing artificial intelligence, you inevitably face the question of the architecture of perception. The advantage of a program lies in the fact that it forces the researchers to formulate any assumptions explicitly. And here there is a clear trend: All programs that have been developed for machine perception are based on a form of "cognitive mediation". The computer is always fed with assumptions about the nature of the expected segment of the outside world, and this knowledge paves the way for the processing of the visual patterns captured by the camera.
Ask yourself how any artificial intelligence uninstructed by assumptions and expectations could navigate the physical world. The problem would be that such a program could not know which visual data are important in the current situation. The software would be constantly on the verge of a crash, even missing only a tiny bit of information. It could not fill in gaps through the context or through foreknowledge - just as humans can in the case of degraded images. We have no idea how often our perception completes degraded images in everyday life. It is even thought that most visual stimuli are underdetermined, meaning not sufficient to uniquely identify the object by themselves. For a computer that has to work its way up from the bottom of the sensory facts without "higher" support, degraded images would eternally remain eternally meaningless doodles.
Literature: Jakob Hohwy, The Predictive Mind