In Preparation
In an ephemeral moment I read an exchange between some people arguing about Systems Thinking / Complexity Science who were introducing their own interpretation of Ashby or Bateson to make their case. Like so much on LinkedIn I could now locate the source.
What worried me about this exchange was a) I disagreed with the sentiments being expressed and b) after reading the works of the principals being quoted - Ashby, Bateson etc - I had drawn rather different conclusions.
It was therefore worth backtracking to the original works to re-read and try to understand the context in which their assertions were made and to try and deduce their thinking at the time. STPrism is an ideal tool for this task, it takes you straight to the ‘paragraph’ and context of what became the well-used and often quoted aphorisms.
There is, of course, not issue with the work of the principals being modified or developed by others - after all we don’t want the subject or the threads introduced by the pioneers to ossify. The issue is false accreditation - the derivative idea should be referenced as (James on Ashby) and not (Ashby). To do otherwise is to mislead.
Sources and Their Context
Ross Ashby Gregory Bateson Claude Shannon
Ross Ashby
What was the context of Ross Ashby writing that a cyberneticist is a man who knows what might have but did not happen
Ross Ashby’s statement that “the Cyberneticist observes what might have happened but did not” appears in a collection of his aphorisms titled “Ashby Says,” specifically under the section “On Science”. This remark encapsulates the core of his cybernetic methodology, which can be understood through the following contexts:
1. The Study of “All Possible Machines”
For Ashby, cybernetics is not the study of a single physical object, but the study of the domain of “all possible machines”. He argues that a cyberneticist treats a particular machine not by asking what individual act it is performing “here and now,” but by considering all the possible behaviours it is capable of producing. In this framework, the actual behavior is only a small subset of the potential behavior, and understanding the “might have been” is necessary to understand the “is”.
2. Information as Selection
The statement is deeply tied to Ashby’s interpretation of information theory and intelligence.
- Selection as Creation: Ashby viewed “appropriate selection”—whether in design, regulation, or intelligence—as the act of narrowing a field of uncertainty.
- The Power of Constraints: He defined organization and law as constraints that reduce the variety of what can be imagined to what is actually observed.
- The Cybernetic Lens: A scientist in the classical sense might only record the observed fact; the cyberneticist, however, focuses on the process of restriction—the reasons why a system conforms to a particular path rather than the thousands of other possibilities it could have taken.
3. The “Observer’s Digest”
Ashby maintained that a “system” is not a physical thing but an “observer’s digest”—a list of variables selected by the observer.
- Product Space: He introduced the concept of a “product space,” which contains all the possible states a system could occupy.
- Relational Reality: While the real world provides the subset of “what is,” the product space represents the uncertainty of the observer (what “might be”).
- Scientific Insight: By knowing the total field of possibilities, the cyberneticist gains insight into the functional constraints (like the physical shape of parts or laws of nature) that prohibit direct routes and force specific behaviors.
Summary of Context
The quote reflects Ashby’s belief that to understand any complex system—be it a brain, a society, or a machine—one must look beyond the immediate event to the wider set of the possible or the conceivable. The cyberneticist is the “man who knows what might have happened” because they have mapped the entire field of possibilities and can therefore explain why the system was constrained to the single path it eventually took.
Gregory Bateson
What is the context around Batesons 'difference that makes a difference'
Gregory Bateson’s phrase “a difference that makes a difference” serves as his foundational definition for a “bit” or an elementary unit of information. The context surrounding this definition bridges cybernetics, epistemology, and biology, and relies on several key theoretical distinctions:
1. The Divide Between Pleroma and Creatura Bateson situates this concept at the boundary between two realms of explanation, borrowing terms from Carl Jung. The Pleroma is the non-living physical world of Newtonian mechanics, governed by forces, impacts, and energy exchanges, where there are no true descriptions or distinctions. The Creatura is the world of living systems, mental processes, organization, and communication. In the Creatura, effects are not brought about by brute physical force, but precisely by differences.
2. Infinite Latent Differences vs. Selected Information To explain how information emerges from the physical world, Bateson adapts an idea from Immanuel Kant using the example of a piece of chalk. He argues that there are an infinite number of potential differences in and around a piece of chalk, such as the difference between its molecules and the sun, or its color and the blackboard. Our sensory receptors filter out almost all of this infinitude. An effective difference—a difference that makes a difference—is one that has been actively selected by an organism from this infinite pool of latent differences to trigger a response.
3. The Abstract, Non-Localizable Nature of Difference Bateson stresses that a difference is not a material “thing” or a physical event, but an abstract, dimensionless relationship. For example, the difference between the color of a piece of paper and a wooden lectern cannot be located in the paper, in the wood, or in the space and time between them. Because it has no physical location and zero dimensions, a difference is essentially an idea.
4. Information vs. Energy (The Causality of Zero) This definition requires a radical shift away from energy-based physical analogies. In physical dynamics, an event like kicking a stone transfers energy that causes the stone to move. In communicational systems, however, if you kick a dog, the dog’s response is energized by its own metabolism (collateral energy), not by the energy of the kick. Because an organism’s neural pathways are already supplied with energy and merely wait to be triggered, the “difference” itself carries no energy. This leads to the profound realization that “zero”—the complete absence of an event—can be a difference that makes a difference. An unwritten letter, an unfiled tax return, or a tick failing to smell sweat can all trigger active responses because they contrast with a latent expectation, proving that psychological causality is based on difference rather than energy transfer.
5. Connecting the Map to the Territory Ultimately, “a difference that makes a difference” defines what manages to cross the boundary from the physical territory into the mental map. The physical territory itself cannot get onto the map; only news of difference (such as a change in altitude or boundary) is transmitted and transformed along neural circuits. Therefore, all human and animal perception, knowledge, and mental processes are built exclusively on these circulating transforms of difference.
Claude Shannon
What was the context of Claude Shannon writing the Bandwagon
Claude Shannon wrote his 1956 editorial, “The Bandwagon,” in response to the massive and rapid popularity his information theory achieved in the years following its introduction.
By the 1950s, information theory had become a highly fashionable scientific trend, receiving extraordinary publicity in both popular and scientific presses. Shannon observed that researchers from entirely different disciplines—including biology, psychology, linguistics, physics, economics, and the social sciences—were indiscriminately adopting the theory’s concepts and buzzwords (such as “information,” “entropy,” and “redundancy”) to address problems in their own fields.
This over-enthusiastic application alarmed Shannon, who felt the theory was being “sold, if not oversold” and “ballooned to an importance beyond its actual accomplishments”. He warned that this widespread popularity carried an “element of danger,” as his theory was a specific mathematical tool for communication engineers, not a universal panacea for every scientific domain. He worried that the “somewhat artificial prosperity” of the field would collapse overnight once people realized that merely borrowing exciting terminology did not automatically solve complex scientific problems.
Consequently, Shannon wrote “The Bandwagon” to inject a note of moderation into the scientific community. He used the editorial to emphasize two main points:
- The need for rigorous application: He reminded readers that information theory is fundamentally a strict, deductive branch of mathematics. Applying its concepts to new domains is not a trivial matter of translating words, but rather requires the “slow tedious process of hypothesis and experimental verification”.
- A call for higher research standards: Within the communication field itself, Shannon urged his colleagues to raise their critical thresholds, prioritize first-rate research over mere exposition, and focus on maintaining a thoroughly scientific attitude to achieve real progress rather than rushing to publish half-finished ideas.
