Deterministic World

Part 1. Physics

5. Information Theory

(note: there is an established field of study called Information Theory that owes much to Claude Shannon's works. What I refer to in this book as information theory is only tangentially related to this field. Perhaps I have chosen an unfortunate name but it is already stuck in my mind.)

The widely used and taught physics theories describe the world as a collection of interacting physical objects - a space-based representation. To understand Newton's three laws of motion, we must imagine the existence of a physical space in which objects move. To understand Maxwell's equations, we must imagine a space in which electromagnetic waves can exist and propagate. To understand Einstein's general relativity, we must imagine a space in which planets and rockets warp their surrounding space, leading to gravitational effects on other objects in that space. So the existence of both space and objects within that space seems natural and straightforward, a logical starting point for theories - even 'modern' ones like string theory or quantum electrodynamics. None of these theories have within them a power for explaining what space (or time) is, nor why objects should exist in that space. Nor would I claim to have an answer to this - such an answer may really be the theory of everything (in contrast to such grandiosely named attempts in present day physics, which cannot explain why space or time is observed). In this chapter's exploration, I attempt to form a physics theory with minimal reliance on space or objects, instead with a focus on information, something I believe is a more fundamental truth. If I had the task of writing a computer code to simulate the universe, what would this code look like? How would I store the state of the universe in computer memory? The universe as we see it looks 3-dimensional and evolving in time, but we are ourselves inside the universe (subject to its illusions) and thus cannot naively assume that this space and time is what the 'universe code' looks like. What seems to us as a 3D space may be any sort of arbitrary underlying representation that makes universe evolution in time an inevitability. But as far as I can tell, it is not possible to describe something without using information, and no matter how that something is stored in 'universe code' it must keep its own share of information. Perhaps the information will look different from different representations (for instance, our human representation vs a 'universe creator' representation), but its underlying purpose will remain constant. So instead of describing how objects move in space - a 'high-level' physics theory - I seek to describe how information evolves over time. The notion of information is broad: it might well be objects moving around (position/velocity/temperature), electric field strength and patterns, the presence or absence of an object, and even 'human' concepts like what someone believes. I believe there are laws that information follows, and the power of understanding these laws is that they become broadly applicable to any system where information is present (which is to say, any physical system) - whether it is a system of particles in space, or a system of people in society, or a system of electrical elements in a circuit.

Let's start with a few practical examples showing just how broadly the concept of information can be applied:

From the above two examples, I postulate two properties of information. First, information naturally spreads. Spreads? As much as I've tried to avoid the notion of space, it seems I cannot proceed without it. What I want to say is that information spreads outward in *space*, but perhaps information theory can provide some hints as to what space is in the first place. Objects that are close/nearby in space are those between which information exchange (via photon reflection for instance) happens most quickly. Military rangefinders send out a pulse of microwave photons and time how long it takes to receive a reflection from some object of interest, and based on this time display a distance to the object. Time and distance are directly proportional by the speed of light. Additionally, objects that are nearby take up more of each other's solid angle, making information exchange more likely and stronger. Putting my hand near a lightbulb, I can feel the warmth from the bulb, but putting my hand far away this warmth is diminished because of a smaller solid angle (how much of a surrounding sphere's surface area is taken up by my hand. All of the visible sky is approximately 50% solid angle, the moon is a small fraction of the sky, holding up a penny in front of my eye to cover the moon is an exercise in matching solid angles). Such change in solid angle is also symmetric - if I move towards an object, the object also finds itself closer to me by the same amount. Requiring information exchange to be symmetric in this way places a constraint on space: relativity, or that information exchange as seen from one object is seen in an equivalent way from another object.

By further postulating that information transfer must always occur with a constant rate/probability we find another constraint on space: namely that it is not possible to move away from all objects, if I move away from one place I must move closer to another. If I were to move away from all objects, that would reduce my information transfer in an absolute sense, which is not allowed (by the postulate). This makes our three dimensions 'rigid', though it does not justify why there should be exactly three dimensions.

"if there were no limit to the dimensionality of space, then we would expect a set of n particles to have n(n-1)/2 independent pairwise spatial relations, so to explicitly specify all the distances between particles would require n-1 numbers for each particle, representing the distances to each of the other particles. For a large number of particles (to say nothing of a potentially infinite number) this would be impractical. Fortunately the spatial relations between the objects of our experience are not mutually independent. The nth particle essentially adds only three (rather than n-1) degrees of freedom to the relational configuration. In physical terms this restriction can be clearly seen from the fact that the maximum number of mutually equidistant particles in D-dimensional space is D+1. Experience teaches us that in our physical space we can arrange four, but not five or more, particles such that they are all mutually equidistant, so we conclude that our space has three dimensions." [Ch 1.1-1.2 of Reflections on Relativity]

At this point note that from all our experience we find that information contained in a certain volume of space can be learned by an observer that controls all the boundaries of that space: the surface area of an imaginary 'bubble' around the space. Containers like water bottles function by providing an impermeable boundary completely surrounding the desired space that is meant to contain water, if the boundary does not completely surround the space we will observe leaks (of water/information). This is a statement also on the nature of space, that it is even possible to surround an object and fully control information transfer. Apply this to human life as well: we are defined by our environment and people we interact with, the things that 'surround' us.

The second example already suggests a notion of a conservation of information. I also cannot at this point avoid introducing the notion of objects, or systems. So even though I cannot justify the existence of either space or objects, I find it necessary to use these concepts to make information theory applicable to reality. By conservation of information I mean that specific *objects* conserve their share of information. It is possible to simplify this further: all objects always store a specific amount of information, no more no less, and this amount is in a sense the definition of the object. So an atom might have a position, velocity, quantum numbers/spins, and a few other characteristics but it is a finite and unchanging amount of information, though the content of the information may of course change readily. We see this clearly in mathematical models (same ones as used to simulate real experiments and calculate physically relevant quantities - that is to say, models which accurately describe the physical world and thus must provide some insight on the 'universe code'). We learn in high school of the importance of 'square matrices' in finding an exact solution to a set of equations; the square matrix is special because it has as many equations (rows) as it does parameters (columns). And to solve for the N unknowns, we must have N equations in N variables (or 'boundary conditions').

Throughout my student career I believed that such mathematical models give some new information, allowing us to know the previously 'unknown'. But information theory applies here too: we see that to find out the N unknown values, we need to supply N known values, the y in A*x=y. The amount of information stored by this mathematical construct is constant, by supplying the y we already fully define the x. In a sense this is like looking at a cube from one side or another, the cube is the same but our perception of it changes. The underlying information is the same in either y or x, and the system of equations has transformed the information from one interpretation to another. In essence, physically useful mathematical models require an input of as much information as they can be expected to output. If we make a thermodynamic model of the earth as a huge sphere with a single temperature parameter, we cannot predict the occurrence of life. If we wanted to predict the occurrence of life, we would need to have enormously complex parameters, and for all of those parameters we would need to have correct values of initial conditions for our model to have any validity. But in the process of determining and measuring all the parameters and their values, we would already have all the information needed to know the answer. So it seems even our most cherished physics theories don't do much more than regurgitate information we have supplied but in a different format.

And how could it be otherwise? The degenerate matrices and insoluble (over-constrained or under-constrained) systems of equations, dreaded by the students that have to try and solve them, are inelegant and not physically useful precisely because they do not conserve the amount of information. An over-constrained system gives out less information than was put in, meaning it does not make effective use of all available information, wasting computational effort and giving a subpar answer, perhaps arbitrarily discarding some excess information. An under-constrained system gives out more information than was put in, meaning some of the output information must have been made up/arbitrary, rather than physically relevant. Today's physics theories strike the balance between these two polarities, of making the most of a minimal useful set of information: "Everything should be made as simple as possible, but not simpler". But all I've suggested is that good theories conserve information, do we have reason to suppose that real world objects also conserve information? There is an argument to be made that good theories are ones that effectively describe the real world, so the above statement carries significant physical meaning. Out of curiosity, we may look at scientific experiments, where we find a notion of effective vs ineffective experiments. By no coincidence, the effective experiments maintain 'control' conditions as constant as possible, while varying 'independent' parameters and measuring a corresponding number of 'dependent' ones - carrying out measurements systematically in a matrix form if multiple independent parameters exist. The ineffective experiments suffering from under-constraint attempt to manipulate multiple independent variables at once, allow 'control' parameters to fluctuate, and carry out limited measurements with overconfident extrapolations. The outcomes of such experiments suffer from arbitrary fluctuations, limited repeatability, and poor understanding of what's going on even if the results are deceptively 'clear'. There are also the over-constrained ineffective experiments which use 'overkill' methods to attain limited results: making nanometer scale images of macroscopic areas, taking up hundreds of gigabytes of storage, and ending up with a single number as an experimental result. While the result may be useful, it should not be overlooked that a complex process of reducing lots of information into a less informative answer may well lead to an inaccurate or error-prone answer, making the extra human and computational effort of questionable benefit. Effective experiments - real world phenomena - strongly suggest that information amount is conserved in real objects.

If we accept that information amount is conserved in our surrounding universe, we are led to a general conservation of information law, one that I think is fundamental to the 'universe code'. Because if information amount is conserved, then if I seek to get 'information out' from an object, I must put the equivalent amount of 'information in' to maintain the constant overall amount. Namely, unidirectional information flow is prohibited. Any interactions involving information must be bidirectional exchanges. This notion will be important enough to the rest of this book that it will be termed a law: *the law of information exchange*.

The law of information exchange: only as much information can be gathered from an object as is supplied to it during an interaction. There is never unidirectional transfer of information.

This is very counterintuitive, because in daily life we seem to routinely see 'transfer' of information from one object to another: I read a thermometer to find out the temperature outside, where is the bidirectional exchange? I get information about the temperature, and the thermometer (as well as the temperature outside) stay just as they were. This illusion of information transfer arises because we rarely interact with the elementary physical objects that actually exchange information; as humans we work on a 'high level' of abstraction, making powerful inferences that seem to give us more information than allowed by bidirectional exchange. This will be explored later with the concept of 'infinite couplings'.

To make better sense of information, it will be instructive to see examples in the real world. Underlying fundamental physical-level information (to which the law of information exchange applies) manifests itself at all scales, including our everyday interactions with the world, in fact enabling macroscopic 'information technology' and notion of information as human-human communication. How can we make use of information in language and science? We might define this 'human information' as:

Statements that are
  1. Descriptive in nature rather than purely relational
  2. Specified with an exact and particular scope
  3. Readily verifiable by experiment at any time, establishing its validity

For instance, "earth is flat" is not information unless better specified in scope - "the ground I see around me is flat" is better and indeed more informative/readily verifiable. Now we know that the planet earth is not flat from experiments, so "planet earth is flat" is not information either. In the early years when no experimental proof was available, this was *not* information, nor "planet earth is round", since that was not known. Rather both were 'suppositions'. Here we see an important point - that limiting the scope of information from a general to a specific case requires knowledge of *all* cases, which is generally infeasible and contaminates information with suppositions. This is why I have to claim that all information be (at least in principle) experimentally verifiable at any time, and thus except for special cases of definitional consistency, "all" has to be replaced by "all observed". For instance, "objects are made of atoms" or even "some objects are made of atoms" is *not* information, but "all observed objects are made of atoms" is information. More subtly, this reminds us that anything that we describe must be capable of being observed by us (interacting/exchanging information with us conscious beings - the machines/tools we use to help observations don't have to interact the same way with other beings), but this does not mean that the universe or laws of physics are constrained in the same way.

The point of relational statements is a challenging one. Let's take for example the "ground around me is flat" statement. If I say "ground around me looks the same as the ground around you" that is a relational statement, since the actual nature of how the ground looks is not established. This is the realm of mathematics, where relations take precedence over actual meanings of particular values. But in a bigger sense we may argue that all information is relational, because there can be no information that is not in reference to an observer. Here the concept of looking at the earlier 'thought experiment' universe with two objects is *not* valid, because by looking into this universe we have brought our own (and our universe's) biases. To get a true view of physics in such a simulated universe, we must imagine ourselves *as* one of the objects in the universe (ie big in relation to my size, fast in relation to my sense of time), but none of our theories are nearly complex enough to allow this. Similarly, from a pure information theory view, to have a true model of our scientific experiments the model should be complex enough to simulate us (the human experimenters) actually performing the experiments in the scope of the rest of the universe. Yet we get by with extremely simple, even rudimentary, models - the reason for this is underlying symmetries, as will be further explored under symmetry theory.

But all of the above only means that good theories conserve information, not that it is conserved in the world. For instance, is historical data information? It may be descriptive and specific, but cannot be verified except by *supposing* that the past and only the past affects the present. But this leads to a rather bleak world in which everything that ever will exist already exists, being just a matter of how it is mathematically manipulated what fragment is called the "present". For instance, "the speed of light is c" is information, but "the speed of light is *always* c" is not, and cannot be (and is actually considered false in some cosmological theories). Similarly "the earth was round 20 years ago" is not information, for it is *not* directly verifiable at present without using suppositions. In fact the concept of time seems to break down, which suggests a reason why classical theorems in physics (which have an absolute time parameter) tend to be very limited in their applicability to real world problems, requiring the expertise of a physicist to specify where and for how long the theory can be applied (for instance a cannonball follows ballistic paths in air, but later can be moved around by people, melted, left in the ground; all not described by Newtonian laws).

To use the theory in a practical sense requires that we deal with the notion of time dependence of information, since any real experiment will not be instantaneous. Strictly, the theory can make no claim as to validity of historicaldata, as it is impossible to make an expeirment to test the past. We rely on remnants of past events and experiments to not change over time so that conclusions drawn from them remain valid at present. There is no way to distinguish any change in time that affects every parameter of an experiment (such as every atom+space expanding, speed of light increasing). We must make a few axiomatic claims:

This is effectively saying that there exists a basis of information completely true through time (the laws of physics) and that any experimentally measured changes in information are due to interactions between objects that exist and *not* due to changes in laws of physics. So, there exist certain guidelines for exchange of information (laws of physics) that are constant in time. This is a "grounding" of our presence in time and allows new types of interactions to be discovered - for instance if the elemental composition of a sample changes over time, we might look into radioactive interactions, instead of claiming that locally (or globally) the laws of physics have changed, although either one is possible. In this sense, using the laws of physics as something "constant in time" is rather arbitrary since they have no material existence, but it seems to have served science and humanity well.

From the above hypotheses that
1 objects in existence store information in time
2 objects in existence can change own information only be interaction with other objects in existence
3 information retains its validity over time
we come to the _Law of Conservation of Information_
An existing object or group of objects that do not interact with other existing objects neither gain nor lose the information (result of experiment performed on the objects) contained therein
The experimental measurement of any isolated system can be carried out at any time and an equivalent result will be obtained
Applying the law of conservation of information to groups of objects yields the more familiar statement: Since any interacting objects constitute a group of objects, if we assume the unobservability hypothesis we can say that in exchanges or in static state, information is neither gained nor lost.

Thus, the amount of information in an unobserved system is constant. We note that there is no strong support for this - while it is arguably applicable even in an infinite universe, we have no reason to believe that information might not be transferred ie through black holes or at the edge of the universe. In any case I would claim that locally (our planet, or even solar system) experiments suggest the hypothesis is correct. It is impossible to test since the best we can do is approximate a truly unobserved system. What we do see in experiments is that parameters of a system that are unobserved remain constant, with the unobservability hypothesis I claim

  1. All unobserved parameters remain constant
  2. The world (universe) is unobserved (in the classical sense)
The creator could have a way of observation that does not affect the underlying information, but this violates information transfer laws for existing objects as outlined earlier.
I already claimed the bidirectionality of information transfer, but this conclusion was far from intuitive, and one I came to only after a few more axiomatic claims.
- Storage size hypothesis: Any object in existence can store only a specific amount of information, and it must state exactly as much information to continue its existence as the same object
This implies the *equivalence of similar objects*, that is any atom is equivalent in its characteristics (amount/type of information stored) and differes only in the specific values of information stored (ie location, velocity). Then by the conservation of information, any group of objects can store only as much information as the sum of its underlying entities with no "group effects" (ie a linear summation), and also therefore that any group is in essence separable. Thus, groups can only be a convenient construct and cannot be said to have real existence. Furthermore, by the conservation+storage size/equivalence hypotheses:
_Law of Information Exchange_
Any existing objects that interact must have (individually, by storage size hyp.) the same amount/type of information before and after interaction. Therefore, only as much information can be learned about a system as is supplied to it in the course of an experiment.

This means *all* experiments that determine the information in a system also perturb the system, and furthermore that the more information it is desired to collect, the more the system must be perturbed. Or, *any* experiment in which the system is observed also gives the system information about the observer. It is impossible to have *unidirectional transfer of information*. Here we come to an important practical effect: emission of light. Since light carries information, it must also return information about the receiver to the sender. Thus light must always be incident on a physical object. This logically justifies quantum experiments in which (ie) the presence of a detector in a light beam path may be determined without activating the detector. Specifically, if it were possible to measure extent of reflection of the photon then a second detector could be in the same location as a laser source, or even a carefully designed laser that measured the amount of optical energy it contains. We might consider making a map of matter density of the universe by performing an experiment in which a light source is scanned across the sky. Then when photon production is seen to fall/rise, it will indicate that there is no matter present to receive the photon. [f1] That is, the photon can be used to tell whether there is an observer on the other side or if that path is unobserved; it cannot differentiate between observers which imposes the classical speed of light limit on traditional (photon exchange) communication. Similarly, all interactions between objects (gravity, electric, magnetic) must only exist between objects in existence and cannot extend outside the observable system. This also includes things like quantized electron wells for emission/reception of light (ie probing electron potential in a sample by light transmission frequencies). From here we have a hint of what must be done:

Any interacting system is experimented upon - the challenge is finding out what the experiment is. For instance a spinning fan is seen to slow down+stop in air, an irreversible loss of "energy". But for an information theory view, this was an experiment in which
  1. The air in the room was used to get information about the fan (speed/direction/shape)
  2. The fan got information about air in the room (speed/density/viscosity)

In the course of the experiment, the information must have been conserved but has been transferred, and the experiment would not repeat by itself. By setting up the experiment in an appropriate way, we can cause information of interest to travel to us, while normally we do not care what information is exchanged back to the sample. In this theory we must consider the latter as well. We can deduce that the information exchange for all common properties is space-dependent, and seems to be decently characterized by exchange between two bodies. The law of conservation of information requires that any information exchange must be *symmetric* with respect to the exchanged quantities. For instance in a general system of N particles, each of which may have a state of 1 or 0, we can say that the total number of 1s and 0s is constant (conserved). We have not defined any mechanism or mode of interaction, so this system is completely static (this represents all classical physics approaches where an "external observer" describes all states and applies all laws by iteration in an external time-based universe). But if we were to define a dynamic system that conserves the information (experimental measurement of 1s and 0s) any interactions between the particles would have to be re-distributions in nature. If we defined a system such that interactions are a physical inevitability, we would have a real world physical object/experiment. So,

_Quantization law_
Any quantized information can only be re-distributed and cannot be altered
Then we see that n-body exchanges are possible, ie 001100 -> 101000. But experimentally, we see that the particles participating in any exchange (that is characterized classically) must be spatially dependent only. For instance a sheet of paper can be used to block out the sun, because the paper atoms will preferentially interact with the eye, excluding the sun and the rest of the universe. Experimentally, it has been convenient to assume that most interactions occur as a sequence of 2-body exchanges rather than the general n-body redistribution.

Information theory: Things can be explained in terms of information (result of all possible experiments {all mutually exclusive/orthogonal experiments, ie each measuring an independent quantity} on a system = system)

Symmetry theory: Systems are defined by their symmetric and asymmetric components (these determine properties of classes of systems without the specific details that make systems unique) - this leads to the possibility of creating simple frameworks and classical physics theories (system = symmetry + asymmetry)

Thus the number of symmetry/asymmetry relations = information about a system. Some thoughts:
  1. Physics theories (frameworks) can only be based on thought + real experiments and because they are symmetric they can only re-cast information given as input -> thus theories have *no* predictive power
  2. Frameworks (system definitions) follow the uncertainty principle - more precise/accurate ones are also less broadly applicable
  3. Causality is a crucial element that must emerge from a combination of symmetry+asymmetry (time+symmetry -> causality, and asymmetry in process -> time)
  4. Frameworks are ubiquitous -> they are a consequence of how nature works and a reflection of underlying symmetry *and* asymmetry (that is context-free)

As described previously, systems are composed of symmetric and symmetry-breaking components. The symmetric components are degenerate in a sense, due to the symmetry itself, thus applying equally throughout the system. In conventional frameworks, the symmetric component corresponds to the "rules" or "laws" governing the system. Typically these are framed as equations, where the equal sign is a representation of the equivalence symmetry (which makes equations so useful, as described earlier). For instance, conservation of mass or F=ma are symmetries of the system. The symmetry-breaking components collapse the symmetric system (of equations or a physical one) to a certain possibility of initial conditions/final outcomes. In F=ma, the distinction between F, m, a is a symmetry-breaking component. These components determine what results an experiment will obtain (referred to as "information" earlier) whereas the symmetric components determine what experiments are meaningful to carry out, or the architecture of the system (of the "storage medium" of the universe if there were one). We note also that the symmetric aspects of the system are the underlying reason that models and computer simulations of reality can be used with certainty to make predictions about the world - even "macro-scale" or human-sized models which have no notion of (ie) atoms or the universe are completely valid. This in turn is enabled by a complex cancellation of underlying elementary symmetries such that only main ones are presented (mostly). Thus the symmetries of Newtonian mechanics are not those of quantum dynamics, which in turn are likely not those of nature.

Why symmetry? For a "universe computer" to make a universe without constraint requires an initial 'blank canvas' or 'nothingness'. Then the only way to define this nothingness is not by any definite properties but by relations, by what it allows – it allows everything but at a symmetric cost, so as to always conserve its own nothingness. The presence of these symmetry relations is the elementary structure of physics. This is an interesting explanation based on [], though I disagree that the information is 'infinitely malleable' as that provides no explanation for the reality of the world we experience - asymmetry is necessary.

The common statement in logic is A->B, B->C, therefore A->C. From an information theory point of view, however, such a statement is inaccurate since it does not conserve information. Namely, the B is lost! The correct way to write the conclusion would be:

A->B, B->C, therefore A->B->C

This way we do not lose track of (ie) the experiment that led to a particular result ("measurement") as is done in scientific papers by referencing sources. However we must go beyond this construct and recognize that the nature of the experiment itself may affect the outcome, even when what is nominally measured is the same. This easily de-mystifies some results of quantum mechanics such as the uncertainty principle - all that is claimed is that the experimental outcome will depend on how the measurement is carried out and on the particle/system history. It is surprising only because, by luck, physics on the human scale mostly cancels this out by various symmetries, or a scale mismatch (observing big objects using tiny photons). However this still occurs on the human scale, such as poll results affected by how questions are worded, machines failing in some cases but not in others, and measurement of distance/speed dependent on what device is used for the measurement. All these examples constitute the B aspect and must not be ignored in a complete information theory.

Next, we face the tough question of how to define "measurement". In a very real sense, there is no true absolute measurement - all values we use must be in relation to something else. Thus for an information theory we may say not that "the speed of light is c" but that "the speed of light is n oscillations of atom x measured by experiment y". [f2] Then information theory must be a relative theory.

If information is observed but does not affect anything, there is no proof it was observed in the first place. If it does affect something, that means it will be passed on to other experiments and that a physical law can be formulated to explain how it affected what it did. Thus for any meaningful definition of information it *must* be exchanged and never unidirectionally transferred, even outside the universe by a creator unless the creator just watches and doesn't interfere (if it interferes, we can formulate a physical law/learn its intentions and thus an *exchange* will have taken place).

It is claimed that, for a useful definition of information, there is no possibility of a unidirectional transfer or observation, thus by its very essence information must be exchanged. We start with the concept of an external creator and then generalize this to real experiments. Assume the case that a creator has made a model system, then there is nothing to stop it from observing the system's properties without the system being aware of the observation. This constitutes a "copying" of information. Now, if the creator/observer makes no use of this observed information whatsoever, there is no way to prove objectively that the observation was carried out - ie there are no effects! It is thus quite meaningless to have this kind of observation and physically impossible to justify a mechanism that would allow it. The other situation is that the observation is used by the creator to somehow intervene in the model universe. Then, the inhabitants of that universe will observe these interventions as physical laws, unaware of the creator's role and unable to find it in any way. Thus, inadvertently, information about the creator (and its universe) is transmitted through these interventions as physical laws or perhaps their violations. A final case is if the creator's observations have no effect on the model universe but have an effect outside the model, in the creator's universe. In this case we must look at the entire life of the model universe and consider the information used - we will find that the information given by the creator to construct/evolve the universe is exactly that which is transparently observed by the creator, and thus again information has been exchanged through the initial conditions and "eternal" laws of the model universe. Now, instead of a model universe, consider a real experiment carried out in a lab. The classical view is that of an experimenter as creator, choosing, controlling, and observing the experiment. By the same logic as above, though, we see that such a view is incomplete. For in the very act of picking a system/control variables/adjusting parameters between trials/carrying out multiple trials, the experimenter provides an information input to the experiment. Questionable quality/dubious experiments attempt to get more information out than was put in (ie certain aspects not controlled, chemicals not pure enough) leading to unrepeatable results or false trends. Over-designed experiments use "overkill" tools to get out less information than they could have.

We might also consider another view on this. The first claim is easy to believe: that the experimenter does not observe a sample but rather interacts with it. So in a microscopy measurement I carry out, I "tell the sample" which spot is being tested, what parameters are applied, what the probe method is. And the sample tells me what properties it has. The aspect more difficult to believe is the idea of who is controlling the experiment, or "in the driver's seat". The obvious view is that the experimenter is in charge, starting the experiments, collecting and publishing the data. But in a very real sense, it is the experiment (sample) that is in charge, for the data it provides is what the experimenter will publish, and will determine what the experimenter or other experimenters will do next. This shows how the notion of "in charge" or "controlling" is one that we have defined for our convenience (which has real explanative power due to symmetries allowing exclusion of the middle segments of large cause->effect chains, a lossy compression algorithm that allows us to interpret/understand the complex world around us) but does not correspond to the reality of countless interrelated influences.

Consider the transformation
where the output is the average of the two inputs. This transformation results in an information degeneracy since the input cannot be determined from the output. I question whether such processes can occur in nature, since they would satisfy the information exchange criterion while providing an analog for the observed 2nd law of thermodynamics - ie the spontaneous increase in entropy (decrease in possibilities for energy extraction).
A fast gas molecule is introduced into a ring. We expect that collisions with other gas molecules will spread out the momentum such that eventually the molecules travel in uniform velocity:
o-->o		    o					 o->o
				 o					/	 \
o	 o		o	 o		o	 o		o	 o
							 o		\	 /
 o  o		 o  o		 o oo		 o<-o
Normally, elastic collisions would not exhibit this behavior. To observe this, there must be some "information dilution" or degeneracy mechanism that is not a 1 to 1 mapping. For instance, a collision of 1 molecule with 2, or transferring the collision energy into both linear motion and vibration (heat) of the target molecule. A simpler case might be the physical nature of the collision, head-on vs grazing incidence, which splits collision energy among multiple directions of motion/dimensions.
o->o   vs o->	or 	  o
			 o	   o->

We see that head-on collisions (which do not "dilute" information) are less common than grazing ones, and that grazing ones tend to exchange only some fraction of information rather than 100%. That means a particle's initial state (information) gets exponentially diluted, until it carries an equivalent amount of initial information as the other particles whereupon the exchanges do not effectively transfer net information. Doing a mathematical analysis with a geometric treatment to find probability (cross-section) of various grazing angles and resulting information exchange in 3D leads to the attainment of the Boltzmann distribution (in 1D information is fully conserved (superconductor), and in 2D a non-Boltzmann but similar-looking distribution is observed (thin film phonons/surface plasmons)).

A steady-state analogy of the above thought experiment is the transmission of force in a solid material. For instance applying a point load on a surface results in a spread-out surface load on the other side:
   \/	 Point force down
^^^^^^^^ Surface force up
The load, having been spread out in space, can only be concentrated again by having (ie) a sharp object that will re-apply a point load.
I argue that partial information exchange leads to the concept of entropy and the observed macroscopic behavior of systems. Consider a molecule entering a tube filled with similar molecules (a):
		(a)				(b)					(c)			
	 [  o  o  ]		[   o  oo]			[o       ] o->	
o--->[o   o  o]		[o oo    ] o--->	[ o  o   ]  o->	
	 [ o o    ]		[     o  ]			[   oo   ] o->

Complete information exchange will result in a single molecule exiting the tube (b). Partial information exchange will result in diffuse force as in (c). This is the case due to the nature of information *exchange* - when the faster molecule interacts with a slower molecule, the outcome is two molecules that are closer to each other in (the initial molecule's) speed. In theory there might be some type of exchange in which the faster molecule gains more speed at the expense of a slower one, but this is countered by the effect of relativity - the particles interact in their own relative frame, in which both particles are on an equal footing - that is, there is no notion of a faster/slower molecule thus no "preferred" information flow direction. We see that after a collision that is grazing, the x-velocities of molecules is partially exchanged, while the y-velocities (which started equal) remain equal.

o--->   o				o/ v=(0.6, 0.1)			o		o--->
v=(1,0) v=(0,0)			o\ v=(0.6, -0.1)		v=(0,0) v=(1,0)
From this, I argue that the "original" information content of the molecule is exponentially degraded with each interaction, and when occurring similarly for all other molecules it means that "individuality" of particles exponentially decays to a "group" common state. We call such systems equilibrium systems.

We may get a sense of the true information content of a system by preventing partial-exchange interactions, or minimizing their extent. This might be done by taking a "snapshot" of a system, or by making use of "information input" that is faster than the "information degradation rate" (which is based on the frequency of interactions). A case of the latter is turbulent flow - the molecules experience irregularities in a wall surface and transmit them but because of the flow speed they are unable to reach an equilibrium state and behave with the complexity expected of a multitude of individual molecules, requiring particle-tracking simulations to understand. Conversely, when system disturbances are slow enough such that information quickly degrades, we see no effect of individual particles. In fact, we see the behavior of one "macroscopic molecule" that behaves in essence like its microscopic counterpart and carries the same information content. Thus we can describe laminar flow with only a few quantities like velocity and location, whereas turbulent flow requires such a collection of quantities for each individual unit in the flow, since they no longer can be said to degrade to an equivalent singular information content. What we observe in a wire is the behavior of a "huge" electron with voltage V, what we see in a water pipe is the behavior of a "huge" water molecule under pressure and other constraints.

Earlier I posed a question whether "larger" systems have more or less information content than "smaller" systems. By larger systems I mean those with an overarching order, such as a radio vs a chunk of material resulting from melting the radio. The answer is not obvious - the radio is in one sense the more complex system, being able to interact with radio waves in a desirable way, but the molten chunk is also complex - having more irregularities in atomic arrangement and such microscopic information. To resolve this consider an analogy with information systems - where a Morse code transmission may be a "large" system while a broadband cable transmission may be a "small" system. Which one "carries more data" is a purely human interpretation - without advanced electronics the Morse code is more info-dense while the cable system is 'white noise', but with advanced electronics clearly more data is being sent through the cable. Consider a binary signal comparison of the two:

Morse	000000....	11111...	000000...
Cable	010011....	10111...	101110...

To the universe, either combination is *equally unlikely*. This is like the fallacy in a roulette game that red or black is 'due'. We are forced to conclude that the information content of both data streams is the same in a universal sense. Yet we have an intuitive problem with this - obviously the long strings of 0s and 1s in the Morse case can be written more simply by compacting them? Thus for 100 0s in a row we could just write, well "100 0s", and 'compact' the transmission. Then, for completeness, we must find a way to convert the number 100 to binary, and also find a way to transmit the idea of the pattern itself to the recipient so that the recipient can recover the original data. This is in essence how zip files work - requiring a special program to open them. To be self-consistent, we *must* reason that the information content of 100 0s and the number '100' along with the pattern to generate 0s in such quantity have an *equivalent* information content. This is a way to find the information content of a pattern, which is identically a mapping, or a *function*. I would claim that to the universe neither the Morse code nor the cable code is preferred - and for this I refer to the *experimental* successes of the concepts of entropy and the ensemble. In effect experiments on gas atoms (for instance) show that they are equally likely to be found in any arrangement - and such a theory leads to testable+accurate predictions. Our intuitive dissonance with the idea that both binary streams above have the same information content stems from how evolution has programmed us to search for "simple" patterns (like repeating 0s) for the universe has no concept of "simple". It is our luck that many laws have a simple formulation in terms of our mathematical language, but we have to be cautious to claim whether a "pattern" exists - it may be "evident" that the repeating 0s can be compacted but if we were looking for other patterns (like Fibonacci numbers mod x in base 2) we might have seen that the cable code is also described precisely by a pattern. In that case we would concede that *that* pattern (along with inital parameters/length) has the same information content as the original cable code. In other words the information content of the underlying message cannot change with any given algorithm and therefore there must be *no absolutely compacting* algorithm (one that will only compact input data). This can also be reasoned by trying to repeatedly feed such an algorithm with its own output - eventually collapsing any data to a number (of repetitions to a null result) - conversely if this is possible, then the data has the information content of a number + algorithm. Any algorithm that compacts some data must make bigger other data. But in all cases algorithm+output has the same information content as its input. Finally returning to our physical system of the radio, we realize that what we consider 'orderly' or 'useful' or 'large' is a purely human construct - the molten chunk may indeed respond in useful and complex ways to inputs that are unknown to us like cable-receiving electronics were unknown to previous generations. We must conclude that both systems have the same information content - though widely different in their human applicability - which gives further reason to believe the earlier assumption of information conservation (this extends to a per-particle information model by repeatedly splitting the system and finding the result unchanged - thus the info of a particle is also conserved).

So far I might say:

Everyday we are exposed to the effects of causality. Typically cause and effect occur in the same distance/time scales and are easy to tell, but sometimes a mysterious 'random' occurence takes place, a glimpse of chaos underlying our imagined orderly world - turbulence, lightning, fracture - I call these 'upconversion processes' since they allow microscopic effects to have macroscopic consequences. Similarly there are 'downconversion' processes in which the microscopic particles are carried along by the overwhelming momentum of the larger system - global climate change, pressure drop causing turbulence (via external pump), loading of material causing brittle fracture. As considered above, 'large' and 'small' are human constructs that make things easier to explain, but in a universal sense these processes should not be viewed as mysterious, but simply a manifestation of the ever-present information exchange occurring across all 'scales'. Nonetheless these processes have clear power (from human-operated machinery (controlling hand-operated levers applies huge forces) to computers (tiny transistors having a macroscopic impact) to weather (a big system has impact on our tiny brain cells)) which comes from not just simple exchange of information, but the ability to *control* exchange of information. Consider a simple upconversion process - a hydraulic valve. When shut, it disables/interrupts information exchange between two reservoirs. When open (which can take little/no work) the information exchange is allowed, leading to fluid motion and exerted forces. Consider a simple downconversion process - climate change, which as it changes limits and constrains the actions of smaller beings like ourselves affecting what we can and cannot do without doing any nominal work. Such conversion processes are perhaps more appropriately termed 'control' processes (later I consider the point that all processes are control processes to some extent) since they control what the underlying system can and cannot exchange information with. Although it seems strange to refer to brittle fracture as a 'control' process, this is what happens in a sense - the system has been set up externally (load) such that a particular defect or crack has control over whether the rest of the structure shatters or stays together - similar to the control of electrons by a semiconductor or the aforementioned hydraulic valve. The control processes do not necessarily require an expenditure of power, and could in fact be used to generate power (regenerative braking). However we may still find that these processes result in an overall increase in entropy (total information uniformity). [f3] At this point we come to the more genuinely mysterious, entropy-reducing processes. I cannot offer convincing proof that these exist, however I would argue an obvious example is the gravitational accumulation of matter (starting as diffuse gas/dust) into orderly planet systems, stars, and galaxies, a clear opposite of the classic thermodynamics problem where a gas starts localized on one side of a partition and spreads to cover all space available to it. I would claim that the action of gravity is an entropy-reducing process. Might there be others? The most unlikely emergence and evolution of life, and its persistence, suggests that other such processes act in (perhaps) chemical bond/structure formation, or even in nuclear reactions. This remains an interesting issue for future consideration.

Entropy could be seen as "energy purity", and at this point I seek to elaborate on the concept using the quantum mechanical bath-coupled oscillator. Briefly, entropy arises naturally from the QM tendency of energy to seek all possible states, and occurs when the path of "desired" (human-useful) energy propagation is not very well defined by appropriate conductors and isolators. Interpreted in this way, a number of ambiguities are removed in terms of "usefulness" of energy, or "disorder", and the process is more general. Energy can shift between forms (oscillators) given an appropriate QM coupling between these forms/states, and will stay in one state in the absence of such coupling. So, a gasoline tank does not spontaneously heat up, nor does a moving piston. By introducing elements to the path which are a potential energy coupling (anything interacting between stationary and moving parts) such as bearings or rollers, or insulators or length along a real (resistive) wire, we enable some of the energy (and, in time, all of the energy) to couple into all available states, of which a thermal bath is a great sink because of so many degrees of freedom. [f4] Yet other elements, like the combustion chamber in an engine and electric motors, specifically are designed to convert between types of energy. Since we can convert energy types ideally with 100% efficiency (KE-PE device - pendulum/orbiting planets) we see that the fact that all states are sought does not mean there will be an even energy split among them (on average/ensemble, though, this is true: see equipartition theorem). Then, machines and transmission elements can exist that function with 100% efficiency.

A question occurred to me when using a belt sander to shape an aluminum part: I am holding the metal piece against the belt and it takes minimal effort, yet the relativity principle says this is no different than if I moved the piece on a stationary belt, yet that would take a lot of effort. How does "the universe" know that when I use the belt sander I don't have to put in effort - and where does the effort come from? Earlier I mentioned that absolute energy is questionable while energy difference is definite. And there is one energy difference: between the metal and the sanding belt, which move at different speeds. Putting the two in rubbing contact against each other will tend to reduce rather than increase the energy difference. But in practice this energy difference persists over time - why? It must be counterbalanced (like a loop, see below) by another energy difference on the belt or metal. If I move my arm relative to my body, I create an energy difference at the point of unequal motion - in the muscles and joints. If I stay stationary, I close the force loop but not the energy difference loop. With the belt sander powered on, the motor closes the energy loop - it contains the fastest component in the sander - the rotating electric field - which the rotor tries to attain, by the same mechanism as placing the metal against the sandpaper will tend to equalize their speeds. From this view it is easy to see that bearings and belts, which exhibit relative motion, all will affect the energy difference balance one way (or another). My body does not exhibit internal relative motion so I feel no effort. The fast motion of the e-field in the motor is in turn balanced by even faster motion of some generator in the power grid, and so on to some complicated questions (where does it all come from?). So in using the belt sander, I am saying "I think the grid's e-speed should be zero", which propagates as a slight phase shift onto the grid, and eventually onto the generator which then slows down accordingly (and so on to the "ultimate power source"), which is quite impressive in that I am tapping into a huge energy-difference loop and readily affecting it. The process looks fractal in nature, and we will continue to see such fractal shapes in further exploration:

.---\										/---.
.---\---.---\						/---.---/---.
.---/---.---\===.===\		/===.===/---.---\---.
.---/					[-]					\---.
.---\					[+]					/---.
.---\---.---/===.===/		\===.===\---.---/---.
.---/---.---/						\---.---\---.
.---/										\---.
atoms	gene-	grid	motor	sand	metal	atoms
(heat)	rator	wires			belt	pieces	(heat)

.---\										/---.
.---\---.---\						/---.---/---.
.---/---.---\===.===\		/===.===/---.---\---.
.---/					[-]					\---.
.---\					[+]					/---.
.---\---.---/===.===/		\===.===\---.---/---.
.---/---.---/						\---.---\---.
.---/										\---.
atoms	cells	muscle	metal	sand	metal	atoms
(heat)	(body)	action			belt	pieces	(heat)

While traveling abroad, walking to a metro station to get to the airport, I thought about how the transportation network is set up like a fractal. One of the measures for "optimal" fractal systems is equal time spent along each path segment: such as for a binary list search this implies a fully balanced/symmetric tree so for each member the search time is constant and 1 unit is spent on each successive level. Perhaps something similar can apply to transportation. So I would spend the same time (say) walking/on bus/subway/airplane - which is not really true in reality but all within an order of magnitude. I spend most time on the airplane, which actually makes this faster - the fastest limit is a plane door-to-door. Thus we see such a fractal arrangement is not fastest time for me but rather fastest time given all other users, perhaps shortest distance traveled or shortest distance of infrastructure - but practically, equal *loading* of infrastructure - an impedance match of inflow and outflow capabilities throughout. In such a system, how could I distinguish where on the fractal I am, for purposes of measuring and optimization? Ideally there would be no absolute fractal level, but still we can distinguish between straight segments (vertices) and junctions (edges/points), and this (along with time) gives a metric for a length of each fractalline segment on the journey. This distinction is made by notion of convergences: on a segment, all nearby pieces evolve together in a coherent fashion, whereas at a junction, many pieces that were not previously nearby (in their evolution) are joined and become nearby or vice versa the nearby pieces are split up into not-nearby pieces that no longer interact. As above, drawn out this makes tree-like (with soil roots and air roots) diagrams for most engineering/power processes, considering both the microscopic sources of power and microscopic reservoirs of heat but a single path of evolution through a dissipation pathway like a motor. At the same time this distinction is not really satisfactory - since in reality the particle trajectories do not ever 'converge' onto a single line - that would be symmetry-breaking, so we can only claim a fuzzy "nearby" area - but what constitutes nearby is not very clear. This is to say, when can we conclude that given particles are on a straight segment? Perhaps only by saying they do not interact with outside systems by joining or separating - namely thermodynamically isolated. All the particles in an isolated/inertial system will evolve coherently, and at fractal vertices/points will evolve by interacting with other systems. For example, a bullet fired at an object: inertial when flying, then creating a cascade of inertial/collision interactions (and originating from an "inverse cascade" from the brain neural net to the striking pin of the gun).

The compilation of this text in itself was only possible due to fractal structures. I had filled up a number of notebooks with ideas, meaning to eventually write them down in a coherent way, but somehow not being able to start until I realized that to arrange everything I should go through all the notes and make a list of topics, then group them by theme, then go "in the weeds" and start working on the text. Without this fractal approach, I literally could not have written this book - it is too much information to handle without ordering by groups and subgroups. Doing such ordering is a physical requirement. [f5] There is some deep principle here about how information works and can be handled. It seems that fractal-like structures thus abound in the surrounding world (trees, centralized networks, concentration of control, energy flows). The example of turbulence comes to mind: similar patterns over many length scales, recursive structures. And what is the point of turbulence? To increase entropy, to distribute the atoms in an optimal energy-dissipating way given the external boundary conditions. The fractal recursive structure is a hallmark of such entropy optimization. I argue, life on earth is also a complex solid-state (vs gas in clouds) form of entropy optimization, which is why we see fractal structures spontaneously form - as they are most effective at this optimization and are a stable limiting case for evolution. Consider consciousness itself as it functions within the brain: information is collected from multiple points, centralized, and sent back to multiple points, making a fractal-like structure of the instantaneous qualia. Consider the communication networks we've established (internet, TV, business inventory/sales, postal services, transportation and tourism), which all serve to enable life's needs and thus increase entropy in an optimal way in their own sense, which end up as fractals because such a shape is most effective/useful for us to choose to construct. I tried thinking of a way to create an internet service that is fully distributed and decentralized - where each user communicates with others nearby. It works well enough for small localized groups, but scaling this up to any modern usable scale becomes quickly impractical but a solution awaits: make some nodes 'central', those can carry/tunnel more information between distant points, but then information about who should contact which central nodes needs to be distributed, and we quickly form a fractal, centralized network - just like what the existing ISPs and cell phone companies have already established. It is physically impossible to satisfy the needs of large-scale entropy optimization/communication/information exchange (which are all one and the same) without a fractal structure to control the flows. At small scales, a distributed and fair/decentralized solution works, just like laminar flow in pipes, but at a certain turning point centralization and emergence of complex, interrelated, recursive structures takes place - turbulence, life, intelligence, multi-variable optimization, evolution and memory and prediction, stable and unstable systems. Business (mines->processing->manufacturing->warehouses->stores->individuals), government (president->lower ranks->administrators->affected individuals->all individuals), armed forces (general->lower ranks->lowest ranks->affected individuals), transport (walk->bus->metro->airplane->car->walk), internet (PC->router->big router->router->PC), research itself (knowledge sources->relevant knowledge->applications) are some fractals in man-made living systems. In nature: lightning, trees/plants, clouds/storms, brooks/streams/rivers, ocean flows, animals (resources->animal->wastes), wolar energy flow, can all be described in fractal terms. Are there structures which are not fractal? Invariably they are either inanimate (not participating in information dissipation) or at small scales or inefficient and susceptible to takeover by better fractals: family is a loose network though there is a tendency to have an 'oligarch' in charge (centralized), school classrooms/coworkers/peer groups tend to be loose though still a social 'pecking order' emerges, the road networks in old cities tend to be more egalitarian/distributed but also slow to the point that some are converted into bigger/faster roads to enable more effective travel resulting in a centralized structure, neighborhoods and individual houses tend to be more on an equal/non-fractal footing though still things like homeowners' association emerge and the government extends its fractal arm to provide some oversight/control. Fractal structures are unstable in the absence of entropy flows and automatically emerge in the presence of entropy flows, fractals are life, maybe even consciousness itself.

Consider me pushing on a wall.
 o  ||
/ \ ||

There is a force loop established, such that my forward push is balanced by the backward push my feet exert on the ground. Half of the loop is in my body - between hand and feet - and half is in the wall and floor. The rest of the earth doesn't need to know about this loop - it is not radiated. The loop ensures conservation of force and momentum; I believe a virtual loop can be drawn for any conserved quantity, which may be elegant because loops can be seen as particles (string theory vs loop theory?). But how does this loop get created? It's not instant, so what is it that causes that specific loop shape to take place? What is the situation in the moments before the loop is closed - can I do things beyond my typical F=ma abilities? Consider a tall ledge similar to the wall above:

 o  ||
/ \ ||
TTT ||
TTT ||
TTT ||
TTT ||
TTT ||

The force loop now will take a long time to propagate all the way around the structure. Am I free to move the wall as I please until the loop gets closed? This has close parallels to transmission line theory and could be calculated in those terms. When touching the wall to establish the loop, some energy will be lost as 'radiated' vibrations - how do the mechanics of this work? Other examples of loops: electric power cord/lines, magnetic flux lines, e-field lines, using a power tool, static shock, buildings and structures (using gravity as one half of loop), pressurized devices (more of a surface than loop, but still works), metal presses, and hammers (using inertia as half of loop). Based on the fields used in loops, I would claim that loop closure can be achieved through electric+magnetic or gravitational+inertial fields, in short e- and g-fields. An ice skater pushing off a wall pushes against his inertia - causing loop closure to wall/earth inertia through the g-field. This 'loop line' will go through the air, like field lines on a magnet. [f6] Stronger arm-wall forces can be achieved with a higher skater mass (greater radiating efficiency of force loop lines into the air) or by rigidly standing on the floor (directly coupling force loop through solid floor). Energy difference can also form loops, where like magnets objects will tend to 'align' to dissipate energy in the same way. Thus the energy created by the skater's arm muscle is balanced by g-field closure on skater's body (and wall).

Everything in the universe must be interconnected. All events must affect all other events, to make a coherent intertwined whole, like threads fitting together to make a piece of fabric - any loose threads fall out and don't make a contribution. If some event did not affect other events, it would be possible to draw an 'isolation boundary' and thus split the universe into different-evolving parts, something which I assume is unallowed under the concept of a single universe. Even testing for the presence of such a boundary already breaks the boundary as then I know that some part of the universe hasn't interacted with another, but now it has interacted with my measurement and my measurement has interacted with the rest of the boundary, so as long as I am in the universe and observe the measurement the boundary gets coupled to my actions and is now leaked/breached. Every action will affect every other action, future and past (though as regards the past, we like to think of causation so we say only the future actions will be affected by present actions, but also recognizing that present actions have been determined by past actions. Without assuming a time arrow, we could say equivalently that every action spreads out to future and past influences, and across all of space).

[f1] I no longer agree with this and see it as a refutation of the photon model. Instead light is emitted locally and exists as field oscillations; the information provided to an emitter is not about the receiver but about its local field surroundings.

[f2] the SI system achieves this, here it is important to see that the resulting units are a matter of convenience for communication and not absolute physical truths. The units themselves are an emergent feature of underlying universal symmetries which we have deemed particularly useful.

[f3] precluding a controller like Maxwell's demon

[f4] an "efficient" process will make it very easy for energy to couple into human-useful reactions and hard to couple into the thermal bath (mostly useless). The lack of degrees of freedom eliminating thermal bath coupling is why superconductors have to be kept cold.

[f5] the brain is intrinsically good with hierarchical structures/trees, and interpreting info in this (fractal) way helps us pick a course of action (this is why computer file systems, of all possible ways to implement and display them, employ the notion of folders and subfolders). Also seen in science to understand complex phenomena (ie split up into parts, most impactful on top level, then more and more specific on lower levels).

[f6] though it's not detectable in air (I don't think?) so this point may be dubious. Maybe air flow around objects is an indication of force? This doesn't work for the wall-pushing example

«« 4. Fields and Particles    5. Information Theory    6. Information Couplings »»