Entropy transformers

What is the relationship between information, causation, and entropy?

The other day, I was reading a post from Corey S. Powell on how we are all ripples of information. I found it interesting because it resonated with my own understanding of information (i.e. it flattered my biases). We both seem to see information as something active rather than passive. In my case I see it fundamentally related to causation itself, more specifically a snapshot of causal processing. Powell notes that Seth Lloyd has an excellent book on this topic, so I looked it up.

Lloyd’s 2006 book is called Programming the Universe, which by itself gives you an idea of his views. He sees the entire universe as a giant computer, specifically a quantum computer, and much of the book is about making a case for it. It’s similar to the “it from qubit” stance David Chalmers explores in his book Reality+. (I did a series of posts on Chalmers’ book a while back.)

One of the problems with saying the universe is a computer is it invites an endless metaphysical debate, along with narrow conceptions of “computer” leading people to ask things like what kind of hardware the universe might be running on. I’ve come to think a better strategy is to talk about the nature of computation itself. Then we can compare and contrast that nature with the universe’s overall nature, at least to the extent we understand it.

Along those lines, Chalmers argues that computers are causation machines. I think it helps to clarify that we’re talking about logical processing, which is broader than just calculation. I see logical processing as distilled causation, specifically a high degree of causal differentiation (information) at the lowest energy levels currently achievable, in other words, a high information to energy ratio.

The energy point is important, because high causal differentiation tends to be expensive in terms of energy. (Data centers are becoming a major source of energy consumption in the developed world, and although the brain is far more efficient, it’s still the most expensive organ in the body, at least for humans.)

Which is why computational systems always have input/output interfaces that reduce the energy levels of incoming effects from the environment to the levels of their internal processing, and amplify the energy of outgoing effects. (Think keyboards and screens for traditional PCs, or sense organs and muscles for nervous systems.)

Of course, there’s no bright line, no sharp threshold in the information / energy ratio where a system is suddenly doing computation. As a recent Quanta piece pointed out, computation is everywhere. But for most things, like stars, the magnitude of their energy level plays a much larger role in the causal effects on the environment than their differentiation.

However, people like Lloyd or Chalmers would likely point out that the energy magnitude is itself a number, a piece of information, one that has computational effects on other systems. In a simulation of that system, the simulation wouldn’t have the same causal effects on other physical systems as the original, but it would within the environment of the simulation. (Simulated wetness isn’t wet, except for entities in the simulation.)

Anyway, the thing that really caught my eye with Lloyd was his description of entropy. I’ve covered before my struggles with the customary description of entropy as the amount of disorder in a system. Disorder according to who? As usually described, it leaves the question of how much entropy a particular system has as observer dependent, which seems problematic for a fundamental physics concept. My reconciliation of this is to think of entropy as disorder for transformation, or in engineering terms: for work.

Another struggle has been the relationship between entropy and information. I’ve long wanted to say that entropy and information are closely related, if not the same thing. That seems like the lesson from Claude Shannon’s theory of information, which uses an equation similar to Ludwig Boltzmann’s for entropy. Entropy is a measure of the complexity in a system, and higher values result in a system’s energy gradients being fragmented, making much of the energy in the system unavailable for transformation (work), at least without adding additional energy into the system.

However, people like Sean Carroll often argue that a high entropy state is one of low information. Although Carroll does frequently note that there are several conceptions of “information” out there. His response makes sense for what is often called “semantic information”, that is information whose meaning is known and useful to some kind of agent. The equivalence seems more for “physical information”, the broader concept of information as generally used in physics (and causes hand wringing due to the possibility of black holes losing it).

Lloyd seems to be on the same page. He sees entropy as information, although he stipulates that it’s hidden information, or unavailable information (similar to how energy is present but unavailable). But this again seems to result in entropy being observer dependent. If the information is available to you but not me, does that mean the system has higher entropy for me than it does for you? If so, then computers are high entropy systems since none of us have access to most of the current information in the device you’re using right now.

My reconciliation here is to include the observer as part of the accounting. So if a system is in a highly complex state, one you understand but I don’t, then the entropy for the you + system under consideration is lower than the entropy for the me + system combo. In other words, your knowledge, the correlations between you and the system, makes the combined you + system more ordered for transformation than the me + system combo. At least that’s my current conclusion.

But that means for any particular system considered in isolation, the level of entropy is basically the amount of complexity, of physical information it contains. That implies that the ratio I was talking about above, of information to energy, is also of entropy to energy. And another way to refer to these computational systems, in addition to information processing systems, is as entropy processing systems, or entropy transformers.

This might seem powerfully counter intuitive because we’re taught to think of entropy as bad. Computational systems seem to be about harnessing their entropy, their complexity, and making use of it. And we have to remember that these aren’t closed systems. As noted above, they’re systems that require a lot of inbound energy. It’s that supply of energy that enables transformation of their highly entropic states. (It’s worth noting that these systems also produce a lot of additional entropy that requires energy to be removed, such as waste heat or metabolic waste.)

So computers are causation machines and entropy transformers. Which kind of sounds like the universe, but maybe in a very concentrated form. Viewing it this way keeps us more aware of the causal relations not yet captured by current conventional computers. And the energy requirements remind us that computation may be everywhere, but the useful versions only seem to come about from extensive evolution or engineering. As Chalmers notes in his book, highly computational systems don’t come cheap.

What do you think? Are there differences between physical information and entropy that I’m overlooking? And how would you characterize the nature of computation? Does a star, rock, or hurricane compute in any meaningful sense? What about a unicellular organism?

Featured image credit

49 thoughts on “Entropy transformers

  1. Heh… This is a HUGE subject and at present I don’t have the time (or energy) to attempt a properly coherent response. So here are some immediate reactions which may or may not be of interest.

    I am reminded of Prigogine’s remark that we are counter-currents (or if you prefer, backwash) in a stream of energy running down the entropic slope.

    Physically, I see causation as exchange of physical information (defined as that which makes physical configurations distinct). All else is causal story-telling, in the language of the appropriate emergent (reductive) level.

    The simple notion of entropy as disorder is clearly wrong. Ice is less disordered than liquid water yet it is “downslope” on the entropic gradient. Entropy is better understood as a measure of the number of possible microscopic arrangements (microstates) of a system that result in the same macroscopic state. Though this invites the question of how the “sameness” of macroscopc states is to be defined. Which leads us to the notion of emergence – on emergent (higher reductive) levels macroscopic states are definitional (e.g. pressure).

    Following Davidson, I see mergence as a lack of type-type translatability between emergent/reductive levels, which is consistent with strict causality, which works on the token-token basis. My go-to example: spaceships in Game of Life — an emergent (definitional) type, which cannot be characterised in the lower level language of the game’s mechanics (because GoL is Turing complete, AFAIK as are other known computational examples of emergence).

    Is the universe a computer (as in whatever performs computation)? I Agree with Penrose that this is an open question, while disagreeing with his attempt to use the halting problem to prove that minds must be non-algorithmic.

    Computers as causation machines? All our creations are causation machines. OTOH, considering computation as such, what about the fact that mixing in non-causal randomness can speed computation, though not increase its scope?

    Liked by 2 people

    1. Hey, you did well off the cuff. (And off the cuff is fine. I actually prefer it. When people are too serious about the composition of their remarks, they tend to expect it from others. I prefer casual friendly conversation.)

      I like what I’m hearing about that Prigogine remark. I often have the same thought.

      Causality as an exchange of physical information is close to my view, but it depends on whether we see information as separate from action or intertwined with it. Of course, we engineer our systems with them as separate, but it’s not a distinction biology seems to care much about.

      The microscopic states to macro ones is a definition I often hear. And it’s compatible with Lloyd’s hidden or unavailable information view. But as you note, the question is what counts as “macroscopic”. It seems relative to an agent’s ability to model its environment, in other words, observer dependent. Although as I noted in the post, if we include the agent / observer as part of the system in question, then it can work. But it seems like there would be no entropy for Laplace’s demon.

      Yeah, can’t say I’m a fan of Penrose’s work in the area of minds. Even if an objective collapse model of QM turns out to be accurate, for it to be something that can’t be incorporated into future computers, it would need to essentially be un-computable, that is, forever beyond understanding according to principles. We can never rule something like that out, but it seems more productive to always assume there are discoverable principles.

      How can non-causal randomness speed computation? Is this a reference to quantum computing? If so, I think the special sauce there is parallel processing, albeit of a variety that needs to promote the right answer at the end throughout the parallel paths.

      Liked by 1 person

      1. Well, it’s a subject I keep returning to every few years, so pulling out a bunch of familiar slogans wasn’t exactly taxing. 🙂

        I prefer not to bring observers into the picture. It makes subjectivist misunderstandings too likely and I think there is a better way. Distinct reductive/emergent levels have their own natural primitives — what philosophers refer to when they speak of “carving nature at its joints” –which are independent of observers. E.g. Boyle’s law relating pressure, temperature and volume of gas is objectively a simple, good approximation of gas properties at our macro-scale. Any configuration with the same values of these three quantities is at our scales effectively the same macro-scale system having a large number of micro-states. This is a fact, independent of any observers.

        So the sameness criterion arises naturally at different emergent/reductive levels, even though it may not be possible to characterise it in the terms natural to the underlying level of micro-states (except as a humongous, possibly infinite list of special cases), which is a hall-mark of emergence.

        BTW, I might have given the wrong impression of being dismissive about “causal story telling” as opposed to exchange of physical information. However causal story is the natural and appropriate account of causality on a given level, giving rise to Weinberg-style effective theories.

        The rather deeper question, which keeps bringing me back to the subject, is why there are there distinct reductive/emergent levels permitting their own causal story-telling. It is far from obvious (to me at least) why there should be any. Stuart Kauffman claims that such spontaneous higher level order is inevitable, but seductive as the argument is, I am unsure that his exploration of network effects generalises to physics.

        Re randomness and computation… No, I didn’t mean anything exotic. I am referring to the simple (and surely well known) fact that e.g. the average performance of randomised bubble sort exceeds that of deterministic bubble sort. Yes, I know, one can use pseudo-randomness as a stand-in for true, ontic randomness, but it is a stand-in, qualified by being pragmatically indistinguishable from the real thing.

        Liked by 1 person

        1. I’m far less sure that emergence, in the way you describe, is strictly objective. To me, it seems more about the need for our primate brains to switch models at certain scales to make accurate predictions. Certainly other brains that exist at similar scales will have similar points they have to shift, but I’m skeptical these boundaries are objective.

          I will note that one benefit of my view is I don’t wonder about why those levels break where they do. They break where they do due to the limitations of our cognition.

          Of course, I don’t have access to a mind that works at different scales to prove my point, so I could be all wet.

          Ah, I see what you mean now on the randomness point. Thanks!

          Liked by 1 person

          1. It really does not strike as surprising that at our scales everyday behaviour of matter simplifies to a small number of equations so simple that despite our primitive brains they can be taught to children? Is that startling simplicity itself not an objective fact? Whatever more complex regularities we might or might not be missing, the simplicity of e.g. Boyle’s law is a fact. We have built a lot of technology on that trivial relationship — from steam engines to heat pumps. And they work, observers or no observers.

            Liked by 1 person

          2. I’d say whether a model is predictive is an objective fact. But it seems like there are always multiple predictive models that work. I’m not sure there’s any fact of the matter which among those models is the one true one. Although one of them might be the easiest for us to understand and work with at particular scales.

            Note that this is a separate issue from parsimony. We can have multiple parsimonious models, like the Heisenberg vs Schrὅdinger pictures in our other discussions.

            Liked by 1 person

          3. Yes, but *why* should such a simple model be predictive? The whole of highly successful 19C physics (deemed by experts to be virtually complete, except for a couple of minor mysteries) rested on a few simple models which just happened to occur on the same macro-scale? Some coincidence!

            We’ll obviously have to disagree on that one.

            Liked by 1 person

  2. @selfawarepatterns.com
    Have you seen/ read @johncarlosbaez "What is entropy" ?
    https://mastodon.social/@johncarlosbaez@mathstodon.xyz/112830082093414400

    His approach may have some answers. I am still working through it!

    Liked by 2 people

      1. @selfawarepatterns.com – you'll see on page one that I steer clear of saying entropy is disorder, which is not a good explanation.

        Liked by 2 people

        1. @johncarlosbaez @selfawarepatterns.com 1/3) Nice read! I’ll try to dissect my comment over the statement “computers are causation machines and entropy transformers”, but first:
          1) Can we liken a computer to a neural network (NN)? Probably, because NN’s are universal approximators (Cybenko, 1989) Here, I’m taking the NN for a multilayer perceptron.

          Liked by 2 people

          1. @johncarlosbaez @selfawarepatterns.com 3/3) Finally,
            3) Are NN entropy transformers? Probably, because of the Information Bottleneck principle (Tishby, 1999), where the Mutual Information (MI) decreases along the network MI(Y,X)>MI(Y,h)>MI(Y,Ypred). MI is defined as a Kullback-Leibler divergence, and so is the cross entropy loss. In fact, maximizing MI(Y,Ypred) is the learning goal of a NN (i.e., maximum entropy classification).

            Any blind spots in this reasoning?

            Liked by 2 people

          2. Most of this sounds right to me.

            Although on 2), I wonder if we’re using “causation machine” in the same manner. By it, I think Chalmers (and I) just mean that it’s principle business is causal, specifically causal differentiation. So for us, nerve nets would definitely qualify.

            OTOH if you’re going for something like repeatability, then I can see the argument against that. (Although if the nerve net is a software one in a digital system, then, assuming the same data and no hardware malfunction, we should get the same result.)

            Liked by 1 person

  3. ”And how would you characterize the nature of computation?”

    As you may or may not know, my current intellectual project, understanding consciousnesse, began with the assumption that consciousness is some form of information processing. So my project includes figuring out exactly what that means. This project has led me to my current understanding of causation, information, and the role of entropy, which I will sketch here, because you asked.

    Every physical interaction can be diagrammed as Input—>[mechanism]—>Output. Note: the choice of which things are “Input” and which is the “mechanism” is arbitrary but chosen to be useful. Also note: “Output” could include changes to the Input and/or mechanism. Also note: “mechanism” can be an entire system which includes sub-processes (so, sub-mechanisms).

    Causation: we can say the mechanism *causes* the Output when presented with the Input (with a probability greater than chance).

    Information: because every physical interaction follows rules (the laws of physics), the Output is correlated with the Input and the mechanism. This correlation is also called Mutual Information. (In quantum mechanics, it’s called entanglement.)

    Thus, every physical process causes (mutual) information.

    Every physical process *processes* (mutual) information. Consider a simple (toy) process: A—>[B]—>C. C carries mutual information with respect to A (and also B). Now consider a subsequent process: C—>[D]—>E. E carries mutual information with respect to C (and D) and by extension (via C) it carries mutual information with respect to A (and B). So every physical system carries some non-zero amount of mutual information with respect to *every* participant in the physical processes before it. The amount of mutual information (the degree of correlation) depends on the observer and the observer’s knowledge of the systems involved.

    Every information process can be expressed as a combination of one or more of COPY,NOT,AND, or OR. (Strictly speaking, NOT and AND are sufficient, thus the NAND gate.)

    Every physical process performs COPY,NOT,AND,or OR with respect to an observer that *cares* about at least one of the inputs. If the observer cares about A (say, a food source), then A—>[B]—>C constitutes a COPY of the mutual information in A to C. If the observer instead cares about B, the process can be considered a COPY of the mutual information from B to C. Which operation(s) is/are relevant is determined by the observer’s *response* to C.

    Entropy: I’m not gonna say much about entropy here other than pointing to it’s role in the second law of thermodynamics, which can drive physical processes and ultimately forms the basis of systems that *care* about things (have goals). (Can discuss on request).

    So I think all of this is compatible with your understanding, but just looks at things from a different angle. This angle is useful for me because it relies less on the Maths, so to speak.

    *

    Liked by 2 people

    1. Do you still find consciousness as interesting a problem as you once did? I ask because my own interest has waned in recent years. Some of it is from not much coming out about it recently. But also because it seems like it’s been a while since I’ve heard any new arguments, at least arguments that can’t be quickly dismissed with the acid test of whether someone is trusting introspective impressions too much.

      “with respect to an observer that *cares* about at least one of the inputs.”

      Is this new? I don’t recall it in your previous descriptions. It seems like whether something has a one to one cause effect relation (COPY), an inverse relation to its cause (NOT), requires multiple causes (AND) or can be caused by one of multiple causes (OR) has some observer independence. What’s an example of something that would be different for different observers?

      Liked by 1 person

      1. Consciousness is still my main project, but these considerations have taken me into further developments, such as the ontology of information (described above) and morality (all about the goals).

        I don’t think the angle on “caring” is so much new as reworded. Something cares if it has one or more goals.

        So for the example you asked for, consider a set of processes

        1. (A,B,C)—>[D]—>E
        2. (F,B,C)—>[D]—>W

        Suppose observer 1 only cares about C. Both processes 1 and 2 then look like COPY C and so it can respond to C by responding to E or W.

        Suppose observer 2 cares about A and B happening together. Process 1 looks like (A AND B), but process 2 does not.

        The actual mutual information is observer independent, but observers can focus on sub-parts of the total information. Actually, there is a whole new field of inquiry in this area. Google “Partial Information Decomposition PID”.

        Does that make sense?

        *

        Liked by 1 person

        1. Hmmm. So if I cared about C, wouldn’t it matter what the actual nature of D is? If D is an AND gate, then whether I see C’s effect on E will depend on whether A and B are present. It doesn’t seem like what I cared about would factor into it. Or I would discover that I have to care about A and B, or F and B. And if D is an OR gate, it seems like I’d have to be alert to the possibility that E doesn’t always signal the presence of C. Again, I’d be compelled to care about A, F, and B.

          I can see a case that what counts as output and what is waste heat depends on what I care about. Although I would think that was recognized in whatever downstream processes after E or W are put in place.

          Liked by 1 person

          1. Objectively, by (my) definition, (A,B,C)—>[D]—>E is an AND gate. E only happens if A AND B AND C. But the pertinent logic for the observer may be “If A and B and C then C” is true, and so “If E, then C” is true. If I only care about C, then I can respond to E and I’m good. D is multirealizable as long as A,B,C->[]->E. And finding the correct response to E so that you’re responding to C is a matter of selection, i.e., finding the response that works.

            D works as an OR gate if you only care that one of A,B,or C is true, but you don’t care which one. “If E, then A or B or C” is a true statement. It doesn’t matter if “If E, then A and B and C” is also true. If you care about A or B or C, and attach the appropriate response to E, you win. It may not be the most efficient way. If that’s the only thing you respond to you may miss some A’s that don’t come with B’s and C’s, but you work with what you got.

            *

            Liked by 1 person

          2. But if D is an AND gate, then you can have C without E. Yes, E will always indicate C, but it won’t indicate all C. If C is what we care about, then we have to care about the others.

            If D is an OR gate, the main thing is that the output is no longer a reliable indication of C.

            Whether it’s objectively an AND or OR gate seems to make a difference independent of an observer’s interest (or downstream process’ response). Unless of course I’m completely confused (quite possible).

            Liked by 1 person

          3. [gonna go out on a limb here and say …]

            You’re thinking in terms of Shannon information (yes/no, bits) and trying to engineer perfect efficiency.

            Mutual information is about probability, and engineering good enough. D may be a process out in the environment, and E may be the only thing you can access. E may have only a 30% correlation with C, but that may be good enough. If you find food every third time, say, nectar in every third flower, that can be a win. It may not be reliable, but it may be reliable enough.

            And I won’t die on this hill, but I think there is no such thing as an objectively AND or OR gate. Here:

            (A,B,C) —> [D] —> E

            E —> F (if you detect E, then do F)

            E -/-> G (if you detect E, then inhibit G)

            Is D an AND or an OR?

            *

            [“If ~A OR ~B OR ~C then do G”]

            Liked by 1 person

          4. No need to die on any hills. It’s all friendly discussion here. (Or at least that’s what I try for.)

            I wasn’t thinking probabilistically, but that was under the assumption we were dealing with suitably simplifying assumptions. Certainly reality can be much messier. A Lloyd or a Chalmers might argue that the messiness is a composition of the primitive elements, which eventually reduce to the operations you discuss. (I actually thought that was your position, but your take here seems more instrumentalist.)

            On your example, it doesn’t seem like there’s enough info here to indicate what D is, only that it comes before E. But which it is seems to determine whether E or later happens.

            Liked by 1 person

        2. ”it doesn’t seem like there’s enough info here to indicate what D is, only that it comes before E. But which it is seems to determine whether E or later happens.”

          Welcome to my world! This gets into a whole ‘nother layer of metaphysics. This is Kant/noumenon stuff. You can’t know what D is, only what it does, i.e., the pattern of causations (plural) that it’s involved in. If you can identify a useful subset of all the patterns, you can give it an identity/name, “D”. But you can never be sure this “D” is exactly the same thing as that “D”. You might be able to identify sub-D’s, but that just puts you in the same position, but down a level.

          Oh, and trying to talk about mutual information without probability is not only making it as simple as possible, it’s making it too simple.

          *

          Liked by 1 person

          1. I actually can’t see enough to say what D does. Maybe if there were examples of particular cases.

            And as a structuralist, when I use the word “is” like this, I usually mean its profile of structure and relations, including causal ones, that is, what it does. But I’m not a Kantian or neo-Kantian. I can’t rule out that there aren’t unknowable intrinsic properties to things. But it does seem like I can rule out their relevance.

            Liked by 1 person

      2. FWIW, my acute project is to find unitrackers and semantic pointers in brain anatomy. To this end I’m very excited about finding Max Bennett’s book “A Brief History of Intelligence”, but more significantly, his paper “ An Attempt at a Unified Theory of the Neocortical Microcircuit in Sensory Cortex“ (https://www.frontiersin.org/journals/neural-circuits/articles/10.3389/fncir.2020.00040/full). I’m pretty sure unitrackers=minicolumns, but I currently suspect semantic pointers may be the set of layer 5 neurons associated with a macrocolumn, or something like that.

        *

        [stay tuned]

        Liked by 2 people

  4. Entropy applies to a given macro-state, at least for Boltzmann’s definition. And how you cut up the world is slightly observer-dependent. But only slightly, because different observers are likely to cut (or group) the world into very similar macro-states, at least in a great number of cases. I don’t have a good physical understanding of why the world is full of emergent objects and properties, but it is, and even organisms with very different structures will recognize them. Octopi, I would bet, have a percept/concept for “waves” (surface waves in the ocean).

    Isn’t a qubit just a superposable, entangle-able bit? There are lots of “its” that don’t come from “qubits” like this. An emission-absorption event of electromagnetic energy takes multiple continuous variables to describe: times and locations. I prefer to turn that slogan upside down: qubit from it. By preparing quantum states in particular ways, we can make useful qubits. While discarding lots of other physical information.

    As you point out, computing clusters generate lots of waste heat (i.e. they increase entropy). As long as computations are used to create and store useful information – calculation results, etc. – this is inevitable. Every physics nerd seems to know about Landauer’s principle that erasing a memory (to clear it, e.g.) increases entropy. But what impresses me even more is Mlodinow and Brun‘s argument about the “generality” of memory – or as I would call it, reliability. A memory cannot be infinitely fragile and still be useful – that is, it cannot depend on every last microscopic detail being exactly in the right place. If my brain “remembers” that “Mike has a new post”, this sentence “Mike has a new post” had better not correspond to a truth only in the event that every word in your post, and all the individual neural firings in my brain, go just so – even if, on this particular day, they do go just so. That would be dumb luck, not knowledge.

    Liked by 2 people

    1. On everyone agreeing about emergent macroscopic entities, it’s not hard to imagine constructing robots that can operate on different scales, nanobots for instance. Or systems which operate on planetary scales. The models such a system uses might be very different from the ones we use.

      On qubits and continuous systems, I suspect we’d get different responses from Chalmers and Lloyd. Chalmers would probably talk about the prospect of every system being ultimately reducible to qubits, even if it’s not currently understood how. Lloyd would talk about how any system can be mapped to qubits. He rejects the distinction between digital and analog quantum computing. At least in 2006. I’ve seen more recent discussions of analog quantum computing as a distinct concept, so I wonder if he’d still hold that view.

      Liked by 2 people

    1. Thanks. The Boltzmann brains part of that post was interesting, although the cognitively unstable argument against them has always struck me as a bit too easy. That said, it’s really just a particular form of solipsism, and while we can’t rule it out, it doesn’t seem like a productive assumption that ever gets us anywhere.

      I like his alternate description of the easy and hard problems, between why conscious minds behave as they do vs why they experience. Although I still think the distinction is misguided. It seems to assume that experience is unrelated to behavior, which I think is another unproductive assumption.

      Liked by 1 person

      1. Yeah, I can’t believe anyone would take the Boltzmann brain thing seriously. I think we can rule it out. I’m ruling it out.

        I didn’t catch that in the distinction he calls ‘brain’ the mind’s ‘behavior’. I think that’s a strange term for talking about the hard problem because it doesn’t draw a sharper line between brain and phenomenal experience, though from what he says overall I don’t think he’s making any grand statement from that description.

        Which distinction do you think is misguided?

        Liked by 1 person

        1. I find Boltzmann brains (along with Boltzmann galaxies or Boltzmann observable universes) an interesting thought experiment. But once the possibility is noted, it doesn’t seem like there’s much to do with it. As the other post noted, we have no choice but to go on as though it isn’t true.

          On the distinction, I think he’s using “behave” and “experience” as an alternate way of describing the access consciousness / phenomenal consciousness divide. It’s the old distinction between the “easy problems” and “hard problem” , which as you know I see as a wrong turn.

          Liked by 2 people

  5. Great post! I too have heard Sean Carroll speak of complexity and information in a way that is incongruous with the physically entropy-inspired definition. What’s interesting is that in this interview (https://www.youtube.com/watch?v=FEnKLcSiHy4), Carroll says that computational theorists often think of information and complexity as being linked to algorithmic definitions like Kolmogorov complexity (and therefore the amount of information is directly correlated with entropy), but that physicists think of the two as inversely correlated (@21:45 in the interview), which I thought was super weird. I would’ve thought that physicists would be the first to conceptualize macro-information as basically being entropy!

    Liked by 3 people

    1. Thanks! And that looks like an interesting interview. I’m going to have to watch the rest when I get a chance.

      In navigating to the point in the video you noted, I landed on the 20 minute mark, and heard his home experiment of taking a picture of a glass of coffee with coffee at the bottom and cream on top, and then again after it’s been slightly mixed. He noted that the resulting image after the mixing will be larger, because it can’t be compressed as much. In other words, it takes more information to now describe the coffee / cream combo. (Not sure this would be true after the coffee was thoroughly mixed.)

      But he also notes the different conceptions of information. I think he’s talking in terms of knowledge, or information about the system, which he notes is different from Shannon information. Although it seems like the type of information he is discussing, the one he said is the physicist version, isn’t the one physicists are concerned about accounting for in black holes. It seems like that version has to be the one that causes his image to be larger.

      But the snippet left me wondering if “complexity” might be a more neutral phrase for this concept. I do need to watch the rest though!

      Liked by 1 person

  6. Ooof, there’s a lot to talk about here.

    I don’t think it’s helpful to talk about the universe as a computer personally, because I think that reality ought to be understood in terms of the organic rather than the artificial. We like to talk about things in terms of information because it’s quantifiable, and when we quantify something we imagine we understand it more than we really do, because it borrows the credibility of maths. But we’d be much better served to think in terms of Aristotelian forms instead – I think this is what information is really talking about, but in a pseudo-codified way. The point is that we’re looking at structure across a whole, rather than mere parts. By thinking of reality in terms of forms rather than information, I think we avoid the error of thinking reality truly consists of a set of 0s and 1s.

    Are entropy and information convertible? Yes. Maxwell’s demon demonstrates how gathering information can even allow you to reverse entropy. And quite incredibly scientists have managed to construct molecular structures that actually do this, creating a record of molecular motion and by this separating hot from cold molecules. The catch is that when the data is deleted it must be released again as entropy. This is explained nicely in Paul Davies’ book ‘The Demon in the Machine’.

    Are you familiar with the Free Energy Principle? It takes entropy and information as convertible, and manages to explain self-organizing entities, such as living beings. It means that living beings are engaged in trying to minimize the free energy/information/surprisal that we receive, attempting to make our world ordered and “resist” entropy, at least within ourselves.

    Personally, I think the idea that entropy is observer relative makes perfect sense. Even looking at the earlier understandings of it strictly in terms of heat, that’s a relative notion because heat is just the average motion of a group of particles, and averaging is a calculation which reduces information; heat doesn’t exist at the molecular level.

    Liked by 2 people

    1. Hey, generating a lot to talk about is part of the goal. Thanks for jumping in!

      I can’t claim too much knowledge of Aristotle’s version of forms. But on understanding, it seems like a lot depends on what we mean by “understand”. Do we want a predictive model of the phenomena? If so, then the structure and relations, the math, seem hard to avoid. I know a lot of philosophers talk about what something is in a more intrinsic manner, but I’m not sure what that means, or if it would be anything we could ever really know.

      Maxwell’s demon never had much of a hold on my imagination. Aside from how it could come to know what it does, it always seemed like its own physics would be the issue. The thought experiment does do a good job at illustrating the statistical nature of the second law.

      I guess my issue with a concept being observer dependent, is what does it mean when there is no observer? I often hear that the early universe was very low entropy, and the heat death of the universe will be high entropy. How can we say that if there are no observers present on either end? Or are we making these statements relative to our own distant and indirect observer perspective?

      Liked by 2 people

  7. We do want a predictive model, and the structure and relations are indeed unavoidable for this (I’m largely an Ontic Structural Realist myself, so I actually think this is kind of the whole story). But I think trying to understand it as “information” runs the risk of making our understanding too one dimensional, and mistaking our provisional models for the much more subtle and nuanced and endlessly interpretable reality.

    The thing is that they’ve actually made Maxwell’s demon. It’s not just a thought experiment any more. It also seems that such “demons” play a crucial role within living cells.

    Re the universe’s entropy at its beginning/end, I think we’d have to look more closely at the meaning of entropy as used in the calculations, and if we did that we would see the theoretical observer baked in to the calculations. For example if it’s looking at entropy as defined in relation to heat, then our calculations are taking a limited, macroscopic perspective.

    Liked by 3 people

    1. I’m an ontic structural realist myself. But with that in mind, what about the information view do you see being left out? You say “one dimensional”, but it seems like information has no problem incorporating multiple dimensions. We do have to be careful that we’re working with physical information, essentially the structures and relations we agree are there, and which we build the more constrained version used in our descriptions.

      On Maxwell’s demon, now that you mention it, I guess the permeable cell and organelle compartment membranes do function as a sort of Maxwell’s demon (a Markov blanket in Free Energy Principle talk).

      I think you’re right about the observer being tangled up with the concept of entropy, and it goes back to its origins as an engineering concern with steam engines. I’m just not wild about that, for the same reasons it bothers me in quantum mechanics. Maybe I’m all wet, but if it’s observer dependent, it doesn’t feel like a complete physical framework yet.

      Liked by 2 people

      1. I think my issue is that we tend to think of information as a string of either words or 1s and 0s. That’s what I mean by “one dimensional”. We think we can, in principle, have a comprehensive grasp on such information, and I want to emphasise that we cannot, even in principle, reduce reality to a single string of information. It’s also just that I think our metaphysical image of the world should be more organic rather than artificial.

        It’s probably my own philosophical biases, but I very much like the explicitly perspective dependent aspect of it, and the fact that it doesn’t impede the physical law working.

        Liked by 2 people

  8. Hi Mike,

    A few responses to your notes on entropy here, and how the concept relates to information and complexity. I think it may be important to distinguish between a “classical” definition of entropy and then the more abstract versions of the concept that arise trying to apply it to modern physics, especially quantum phenomena.

    You wrote, As usually described, it leaves the question of how much entropy a particular system has as observer dependent, which seems problematic for a fundamental physics concept. I would argue that in the classical sense your view of this is incorrect. The entropy of a fluid or system is well-defined and is not observer dependent, particularly when thinking of the performance of heat engines, for which the concept was largely developed as you know.

    Information comes into play here in the sense of passive information–not the active type you were writing about above. The amount of information (the number of physical parameters one would need to know) required to describe a high entropy state of a system is considerably greater than the amount required to describe a low entropy state of a system. I would argue this isn’t a question of what any given observer may happen to know about the state of the system, which would make entropy observer dependent, but a measure of what any observer can know. Classically, if all I can do is measure properties like mass, temperature, pressure, velocity, etc., then no observer is privileged, and higher entropy states are simply those for which these macroscopic parameters could correspond to a higher number of possible micro states than a lower entropy state of the same system would.

    Such systems are disordered because they are inherently more random in a sense, with less relationships between the particles that comprise it. There are less constraints and more degrees of freedom than in a more ordered state, such as a crystal, the structure of which requires the maintenance of certain relationships between the particles. They can’t just be anywhere, going at any given velocity. So the crystal is more ordered.

    Application of the word complexity is confusing because some authors note that disordered, high-entropy states are complex, while the field of complexity is often about low-entropy, living systems. These are two very different definitions of complexity. In the former, high-entropy states are “complex” in the sense that they are very difficult to define and a great deal more information (quantity of physical parameters that would need to be known) is required to describe them. They’re convoluted, messy… “complex.” But from the perspective of the other view of complexity, which is the study of highly-organized systems, they are not very interesting–like a living cell is, for instance–and so in this view high entropy systems are “less complex.”

    You wrote, He sees entropy as information, although he stipulates that it’s hidden information, or unavailable information (similar to how energy is present but unavailable). But this again seems to result in entropy being observer dependent. I think I’ve addressed this above. There’s less hidden information in an ice cube than a glass of water because the amount of information that would be required to fully define the micro-state of the ice cube is much less than the amount that would be required to fully define the micro-state of the glass of water. This isn’t about what any observer actually knows or doesn’t know but about what would need to be known to provide a comprehensive description of the physical state of the system.

    You wrote, If so, then computers are high entropy systems since none of us have access to most of the current information in the device you’re using right now. I think this is a misunderstanding of the basic concepts of entropy, at least classically. The atoms in a computer are very specifically located and organized and it would require knowledge of far fewer physical parameters to fully describe the state of the system than if the computer was shaken apart into dust at the same temperature, and then those same atoms were placed in a jar.

    You wrote, Computational systems seem to be about harnessing their entropy, their complexity, and making use of it. And we have to remember that these aren’t closed systems. As noted above, they’re systems that require a lot of inbound energy. It’s that supply of energy that enables transformation of their highly entropic states. I think there are some misunderstandings here as well. We could think of a computer as a racetrack for electrons, let’s say. Higher voltage, (lower-entropy) electrons are released at the “top of the hill” and flow downhill through the race track, emerging as lower voltage (higher entropy) electrons. The energy extracted from the electron causes switches to change state, lights to wink on and off, etc. But there is, from a big picture perspective, very little change to the state of the computer. In other words, the amount of physical parameters required to fully describe the state of the computer before and after a program was run would be virtually the same. Bits of memory may be radically different, but no additional parameters would be required to describe them. We could even unplug a computer for 10 years and set it on a shelf and very little change would occur. So I would suggest it’s not really the computer’s entropy state that is changing when it is “running.” The entropy change is in the electrical energy flowing through it, which in the case of the computer has relatively little impact upon it.

    Contrast this to a living system, in which the flow of energy through the system is continually utilized to maintain or sustain the low entropy condition of the organism itself. If we shut off that energy source, not only does the computing stop, the organism itself spontaneously transitions to higher entropy states… A computer does this, perhaps, over a far longer time scale–(even on a shelf after thousands of years it will degrade)–but in essence there is no functional or causal coupling between the energy flowing through it and its entropy level. In short, the computer is a “static” device from an entropy perspective–it is not transforming at all in real time in the classical sense of entropy. But organisms are.

    You wrote, So computers are causation machines and entropy transformers. A computer is an entropy transformer in the same way that a wind turbine is. The energy passing through it, due to its structure, increases in entropy to do work. But it is not an actively transforming entity like an organism.

    So in the end, I’m not sure I would say physical information and entropy are identical. It would seem to me that entropy might, in the final accounting, still be a quality or characteristic of physical information, and not the physical information itself.

    Michael

    Liked by 2 people

    1. Hi Michael,

      Good hearing from you!  I’ve missed our discussions.

      I have to admit that my knowledge of classical entropy is scant.  Most of what I’ve picked up came from approaching it through information theory and statistical mechanics.  So I’ll defer to your expertise on that definition being defined in a non-observer dependent fashion.  Maybe it’s just the popular descriptions that are to blame.  

      I do think a lot of what you describe matches my take of disorder for transformation (or work in engineering terms).  It’s that added “for” clause that turns into a more objective state, at least for me.

      One question I would have is for your point about what an observer can know about the system.  Aside from a black hole, wouldn’t that depend on how much energy we were willing to bring into the system, how much we’re willing to work at dissecting it?  Or do you mean what we can know passively?  Am I missing something more fundamental here?

      On complexity in particular, I actually listened to a podcast sometime after writing this post which discussed complex systems with Sean Carroll.  Carroll made a point similar to yours, that there can be high complexity systems with lower entropy.  Which shook my confidence in that association.  Although I remain confused on where the disconnect is between them.  It still seems like an organized high complexity system has more entropy than an organized low complexity one.  

      On computers, I would say you’re thinking of the hardware as something separate from the information it contains.  (Which modern devices make easy to do.)  But a blank hard drive or SSD seems like it’s in a different state from one with a lot of information on it.  (I’m not sure which one has more entropy.  “Blank” could mean uniform, but not necessarily.)  We have to remember that information is physical, and so whatever stores it is in a different state from before it stores it, and processing that information changes its physical state, changes that can’t happen spontaneously within the system, but require energy from outside.  It also seems like the portion of a computer, aside from an internal battery, most susceptible to degrading over time.

      But definitely the fact that we can store it for any length of time on a shelf makes it very different from a living system, far less dynamic.  Although in the future we might build machines closer to the thermodynamic edge, which could have vulnerabilities similar to living systems.

      I’m open to entropy being different from physical information.  Your point about it being a characteristic of that information is interesting.  But I wonder if such a characteristic would itself not be information.

      Liked by 2 people

      1. Hi Mike,

        I’ve missed our conversations as well. I’ve not really been online much but always find the subject of entropy interesting… 🙂

        Regarding your paragraph about an observer, that begins with, One question I would have is for your point about what an observer can know about the system, I would say that from the classical perspective there are macroscopic “state properties” that in the classical theory essentially define the limits of what can be known about a system. Keeping in mind this began with thermodynamics, a relevant example would be temperature. If I have a working fluid like steam generating power in a steam turbine, I can measure the temperature of the fluid prior to its entrance to the turbine and that will define a property of that state condition. Any given molecule could have a different “temperature” however. But even though all the molecules may have different velocities or states of vibration, the temperature is basically identical anywhere I measure it macroscopically. And knowing things like pressure, temperature, mass flow, etc., (and entropy) is sufficient for me to accurately predict the work that could be produced in a steam turbine. In fact, in classical thermo, I only need to know to properties of the steam to fully define its state. So if I know temperature and pressure, I also know the entropy and the enthalpy of that steam. I also can then predict the maximum possible work a “perfect” steam turbine could produce between two pressures (a high pressure inlet and a low pressure outlet), which would be the amount of work that would result from holding the entropy of the steam constant before and after the turbine. Knowing my outlet pressure and the entropy is again sufficient to fully define the state properties of the steam at the exhaust. Steam turbine efficiency then is the ratio of actual work produced to this theoretical “isentropic” work. In the real world, the entropy of the steam will increase and less useful work than the theoretical limit will be accomplished. But long and short, in the classical sense, the notion is that a given state will have “fixed” macro properties like temperature or pressure or entropy or enthalpy, but for any macro state there will be some quantity of micro states defined by the actual position and velocity of all the individual particles. There is really no discussion about how we might try and figure that state out. My theoretical understanding is perhaps lacking here–but the long and short is that we can’t, perhaps because any measurement changes the state. Also, higher entropy states have more possible micro states that yield the same observables.

        On this, It still seems like an organized high complexity system has more entropy than an organized low complexity one, it really comes down to what you are meaning by complexity. A macro state that has many more micro states than another has a higher entropy. Some would call this “complex” but to paint a picture, these states cannot do any useful work in the thermodynamic sense. High entropy equates to low capacity to do work, but a great many indistinguishable micro states. Obvious and extreme example: heat death of the universe. More practically, let’s say I have a finite quantity of mass at 1,000 deg F and another finite quantity at a lower temperature of say 50 deg F. A fluid like steam heated to 1,000 deg F with the former can do some useful work as it is cooled to 50 deg F, if I interpose a steam turbine between the source and the sink. But if I just take my 1,000 deg F source and my 50 deg F sink and I mix them together, so I have just one equilibrium mixture at 525 deg F, I can’t do anything at all with it–at least not until I create some relationship to another lower temperature system. But the state in which everything is mixed together in equilibrium has vastly more possible micro states and thus a higher entropy than the system comprised of two distinct regimes at different temperatures. One reason is that if I keep the two regimes separated, by a valve say, then the molecules are constrained in the sense they can never be found on the other “side” of the system. I have to be careful because we can “draw the box” in different ways. I can just talk about the entropy of steam–which is lower for high temperature steam and higher for low temperature steam. Or I can talk about a system consisting of two steam chambers connected by a valve: before I open the valve there is hot steam on one side and cool steam on the other. That has a lower entropy than after I open the valve and let them mix… Make sense?

        When we talk about “complex” systems we may be talking about something like an organism. When it is alive, molecules are in highly ordered states and cannot just be anywhere… But if we kill a cell and shake it up until it is shattered into bits, then molecules could be just about anywhere. So the former, living cell, has a higher complexity, and a lower entropy, and vastly fewer possible micro states, than the pile of dust.

        When you talk about “organized high complexity” and “organized low complexity” systems I’m not sure what you mean. Entropy is generally used, classically, to describe the quality or potential to transform–as you rightly note–of a system. The potential to transform generally comes from the fact that, in nature, systems move from low entropy states to high entropy states. This is exactly equivalent to saying thermal energy flows from high temp states to lower temperature ones, that balls roll down hill, electrons flow from higher voltage states to lower voltage ones, etc. All those simple dynamics we understand readily, are text book examples of systems evolving from low entropy to high entropy states. So it’s actually the lowest entropy states or systems that have the greatest capacity to “do something.” The byproduct of their “doing something” generally increases entropy “somewhere.” When a steam turbine generates electricity, the turbine’s entropy really doesn’t change–the steam’s does. Nothing about the steam turbine itself has materially changed.

        That brings us to the computer analogy. Let’s take your hard drive. You wrote a blank hard drive or SSD seems like it’s in a different state from one with a lot of information on it. First of all, absolutely I agree. It’s in a different “state.” But since classically, entropy is about how many microstates a macro system could occupy, I would say that changing the contents of the hard drive is akin to toggling between micro states, and therefore not a change in the entropy of the drive itself, in physical terms. Generally speaking, if I understand the hardware correctly, the atoms of the hard drive that were there before / after the various bits on the drive are flipped, haven’t really gone anywhere. They may have changed energy states to a certain degree, and if more energy was stored on the disc than before that would be a small change. But since generally speaking the configuration of atoms in the hard drive is to a very large degree unchanged, thermodynamically it doesn’t seem like the entropy would have changed all that much. The macro state is virtually identical. This is why I say the computer is more like the steam turbine than the steam. In terms of physical atoms, the computer is largely unchanged by its operations. I don’t think flipping bits around in memory would equate to a significant thermodynamic state change. That’s not to say it doesn’t change at all. I just don’t see the changes to be significant: it’s not like the hard drive changed shape, or topography, or that atoms located in a certain location are now free to appear somewhere else entirely. So, ultimately, what I’m saying is that reading or writing to / from a drive doesn’t alter the number of possible micro states the drive could be in…

        I understand that intuitively, there is a vast difference in the information content of the drive and that, on this basis, when one has the mindset to equate various quantities of information with entropy, one must conclude a big change has occurred. Now we’re getting outside of classical entropy I think, and so we get to some interesting questions. Let’s say I have a novel on the one hand, physically in the form of a paperback, and an equivalent quantity of paper coated with an equivalent quantity of ink on the other. Neither is going to spontaneously transform, and the two are equivalent in terms of their ability to produce real work. Conceivably, if we burned either one, the heat released would be identical and the heat engine using that energy could do the same amount of work. But in information terms, I 100% agree they are vastly different. So where in the rabbit hole does information become a physically powerful entity? Where is it that information actually “moves” physical systems?

        I would be interested in your understanding of this… My understanding gets really fuzzy here. Perhaps the states of quantum systems are akin to the words of the novel, in the sense that they are discrete building blocks, and they compel the evolution of physical processes? Like a living language? These quantum states are very specific, perhaps, and so they are like “words” but also they are not just symbols. They simultaneously and effectively are, or embody, a physical energetic potential that is one-to-one correlated to the symbol (the state) itself? I’m completely spit-balling here on how what we think of as “information” could eventually / ultimately run a tractor…

        What are your thoughts here from your reading? Where does information cease to become “virtual” and become “tangible” or “motive” in its nature, as instantiated in the physical universe?

        Thanks,

        Michael

        Liked by 2 people

        1. Thanks Michael.  You’re always welcome anytime you feel like jumping in.

          If I understand your explanation on what can be known for a steam engine system, it sounds like this is a practical engineering sort of limitation.  That makes sense.

          That also seems like the point I’m seeing for the macro / micro state relation.  For hard headed engineering considerations, that’s fine.  But for a broader theoretical picture, who defines the distinction between micro and macro?  It seems contingent on the scales we evolved to operate over.  So for the heat death of the universe, what counts as micro and macro?  Maybe we can just use our scales to talk about it, but it doesn’t seem like a natural boundary.

          On organized high complexity vs organized low complexity, consider a two chamber system, each at different temperatures.  If we open a door between them, there will be transformation as they equalize.  But the initial state seems organized and low entropy.  Now consider the biological cell you used in your example.  Even if everything is working right for the cell, wouldn’t it be in a higher entropy state than the initial one of the simple two chamber system?  The cell needs constant energy intake to maintain its homeostasis and overall mechanisms.  The chamber won’t for its equalizing transformation.

          On the two books burning, this again gets to the pragmatic vs theoretical aspects.  If you burn both books, there will be micro-differences between what happens.  Not anything we could detect, mainly because they would be swamped by other stochastic variations from the environment.  But if information is conserved, then the configuration of the ink from each will lead to microscopic differences in the burning processes.

          I think when it comes to information, the thing to remember is that it’s always physical.  We may not always talk about it as being physical, but it is.  In that sense, you could think of the words in a novel as a series of something like logic gates.  You shine light at the book, which reflects off the ink configurations, which strikes your retina and leads to state changes in your brain.  So the light plus ink configurations are causal, not by themselves, but like an AND gate, when combined in the right way, produce consequences.

          It’s interesting to note that things like books, computers, and other information bearing things are highly optimized to store causally differentiated patterns at the lowest energy levels that can be managed.  Due to that mechanism, they always need enabling systems, or amplification systems, to have causal effects in the world. It’s worth noting that brains don’t make the separation between action and information we do in books and computers.  Although the activity in brains still happens at very low energy levels.  It depends on its own amplification systems (muscles) to have effects in the world.

          Of course, if you’re a dualist, then maybe information in the mind isn’t physical, so the picture may not be as consistent.  But everywhere else, it seems consistently physical.

          Hopefully somewhere in this I’m getting at your question?  Let me know if not.  You’re always interested in entropy.  I’m always interested in discussing information!

          Liked by 2 people

          1. Hi Mike,

            We’re in interesting territory here…

            You wrote, But for a broader theoretical picture, who defines the distinction between micro and macro?  It seems contingent on the scales we evolved to operate over.  So for the heat death of the universe, what counts as micro and macro? On this one, I wouldn’t disagree that macro / micro are contingent on scales. At the same time, it’s pretty clear that the air temperature outside is a macro property comprised of countless individual molecules moving around, and that the same air temperature can occur for countless states of those molecules. This phenomenon may not be as scale-dependent as it seems though. There may be properties of galaxies or galaxy clusters that are the same everywhere, and that could be derived from countless configurations of the constituent stars, for instance?

            But either way, I probably shouldn’t have brought up the heat death of the universe, as I think it’s unclear entropy applies to the universe as a whole. It’s just instructive to note that in a system without any gradients in the quality of its energy, no transformation can meaningfully occur. It was meant to be illustrative only. That said, presumably when the universe is close to equilibrium a macro property will be something like whatever temperature is measured, and a micro property would be a comprehensive set of parameters defining the state of all its particles.

            On this one, Even if everything is working right for the cell, wouldn’t it be in a higher entropy state than the initial one of the simple two chamber system? I think it’s hard to answer. If there’s 100 trillion atoms in a cell, and the two-chamber and valve ensemble has 100,000 trillion atoms divided into a hot side and a cold side, it seems intuitive that the number of microstates the two-chamber system could occupy would be vastly greater than the number of microstates a functional cell could occupy. On that basis, the two-chamber system has higher entropy, not the cell I’d say. Even if entropy is expressed on a “per-mole of particles” basis or something, the living cell would likely have far less possible states it could occupy than a two chamber system with the same number of particles. There’s just far, far less constraints in the latter.

            On the book-burning example, and information in general, I have no issue with information being physical in the context of this conversation. I just don’t necessarily think we define what information “is” the same way. The book without words that is just a stack of papers with ink smeared on it, has a higher information content from an entropy perspective than a novel with the same quantity of paper and ink because it takes more bits of information to define the disorganized thing’s state. But that is information about the system. Not information in the sense of physical parameters that impact the world in any way whatsoever. Entropy is about the quantity of information needed to describe the system, and the greater that is, the less that system can do (thermodynamically / classically).

            But similarly, von Neumann entropy is about the information that is unavailable to any viewer of a quantum system with a given vantage point. That information is unavailable when the quantum ensemble under review is in a mixed state, and the properties of the wave function that would contain all the information about its condition are “lost” through entanglement to the environment. So there is no measurement that could be made on the quantum system in front of you that would ever provide information on the entire wave function. Here, entropy is higher when either quantum events have more possible outcomes, or when an increasing quantity of information contained in the wave function is inaccessible due to entanglement. In both cases, entropy is higher when the amount of missing information that would be required to fully describe the state, or to describe the possible transformations of that state, are greater.

            I agree there are microscopic differences between burning a novel and a random stack of paper of similar size, but there are probably no two identical macroscopic objects in the entire universe so I’m not sure what is significant about that.

            It’s still difficult for me to see computers as entropy transformers for the reasons noted in my earlier comment. It comes back to this for me at the moment: So, ultimately, what I’m saying is that reading or writing to / from a drive doesn’t alter the number of possible micro states the drive could be in…

            Michael

            Liked by 2 people

          2. Hi Michael,
            Good point on the chambers. I probably should have picked something on a similar scale to the cell, like maybe a modern microtransistor, although those aren’t engineered to spontaneously transform.

            The heat death of the universe is often cited as a maximal entropic state, so it makes sense to bring it up. I can see it in terms of transformation no longer being possible. Although whether the universe is ultimately an open or closed system I think remains unknown.

            Sounds like we agree that a high entropy system is one that takes a lot of information to describe. I kind of pushed the boundaries a bit by saying it just is a high quantity of information, to see if anyone could distinguish them. Along those lines, you’ve given me things to think about. Thanks!

            Another case that I’m not sure about are black holes, which are also said to be maximally high entropy systems. Although that gets into the information loss paradox, so I probably shouldn’t expect consensus answers there.

            Definitely for von Neumann entropy a mixed state is higher entropy than a pure one. Decoherence is an entropic process. In fact, its irreversibility, unlike an ontic collapse, is statistical in nature, a second law process, or at least that’s how I think of it.

            The main reason I noted the difference in book burning was information conservation, something that seemed pertinent to the questions you were asking.

            For the drive, I think what I’d point out is that a drive where we’ve run software to blank out the content (like DBAN) , basically reduces the number of microstates. If we subsequently begin saving data again to it, the number of microstates increases, or that’s the way it seems to me. This isn’t anything that happens spontaneously. It takes energy flowing into the system, and waste heat removal, for it to happen in a way that’s productive.

            Is the same portion of the system undergoing transformation anywhere near what happens in a living system? No, not at all. It’s fair to say its infinitesimal in comparison. Although as I noted above, that could change in future technologies.

            Liked by 2 people

  9. Hi Mike,

    Agreed on most of your notes here, Mike. I wanted to provide some thoughts on the drive example. I think there’s two ways to think about this.

    First, one could think about the contents of the drive as information in the Shannon sense, which I’m not expert on, but for which I think it would be obvious that the entropy of a blank drive would be different from one with non-random files saved on it.

    But as for the drive itself, I think of the microstates as being all possible combinations of 1’s and 0’s for the memory. Thought of like this, it doesn’t matter what is on or not on the drive–whether any non-random files are saved to it or it is blank, or all 1’s, or any combination of 1’s and 0’s whatsoever. The micro state that it is in doesn’t change the macro state–those micro states would just all be possible positions of the “particles”–none of which are distinguishable from one another by macroscopic properties. “Physically” speaking, the change in micro states associated with reading / writing to the memory would be analogous to the molecules of gas in a fixed volume being in different locations within the volume. Subject to various “real world” assumptions, the temperature measured, for instance, wouldn’t change.

    So I think, in a sense, we’re just saying different things here. Even though Shannon entropy is analogous to thermodynamic entropy, I’m not thinking they are physically identical. But both could be causal, in a sense… with intriguing distinctions. Let me back up: in thermodynamics I’m not sure I’d say that entropy was “causal,” at least not in the way that electric charge or fluid pressure is, for instance. To try and explain what I’m trying to say: we don’t need entropy for physical events to occur. A ball would in theory roll down hill in a gravitational field with or without entropy. Likewise, it won’t roll up the hill with or without entropy. Entropy is just a quality of things, like say, a level of brightness or a degree of saturation in a color, and measuring this property we can make various predictions about how physical systems behave. But it doesn’t make sense to me to say that entropy “causes” a ball to roll down a hill.

    That said, entropy seems to suggest the existence of a container or structure in which the causality of other physical processes can occur. That structure is time, perhaps. We know that the entropy of physical processes proceeds in only one direction, and in a sense this is time, or irreversibility. Does time “cause” a ball to roll down a hill? I mean, it doesn’t seem like it does. But it wouldn’t even make sense to discuss a ball rolling down hill without time. If there wasn’t time, or that “uni-directional” as well as qualitative (low to high entropy) sequencing of physical processes, then we wouldn’t have causality at all and many more physical processes would theoretically be possible… I think causality as we understand it would probably break down.

    Information in the Shannon sense is almost inside-out of that type of entropy in the sense that it only seems causal within a context. It doesn’t provide the context. So for physical processes, entropy seems to be characteristic of a context in which physical processes occur, like I tried to say above. But the information associated with signaling becomes “something more” than just physical bursts of energy only within a context of meaning–such as a language or a code. Without this, such information would not be very causal. Entropy here relates to how easily a given message can be understood I think, which occurs within the context of some sort of language or code. “Causation” in this context results from the receiver of a signal taking (or not taking) an action based upon the information received. It seems that in this scenario “cause” moves from the motive origin of a signal, follows the signal through whatever medium it travels, and then continues in the form of the recipient’s response.

    I’m inclined to think these are two different orders of phenomena. The computer would be causal here in the sense that it can initiate and receive signals and depending on our definitions “respond” to them. There’s “something extra” layered upon the base physical processes that define the signal, and this is more than analogous to living systems: it’s one thing I think that makes living systems unique. They are coded orchestrations of energy in flux. You have to have the unique low entropy thermodynamics and the unique low entropy signaling together in a sense, though they are not identical phenomena for me. There are thermodynamics there for sure, but also a very rich, multi-layered system of signaling with “meaning” in the sense that the signaling drives responses related to maintaining the organism as a whole.

    Michael

    Liked by 2 people

  10. The problem I have with applying Shannon’s entropy to things like computers and the universe is that it relates to messages where there is a sender and a receiver who needs to decode the message.

    “[Shannon] captured it in a formula that calculates the minimum number of bits — a threshold later called the Shannon entropy — required to communicate a message. He also showed that if a sender uses fewer bits than the minimum, the message will inevitably get distorted.”

    https://www.quantamagazine.org/how-claude-shannons-concept-of-entropy-quantifies-information-20220906/

    Who or what are the senders and receivers of the universe? What is the message of a computer?

    Liked by 2 people

    1. My initial take is to wonder what is necessary for something to be a sender or receiver. For example, our computers send and receive all kinds of information, most of which we’re not privy to. And we can see DNA and RNA as proteins and ribosomes communicating with each other. Are there any systems we’d say are too simple to be either a sender or receiver? If so, what’s our criteria for the cutoff?

      Liked by 2 people

      1. Senders/receivers wouldn’t apply to the universe in total unless we bring God (sender) and maybe consciousness (receiver) into the picture, which I won’t. I don’t know that rocks are either senders, receivers, or messages in any meaningful sense.

        Communications, of course, are key parts of computing and living systems. I don’t think computing and living systems are especially comparable in how they communicate. Computers use electronic bits which would be what Shannon is talking about. I don’t know exactly where the “bits” are in living systems or how Shannon’s theory would apply except as metaphor.

        So, I would pretty much exclude everything except “bit” related technology.

        If the brain is a receiver and reality is the message (from some unknown sender), it might be doubtful we are receiving the minimum number of bits to decode the message without distortion.

        Liked by 2 people

        1. Right, this gets into questions like what, if anything, caused the universe, and what, if anything, does the universe cause? We don’t even know if those are relevant questions since we’re asking it as systems embedded inside the universe.

          A rock is affected by the causal effects that act on it. So solar radiation could be seen as communication between the sun and a rock, resulting in it heating up. I’m not saying that’s a particularly productive interpretation, just that it is a possible one.

          If you compare Shannon’s formula to Gibbs’ entropy formula, they’re basically the same, except that Shannon uses base 2 for his logarithm while Gibbs’ uses the natural logarithm, base e. (This was recognized by John von Neumann, who reportedly urged Shannon to just call his concept “entropy”.) As I understand it, for classical analog systems, there’s always an amount of coarse graining involved to analyze it, so the actual units end up being arbitrary, or more accurately, scaled for whatever the purpose might be. Lloyd, in his book asserts that the continuous / discrete distinction disappears in quantum computing, but I don’t remember the ins and outs at this point.

          Right, if God is the sender, reality the message, and we the receiver, that seems to put us in a Berkely idealist type framework. I agree if that’s the situation, the message / receiver dynamics could have been set up better.

          Liked by 2 people

Leave a reply to James Cross Cancel reply