I’ve long struggled with the concept of entropy. Part of the reason is the way it’s often described in popular science accounts, which typically seem subjective and value laden. The most common way of describing it is the amount of disorder in a system.
But disorder according to who? A room that appears messy and disordered to an outsider, like my office, might be the owner’s idea of pragmatic organization. Of course, the owner typically knows the causal history of the items in the room, and so might know where everything is, but to an outsider, it’s a mess with lots of uncertainty within it. (Yeah, I know I’m full of baloney on knowing where everything is. But it’s not like knowledge of where things are is any better after someone comes in and straightens everything into nice neat stacks.)
That relates to another typical description of entropy, the amount of uncertainty in a system, or perhaps the amount of hidden information. But again, that seems subjective. If you know more about the configuration of a system than I do, does that mean it has a lower entropy for you than for me? This seems wrong for what’s supposed to be an objective property of a physical system.
In my own mind, I typically think of entropy as the extent to which a system has approached its final causal state, its resting ground state, which fits with Rudolf Clausius’ decision to use the Greek word for transformation to describe it. But that’s admittedly somewhat tautological, describing entropy in terms of its effects rather than what it is. And it doesn’t provide any insight on how to quantify it.
Of course, scientists like Clausius, Ludwig Boltzmann, Josiah Gibbs, John von Neumann, and Claude Shannon have already done the heavy lifting here. The equations that have been hammered out describe entropy as the number of states that a system can be in.
Shannon’s inclusion here is interesting, because his contribution comes from the direction of information theory. It turns out that the measure of a system’s entropy is the same as the measure of how much uncertainty can exist in it, in other words, of how much information it would take to describe it. So a possible way to describe entropy is the quantity of a system’s information.
Of course, this is in terms of Shannon information. Many object to the word “information” here, because they don’t perceive it as necessarily semantic information, that is, meaningful information. A common view is that information is data plus meaning, and Shannon information seems like just raw data, physical patterns, at best.
But this raises the question of what we mean by “meaning”. (Yes, I’m pondering the meaning of meaning. Sorry.) What gives data meaning? A lot has been written about this. My own take is that meaning comes from the causal history that resulted in the physical patterns we call data, and the causal effects those patterns might have in the future. It’s the relationship between the data’s past and future light cones. Put another way, meaning comes from the data’s relationships with its environment.
In this view, data always has meaning. We often just don’t know what it is. For example, tree rings have semantic meaning for anyone who understand their causal history, in how they are produced in the tree’s development and growth. But without that knowledge, tree rings are just a peculiar pattern.
That means data is always information, although it may not be semantic or useful information for me or you. Put another way, semantic information is a subset or a type of physical information, and whether or not a piece of information is semantic depends on our knowledge of its causal history and potentialities. In other words, the distinction is relative to the infomee.
Last year I tentatively defined “information” as causation and asked if anyone could find distinctions between these concepts. I think I might have just found one, because a high entropy system has a lot of physical information, but little causality left in it, at least without outside energy. That seems to imply that information is a result of causation rather than causation itself. But in a relatively low entropy system, information can still have causal efficacy. So maybe a better way to describe it is as a snapshot of causal processing. That also has the nice benefit of keeping a meaningful distinction between information and information processing.
Since the universe is always producing entropy, this has the interesting implication that the universe is always producing information. The universe is an information producing system. That probably sounds more profound than it is, since all we’re really saying here is the universe is always increasing in complexity. (I do wonder what this means for the state of the universe after heat death, which doesn’t seem particularly complex to me, but is usually considered a high entropy state.)
This relation is also making me take another look at brain entropy theories. The idea that a brain is a high entropy system seems very counter-intuitive. But I suspect it seems that way because of the value-laden way we think about entropy. It’s similar to the fact that when someone starts learning accounting, they often have to unlearn the idea that credits are always good and debits bad. Or the idea that bacteria in our body is always bad, when it turns out they play an integral role in digestion and other processes. In that sense, the idea that the brain uses entropy might fit right in.
Anil Seth, in a recent interview with Sean Carroll, talked about transfer entropy, and how it’s been shown to be the same as Granger causality. It reminded me of the study that found information hubs in the brain (with the idea that they’re the hubs of the global workspace) by using a technique called NDTE (normalized directed transfer entropy). Talking in terms of entropy rather than information seems like a more cautious way to discuss it, avoiding the definitional morass between physical and semantic information.
All of which is to say, entropy, information, and causality are distinct concepts, but they seem intimately tangled together into a tight conceptual hairball. At least, that’s the way it seems to me today.
What do you think? Do the relations I’m describing here make sense? If not, where is the logic going wrong, or what am I missing?