A Brief History of Entropy: Chapter 1

Published in

Towards Data Science

12 min readAug 15, 2020

Though the idea of entropy was introduced during the study of systems from thermodynamic perspective, this idea has inspired new innovations in multiple non-related fields, prominent among them being the design of Computer algorithms for Artificial intelligence and design of communication systems. Join me in this series of articles where I take you on a journey alongside our friend, entropy, as he makes his way from Thermodynamics to classification algorithms used in deep learning.

Chapter 1: Thermodynamics

In the early 1800s a French Engineer, Sadi Carnot was investigating work done by heat engines. After some detailed analysis, he concluded that any heat engine, not only needs a hot body but also a second body with a lower temperature to operate. In a thought experiment, he designed a heat engine that worked with the maximum possible efficiency(the amount of work that you receive from the engine for every unit of energy that you spend on it) with which any engine can operate, which famously came to be known as Carnot’s heat engine. The details of Carnot’s engine can be found in any standard book on thermodynamics. But what concerns us the most is that, was that during the analysis of his engine, Carnot realized that the following equation holds:

Equation 1

A Heat engine. Image Courtesy: Wikipedia, https://en.wikipedia.org/wiki/Carnot_cycle

Carnot’s cycle. Image Courtesy: Wikipedia, https://en.wikipedia.org/wiki/Carnot_cycle

That is, The ratio of heat taken from the hot reservoir to the temperature of the hot reservoir is equal to the ratio of the heat rejected to the cold reservoir to the temperature of the cold reservoir. Another important property of his engine was that the engine operates on a reversible process. This property emerges from the assumption that there are no dissipative forces in the process and the smart design of the Carnot cycle, thanks to which there was no necessity to approximate any part of the cycle as a quasi-static process(an always equilibrium process). So this theoretical engine had no scope to lose energy anywhere, making it easily reversible… yet it could never attain perfect efficiency.

Carnot’s analysis of a reversible engine, subsequently also led to the conclusion that it was only the reversible engine that could have the maximum efficiency. But as it was done, his analysis applied only to reversible processes, something which can rarely be found in nature. Thus equation 1 never holds for most real-life processes as most processes in nature do not reverse on their own. Objects always fall, and never go up on their own… heat always flows from a hot body to a cold body, not the other way around… nature is bountiful with such examples. What is it in irreversible processes that nature always prefers it to a reversible one?

What do we do when we can’t explain a process? We make it a constitutive law, a law that portrays an observation as a rule and says that this rule cannot be violated. That’s the second law of thermodynamics, at least when it was formulated by Kelvin-Planck and Clausius(Independently). Overall the law stated that heat cannot flow from a cold body to a hot body without an external agent ensuring that it does. This was a blatant observation in nature that no one knew why it happened so. Here I’d like to stress the role of the external agent. So unless an external agent gets involved in the process, heat cannot flow from a cold body to a hot body… which indicates that all observations mentioned before occur only when the system is not tampered with, or in formal terms, is a closed system.

One way of interpreting the statement that I’ve just made is to assume that the external agent influences the system in the form of work. This gives us Clausius’ statement of the second law. But, hush-hush, before we proceed ahead in this discussion, let’s see how entropy as an idea was born in such a background…

Birth of entropy

Thermodynamics was (at least then) the study of state variables of a system. Carnot’s equation(1), made Clausius draw parallels between the energy conservation law and the former, and he postulated that perhaps, this quantity Q/T is conserved in a reversible process. He went ahead and called this quantity as entropy(popularly denoted by S) and claimed that this was a state variable too. Cool! So, entropy is conserved in a reversible process. But what about irreversible ones?

Let us consider one irreversible process, the conduction of heat from a hot body to a cold body. The change in entropy of the hot body will be = (Heat lost by it/Temperature of the hot body), while the entropy gained by the cold body = (Heat Gained by it/Temperature of the cold body). Keeping in mind that the heat lost by the hot body is the same as the heat gained by the cold body, the total change in entropy of the system made of these two bodies will be:

Equation 2

Thus entropy increases in this irreversible process. Pardon me but here I must extend my claim that entropy of any irreversible process increases.

Okay, so we defined entropy for reversible processes as Q/T. And for reversible processes entropy is conserved. But clearly entropy is not conserved for irreversible processes.

Our analysis in view of the stated second law lets us claim the following two conclusions:

The entropy of a closed system always increases with time. A closed system can have only irreversible processes happening in it as it cannot have any external influence inside it.
Nothing can be said about the entropy of open systems.

Okay so till now we have just listed some observations as rules that we won’t violate by choice. So can we justify why such observations exist? The answer to this question lies in the kinetic theory of matter and statistical analysis of gases.

Probabilistic interpretation of Entropy

Till now no one knew the nature of this newly proposed state variable, entropy… what is it? Can we somehow relate it to some property that we can observe in real life? Before we formally draw some parallels between probability and entropy, let us build an intuitive feel for the same.

Suppose a gas contained in a vessel of volume V2, currently occupies a volume V1. There is a partition that holds it back and when the partition is removed, it expands to volume V2. We ensure that this process is isothermal by giving it some heat Q, as the gas will tend to cool down during expansion. Consequently, the entropy change associated with this process is:

Equation 3

where N is the number of molecules of the gas involved in the expansion and k is Boltzmann’s constant.

Using this example, we can explain the increase in entropy, based on probability. Before expansion(just after the partition is lifted), the probability that a gas molecule is present in a volume of V1 is V1/V2. The probability that 2 molecules will be present in a volume of V1 is (V1/V2)². Thus the probability that N gas molecules are present in a volume of V1 is (V1/V2)^N. After expansion, the probability of the gas molecules occupying a volume of V2 is 1. If we denote the probability that N gas molecules are present in volume V1 as w1 and the probability of N molecules present in volume V2 as w2, then their ratio is:

Equation 4

Comparing equations 3 and 4,

Equation 5

Boltzmann’s formula

In 1872, Ludwig Boltzmann published a formula in which he expressed the entropy of the system in a certain state as proportional to the logarithm of the probability of that state. The proportionality constant was later called the Boltzmann constant.

Equation 6

We can quickly verify that equation 5 can be deduced from equation 6.

This new definition of entropy made the additive property of entropy obvious. Suppose if a system is made of two subsystems in states 1(with probability w1) and state 2(with probability w2). The probability that the whole system is in this state is w1*w2. Thus the entropy of this system would be

Equation 7

Thus the entropy of the system is the sum of entropies of its subsystems.

Entropy as a measure of disorder

Given a system in a state, Boltzmann’s formula linked its entropy with how probable that the system is in its current state. Higher the probability, the higher the entropy. So one question naturally arises… What are those states that are more probable than the others?

Before we enter into the above question, there is an even more fundamental question that we must ask ourselves. What is it that I, a human, perceives as a state of a system? What is the distinction between the two states? Let’s make this clear with an example:

Suppose we have three balls, that could take any of the following two colors: red or blue. So there are the following 8 configurations that this 3 ball system could be in:

Each of these 8 states cannot be decomposed further into states. A popular term given to these fundamental states is ‘Microstates’. Now the question is, what constitutes a state for you? You can say, All those microstates, with at least 2 blue balls, are one state and the rest another state. Yes, that is a valid definition of a state. Meanwhile, another person can say that for him all those microstates with at least 1 blue ball is a state, the rest being another state. Do you see that what one perceives as a state is up to one’s purpose and decision? By the way, these ‘made up’ states have to be a combination of microstates and are popularly called ‘Macrostates’.

Despite the absolute freedom to choose states to be made of microstates of one’s liking, we as humans(at least most of us) have a keen eye in observing patterns in these microstates and designating a special category for these states called “Order”. For example, I deem all red and all blue balls(S1 and S8) as “Order”, as observing these states is oddly satisfying for me, and I believe many of my fellow humans would gladly agree with me. But what mostly happens is the subset that we humans deem as order, is very small as compared to what we deem as ‘disorder’. Thus in this binary classification of states, the system is more probable to be in a ‘disorder’ state than in ‘order’.

With higher probability comes higher entropy, thus with disorder comes higher entropy.

Revisiting the Second Law

When entropy was introduced to interpret the second law, we concluded that, for a closed system, entropy increases with time. This is another statement of the second law. But the probabilistic nature of entropy tells us the extent to which the second law is true. The second law of thermodynamics is probabilistic, and not a deterministic one like Newton’s laws. It is a law that is certain to be followed, only if we give the system considerable time to follow it. It is permissible for the system to reduce its entropy for a short time if, in the long run, it increases its entropy(For those who have read Dan Brown’s Origin, this statement has a lot of vibes along with it :) ). How long is the long run? We can’t say… It can be as short as a few nanoseconds for gas molecules or years for certain other systems, But what we can guarantee for certain is that given a lot of time, entropy will increase(This is what mathematicians call convergence in probability).

Maxwell’s Demon

In 1867, James Clerk Maxwell (Yes, the same guy who proposed the four electromagnetism equations) suggested a thought experiment, which proposed a possible way by which the second law of thermodynamics could be violated. His experiment was as follows:

Consider a vessel containing gas at some Temperature. The vessel is partitioned in the middle, with a wall that has only one door, through which only one molecule of the gas can pass through at any given time. Now Maxwell proposed that let an intelligent being(the demon) stand as the guardian of the door, and among all molecules approaching the door from the left partition, allow only those that are faster than some threshold velocity to move to the right partition, and at the same among all molecules approaching the door from the right partition, allow only those that are slower than some threshold velocity to move to the left partition.

Now you need to understand the consequences of the existence of such a being. But before that, is it possible that such a being is capable of existence? Yes, obviously; Us, decision making humans are a good example of such beings. Let’s see what the demon has essentially done in our vessel. The demon has acted as a filter and has segregated faster molecules from slower ones. So, essentially, it has created order. Also in the process of segregation, the faster molecules have accumulated on one side of the vessel while the slower ones on the other. The average momentum of molecules on the right side is higher than the average momentum of molecules on the left side directly implying that the temperature on the right is higher than the temperature on the left. So what has this demon accomplished?

It has created order and hence has decreased entropy
It has created a temperature gradient, thus enabling work to be extracted from the system

But at what cost has the demon accomplished this feat?

That question is the crux of Maxwell’s thought experiment. It’s easy for us to consider the demon as not a part of the system, and argue that the vessel is an open system, and hence external interference is allowed, and hence it’s perfectly fine if entropy decreases. But this argument fails, if we force ourselves to consider the demon to be a part of our system, which is justified completely as we are at complete freedom to choose what constitutes our system.

Another gullible argument that arises from the probabilistic nature of the second law, is that it allows for a decrease in entropy, at least for a short period. But one can easily conclude that entropy in this system can never increase, as the demon is continuously striving to segregate the molecules, and will continue doing so for eternity.

If the system made of the vessel and the demon is open, where is the leak? How is the system gaining energy to create order within itself? But if the system is closed… and if we were to take the second law of thermodynamics as sacrosanct… What the hell is happening here? Here is one relevant question:

Don’t you think the demon will need some method to acquire information about the velocity of the molecules? Isn’t it possible that such a method would consume energy and hence make the leak in this “Closed System”?

That was the argument raised by Leó Szilárd in 1929. He proposed that the demon needs some means to measure molecular speeds, and this method of acquiring information spends energy. The demon, on spending energy, can only increase his own entropy, as the entropy of the gas in the vessel is clearly decreasing. Szilard considered that this increase in entropy of the demon should be more than the decrease in entropy caused by the segregation process in the vessel. Quite a plausible explanation! Except for the question… What does it mean to increase the entropy of the demon? Which part of the demon gets more disordered than before? What does it mean for me to be more disordered after I have observed a reading from an ammeter, during an experiment?

One question has led to several others. So let me end the trend… What does it mean to observe data?

Find out how to exorcise Maxwell’s demon in Chapter 2!