A Shallow Dive Into Bitcoin’s Blockchain Part 1 - Consensus

Andreas Lymbouras
Towards Data Science
11 min readJul 8, 2019

--

From natural objects to coins, to paper money backed by gold, to just paper money, to digital money. Whatever the format, money serves as a mean of exchange, a method of payment, a standard of value, a store of wealth.

Today most of the transactions are becoming just digital. For example, this is what happens when you buy stuff from a merchant using your card:

1. You are being authorized by entering your PIN to the merchant’s machine

2. Transaction and card data are sent to the merchant’s bank

3. Merchant’s bank (Acquirer in the figure below) forwards the details to your card’s organization (Visa, Mastercard, etc.)

4. Then the card organization contacts your bank (issuing bank in the figure below) to make sure the card is valid and has money in it

5. Once the merchant receives the OK message back, they can give you the goods/services

Acquirer is Merchant’s Bank. Issuing Bank is the Customer’s bank. Source

6. At the end of the day merchants’ bank sends all the valid transaction data to the card organization (Visa for example) for settlement

7. The card organization forwards all merchant’s transactions to the customers’ banks (issuing bank)

8. Customers’ banks will need to transfer the funds to the card organization

9. Card organization pays the merchants’ bank less an interchange fee

10. Merchant’s bank deposits the funds to the merchant’s account, less a fee

Acquirer is Merchant’s Bank. Issuing Bank is the Customer’s bank. Source

Card organizations act as the middleman for financial institutions. They are facilitating transactions among people’s bank accounts. It’s like they are on top of banks giving you secure access to your funds with just a card.

This article, however, analyses how we can securely transact amongst ourselves.

Let’s try to build a system without having to depend on any organization.

The first approach would be having a public place where we save all transactions happening among us. A database ledger, for example, on a public server somewhere. So, whenever a transaction is happening we would save it on that database. A digital wallet would automatically send the details of a transaction to that server.

But you can quickly see problems arising with this approach. Who will be in charge of such a database server? The one with access to the server will have the power to add, edit or delete a transaction however he pleases.

To remove that bit of trust we can have every participant of the network keep a copy of that database ledger on their machine.

But here is another question. How can participants keep this ledger in sync? How can you get everyone to agree on the same ledger?

To address this challenge, central authorities are substituted using a consensus algorithm called Proof of Work (PoW).

Sidenote:
The term for PoW is “mining”. And the purpose of mining is not to find new coins as the term mislead, but rather to keep the blockchain secure.

The basic idea is to trust the ledger with the most computational work done on it.

Remember from a participant’s point of view he can trust no one!

On the bitcoin network, miners put together transactions in a block and broadcast them to the rest of the participants. Not all participants need to create blocks (be a miner) though. Participants can also just listen and verify the validity of new transactions; the transactions that they are interested in.

A wallet app on a mobile phone, for example, can also participate but not necessarily support the functionalities that a full mining participant runs (this includes verify and broadcast all newly created transactions and blocks, create new blocks, etc). This would require lots of computational power and Gigabytes of space.

Alice, running a full mining node, just found a new block and broadcasted it to the rest of the network. Bob, being a participant in the network, listening for new blocks of transactions from the miners, receives the newly created block. Source

Let’s say you are a participant on this network. How can you trust that miners do their job right? In other words, make sure that the blocks of transactions that you receive from them are valid, and at the same time, everyone else records the same blocks.

The way you trust the miners is by validating that they put some computational effort (PoW) into creating those blocks. This mining process will be discussed later in more detail.

Bad actors can put some effort as well and send you blocks right?

Correct! But there here is the caveat:

Because it takes time and effort to create the PoW for those blocks, bad actors can’t just trick the whole network easily.

Assume that you receive a block with a bad miner’s transaction in it. After a little while, you hear another block from a good miner but with a conflicting transaction. From your point of view, you don’t really know which of the two blocks to trust.

The key point is to keep both blocks and create two separate blockchains (a fork event) and keep listening for new blocks; along with their PoW.

For more details see this animation’s caption:

In this example, Alice is trying to cheat the network by trying to convince it that her version of the chain is the valid one (and thus the transactions in it). Bob, who is a participant in the network, is trying to understand which blockchain is the correct one. Source

A fork is not necessarily a bad thing. It happens every day in the network. It occurs naturally as a result of transmission delays in the global network. i.e. two different miners finding and transmitting a block at the same time (each miner will include different transactions or transactions in a different order when creating a block and it can happen for 2 different miners finding a nonce for their blocks at the same time). But as long as all nodes wait and select the longest chain, the network eventually converges to a consistent state (consensus through the network).

Visualization of a blockchain fork event: two blocks propagate at the same time. Some nodes receive block “triangle” first and some receive block “upside-down triangle” first. Source

Therefore, if bad miners’ blocks do not eventually make it and convince the rest of the network, all the effort is wasted, among the resources they used to make it. And that’s one of the factors that disincentivize bad actors from cheating.

What kind of computational work do miners need to do in order to prove that they put some effort into the creation of a block?

To understand this let’s explain cryptographic hash functions first.

The first thing to note is that everything in the digital world boils down to bits (0s and 1s).

A cryptographic hash function is a mathematical function that takes any size of bits (this is the input message — a text, an image, a number) and outputs a fixed length of 256-bits. This output is called the hash value, digest or simply the hash of the message.

Cryptographic hash functions have these properties:

  1. It is infeasible to modify a message without changing the hash value. If you slightly change the input message, even one bit, the resulting hash value changes completely!
  2. It is infeasible to find two different messages with the same hash value. The same input message (set of bits) will always give the same hash value (256-bits) and two different messages (sets of bits) will give different hash value (256-bits).
  3. It is infeasible to generate a message from its hash value. Having the 256-bit hash value you can go backwards and find the message used as an input.
Hashing comparison. Note that for readability hexadecimal notation is used for the resulting hash. One hex value represents 4 bits. 64 hex values times 4 bits = 256 bits. Source

Bitcoin uses the SHA256 hash function.

How can such a function prove that the participants did spend a lot of computational power? In other words, how do we know that a miner actually did the work required for a specific block of transactions?

The rule is that miners will have to find a number, called nonce, that when added to the block of transactions and hashed together will result to a 256-bit hash (called the block hash) that starts with a specific amount of 0s. This amount of leading 0s is defined by the difficulty level.

Remember in the digital world everything is represented in bits:

Each block in the blockchain contains various information about the block (transactions, creation time, etc.). This information is represented as a set of bits as well. Those bits are unique for each block because it has a unique creation time and set of transactions.

What makes it hard work for the miners is to find an extra number of bits (the nonce number) that when added to the existing block’s bits, and hashed together using the SHA256 hash function, will result to a hash value with a specific set of 0s (defined by the difficulty level). The only way to produce a hash value matching a specific target is to try again and again, randomly modifying the input (by trying a different nonce number) until the desired hash value appears by chance.

Take time to read that again. That’s the cornerstone of Proof of Work.

Hashing numbers to find the nonce that satisfies the difficulty level. Source

This number (nonce) acts as proof that a certain amount of work was done to produce a resulting hash that matches a specific target. Thus the term Proof of Work.

The nice thing about using this method is that every participant can easily verify that they did a large amount of work, without having to do the same effort. Having 1. the block’s data, 2. the nonce and 3. the resulting hash, anyone can quickly validate its correctness using the same function.

A block that satisfies the difficulty level is considered valid.

So, if a bad actor was about to change a transaction (essentially some bits in the message), the block hash would be different and most probably not starting with the required amount of 0s and as a result, invalidating the block.

The block hash acts as a digital fingerprint, or a block identifier lets say. It identifies a block uniquely and can be independently derived by any node by simply hashing the block’s data (nonce, transactions, difficulty level, timestamp, etc.).

One thing I haven’t mentioned is that a bad miner, could not just change any transaction. There as an extra bit of security which I will discuss on Part 2 — transactions. But for now, take this for granted.

Having this extra bit of security, a bad miner could only change one of his own transactions.

So in order to cheat the network, he will have to go to the specific block in the chain (the block where his transaction is saved) change the transaction in his favor and re-do the work of finding a block hash that satisfies the block’s difficulty level. Propagating such a block to the rest of the network would be considered be totally valid.

Unless we add another bit of security.

All the blocks are connected with each other! By hashing the previous block’s hash and not just the transactions, nonce, and timestamp, we get this tight relationship between all blocks.

Blocks’ contents. All transactions (Tx0, Tx1, Tx2, Tx3) can be represented with just one hash number, called the Merkle Root (Tx_Root in the diagram). Source

So, if you were about to change any block, or try to swap the order of two blocks, it would change the block after it. And since a single change on a block changes its hash, it will change the next block’s hash, and so on, invalidating all blocks after it. That would require redoing all the work, finding a new special number for each of these blocks that makes their hashes start with a specific number of 0s. This requires lots of work!

Side note:
At the time of writing the hash rate of the network is around 65 quintillion Hashes per second. A bad actor will need to have at least half of it to trick the network (known as majority or 51% attack). Since the attacker can generate blocks faster than the rest of the network, he can simply persevere with his private fork until it becomes longer than the branch built by the honest network. However, the current network’s hash rate makes it financially impossible (the machines and energy needed cost millions of dollars). And that’s just to change one of his transactions. So, it better be worth it!

Difficulty level

Earlier I said that the PoW is to find a special number (nonce) so that the hash of the block starts with a specific number of 0s.

The more leading 0s the harder for the miners to find a nonce that satisfies that hash. That’s because the range of acceptable resulting hashes becomes smaller.

But why making it hard for the miners in the first place?

The target is to keep the difficulty of mining one block to 10 minutes for the entire network.

This is a design compromise between fast confirmation times (settlement of transactions) and the probability of a fork. A faster block time would make transactions clear faster but lead to more frequent blockchain forks, whereas a slower block time would decrease the number of forks but make settlement slower.

The difficulty parameter is adjusted dynamically, every 2016 blocks, to keep the block interval at the 10-mins target. So, the formula is that for every 2016 blocks the miners' average block creation time is calculated and if that time is more than 10 minutes the difficulty parameter is going to be adjusted. If for example, it took the network on average 12.5 minutes to build a block (20% more than the 10-mins target), the difficulty will be lowered by 20%, for the next 2016 blocks. This revision of the block creation time follows the improvements on the miners’ hashing power (essentially the hardware).

Side note:

Today lots of miners are coordinating together in a mining pool competing for the creation of a block. They split the work of searching for a solution, earning a share in the rewards. This is centralizing the whole idea of mining but it also has some advantages. But, in any case, as current financial institutions are closely watched for inside trading, big mining pools are also being watched for misbehaviors. Any misbehavior would make them lose credibility or even worse, lose users' faith in Bitcoin, dumping the whole project into the abyss. This would put them out of business.

Today, mining pools act as a bank, keeping people’s funds secure. But, at the end of the day, most of them are centralized corporations! 😉Peer-to-peer pools also exist. But they are a minority.

Interested in learning more? Read my article Part 2 — transactions for more knowledge.

--

--