Saturday, 1 June 2019

All About Blockchain

I remember back in the mid-nineties there was this huge fad sweeping the world. People could see that it was going to be the future, and entrepreneurs were making buckets loads of money off of ideas that in reality had no chance whatsoever of actually making any money. Then, in 1998 the entire house of cards came crashing down. No, I'm not talking about the Asian tiger but rather the growing pains of the internet. Mind you, this was back in the days of dial-up modems and while security algorithms existed there were nowhere on the scale that they are today.

Well, it seems as if the bust was centuries ago, and we now live in an age where we are wandering around with computers in our pockets that are more powerful than those that sent the original space shuttle into space (and even then the space shuttle is starting to become a distant memory, as is the fall of the Berlin Wall). So, the reason that I am talking about the past is because it seems that we are seeing another technology that has the potential to change the world in which we live, and that is digital currency such as bitcoin.

In fact we have even seen what some could consider to be a bit coin crash, if this chart is anything to go by:

I guess anybody who had bought into bitcoin at the beginning of 2018 probably hasn't had all that much of a return on their investment, though the thing is that fads like this happen more often than not, when everybody pours their money into the next big thing, only to discover that the next big thing is nowhere near as great as they anticipated - just ask the Dutch about the tulips.

Anyway, there is more, in fact a lot more, to the concept of blockchain than simply cryptocurrency, though cryptocurrency is basically the thing that pretty much everybody and anybody has heard about, and think about, whenever block chain is mentioned. So, while I might touch upon crypto-currencies, here I'll be looking more at the concept behind it as opposed to the product itself.

So, What is a Blockchain

So, a block chain is a distributed ledger that is shared across a network. Each block chain carries a list of transactions, and a hash that represents all of the previous blocks in the chain. The exception to this is the first block in the chain, known as the genesis block. The way my lecturer explained it is that it is like a spreadsheet, of which there are copies on every single computer in a network. You can add rows to this spreadsheet, but you cannot change any of the existing rows. The reason that blockchain has worked so well with cryptocurrency is that it solves the double spending problem, that is the idea where you can use one bitcoin to purchase something, and then use that exact same bitcoin to then go around and purchase something else.

So, because the blockchains are replicated across a network, it makes them very difficult to corrupt. As such, this means that you can use them in what is known as a 'trustless network'. Basically they can be used even though you do not necessarily trust the source. In being able to do that you no longer need a central authority, such as a central bank in the case of currency, to regulate the chain. In a way the blockchain regulates itself by being able to authenticate itself.

Blockchain in Action

Okay, we already know about cryptocurrency, but they can be a little confusing, and rather complex, so let us look at another use for block chain - logistics. Say we have a company that produces medicine, and that the medicine needs to be kept at, or below, a certain temperature from production to its final destination at the hospital. Previously we would have had to simply trust the paperwork, but block chain actually solves that problem for us.

Basically, at every point in the production line a block is produced, and the temperature is recorded at that point. When the next stage of production commences, a new block is generated, and all of the other blocks are added to this new block. So, we have production, packaging, transportation, and finally retail. Now, just remember that all of this information is being recorded by a computer, so you don't have any dodgy truck drivers, or warehouse managers fiddling the books, so that when the medicine finally arrives at the hospital, the doctors can make sure that the medicine is okay by simply looking at the blocks. If the temperature exceeded a certain height in the process, this will be known.

So, what if the truck driver decided to go back and change the block after it was generated. Well, remember what we said about the hashes in a previous post? Well, these hashes are added to the blocks, so if anybody changes a previous block then this will be noticed since the block that had been changed will not be consistent with the block that had been automatically generated.

In fact, the temperature is just one aspect of this process because you can then also throw in a time frame as well. The time at which the medicine arrives at a certain point in the stage is also recorded, so that if they took too long to put it through one of the stages, this will also be noted in the final block, and the medicine can also be rejected. 

You see, with the traditional method, simply by going back down the chain and asking people you will basically get plausible deniability. The reality is that people, over time, learn that the best way to retain their job is to basically deny everything - in fact that is one of the key aspects of our legal system - denial. What blockchain does is that it allows us to trace where the problem arose, and to then deal with the problem at that point as opposed to attempting to dig up the truth from a bunch of people who are basically being very uncooperative.

You might also note that the hash also contains a private Id. That means that once again, only the section in which the block has been generated can actually create, or change, the block. Without the private Id, access to that particular block, or that particular part of the system, will not be able to happen.

So, here is the block chain that has basically been produced through our hypothetical logistics chain:

Now, what happens is that the temperature during the process went over the maximum value. Well, the auditor, who has full access to all of the blocks, and the private keys to the blocks, can then review each of them to see where the problem occurred. The auditor also knows all of the instructions, and as such is the only one that is able to reproduce the hashes. So, because the auditor can reproduce the hashes, the auditor can not only see that something is wrong, but also where it went wrong. For instance, consider below:

So, from looking at the blocks, the auditor can see that something went wrong with the shipment, and can then approach them and request that they produce their records. However, can the shipment change their records? Well, no, because not only don't they have access to the private key, but further, they do not have access to the retailer's system. In fact, by changing their block will produce an indiscrepancy in not only their block, but the retailer's block as well, which will once again be picked up by the auditor.

In fact there are many uses for block chain, including land registry and health care. The thing is that they all run on the same principle in that all of the previous details are encoded onto the latest details. Further, since they are stored on a distributed system, then one block can be compared with all of the other blocks in a way to prevent fraud. If you change one of the recorded blocks, this will be noticed when you compare it with all of the other blocks that have been stored on the network.

Now, bitcoin works similarly, but there is something else that needs to be considered, if we take into account the diagram below:

So, what is that Merkle Root? Well, I'm glad you asked.

Merkle Trees

Isn't it interesting that a lot of the stuff that I learnt last semester seems to be rearing its ugly head this semester, particularly the stuff from Discrete Maths. The reason I raise this is because of the concept known as the tree. Basically trees are non-cyclical graphs. Now, I'm not going to go into huge details here, but I can refer you elsewhere, such as Wikipedia (that is what is a tree in graph theory). All we need to know here is that a Merkle Tree is basically one of those trees.

So, a Merkle Tree is a hash based data structure where every leaf node is a hash, and every non-leaf node is a hash of its children. In particularly it is the generalisation of a hash tree, and in most cases is implemented as a binary tree. Take the diagram below for instance:

So, each of the leaf nodes is hashed (say that each of the leaves as a transaction that is to be recorded). Then each pair of leaves is then hashed together, and these blocks are then paired off and are hashed together, and this continues until you read the top hash, which contains the hash of all of the nodes below it. This node is known as being the hashed lists of all of the transactions that have taken place since the last block. This is also known as the root hash.

Merkle Trees are used in distributed systems where data verification is paramount. By using hashes instead of full files means that there is a saving when it comes to space. However, because they exist on a distributed system, if a block is changed in one place, then it must be changed everywhere for the change to be valid.

Now, there are two great features with regards to the Merkle Tree. The first is obvious, if we have been following along with regards to the hashes. Namely, if one of the blocks is changed then that change will be reflected all the way up to the root hash. The other aspect is not so obvious - basically if you wish to confirm that something is in the Merkle Tree, you do not need to download the entire tree - you only need to download a part of it. Namely, you only need the block header, and then the path you need to get to the transaction.

So, considering the above, if we want to validate leaf c, all we need is node H(d) and node H(ab). From leaf c, we can recreate node H(c), and then with node H(d) we can recreate node H(cd). Since we also have node H(ab) we can then recreate the root node H(abcd). As such, instead of downloading the entire tree, we only need to download two nodes, and of course the root node for comparison. This is known as 'proof of existence'.

Another example of where a Merkle Tree is used is in a system known as the IPFS, or interplanetary file system. Basically the IPFS is a distributed file sharing system that provides a distributed hash table as a form of catalogue for the files on the system. In a way it is similar to the world wide web, except that is is more like a bit torrent swarm. In fact, the distributed nature of the network means that it is much more resistant to attacks such as DDoS. When a file is added to the network it is referenced through the use of a hash, and it is through this hash that the file can be accessed.

Proof of Work

Now, it is time to start looking at what is known as a consensus mechanism, and in bitcoin this is known as proof of work. Basically a consensus mechanism makes sure that all of the members of a distributed network are the same, and provides a back up if some of the members are beginning to fail. Proof of work is basically a system where miners, through trial and error, seek to reach a consensus to be able to prove that a transaction is valid. Due to the computational power that is required to perform the proof of work, miners are rewarded with bitcoin when the transaction has been verified.

Basically, proof of work involves searching for a nounce (number only once) value that when added to a previous hash will produce a hash that begins with a specific number of 0 bits:

Confused, well, let's go through an example. Say we have a value 1034 and we want to find a hash with 4 zeros. So, we add the number one to our number, producing 10341. We then hash it producing:


This is not the hash we want, so we add a 2 instead, producing 10342:


Well, that isn't what we wanted, so let us try a 3:


It seems like I could be going on for a very long time if I continue to do it one by one, so let us jump a head and add 599 to produce 1034599:


There we go, after 599 hashes we finally have the hash that we were looking for. So, the nounce that we were looking for is 599. This is how it works with bitcoin, and the requirements keep on changing to keep the validation under the 10 minute time frame. It is the search for the nounce that is computationally expensive.

Just for another example, here is a list of hashes, from the sha256, that begin with three zeroes:

Well, this was a pretty long post, but it is a pretty detailed topic of which we have only scratched the surface. The thing is, that as with all new technologies, people seem to want to use it anywhere and everywhere. The reality is though that while blockchain may have a lot of useful characteristics, there may be a lot of situations where it may not be useful at all.
Creative Commons License

All About Blockchain by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

No comments:

Post a Comment