Home > Crypto-finance > Blockchain
The concept of "blockchain" dates back to 1991, when Haber and Stornetta described a "cryptographically secured chain of blocks". It was made popular by Bitcoin, which provided the first real-world implementation of this concept in 2009 - even though the Bitcoin white paper itself did not use the word "blockchain". This post attempts to show what the main features of blockchains are, and briefly discusses some of its design options.
What is a blockchain?
While there is no official definition of what constitutes a blockchain, the following is an attempt to describe some of the main features of this technology.
+ A shared, unalterable and sequential database
A blockchain consists of a sequence of data "blocks", each one cryptographically linked to its predecessor. The data blocks can of course contain any type of data, but are usually said to contain "transactions."
Each block generally contains a quasi-timestamp, representing the time of its creation.
The database is shared between multiple parties, who run "nodes" – which are the computer systems running specific blockchain software and communicating with each other. Usually, each node will be connected to a number of nearby nodes, with whom it constantly exchanges information about the state of the blockchain.
The value of each block appended to the database is determined via a "consensus mechanism", which ensures that all nodes agree on the latest state of the database. It also determines the frequency of appends, which ranges from seconds to minutes. Typically, the consensus mechanism determines which open transactions submitted to the system are valid and can therefore be written to the database.
The nodes that actively participate in the consensus mechanism are called "validators". In some blockchains, all nodes validate all transactions. In other blockchains, the validation of a transaction is done by a subset of nodes. For example in Ripple or Stellar, each node decides which other nodes to trust, therefore the set of validators will depend on the transaction itself.
Because of the size of the network and propagation issues, different nodes of the network may have a slightly different view of the most recent block or blocks. The consensus protocol ensures that these views converge quickly. In general, once consensus has been reached on a block, then the probability that this consensus will change converges to zero very rapidly as new blocks are added. That is what is meant by "unalterable."
Note that unalterable does not mean irreversible. It simply means that to change a record (e.g. reverse a transaction), the record itself cannot be modified, but a modification (or reversal) must be added to the chain later.
Because of the cryptographic link between blocks, it is practically impossible, using today’s technology, to alter the value of a past block once consensus has been reached and the block is old enough. (In the future, however, quantum computers may be able to crack such encryption.)
+ Resilience against network and hardware failures
Because of the distributed architecture, the system continues to function even if a fraction of nodes become unavailable. When nodes come online again, or are added to the network, the protocol ensures that they "catch up" rapidly with the current state of the blockchain.
+ Resilience against cheating
The system is resilient against cheating (wilful or not), because the consensus mechanism generally requires a majority of validator nodes to agree. So the only way to cheat the system would be for a majority of nodes colluding.
+ Limited attack surface
When all or part of the data in the blockchain is held by "accounts" that are identified by their public key, and the private key is needed to initiate a transaction, there is an additional level of security. In that case, even if the majority of nodes colluded in a malicious attack, these nodes would still not be able to initiate transactions for other users. (However, a successful attack would be able to reverse previously registered transaction.)
The basic procedure
If we consider that every data item in the database is a transaction, then the basic procedure between blocks can be described as follows:
- take all "open" transactions from the last block
- collect all new transactions initiated by users
- run the consensus
- some transactions are confirmed
- some remain "open"
- some are rejected
- all of this goes into the new block
Note that this description would have to be expanded in the case of smart contract platforms.
There are multiple design choices for blockchains. Initially, blockchain design was influenced by the success of the Bitcoin blockchain, but over time this influence has waned.
Bitcoin provides a practical solution where trustworthy validation can occur even if anyone can join, nodes are completely anonymous and we don't even know how many nodes there are in the system - as long as the majority of the nodes are honest. Or, to be more precise, the majority of CPU power. For that reason, there has initially be an emphasis on systems with anonymous nodes. However, the ability for anyone to join can also become a problem. We see this with Bitcoin today: a majority of miners are in China, which is a potential political risk.
Blockchain technology is quickly evolving towards less node anonymity. In most real-world cases, the validators will know each other, and will in fact be bound by legal agreements. The technical validation procedure will just enforce the legal agreement.
In Ripple and Stellar, anyone can join to become a node, but because every node can choose whom it trusts, validation is mostly done by known nodes.
Another possibilty is to provide incentives against cheating. For example, Ethereum has announced that in a future version, nodes will have to deposit funds in order to become validators.
In a (non-sharded) blockchain, all nodes necessarily see the contents of the entire database. Users may not want to show data, and protecting privacy using pseudonymous accounts as in Bitcoin or Ethereum is far from sufficient. This has been a major barrier for using blockchain in situations where users do not completely trust each other.
With new technologies, such as zk-SNARK, users will be able to define precisely who sees what part of data. Taking a stock trade as an example, the public will see only the securities code, volume and price, the stock exchange will also see who the buyer and seller are, while the complete trade data will be seen by the two parties to the transaction and maybe a regulator. Using this kind of selective data disclosure will make blockchain acceptable in many new settings.
Ledger vs Journal
Blockchains are often called "distributed ledgers". In accounting, a "ledger" generally refers to a record of account balances, while a “journal” is a record of transactions. If we look how accounting was usually done before computers, transactions were entered into journals on an on-going basis, then from time to time (maybe at the end of the day or the week), the ledgers were updated.
It seems quite obvious that only recording transactions, without account balances, is not very efficient, but this is precisely what Bitcoin is doing - although there is pruning to reduce database size. In that sense, one could refer to the bitcoin blockchain as a “distributed journal”,
Ethereum's appoach seems more reasonable, it could be described as a “distributed ledger cum journal” with the ledger updated in real time.
Types of nodes
Beginning with bitcoin, early blockchains started as peer-to-peer (p2p) systems in which all nodes were equivalent. The only difference was that some would be "mining", others not. As blockchains grow, it becomes increasingly difficult to run Bitcoin or Ethereum on a normal PC, it takes time to download the chain and processor power to run.
Therefore, the need for different types of nodes is becoming clear, for example:
- Client/light nodes, who occasionally connect to the network (typically to just a few “trusted” nodes)
- Passive nodes, who are permanently connected to the network but do not do validation
- Validating nodes
- Archiving nodes
For example, in Ethereum, there are already some nodes validating but not archiving the full history, while others do archive.
Smart contracts are a way of automating certain actions on the blockchain, based on transactions occurring on the blockchain itself, or on data received from so-called "Oracles" – interfaces to data from outside the blockchain. They are simply programs executing on the blockchain, their inputs and outputs being data available on the blockchain.
These programs can be configured so that they cannot be modified. For example, consider a bet about the level of a certain stock on a certain date. The parties to the bet deposit some amount of the blockchain's own coin into an escrow account. Based on the Oracle's price feed at the agreed-on date and time, the program wll release the funds to the winner. No-one will be able to modify the program, or take or release the funds before the appointed date. In fact, if the program contains a bug, the funds may remain blocked forever. (Of course, if the funds were not in the blockchain's own coin, the escrow account would only hold a pointer to some outside account, in which case the funds could in fact be stolen or otherwise transferred, the bank holding the funds could fail)
The first blockchain using smart contracts was Ethereum. As of late 2016, it is not yet considered fit for business-critical applications. There are some inconsistencies in the Solidity compiler which may provide openings for attacks. But the main issue is the level of testing and validation required before a smart contract, which may have the ability to transfer significant assets without recourse, is set loose on a blockchain. Most programmers are not used to such rigorous levels of testing and validation, and best practices are still being developed.
Another significant issue with Ethereum smart contracts is that they are visible by all, which will not be acceptable to many business users. It also means the virtual collapse of any fee model. If a useful program is deployed, with a built-in fee to pay the developer, another developer (anonymous if necessary) can just take the program and set a lower fee, or remove it entirely.
A solution where the source code, or at least parts of it, can only be seen by certain parties seems almost mandatory. Auditors will then be needed to certify that the hidden code works as designed. An alternative is for processing off-chain, with only certain critical actions requiring public scrutiny done on-chain.
The question of coins occurs in the context of public or semi-public blockchains. While there must of course be an incentive to participate in the validation, having a blockchain-specific coin is but one of several options.
In any event, it does not appear to make any sense to link the remuneration of the validators to the equity value of the blockchain itself, as is currently the case in Bitcoin and Ethereum. Another influence of Bitcoin design that will fade over time.
It is true that if a "coin" goes up sharply in value, validators will likely reduce fees because of competitive pressure. But this may not be enough of an argument for potential business users.
That said, it might be a good idea of having an internal currency in which validators get paid, because that gives the system some flexibility. What is meant by this is that if costs of computing and storage goes down, the value of an internal "validation currency" could be adjusted downwards over time. On the other hand, it could also be adjusted upwards to attract more validators if necessary.
In general, the discussion about coins should be replaced by a discussion about the business model and the interaction between the various parties:
- blockchain owners (all kind of possible models)
- blockchain validators (generally need to get paid on an ongoing basis, unless they are also the blockchain owners)
- blockchain users (generally pay-as-you-go)
When to use blockchains
The two questions to ask are the the following:
1) is there a critical reliance on a database that holds certain information that is key to the project, and on which all users must be in agreement at all times (a typical example being financial information)?
2) is there reluctance to entrust one single entity with the management of that database?
If the answers to these questions are "yes", then the use of blockchain / distributed ledger technology may make sense.