Blockchain Technology: Architecture, Security Mechanism, and Blockchain Transactions
Blockchain has been a buzzword for some time, and blockchain technology has been gaining popularity by the day. However, the real potential of this technology is yet to be tapped.
For any emerging technology or concept, the initial acceptance and adaptation are always slow. However, when the momentum picks up, there is no stopping. We all remember the days of the early 2000s when cloud computing was in its infancy. Cloud computing was a buzzword, but companies were apprehensive about going for it because of security and other concerns. And it’s 2023 when cloud computing has become the norm. The same days should not be far for blockchain technology.
To tap into the full potential of emerging technologies like Blockchain, we need to understand how this works, their architecture, the security aspects, and their advantages and disadvantages.
In this article, we will understand the need for blockchain, categorize types of blockchain networks and cryptocurrencies, and continue the discussion to understand the key technologies behind the blockchain network.
What is Blockchain?
Blockchain technology is to create a peer-to-peer distributed ledger that tracks various transactions in the network without any central control. Blockchain combines decentralized and distributed databases containing a registry of distributed transactions among peers or fellow participants in the network. Therefore, we can define a blockchain as a globally decentralized and distributed ledger. A Blockchain network is a distributed architecture without the control of a centralized entity. As the name implies, a blockchain is a chain of blocks.
As a data structure, each block of the blockchain contains the records of the transaction, which are accessible to all computers in the network or the “chain.”
- The registry includes a long list of transactions, and new transactions continuously update this registry.
- Every block starts with its first transaction, and the subsequent transactions are grouped into it. This process continues till the block achieves its predefined block size—for example, 1 MB in the case of Bitcoin.
- Once one block achieves its size limit, the next set of transactions forms another block. This new block links to the previously formed block.
- Over time, we get a series of blocks where each block is connected to another block created just before. Thus, we call this chain of blocks the Blockchain.
- The Blockchain has complete information about addresses and balances from the genesis block (the very first transactions ever executed) to the most recently completed block.
Why Blockchain?
Let’s check the reasons behind the idea of blockchain networks for executing transactions.
In the traditional system, the centralized trusted part plays the intermediary between two known or unknown parties, which are not part of the actual transaction execution. For example, a bank is a centralized party that executes the financial transaction between two parties.
In such a centralized system, the central party maintains all the data and controls all operations. Moreover, this lack of transparency may cause possible security concerns.
The above concerns about the lack of transparency and security were the foundation of Blockchain networks. The revolutionary blockchain technology can potentially disrupt existing business frameworks and provide transparency in transactions among different entities.
What is a Block in a Blockchain?
Blocks are the fundamental components of the blockchain with a block header and body. Block body will be nothing but the transactions that will be part of created or proposed blocks in the form of a Merkle tree.
Structure of the Block
The overall structure of a Bitcoin block always includes the following elements:
- Magic number: This is a 4-byte field that always contains the value 0xD9B4BEF9 (for Bitcoin blockchain), indicating the start of the block.
- Block size: This 4-byte field sets a cap on the data contained in a block. For example, the Bitcoin block size is limited to 1 MB.
- Block header: This 80-byte field consists of six components, i.e., Version.
- Transaction counter: This field can range from one to nine bytes, indicating the number of transactions in the block.
- Transactions: This variable size field contains the list of all transactions in the block and the number of transactions to satisfy the 1 MB size of the block.
Block Header of Blockchain’s Block
The Block header contains 80 bytes of cryptographically verifiable information. The header of Blockchain’s block has the following components:
- Version: The 4-byte field indicates the version number of the Bitcoin protocol used in the network and typically contains the value 1.
- Previous hash pointer (hashPrevBlock): The first blockchain block is Genesis Block, which has no parent. The Previous hash pointer is a 32-byte field containing the 256-bit hash address of the previous block header and the previous block’s Hash.
- Merkle root (hashMerkleRoot) – This 32-byte field represents the aggregation of all the hash values of the transactions into a 256-bit single hash value.
- Timestamp (Time): This 4-byte field contains a timestamp of the current block to arrange the block chronologically.
- Nonce: This 4-byte field contains a 32-bit random value that is altered/updated to try different permutations to achieve the required difficulty level. It is calculated using trial and error.
- Bits: This 4-byte field contains the target difficulty of the current Bitcoin block, which determines how difficult the target hash will be to find.
If you wish to see an actual block in the Bitcoin network, head over to https://www.blockchain.com/explorer, and you can see the actual blocks mined in the network.
What are Nodes in a Blockchain
While blocks are logical entities in a blockchain, nodes are physical, electronic devices with IP addresses. Nodes of a blockchain are essentially computers that store and maintain the transaction history of the blockchain network. However, other physical devices with IP addresses, such as routers, modems, switches, hubs, and printers, can also serve as network nodes.
Blockchain Architecture
There are four fundamental components of blockchain architecture. These are as follows:
- Decentralized and Distributed Database
- Network Layer
- consensus Layer
- Application Layer
Decentralized and Distributed Databases
Decentralized and distributed databases are the core of blockchain technology. Before diving into the deeper technological aspects, security, and usage, let’s look at the following types of database systems that a blockchain needs.
- Decentralized
- Distributed
Decentralized Databases – the Core of Blockchain Technology
A decentralized database is spread over multiple locations and devices. Therefore, there is no single point for overall decision-making in a decentralized database, which is what a blockchain is all about. Instead, every node in the system makes its own decision, and the system behavior is the sum of those responses. Also, depending on the architecture, a single node may or may not have complete information about the system.
Distributed Database for Blockchain’s Distributed Architecture
A distributed database is a stretched version of a decentralized database. Distributed databases are best described as systems where data processing is shared across all the nodes. However, the system decision might still be centralized, based on the complete system.
Network Layer
The network layer is a bridge between the nodes of a blockchain network. It facilitates the communication between nodes on the network to send and receive data across the network. A blockchain’s network layer uses protocols such as TCP/IP, HTTP, and WebSockets.
The blockchain architecture is divided into three main components: the network, consensus, and application layers. The network layer is responsible for communication between nodes on the network. The consensus layer ensures that all nodes on the network agree on the state of the blockchain. The application layer is where the actual blockchain applications are developed and deployed.
Consensus Layer
Apart from the data exchange among the nodes, the nodes must have a consensus. The consensus layer ensures that all network nodes have the consensus and agree on the blockchain state. This layer includes consensus algorithms such as PoW (Proof-of-Work), PoS (Proof-of-Stake), and DPoS (Delegated Proof-of-Stake).
Application Layer
Like any software architecture, the application layer is where blockchain application development and deployment occur. This layer includes blockchain services like smart contracts, decentralized applications (DApps), etc.
Key Technologies Behind Blockchain
I am sure that we have heard about the following technologies used for blockchain. I first encountered these technologies in my networking and algorithm design classes at my engineering college.
- Digital Signatures
- Merkle Tree
- Hashing and Cryptography
- Hash Pointer
- Hash Cash
- TCP/IP and Peer to Peer Network
Satoshi Nakamoto, a Japanese male name, took up existing technologies that had been there for decades and created a revolutionary network in 2008, but no one knows if it was an individual, an organization, or a group of professionals.
Let’s move on and talk about the above-listed technologies used in blockchain.
Digital Signatures for Blockchain
To pay via a bank cheque, the account holder must authorize the cheque with his signature. After that, the payer gives or sends the cheque to the payee, who can, in turn, present it to the back to receive the money.
In the same way, a digital signature is a digital version. Due to its advanced encryption, a Digital Signature stands apart from an electronic signature. It is akin to an electronic fingerprint, which verifies a person’s identity and secures documents.
Moreover, Digital Signatures are more secure and reliable than electronic signatures because they are validated by a certification authority. Regarding document safety, Digital Signatures are the ideal choice for blockchain.
Digital signatures can be of two types:
- Symmetric Digital Signatures
- Asymmetric Digital Signatures
Symmetric Digital Signatures
A Symmetric digital signature uses a single key to encrypt the messages.
- The sender encrypts the message with that key and sends it to the receiver.
- Once the receiver receives it, they need the SAME key to decrypt or unlock that message.
- So, the sender also shares the key they encrypted the message with so that the receiver can use it to decrypt the message.
However, Symmetric Digital signatures aren’t popular because they require the transmission of the key using secure means. Moreover, transferring the key along with the message doesn’t make sense, as intruders or hackers can easily decrypt the message during the transmission process over the network.
Asymmetric Digital Signatures
Asymmetric digital signatures, which use a pair of public and private keys, are much more widely accepted than symmetric digital signatures.
- A message encrypted with a public key can only be decrypted with the corresponding private key of the public-private key pair and vice versa.
- The public key of each participant is shared across the network, and the private key is held secretly by the individual.
Hash Function in Blockchain
The hash function is a mathematical function that can convert any input length to a fixed-length output. The following figure explains the working of a hash function to generate fixed-length output from an input of any length:
The hash function has some fantastic features/properties, which are as follows:
- Hash uniquely represents the data.
- The hash function in Blockchain converts the input data (for example, text) into a string of bytes. Then, whatever the input data’s length or structure, the hash function converts that into an output string of a fixed length and structure. The output we get through the hash function is called a “hash value” or “checksum. “There may be different hashing algorithms, and the hash value created using a specific hashing algorithm is always the same length. And it is one way, i.e., input to output, which is irreversible.
- Even a tiny change in the data generates an entirely random and different hash.
- Each hashing algorithm, irrespective of the data size, generates the hash of a fixed length, an essential characteristic of the hashing algorithm.
- We cannot retrieve the original string from the hashed string.
Blockchains use different types of hash functions. For example, there are Secure Hashing Algorithm (SHA-2 and SHA-3), RACE Integrity Primitives Evaluation Message Digest (RIPEMD), Message Digest Algorithm 5 (MD5), or BLAKE2.
For example, every digital asset on the Blockchain gets a unique identification via cryptographic hashing value calculated with the “SHA256” algorithm. “SHA” is the acronym for “Secure Hash Algorithm,” and 256 represents the total number of bits consumed in the memory.
In simple terms, the use of hashing in blockchain networks protects privacy and prevents data tempering. In addition, hashing helps secure blockchain networks, as explained here.
What is Hash Pointer in Blockchain?
Hash Pointer is nothing but a data structure. It contains two parts:
- Address of data
- The hash of the data
The system securely stores the hash pointer in a place other than the actual data storage. Storage of hash pointers at different locations than the data ensures the prevention of tempering of the transaction data.
We can easily verify any possible data tampering by recalculating the hash value of the actual stored data and comparing it with the hash of the data for the given address. Specifically, any mismatch of the hash value confirms data tempering and tells us that the data is not the same as the originally stored data.
A hashing algorithm is also used to generate account addresses.
Hash pointers connect the data blocks in a blockchain, ensuring data immutability.
Merkle Tree for Data Security
Merkle tree is an advanced technology for storing data securely in a blockchain that helps with efficient data retrieval.
Say we have four different data points. The initial idea was to append the data points using a simple concatenation technique and generate the hash. But later, the recommendation to create a Merkle Tree came up.
Storing data in the Merkle tree fashion has an advantage in the retrieval process compared to the traditional way. In the Merkle tree, we don’t need access to all the data points to verify the data/transaction.
To arrive at the Merkle root of the “n” number of elements, we follow the below steps:
- Compute the individual hashes of each element.
- We form strings for this set of hash elements by concatenating each pair of consequent hashes. For example, consider a case where a consequent hash is unavailable for the last hash. In this scenario, the hash of the previous data point is concatenated to itself.
- Compute the individual hashes of the obtained strings.
- Now, if the number of hashes is more than one, we go back to step 2 and perform steps 2 and 3 until there is one single hash.
NOTE: Even a tiny change in any data point will change its hash, which also changes the root hash. Thus, the root hash is the fingerprint representing all the data points in the blockchain.
Refer to the above diagram illustrating the calculation of a root hash of the four data points.
Levels of Merkle Tree
There are three levels in a Merkle tree, which are as follows:
- Leaf level: The lowest level of the Merkle tree is called the leaf level
- Merkle root: The top node is called the Merkle root
- Non-Leaf level: Non-leaf levels are in-between levels between leaf and root.
Binary Merkle Tree in Blockchain
We call the binary Merkle tree if the leaf nodes have \(2^n\) nodes. Moreover, if it is not a binary Merkle tree, we can make it a binary Merkle tree by adding duplicate nodes. There are a few essential properties of the binary Merkle tree, which are as follows:
- Nodes and Levels: The binary Merkel tree should have nodes (N) \(= 2L – 1\) and levels \(= log_2 {L + 1}\), where L represents the total number of nodes in the leaf level.
Our example above is L = 4, as the number of data points in the leaf level is four. - The Merkle path is the path stating the data points that are needed to verify the data points.
The Merkle tree allows verifying that a transaction exists in a blockchain block without having the entire block by following its Merkle branch, as the algorithm also stores intermediate hash results.
Hash Cash
Hash cash is a technique that incentivizes the right behavior and penalizes the wrong behavior.
In the early days, when email technology was evolving, hash cash was a technique to identify spam emails. In this technique, the sender calculates the hash cash and attaches it to the mail along with the recipient’s address, cc, body, etc., of the mail. It is proof of the work the sender does before sending an email.
The above limits the scope of spam emails as computing hash cash is computer-intensive and time-consuming. Blockchain technology uses hash cash as a consensus mechanism for mining the blocks.
Blockchain and TCP/IP & Peer-to-Peer Network
The TCP/IP protocol is a point-to-point protocol for the Internet. It creates an Internet where we can request data packages from the server using an IP or URL.
We can establish a peer-to-peer network by connecting multiple participants using TCP/IP protocols. We use peer-to-peer networks widely as torrents, where we download seeds from peers or servers.
By combining the above technologies, we establish an ideal blockchain network with two essential properties:
- A network where the participant’s interactions with complete transparency
- Maintain a single truth of the network
Cryptography and Blockchain’s Security Mechanism
Cryptography is the act of keeping digital assets secure through encryption in a blockchain. Encryption is the process of encoding digital files or assets so that only authorized parties can access them. A digital asset may be a message, an audio or video file, electronic cash (as in the case of cryptocurrency), or a document.
When encrypted digital assets get into the wrong hands, they will remain inaccessible if they cannot decrypt them. Decryption is the process of reversing encrypted digital assets. Fundamentally, cryptography is a safe way to send and receive digital assets.
For example, Japan made use of the 8 “i-ro-ha” Alphabet, 1-7 “Checkerboard” Cipher for cryptography from the 1500s to the Meiji era (October 23, 1868, to July 30, 1912). Therefore, foreigners who did not understand the checkerboard system – also called Polybius Square – would not decrypt encrypted messages with it.
Cryptographic Hashing and Blockchain Security
As mentioned under the “Hashing Function” section, every digital asset on the Blockchain gets a unique identification via cryptographic hashing value. We build blockchain security using Cryptographic Hash Functions, a special class of functions with unique properties.
Every digital asset has a hash value. We derive the hash value from the asset but not vice versa. The hash value remains the same as long as the asset content is immutable. A change in the asset contents triggers an instantaneous change in hash value—a concept called the Avalanche Effect. A collision occurs if a generated hash value is precisely the same as an existing value. Data security and integrity make it compulsory to prevent such collisions, making the hashing algorithm Collision-Resistant.
For example, let us assume that we need to hash a digital asset with the name “file.txt” and send it through the Blockchain; the application of the hashing algorithm would convert the file to a 64-bit hexadecimal hash value through a hash function.
We also use hashing algorithms to generate math puzzles that computers try to solve for a prize. After solving the puzzle, the computer is selected to help handle the transactions.
As discussed above, Merkle trees use the hashing keys of extensive data in small pieces. Therefore, they are useful for lightweight wallets, which constrain hardware devices such as mobile devices.
Cybersecurity in Blockchain Technology
The Confidentiality, Integrity, and Availability Triad are popular concepts in Cybersecurity. Confidentiality means keeping digital information or data hidden from unauthorized people. Integrity means protecting information or data from unauthorized tampering. Finally, availability refers to on-time and reliable access to data.
Incorporating the CIA security triad model for Blockchain involves the following:
- To Keep the transaction history and details hidden from third-party accessibility.
- Masking transactions of businesses that adopt cryptocurrencies as payment systems for security purposes.
- Protecting digital assets like data from corruption caused by configuration errors, software bugs, or espionage attempts.
- Makes the records of all transactions available, and these transactions could be the entries of business activities, asset entries, supply chain management records, and many more.
How Transactions Happen in a Blockchain?
To add a transaction to the blockchain network, it must undergo specific steps, including authorization, authentication, validation, and consensus from the network. The diagram below shows the high-level flow for adding the transaction to the network.
In the rest of the post, I will explain the steps, and we will demystify every step shown in the diagram above together. But before we do that, let us consider a simple network that will help us understand how the transaction will happen in the blockchain network.
In the sample network shown above, multiple organizations or individuals (represented as Org 1, Org 2, Org 3, and Individual) represent a node (N1, N2, N3, and N4) in the network. Each node of the blockchain network has a digital signature that gives nodes their identity. In reality, there can be millions of nodes. But for demo purposes, we are considering only a four-node network.
The above diagram is the basic setup of the blockchain network. For example, node N1 wants to transact with node N2 in the network. How will this transaction happen in the distributed blockchain network?
Node N1 wants to transfer 1 BTC (one bitcoin) to node N2. Let’s understand the step-by-step process for this.
Explanation of Blockchain Transaction Execution
- Node N1 initiates the transaction T1 of 1 BTC. The transaction initiated by node N1 is signed using node N1’s digital signature, and it contains the address of Node N2 (which is nothing but the public key of node N2)
- Transaction T1 is flooded into the network using the “gossip” protocol. The Gossip protocol communicates the transaction T1 to all four nodes. Nodes may also receive communication several times about the same transaction. For example, consider node 3, which gets information from nodes N4 and N1 about transaction T1. In this case, whoever communicates first is noted, preventing duplicate record creation.
- The Blockchain network maintains the record-keeping during the transaction flooding. You can think of this as a bank passbook containing a list of transactions for the account, showing the balance going up or down when receiving and sending accounts. There are multiple methods available to do record keeping in different networks. One of the standard methods is the account balance method, which is like the traditional banks.
- Next, all the nodes validate the transaction by checking the following:
- Does node N1 have a balance of 1 BTC? Does transaction T1 contain the digital signature of node N1?
- Does the transaction have an address (the public key) of node N2?
- Each node has a maintained transaction pool, as shown in the diagram above.
- After validating the transaction T1, it gets added using a hashing algorithm to the transaction pool, and the transaction is not added.
- Let’s say node N3 validated and added the transaction pool. The same will be flooded in the network when communicating about transaction T1, which is added to the transaction pool.
- If most nodes in the network agree that transaction T1 is valid, it is added to the block.
- The nodes are also incentivized to validate the transactions, generally in cryptocurrency.
- Some nodes in the network are called validators or miners. After a fixed interval of time (which varies from network to network), these nodes generate the block. Block generation occurs when all the valid transactions are in the transaction pool.
- Since transactions are added after a fixed interval, the blocks created will likely have different numbers of transactions.
- Every block also has a block size, which refers to the maximum data limit a block can hold. Different transactions can have different transaction sizes. Therefore, the combinations of the valid transaction (available in the transaction pool) sizes should not exceed the block size.
- If the size of the validated transaction is more than the remaining place in the block, the transaction goes back to the transaction pool. After that, this validated transaction goes to the next block.
- The consensus mechanism validates and agrees that the block created by a special node (using the Merkle tree, hash pointer, and hash cash technique) is valid. This maintains the network’s central truth, and everyone should know what is happening on the network.
- Finally, adding the validated block to the Blockchain takes place and completes the transaction in the blockchain network.
Types of Networks in Blockchain
The classification of the blockchain network generally falls under the cryptocurrency or enterprise network category. These popular networks are being used today, yet more research is going on to evolve them further. Let’s dive into the details of the stated networks.
Cryptocurrency Networks
The main intention of these networks is to introduce cryptocurrencies that work the same as common currencies such as rupees, dollars, etc. This means there must be an underlying value behind the cryptocurrency, which should be used to trade for the goods and services in the market (which is currently not happening in many countries).
These currencies have a digital presence and can be used only on a digital network. So, they are primarily digital currencies we have in a bank account but do not have any physical presence.
Bitcoin was the first blockchain cryptocurrency network. Other examples of this type of network are Ethereum, Polygon, Ripple, Litecoin, etc. The advantages are evident as we can transfer these currencies without financial institutions like banks.
Every cryptocurrency network has its pros and cons. However, Bitcoin is a widely used and most accepted cryptocurrency compared to other cryptocurrencies. However, the transaction speed could be faster; hence, scalability is the challenge.
Enterprise Blockchain Networks
As the name suggests, enterprise networks are a popular choice for businesses. They are a permissioned blockchain (more on it later) where businesses can utilize the advantages of blockchain with proper visibility and access restricted to a specific group of stakeholders.
Enterprise blockchains can be private or consortium networks.
Examples of these networks (open source) include Hyperledger, R3 Corda, etc. Big companies such as Facebook, IBM, Walmart, Mastercard, and many others are actively exploring the enterprise blockchain network at the time of this publication.
Types of Blockchains
In the above-discussed networks, multiple types of blockchains can exist. There are primarily two types of blockchains: permissionless and permissioned. However, there are several variations, such as consortium or hybrid blockchains.
Let’s go ahead and discuss every type in detail.
Public Permissionless Blockchain
These blockchains are permissionless and free for anyone to join or leave. All nodes in such a network have equal access to perform network operations. This network provides anonymity, immutability, and transparency but compromises efficiency.
Example: The Bitcoin blockchain is the best example of a public permissionless network. Other examples include Ethereum, Litecoin, etc.
Public Permissioned Blockchain
Confused looking at the words, public and permissioned together? It is contradictory. A public permissioned blockchain is a new type of network that is being researched to fill the gap between public permissionless and private consortium networks. Such a network still supports the decentralized and immutability properties of blockchain. Still, every node in the network knows the identity of other nodes.
Example: Such a network can help fulfill use cases that cannot be addressed in public or private networks. One such example is the Goods and Services Tax (GST) network in India, which will be most suitable for a permissioned blockchain since known entities operate it, and all participants are verified before they join the network.
Private Blockchain
The private blockchain operates within a closed network where only specific participants can access the network. It is operated and managed by the controlling authority, which holds the power of security, authorization, permission, and accessibility. The controlling authority controls even the right to perform a function.
Organizations that can control the participating stakeholders in the network primarily use private blockchain networks. Moreover, such networks prioritize efficiency over anonymity, transparency, and immutability.
Examples: Hyperledger projects, Corda, etc.
Consortium Blockchain
A consortium blockchain is like a private blockchain but differs in the factor of the authority who can control the access or operations.
Unlike private blockchain, where there is only one controlling authority, in a consortium blockchain, the authority is distributed. Instead of one controlling authority, multiple groups have controlling authorization. Therefore, decentralization is more prevalent than private blockchain.
Example: Supply chain where different units such as sales, logistics, marketing, etc., can have multiple controlling authorities, etc.
Hybrid Blockchain
It is a combination of private and public blockchains. It is the best of both worlds that can utilize the advantages of both private-based permissioned systems and public-based permissionless systems. In such a network, only a specific section or dataset records can be on the public network; another part of the data is restricted to the private network. Only authorized stakeholders are provided with access to the data available in the private network.
Example: Dragonchain, a blockchain initially developed by Disney, etc.
Summary
The blockchain concept started with Bitcoin cryptocurrencies, but innovative blockchain technology has been catching the attention of various industries to explore various uses. The usages of blockchain range from smart contracts for various usages like automated escrow, distributed ledgers for banking, online games for monetization, in-game optimization, and even in industries like insurance and supply chain.
This article discussed blockchain technology, including its architecture, security, and transaction processing. We hope that this will be helpful for people who wish to familiarize themselves with the concepts of blockchain technology.
Tavish lives in Hyderabad, India, and works as a result-oriented data scientist specializing in improving the major key performance business indicators.
He understands how data can be used for business excellence and is a focused learner who enjoys sharing knowledge.