2018 is the year of the boom of cryptocurrency popularity. A vivid example of this is the fact that the number of Ethereum addresses in use is becoming bigger every day. More users mean more funds that people trust to the Ethereum blockchain, which they use to store their capital.
The market capitalization of the Ethereum platform was $64 billion USD as of May 23rd, 2018. In addition, Ethereum tokens, which are included to the top 100, achieved $39 billion USD.
This creates a need to discover how Ethereum storage works and what can be stored there to understand what is happening to money after people have trusted it to the Ethereum blockchain.
First of all, let’s learn about the Ethereum platform more deeply.
What Is Ethereum?
Ethereum is a blockchain-based decentralized computing platform that allows the execution of programs that are created on the Ethereum blockchain with the Solidity contract-oriented programming language.
In short, Ethereum provides a network that includes a lot of computers operating together in order to build a true “supercomputer” which democratizes the current client-server model. It offers thousands of “nodes” that are run by volunteers from all over the world, instead of servers and clouds.
Ethereum executes insurance contracts, exchanges financial instruments, and stores data. It enables equal opportunity for offering services on top of this infrastructure for all users. For example, there are a lot of applications users can download and install on their computers, but they rely on a company (or another third-party service) to store their credit card information, purchase history, and other private information, which means those parties have access to all of them.
Providing an open network to all users facilitates stealing, leaks, and potential changes to important private information of users without their awareness. Users store their personal information on clouds and servers that are the property of companies like Amazon, Google, and Facebook. With a variety of conveniences, like secure data and an absence of costs for hosting and uptime, these companies also bring vulnerabilities to the entire computer system.
Moreover, third parties like Google and Apple allow applications that are governed exclusively by them (Evernote, Google Docs, iTunes).
Ethereum is a disruptive technology that helps us withstand the vulnerabilities internet third parties bring to users. Ethereum returns control of personal data to users and provides creative rights, as well. The main idea is that third-party companies won’t have access to make changes to user data. If the owner of the data saves edits or adds or deletes notes, each node on the network will see the changes.
Looking at Ethereum from the software perspective, it contains a set of accounts, each of which has a balance (in Ether).
After a user has created his/her own Ethereum wallet, he or she can execute transactions, transfer Ether to other users, or deploy smart contracts.
The transaction processing system of Ethereum consists of:
- a state: a set of all accounts and their balances
- a transaction
- a new state: an updated set of accounts and their balances
Each account has an address and a balance. There are two types of accounts in Ethereum: user and smart-contract. A smart contract is a digital asset deployed to EVM (the Ethereum Virtual Machine) that can store data, hold Ether, and execute functions.
What Is an Ethereum Array?
All smart contracts run in the Ethereum Virtual Machine (EVM) and keep a state in its own permanent storage. Ethereum storage is a gigantic array that initially contains zero information. Each value in the array is 32 bytes wide. In total, there are 2,256 values. Smart contracts can read from or write to a value at any given place.
As zero does not occupy any space, and there is the possibility to restore all values by setting them to zero. The amount of gas spent on changing a value to zero can be refunded.
Locating Fixed-Sized Values
A known variable of fixed size receives a particular place in storage. The Solidity programming language places slots (specific locations of values within the storage):
“a” value is stored at slot 0
“b” values are stored at slots 1 and 2
“c” value occupies slot 3; and the next slots as the entry structure stores two 32-byte values.
The slots are defined during the compilation time, which depends on the order of the variables appearing in the contract code.
Dynamically-sized arrays need locations to keep their sizes and elements.
Only the information that the dynamically-sized array “d” has placed in slot 5 is stored. The array stores the values consecutively, starting at the hash of the slot.
Note: Reserved slots cannot be used in dynamically-sized arrays and mapping, as we do not know the number of slots to reserve. Solidity performs a hash function to repeatedly compute locations for dynamically-sized values.
In mapping, there is the requirement to find the location related to a given key. We can hash the key; however, consider the fact that different mappings should generate different locations.
“e” value is placed in slot 6, and “f” in slot 7, but there are no lengths or personal values stored at those locations.
It is possible to determine the location of a particular value within mapping by hashing the key and the mapping’s slot.
We can combine dynamically-sized arrays and mappings to find a location of a value by recursively applying the calculations.
We can find items within the following complex types.
What Data Can Be Stored in Ethereum Storage?
First, let’s find out what type of information we can store to enable blockchain network operation. Here is a simple example of a transaction:
Trace the balances and other data from different customers and each piece of data that is running on the blockchain; for example, transactions. Every platform behaves differently. Let’s discover what is happening on the Ethereum platform.
Ethereum is a transaction-based “state” machine, which means that all transactions based on state-machine concepts can be created. It starts from its own genesis block. Transactions, contracts, and mining are constantly changing the state of the Ethereum blockchain. For example, an account balance that is stored in a state tray remembers all changes made upon transaction execution.
The state trie, or Merkle Tree, of Ethereum provides a key and value pair for every account in the system. This “key” is a single 160-bit identifier (the address of the Ethereum account). The “value” in the state trie is generated through encoding the following account details of the Ethereum account, with the recursive-length prefix (RLP) encoding method:
The state trie’s root node (a hash of the current state trie) can be used to identify the state trie; the state trie’s root node cryptographically depends upon all internal state-trie data.
The storage trie is the place where contract data is stored. Every account has its own storage trie. A 256-bit hash of the storage trie’s root node is a storageRoot value in the global state trie.
The transaction trie is included in every Ethereum block. The miner who governs the block determines the order of transactions in the block.
One can gain access to a particular transaction in the transaction trie through its index. A transaction will always have the same position, as the mined blocks are immutable. Having determined the location of the transaction, you will use the same path to it every time you need it.
The Ethereum blockchain stores the root node hashes of transactions, states, and receipts directly; however, it does not provide the option to keep data such as account balances, for example.
In the scheme provided above, we notice that the root node hash of Ethereum storage that contains all smart-contract data indicates the state trie that points to the blockchain.
The Ethereum blockchain stores two types of data:
- Permanent (e.g., the record of a transaction that has already been confirmed and cannot be changed)Ephemeral (e.g., the balance of a particular Ethereum account address that the storage keeps is changed after every individual account.)
- Ethereum structures both of these types of data and stores them separately in order to manage them.
We can compare the record-keeping system of Ethereum to the type of ATM/debit card that banks use to track the amount of money every debit card possesses. This provides a response to its owner about whether or not the card has enough money before approving the transaction.
Types of Ethereum Tries
Mainly, Ethereum clients use 2 different database software solutions to store their tries:
RocksDB is built over LevelDB, and is scalable to run on servers with many CPU cores; to efficiently use fast storage; to support IO-bound, in-memory, and write-once workloads; and to be flexible to allow for innovation. In RocksDB, many of the drawbacks of LevelDB are eliminated. In addition, functionality is significantly expanded. Well-known users of RocksDB include: Facebook, Yahoo, LinkedIn, Airbnb, Pinterest, Uber, and Netflix. In addition, RocksDB is applied in projects like CockroachDB, TiDB, and Apache Flink.
Leveldb is a commonly-used data storage for system logging that provides platform persistence. It is an open-source Google key-value storage library that offers forward and backward iterations over data, ordered mapping from string keys to string values, custom comparison functions, and automatic compression. The open-source Google compression/decompression library “Snappy” provides automatic data compression. By the way, GoLang, C++, and Python clients use this database.
When a user queries leveldb, it returns encoded results because the Ethereum blockchain uses its own unique Modified Merkle Patricia Tree implementation during leveldb execution.
The information on the design and implementation of both Ethereum’s Modified Merkle Patricia Tree and Recursive Length Prefix (RLP) encoding can be found on the Ethereum Wiki.
As Ethereum uses a tree data structure, the Modified Merkle Patricia provides a method of “extending” a node according to which one can shortcut the descent.
A Modified Merkle Patricia tree node isone of the following:
- an empty string (referred to as NULL)
- an array that includes 17 items (referred to as a branch)
- an array that includes 2 items (referred to as a leaf)
- an array that includes 2 items (referred to as an extension)
An interesting fact is that Ethereum accounts can be included in a state trie after a transaction has already been located. If one creates a new account using the field “get account new,” this account will not be included in the state trie. This is also impossible even after the mining of many blocks.
On the other hand, if a transaction is successful (costs gas and is located to a mined block), it is recorded against that account, so it will be recorded to the state trie. This feature helps avoid hacker attacks that register new accounts and bloat the state trie.
One of the main issues users face is rapid database increase, which results in significant storage costs. The database, also known as the Ethereum state, contains all of the computations memorized by the computers. The cost of storing the database is constantly increasing, which makes the ability to run full nodes impossible for more and more users. As a result, the network can be changed into a centralized one, and become accessible only to a few users.
Ethereum developers have provided really effective protocol-level changes: for example, sharding to minimize the size of the database.
To cope with other issues, Geth and Parity recently developed the following updates.
First, they offered to eliminate temporary files from the network history in order to reduce storage requirements. Because of long synchronization times, Ethereum has been unable to use hard drive since last summer. This update allows the running of full nodes and Ethereum software with faster synchronization times on a hard drive instead of a solid-state drive (SSD). This has brought a lot of positive comments from Ethereum users.
The independent developer Alexey Akhunov suggested a second update: to simplify the process of the overall state through rewriting the geth client (“turbo geth”). In his opinion, the adaptation of the software can help to run random access memory or RAM much faster, and enable users to synchronize with the network instantly.
Additionally, Ethereum can implement “stateless clients” who store a compression of the overall state. If these clients have no need to access storage from files or for block validation, they can keep their information. Pruning old, irrelevant data, empty or long-inactive accounts, and the time required for checks will all drop significantly. There are also difficulties with code operation: coordination difficulties that need to be overcome in order to ensure the code continues to operate as designed.
The Constantinople update, which will be released in October 2018, will bring a new challenge for Ethereum users: a balance between a web of diverse stakeholders. The ‘difficulty bomb,’ a piece of code from the Constantinople update, was developed to promote new technology for the platform. However, it can also make blocks steadily less time-efficient to mine. Special measures should be taken to allow the processing of transactions and the operation of the blockchain as originally conceived.
The delay with the difficulty bomb release also had a negative impact on Ether inflation; therefore, the Ethereum platform must upgrade its code before the difficulty bomb hits.
There are three offerings on how to improve Ether inflation (EIPs): EIP 145, EIP 1014, and EIP 1052 will facilitate state channels and increase the speed of contract verification.
Some clients think the Constantinople update should also reduce the amount of Ether given to miners — people who run special computing hardware to ensure the security of transactions. This opinion is also expressed by ETH traders, who believe that it will preserve the value of the currency and the network.
In response, miners state that a decrease in their income will cause a decrease in the security of the Ethereum platform.
Also, learn more about smart contract bugs and pitfalls while implementation in order to avoid some technical issues.
Having discovered the operation of the Ethereum network and its storage, we can make a conclusion that Ethereum provides effective alternative storage for transactions involving third-party companies.
The fact that the number of Ethereum adherents is constantly growing proves that today, people prefer to use a decentralized application over a centralized one. This means that the era of blockchain-based technologies will continue for a long time to come. By the way, you have a great chance to contact Applicature team to develop your own smart contract that meets all regulatory requirements in your country.