*What is decentralization – and how do you measure it? A group of Chinese researchers compared decentralization in Bitcoin and Ethereum through three metrics. But the result mainly shows how difficult it is to measure the decentralization of a cryptocurrency.*

If there is a sacred cow in the crypto space, it goes by the name of decentralization. Decentralization is the purpose and cause and goal of a blockchain. It makes the decisive difference between PayPal and Bitcoin. Everything stands and falls with it. The market capitalization of all cryptocurrencies, which is now a trillion dollars, can be seen as the value that the market gives to a decentralized solution for transactions. If you want to buy cryptocurrencies, make sure to get a coupon before you sign up. Many exchanges offer these from time to time, currently there is one for binance, take a look at http://www.bitcoincoupons.org/binance/.

*But – what is decentralization? Is a thing either centralized or decentralized? Can you measure whether something is more or less decentralized? And if so, how?*

## Schrödinger’s sacred cow

These questions are insanely difficult to answer. Decentrality is in itself just the opposite of centrality. As soon as a system no longer has a center, it can be considered decentralized. In that sense, three equally strong and independent parties would be enough for something to be decentralized. At the same time, no one disputes that a system with 100 independent and equally strong parties is more decentralized than one with three. But how exactly is this to be defined and measured?

Or is decentralization just a function to produce an outcome – to keep cryptocurrencies secure, censorship-resistant and permission-free? Accordingly, is a cryptocurrency that achieves this state with with fewer independent parties just as decentralized as one with more parties?

And is a theoretically achievable decentralization worth more than the practically realized one? Is the cost of running a full-fledged node a better indicator of decentralization than actual nodes? Or the cost to centralize the network?

What are the essential metrics? The number of miners? The number of mining pools? Of the nodes in the network? Or of the connections between the nodes, the edges in the graph? Should we look at the distribution of units of currency? Or to the number and distribution of software implementations of the protocol? And what role does the consensus mechanism play?

All these questions are far from settled. Probably they can never be answered universally. After all, they are more a matter of perspectives, views and beliefs that are accompanied by strong prejudices: Everyone has their favorite cryptocurrency and has lost impartiality with the investment at the latest.

Thus, for Bitcoiners, there is absolutely no question that Bitcoin is the only truly decentralized cryptocurrency, and that only Proof of Work achieves decentrality. Ethereum, on the other hand, is hoping for more decentralization from Proof of Stake, while Cardano and Polkadot already hail themselves for being much more decentralized than Bitcoin, Craig Wright’s BSV scene claims that only protocol immutability creates decentralization, and Ripple Labs touts XRP as being more decentralized than Bitcoin, even though the rest of the world thinks Ripple is a hideously centralized event. And so on.

Decentralization is the sacred cow of the crypto scene. But what the nature of this cow is and what pasture it grazes on – that is impossible to determine. You could also say we have spotted Schrödinger’s cow – or not. Fortunately, there are scientists who are trying to get down to the bare facts on this subject.

## Miner as a key metric for decentralization

In a recent paper, three computer scientists at Beijing University explore the question of whether Bitcoin or Ethereum is more decentralized. Their approach is very straightforward: “We measure the degree of decentralization of the two blockchains over the course of 2019 by calculating the distribution of mining power.”

Many of you will now be shaking your heads. Isn’t it clear by now that decentralization is defined as much – if not more – by the number of full nodes as it is by the miners? After all, what have we been arguing and discussing this for all these years?

But let’s leave the academics with the definition, which makes sense on its own terms: One “intuitively recognizes that it is indicative of a more centrally controlled – and potentially less secure – blockchain when fewer parties own the majority of resources” and, through collaboration, amass enough power to attack the blockchain and possibly falsify past transaction data. While this doesn’t do justice to the full compexity of the issue-technical, sociological, and economic-it’s not wrong either. So it is to be applauded that researchers are trying to bring clarity to this sub-issue.

The researchers measure miner decentralization through three metrics: The Gini coefficient, the Shannon entropy, and the Nakamoto coefficient. They chose the blocks 556,459 to 610,690 for Bitcoin and 6,988,615 to 9,193,265 as the period under consideration – or, more trivially, the year 2019.

## What do each of these values mean?

The first metric is the Gini coefficient. It is relatively well-known, as it is usually used to measure inequality in nation-states. To calculate it, one calculates the so-called Lorenz curve of income distribution and measures its distance from the curve of perfectly equal distribution. If the distance is 0, everyone has exactly the same amount; if it is 1, someone has everything. To express it as a percentage, it is multiplied by 100.

### Formula for calculating the Gini coefficient

In countries such as the Czech Republic, Finland or Belgium, the Gini coefficient is below 30, in many countries, including Germany, Pakistan and Japan, it is between 30 and 35, in Latin America and the Caribbean it is mostly between 30 and 40, sometimes even above 50, and in some African countries it is even above 60. For Bitcoin and Ether, the Gini coefficient is calculated quite similarly. The population does not consist of all inhabitants of a country, but of the miners, and the decisive value is not the income, but the number of blocks found.

The second value is the entropy according to Claude Shannon’s information theory. Entropy here means the uncertainty of an information, which makes redundancies necessary and therefore reduces the information density. If someone expresses himself unclearly, he must repeat his message to prevent misunderstandings. The researchers apply Shannon’s entropy formula to mining. They rate higher entropy as an indicator of decentralization. The harder it is to guess who will find the next block, the more widely hashing power is distributed.

### Shannon’s formula for entropy

The third metric is the Nakamoto coefficient, named after Satoshi Nakamoto, the inventor of Bitcoin. It makes a concrete connection to the security of a blockchain and is defined as the minimum number of parties that must cooperate to gain 51 percent of the mining power in the entire system and thereby attack it. The more of these parties needed, the higher the decentralization.

The results are very clear: the Gini coefficient is significantly lower for Bitcoin than for Ethereum, while the Nakamoto coefficient is higher. Shannon’s entropy also confirms this verdict, albeit less clearly. However, the data has some irregularities. For example, the Gini coefficient varies widely depending on the time window chosen. If we look at individual days, it varies between 0.45 and 0.6. If we scale it to a week, it ranges between 0.6 and 0.7, and in the monthly average it even reaches 0.9 in some cases.

The reason is easy to find. First, it is due to the Gini coefficient itself. A lower resolution of the data, Wikipedia warns, makes for a lower value. Second, it is due to mining, in which finding a block is a random event in the stream of probability. And anyone who has ever dealt with dice knows that with 1000 rolls the result is on average 3.5, while with 10 rolls it can sometimes be 2.3 or 4.8. In short: To work out a structure in a random series of events, you need a large population.

- One day in Bitcoin comprises 144 blocks – too small an amount to show the inequality that exists. Even the 1008 blocks that exist per week seem too few. Only the monthly scale reveals the actual inequality – centrality – of mining – at least approximately.
- A day in Bitcoin comprises 144 blocks – too small an amount to show the inequality that exists. Even the 1008 blocks that exist per week seem too few. Only the monthly scale reveals the actual inequality – centrality – of mining – at least approximately.
- Ethereum is different: Here, the miners find a block every 12.5 seconds, which results in a good 6,900 blocks per day. The Gini coefficient increases significantly less on a weekly or monthly scale than it does for Bitcoin. The fact that it still does, however, shows that even the monthly perspective for Bitcoin does not quite capture the actual inequality.
- When comparing the Gini coefficient of Bitcoin and Ethereum, it would make sense to compare the daily scale on Ethereum with the monthly scale on Bitcoin. This narrows the difference, but Bitcoin remains more decentralized.

The Shannon entropy is also somewhat confusing. It fluctuates between 3.5 and 4 for Bitcoin and between 3.3 and 3.5 for Ethereum, so it seems to be slightly higher for Bitcoin in general. The higher fluctuations are again likely due to the fact that the resolution is lower for Bitcoin than Ethereum. It is curious, however, that the Shannon entropy develops similarly to the Gini coefficient over the course of a year. Since higher means more centralized for the Gini coefficient, but more decentralized for Shannon entropy, the two values should actually correlate negatively – but they don’t.

## Entropy in Ethereum

The researchers find a possible explanation for this when they take a closer look at the blocks. This explanation, by the way, hails their entire method.

On January 14, 2019, the Gini coefficient drops to 0.34 and the Shannon entropy drops to 6.2, both of which are extreme values that have the potential to skew the entire statistic. The cause of the anomaly lies in two blocks that contain more than 80 independent Coinbase addresses. These are the addresses to which miners’ earnings are paid out. Usually, this is the address of the pool that then forwards the earnings. On this day, one pool probably tested the immediate payout by Coinbase, and then probably discarded it.

So, given an identical distribution of hashpower, the researchers’ methodology produces completely different results when a pool pays out the proceeds directly instead of indirectly. In principle, this makes the method unsuitable to determine anything at all.

Finally, the Nakamoto coefficient is more consistent. It stays mostly at 4 for Bitcoin, largely independent of the time scales, and sometimes fluctuates between 4 and 5. For Ethereum, on the other hand, it ranges very consistently between 2 and 3. So the clear result says that for Ethereum, only 2-3 parties need to ally to drive a 51 percent attack, while for Bitcoin it is (after all) 4 to 5.

### No robust results

The paper shows rather weak values for decentralization in all metrics for both Bitcoin and Ethereum. If you don`t have an Ethereum wallet yet, check out https://ethereum.org/en/wallets/. The Gini coefficient indicates extreme inequality, and the Nakamoto coefficient indicates an extremely small group of entities that would need to collaborate to attack the system.

However, this finding suffers from the fact that the researchers equate pools and miners. Once a pool pays the block reward directly to the miners involved, the Gini coefficient drops to 0.34 and the Nakamoto coefficient rises to the highest value ever observed of more than 35. Both are values that indicate a very high degree of decentralization. However, the data basis is far too thin to draw any serious conclusions. It is clear, however, that the measurement made here is in no way suitable for depicting the miners involved.

For example, on Ethereum, 2-3 pools can cooperate to launch a 51 percent attack. But as soon as the miners whose hashpower the pools are pooling pull it off, the attack is over and the pool is ruined. To do even short-term damage, the pools rely on numerous parties, while not necessarily following suit, to at least remain inattentive. This also makes the Nakamoto coefficient rather meaningless at this point.

If the paper shows anything, it is how difficult, if not impossible, it is to map even one aspect of decentralization – mining – into precise metrics. This is interesting, but will do little to answer the question of decentralization.