Dec 8

Dec 8 When doomed stubs attack: blockchain voting and proof of work

Pizza or ice cream? It’s a question that we as a species have grappled with since the dawn of time, and I think we’d all agree that the only sensible answer is “Both!” If you have the means, of course.

In Bobcoin, blockchains, and cryptocurrency we ran into a related problem with our otherwise delightful digital currency scheme: what’s to stop Sam the scammer from spending the same Bobcoin twice, once on pizza, and again on ice cream?

Digests and hashing

One way is to use a mathematical function to compute a kind of unique fingerprint of each transaction. The function takes a block of Bobcoin data, such as the one containing sketchy Sam’s pizza purchase, and boils its bits down to a single value. Anyone in the Bobcoin network can use the same function independently to compute the fingerprint value, called a digest of the block.

As the name suggests, a digest is usually shorter than the data it summarises. In fact, a digest usually has some fixed length—for example, 256 bits—regardless of the size of the message it represents.

The problem of taking some arbitrarily long string of bits (a message), and turning it into some representative fixed-length string is called hashing, as in “corned-beef hash”, or “making a hash of things”.

My book Explore Go: Cryptography is all about such mathematical magic, and in it you’ll learn how to keep your messages secret, how to protect your digital currency from scammers, and how to be truly random. We’ll implement lots of cryptographic code in Go together, including various kinds of hash functions, and see just how effective they are (or aren’t).

The specific hash function that we use doesn’t matter here, as long as it satisfies some key properties. The most important of these is that, while it’s straightforward to compute the hash of a message, it should be very difficult to reconstruct the original message if you only have the hash. Hash functions are sometimes called “one-way” functions for this reason.

The blockchain

And that’s what we use to stymie Sam. When each node—each computer in the Bobcoin network—has received a block’s worth of transactions, it starts trying to compute the next block. It hashes the contents of the previous block together with the transaction data, and the result is the next block, which it then sends out to the network.

Now we have a way for nodes to verify the order of blocks: by verifying their hashes. Since the hash of any block depends uniquely (or very nearly uniquely) on the previous block, a given set of blocks can only be verified in their true order.

This means we can order all transactions. When Sam spends their Bobcoin on Pat’s pizza, and then tries to spend it again on Irene’s ice cream, Irene can detect that double spending and reject the transaction. Success!

Well, nearly. There’s still a problem: what if two nodes independently compute and send out different versions of the “next” block? Won’t the chain then split into two independently-evolving versions?

To avoid that, a kind of consensus voting takes place. If a node receives two different candidates for the “current” block, it keeps both of them until it can decide which is the authoritative block.

Doomed stubs

How can the node decide which block wins this race? Well, sooner or later it’ll receive another block from the rest of the network. This will be “chained” to one or other of the two blocks.

In other words, the new block will be verifiable only against the hash of either Block A or Block B. Whichever it is, that’s what the network considers to be the authoritative block, and the node can safely discard the other one.

This doomed “stub” version of the chain might still grow for another block or two, depending on the speed of propagation, but eventually it will be discarded. When faced with two competing versions of the chain, nodes will always opt for the longer one, because it represents the majority opinion that this is the “correct” chain. More work has gone into it, so the network votes with its feet—or rather, with its hashes.

Verifying a given transaction, then, is simply a matter of looking at the block that contains it. The more valid blocks that follow this one, the more confident we can be that the transaction has been accepted by other nodes.

Since hashing takes some non-zero amount of computing power, it’s in a node’s interest not to waste time, heat, and money by computing hashes for a doomed stub. Thus, the network rapidly converges on a single authoritative ordering of Bobcoin transactions.

To repay nodes for their hard work in keeping the blockchain going, the node that successfully creates the next block earns a small fee, made up of the leftover fractional bits of Bobcoin from each transaction. Each node’s version of the next block includes a transaction in which this fee is paid to their account. Thus, their reward for winning the race is to have their fee-earning transaction validated by the rest of the network.

Integrity

In a decentralised scheme like this where each node marks its own homework, isn’t there a danger of fraud?

Surely all a malicious node has to do is compute the next block, based on a bogus transaction (“Alice pays Mallory one million Bobcoin”, let’s say), and then keep this stub chain going by computing all future blocks based on it (plus the subsequent genuine transactions). What’s to prevent this?

First, transactions are signed by the person who makes them: only Alice can create a transaction where Alice pays money to someone else, and anyone can independently verify it.

Second, hashing is hard work. It costs some amount of money to compute the hash of each block, so the longer this “fake” chain is kept up, the more it costs Mallory. Eventually it’ll cost them more than they could steal with the fake transaction.

In effect, Mallory is fighting the combined hashing power of all the other nodes in the network, which are collaborating to keep the genuine chain going.

The genuine chain will rapidly become longer than the fake one, because it simply has more hashing power behind it. Very soon, all nodes will discard the shorter chain as invalid, and Mallory’s efforts will be for nothing.

Sure, Mallory could recruit other nodes into their Bobcoin-stealing gang, but they’ll still be defeated by the honest nodes. Only if Mallory can commandeer more than half of the total hashrate in the network can they hope to subvert the blockchain.

Such a 51% attack is expensive and infeasible for large networks, which means that the more nodes join a given blockchain network, the more secure it is: safety in numbers.

Proof of work

A third way in which malicious blocks are discouraged is to make the hashing process as slow and expensive as possible. Usually, we want hashing to be fast and efficient, but in this case the point is that it should take a lot of time and effort.

In the case of the real-life Bitcoin network, for example, the hashing problem is made harder by requiring miners—nodes that want to produce blocks—to produce a hash value within a specific range.

For example, suppose you’re given some transaction data and asked to hash it so that the 256-bit hash value is less than, say, a million. Of course you can only do this by mixing in some extra data of your own—some nonce—to produce the required value. And since you can’t reverse the SHA-256 algorithm to find the input number you need (it’s preimage-resistant), the only way you can do it is by guessing.

It’s like a kind of benign brute-force attack, or a version of the “guess the number I’m thinking of” game for extremely patient players. You just keep guessing a different value for the nonce and hashing the block until, by pure chance, your hash value comes out as some number less than 1,000,000.

That might take a while: given the size of the 256-bit hashspace, your odds of producing an acceptable hash are about one in 10⁷¹ for each attempt. It therefore costs quite a bit of money just to produce each block, making attempted fraud less attractive. Mining fees are structured so as to make it more rewarding to mine genuine blocks than fake ones.

In Bitcoin, the required difficulty target is a little easier than our example: at the time of writing, the chances of producing an acceptable hash are currently about one in 10⁶⁴. The difficulty target gradually increases over time, to keep pace with the expected increase in computing power.

Up in smoke

If I tell you that, despite this astronomical difficulty, a new Bitcoin block is mined, on average, about every ten minutes, that gives you some idea of the enormous computing resources devoted to the network.

Tens of thousands of Bitcoin nodes are all hashing as fast as they can, five hundred billion billion hashes a second, trying to guess a nonce that will produce a valid hash, and only one node can win that race. All the work done by the losing nodes is wasted, and simply goes up in smoke. Literally, since most of the electricity they use comes from burning fossil fuels.

This proof of work scheme, then, depends on wasting a massive amount of precious energy. Bitcoin alone consumes over 120 terawatt-hours of electricity every year, which is more than many countries.

In effect, Bitcoin adds another energy-hungry country to the world, and we don’t really need another one of those.

It seems a shame that the most powerful distributed supercomputer in human history is being wasted on polluting our atmosphere and heating up the planet by solving math problems that are totally useless to humanity.

A stake in the future

Proof of work, then, is perhaps the worst idea ever, but fortunately there are some alternatives in prospect.

One is so-called proof of stake. In this scheme, rather than the most trusted nodes being the ones that have burnt up the most energy, they are simply the ones who have the most money. These nodes, the reasoning goes, have the biggest stake in the network, so they’re least inclined to undermine it.

That’s true to some extent, but those who find the democratising nature of cryptocurrency appealing may think that this is just another way of rigging the system in favour of its richest participants. And we have that already: it’s called “capitalism”.

Clearly we need a fair, efficient way of preventing fraud and double-spending that doesn’t literally set fire to the planet, and we don’t have that at the moment. Maybe someone will have a bright idea about this soon. Maybe it’ll be you.