Cryptography in Go: AES explained

Cryptography in Go: AES explained

Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can’t break.
—Bruce Schneier, “Memo to the Amateur Cipher Designer”

This is the first in a three-part series about cryptography in Go using the AES cipher.

  1. AES explained
  2. AES implementation (coming soon)
  3. AES encryption (coming soon)

You should never try to invent your own cipher, or, for that matter, a perpetual motion machine. In both cases the results are likely to be disappointing, and it’s not a good use of your time.

Nevertheless, in my book Explore Go: Cryptography, we’ll learn all about how ciphers work, or don’t, principally by trying to invent them ourselves. That’s the best way to learn anything, actually: by trying to do it yourself.

But now you might be wondering what the upshot of it all is, in practical terms. I mean, if you need to encrypt stuff in a Go program in real life, you definitely shouldn’t use a simple cipher that you invented yourself. What should you do, then?

That’s easy, and I can give you the answer in three words. Just use AES.

In the first post of this series, we’ll talk about where AES comes from, and how it works, at a high level. Next, we’ll take a deep dive into the Go implementation in the crypto/aes package. Finally, we’ll see how to use AES for encryption, decryption, and authentication in our own Go programs.

Ready? Buckle up, we’re off!

A little history

The initials “AES” stand for “Advanced Encryption Standard”, as perhaps you know. To explain what’s so advanced about it, we’ll need to say a word or two about the history of previous encryption standards.

In the beginning was the Caesar cipher (actually, he stole the idea from the ancient Egyptians). Fast-forward through two thousand years of fascinating cryptographic developments, which I urge you to look into after class, including Vigenère ciphers, the Enigma machine, and one-time pads. It’s fun stuff, but we can’t stop to talk about it: we’ve already arrived in 1973. Sorry about the bumpy landing.

The origins of DES

It makes a lot of sense for a government, or any other big organisation with secrets, to have an official standard cipher. The United States government is pretty big, and has a lot of secrets (not always terribly well-kept), so in 1973 it invited proposals for just such a national standard encryption system.

The result was the Data Encryption Standard (DES). DES uses a key size of 56 bits. Yes, that sounds rather small, especially to those who’ve read my book, and so it is. The restricted key size was imposed by the National Security Agency (NSA), the US cryptographic intelligence bureau, for predictable reasons: they didn’t want anyone using a cipher that they themselves couldn’t break.

You’d think governments would be in favour of encryption that’s as strong as possible in order to keep their own secrets safe, and so they are, up to a point. The problem is that they also want to know other governments’ secrets. The NSA’s preferred cipher, then, would have a key short enough that they can break it, with their tremendous resources, but also long enough that other, less well-equipped spy agencies can’t.

The chosen length of 56 bits thus tells us something about the brute-force cracking abilities of the NSA in the early 1970s, and by extension about those of its rivals (both foreign and domestic).

However, the 1970s were a long time ago, as I’m increasingly aware, and as an old cryptographer’s adage has it, “Attacks always get better; they never get worse.”

Tripling down on DES

So, by the mid-1980s, concern about practical brute-force attacks on DES had become widespread, and the quick-fix solution, for really important secrets, was to use “Triple DES”. As the name suggests, it simply consists of applying DES three times.

This doesn’t sound all that promising on the face of it: three weak passwords are not much better than one, for example. It’s clearly a case of kicking the can down the road, but governments have a bad habit of doing just that, as you may have noticed.

Something new would soon be needed.

AES

By the late 1990s, brute-forcing a DES key within a few days was possible on a machine costing a couple of hundred thousand dollars. This is cheap enough, relatively speaking, to make any government nervous. Once DES had fallen, indeed, could Triple DES be far behind? So proposals were invited for a new standard, to be called the Advanced Encryption Standard (“This time, we’ll get it right!”).

Rijndael

The cipher selected was one named Rijndael, after its inventors Vincent Rijmen and Joan Daemen. While Rijndael can work with a range of block sizes, the AES standard itself specifies a fixed block size of 128 bits (16 bytes). Similarly, while the underlying algorithm supports many different key sizes, AES limits the selection to either 128, 192, or 256 bits.

We won’t bother with the shorter key sizes (nor should anyone else), so let’s assume from now on that we’re talking only about AES-256, which is to say Rijndael using 256-bit keys. Rijndael is a very strong and sophisticated cipher, so using it with short keys would be silly, like owning a Ferrari but driving it everywhere in first gear.

Now, you don’t actually need to know anything about how AES works in order to use it, as long as you follow the usual good cryptographic practices like choosing secure keys, and so on. But you’re here, and it’s interesting, so let’s take a quick overview of the internal structure of AES, at a functional level.

All block ciphers work the same way, in principle, and AES is no exception: it operates on a single block of plaintext data at a time, combining it with a key to produce the corresponding block of ciphertext.

And, just as with any other block cipher, AES can be used in a variety of operating modes, some of which you’ll know about if you’ve read Explore Go: Cryptography: ECB (bad), CBC (good), and so on.

We’ll talk more about operating modes for AES in a later post, but let’s first see how the key is combined with the plaintext, specifically. That’s the clever bit, of course, so it might take a little explaining.

The starting grid

As we’ve seen, AES uses a block size of 16 bytes, which is a nice round number, or rather a nice square number: 4 × 4.

That’s not a coincidence, because we’re going to arrange the 16 bytes of block data in a 4 × 4 grid. Instead of writing them in rows, left to right, we’ll write them top to bottom, in so-called column-major order, like this:

01  05  09  13  
02  06  10  14
03  07  11  15
04  08  12  16

Why use a grid, you ask? Well, it’s a bit like one of those puzzle games where you have to shuffle the tiles around to form a sentence or reveal a picture. We’re going to shuffle the rows and columns around in a similar way to obscure the plaintext before encrypting it.

In any cipher, a block is encrypted by combining each plaintext byte with the corresponding key byte in some way. AES does this, too, but not just once: it repeats the procedure many times, using multiple successive rounds.

Round and round

In AES-256, there are fourteen rounds, and before we even start encrypting any blocks, we generate a separate 128-bit key for each of these rounds. All these round keys are derived from the 256-bit cipher key itself, by a complicated mathematical process we’ll glide smoothly over here.

Once we have the round keys, we can begin encrypting. In the first round, we simply combine each plaintext byte with the corresponding byte of the round key (let’s call this phase “AddRoundKey”).

Despite the name, rather than adding the two bytes together directly, AES uses the XOR operation. Just like addition, this is a fast, simple, reversible way of combining two bytes. As a bonus, it’s very easily implemented in hardware.

The output grid from the first round now becomes the input to the second round, and so on. In this, and each of the subsequent rounds before the last, we’ll again combine the data with the corresponding round key, but we also perform three extra steps beforehand (I did warn you it was complicated).

Confusion and diffusion

In the first of these phases, “SubBytes”, we substitute every byte of the block data with a different byte, obtained from a lookup table called the S-box. This is just a fixed table with 256 entries, mapping every possible input byte to some different output byte. All AES implementations use the same S-box table, and it’s published, so the enemy has it too, but that’s okay. Its purpose is just to make it harder to reverse-engineer the key from the ciphertext.

The next phase is called “ShiftRows”: we shift the rows horizontally, like the sliding tile puzzle. We leave the first row as it is, then we shift the second row by one place to the left, the third row by two places, and the last by three places. The overall effect of this is to make sure that each column in the output contains a byte from each of the four input columns.

Why do this? Good ciphers obscure the plaintext in two main ways, known in the trade as confusion and diffusion. Confusion involves changing the value of each byte, as in the previous SubBytes phase.

Diffusion, on the other hand, involves moving those bytes around within the block. The point, again, is to make it even harder for Eve—a hypothetical eavesdropper on the conversation—to work backwards from the ciphertext to the plaintext.

It’s just like the simple “rail fence” cipher you might have learned as a kid. For example, we might arrange the phrase “ATTACK AT DAWN” in a column-major grid as follows:

A C T W
T K . N
T . D .
A A A .

The resulting ciphertext, reading off each row in turn, is “ACTW TK.N T.D. AAA.”, and all the recipient has to do is write it out in a similar 4 × 4 grid to read off the plaintext in columns. Diffusion on its own, of course, isn’t terribly secure, which is why it needs to be combined with confusion to produce an effective cipher.

The third phase of AES encryption, “MixColumns”, is also a diffusion step, but operating on columns instead of rows. It applies a matrix transformation to each column such that each byte in the output column depends on all four input bytes of that column. Again, the details don’t matter for this discussion, but you can take it for granted that they’re mathemagical.

Putting it all together

Finally, we apply the round key, and the whole process is then repeated in the next round. This continues until the last round, which is the same as the others except for the omission of MixColumns. So here’s the recipe:

  • Round 1: AddRoundKey
  • Rounds 2-13: SubBytes, ShiftRows, MixColumns, AddRoundKey
  • Round 14: SubBytes, ShiftRows, AddRoundKey

And that’s it. It sounds a bit complicated, and of course cryptographically speaking it is: that’s the point. If you really want to dive into the details, read Jeff Moser’s wonderful Stick Figure Guide to AES.

However, as complicated as each of these phases are, they can be implemented very simply using a pre-generated lookup table. This is tremendously efficient, and it doesn’t take much code. So, in practice, AES implementations can be remarkably concise, straightforward, and easy to read.

To prove that point, in the next post, we’ll take a look at the implementation in Go’s crypto/aes package. See you then.

Iterators in Go

Iterators in Go

Rust vs Go in 2024

Rust vs Go in 2024

0