Cryptography’s inner workings have long been regarded as exclusive to the realms of experts and mathematicians, its technicalities being largely attributed to magic. Given the complicated nature of the state of modern cryptography, this is understandable. However, through many of the global movements that stem from a lack of understanding on the subject such as the UK’s proposed Encryption Ban and Australia’s Assistance and Access Bill, it is clear that this approach is doing far more harm than good.
In this guide, I will give a crash course in everything that you might need to know to get started in understanding Cryptography. I will give a rundown on the history of various cryptographic systems, and give a crash course on the three most prevalent areas of cryptography: Stream Ciphers, Block Ciphers and Public Key Cryptography.
Ciphers are the cornerstone of cryptography. A cipher is a set of algorithms that performs encryption or decryption on a message. An encryption algorithm (E) takes a secret key (k) and a message (m), and produces a ciphertext (c). Similarly, a Decryption algorithm (D) takes a secret key (K) and the previous resulting Ciphertext (C). They be represented as follows:
This also means that in order for it to be a cipher, it must satisfy the consistency equation as follows, making it possible to decrypt.
This just means that if you encrypt a message with the key K, you would get the exact same message back by decrypting it with K as well.
One of the oldest and simplest ciphers is the Caesar Cipher, which simply replaces each letter in a message with a letter some fixed number of positions down the alphabet.
In the following case, the message has been shifted three characters up:
This cipher can be represented mathematically as follows:
While this meets our definitions of a cipher, it is hardly secure. If an attacker knows that this cipher has been used, they could just try all 25 combinations. Even if they were not sure what cipher has been used, they could figure this out by noticing the pattern of letters in the ciphertext.
To move onto some more secure encryption algorithms, we will have to discuss an operator known as Xor.
An XOR, or ‘Exclusive Or’ gate is a boolean digital logic gate that takes two inputs of 1 or 0, and returns 1 if both inputs are different, or 0 if both are the same. It can be represented with the following truth table covering all possible inputs and outputs:
This operator is often represented with the ⊕ sign. In this notation:
0 ⊕ 0 = 0,
0 ⊕ 1 = 1,
1 ⊕ 0 = 1, and
1 ⊕ 1 = 0.
There are several important properties that apply to Xor.
- You can apply them in any order: a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c
- Anything XOR’d with itself is 0: a ⊕ a = 0
- Anything XOR with 0 is itself: a ⊕ 0 = a
This implies that a ⊕ b ⊕ a = b, as this equals a ⊕ a ⊕ b which equals 0 ⊕ b which equals b, according to the laws stated above. It’s important to note that this only works on ones and zeroes, so numbers of a different radix must be first be converted to binary.
87 ⊕ 73 = 1010111b ⊕ 1001001b = 0011110b = 30
Now we can move on to our first secure cipher.
Described by Frank Miller in 1882, a One-time pad (also known as a Vernam cipher) works by performing an XOR with a message and a private key, and then using XOR on the private key and the ciphertext to re-obtain the message. This can be done owing to the property described above, that a ⊕ b ⊕ a = b. The pair of algorithms that define the One-time pad can be represented as follows:
The consistency equation for this pair of ciphers can be easily proven as follows:
We’ll do a simple example to show how easy one-time pads can be to use. Say we want to encrypt the word “Message”. The first step would be to convert the message into binary (just ones and zeroes.). We can do this by converting each letter according to the ASCII character set.
We now need 56 random bits to make a key to XOR the clear-text with. It is important that the private key is as random as possible.
We then XOR each of the bits in either message with each-other.
The resulting set of bits is our ciphertext. To decrypt the ciphertext, we simply need to do the same operation, but this time XORing the ciphertext with the same randomly generated key, and then converting it back to ASCII.
This cipher is very simple to use and understand, but it also has another interesting feature about it. The One-time pad has what is known as perfect secrecy, essentially meaning that if an attacker only knew the ciphertext (the result of m⊕k), it is mathematically impossible to know anything about the plain-text message, and it is completely impossible to crack.
So, we have a cipher that is easy to understand and even do by hand, but mathematically impossible to break. Why would we ever want to use anything else? The reason being that while One-time pads are effective and easy to use, they have a number of important downfalls.
The first major issue is that for any message being sent, the private key must be as long as, or longer than the clear-text message. For the recipient to be able to decrypt said message, you must have a way to securely provide them with the private key. If this is the case, then it would simply be easier to transfer the original message across the secure channel.
This problem is exacerbated with the second major drawback which lies within the name. The private key of a One-time pad may only be used one time, meaning that each message being encrypted must use a unique and random key. The major weakness that comes from encrypting multiple messages with the same private key can be demonstrated mathematically quite easily.
Lets say we have two messages, m1 and m2, which we will both encrypt using the same key, K (making our system a two-time pad). We can get the following by XORing both cipher texts together.
From this, we get m1⊕m2. Now an attacker can perform various forms of analysis on the resulting text, such as statistical analysis, frequency analysis, pattern matching, or using a natural language approach as described in this 2006 paper. I won’t get fully into explaining why this is so insecure, but for those interested, this answer here explains it graphically. The more times the same key is used (E.g, a three-time pad or a four-time pad) it is intuitive that the less secure it is.
Now that we’ve gone over the basics of XOR encryption and One-time pads, we can move onto a more practical form of encryption.
The most important takeaway from One-time pads is that they have perfect secrecy, meaning that there are no possible attacks if an attacker is only given the Cipher text. However, having perfect secrecy means that the key length must be equal or greater than the message length. This makes the One time pad impractical because if two parties have a way to secretly agree on a key to encrypt a message, they may as well use that mechanism to transmit the message.
To make the One-time pad more practical, we introduce ‘Stream Ciphers’. The key idea behind a Stream Cipher is to replace the ‘random’ key in a One-time pad with a ‘pseudorandom’ key, meaning that the message is XORed with a key produced with a Cryptographically Secure Pseudo-random number generator or CSPRNG. (Note that this differs from a Pseudo-random number generator, in that data generated from a CSPRNG must be indistinguishable from true randomness).
A CSPRNG is simply an algorithm (or a function) that generates large sequences of numbers, approximating the properties of random numbers. Because random numbers are difficult to generate, CSPRNGs rely on a seed to determine the starting state as well as the numbers generated in the future. They allow for a disproportionately large amount of random numbers to be generated from a comparatively small starting seed (e.g, a 128 bit seed generating gigabytes of random data). If the starting seed is known, then all the subsequent numbers generated are known, meaning that a CSPRNG is deterministic. Because of this, the limit of how random the generated numbers are depends on how random the starting seed is.
To make one time pads practical, we can replace the private key with the desired length of output from a pseudo random number generator, and treat the starting seed as the new private key. Because a CPRNG is deterministic, using the same starting seed the same output will be given.
To clarify, while this is a traditional One-time pad:
Given a Pseudorandom number generator G(K), we replace K as follows:
(Note that this example is only of one type of stream cipher, known as a Synchronous stream cipher. Another related approach is a Self-synchronizing stream cipher, in which several of the previous digits in the ciphertext are used to compute each digit of the encrypting key.)
This is far more practical than the one time pad for a number of reasons, namely that the private key can be much shorter than the message being encrypted, making distribution more manageable. However, this different approach comes with a downside.
By switching the private key for the output of a secure number generator our cipher no longer has perfect secrecy, since the key is shorter than the message. Because of this, how secure a stream cipher is now relies upon how unpredictable our pseudorandom number generator is. If the output of a CSPRNG can be predicted, then the cleartext message can be obtained. Here are a few well known cryptosystems with weak stream ciphers:
802.11b WEP: WEP is an algorithm for encrypting data across WiFi, which used a stream cipher called RC4. Because the same key can not be used more than once in a stream cipher, the long-term secret key was concatenated with a value ‘IV’ which would change every time. However, ‘IV’ was only 24 bits long, meaning that after 5000 messages were encrypted, there is a 50% chance that the same key would be used.
CSS: The Content Scramble System was used by the DVD Forum as a form of Digital Rights Management which would encrypt DVDs, restricting access to the content to only be used in licensed applications. CSS used a 40 bit key which could be brute-forced relatively quickly, owing to the small keyspace of the system. (Although the keys were 40 bits, the system could be cracked after only generating 17 bit combinations owing to the technicalities of the CSPRNG.)
Now that we’ve covered stream ciphers, we can move onto another cryptosystem known as Block Ciphers.
Block Ciphers are another way to encrypt and decrypt data. A Block Cipher consists of two algorithms, E and D for encryption and decryption, which both take a secret key K.
The primary point behind a Block Cipher is that it the length of the input text and the resulting ciphertext is always going to be an equal, fixed amount. This amount is known as the Block Size, and is dependent on which Block Cipher is being used. Additionally, the length of the private key K is known as the Key Size and this is also a fixed amount. Two of the common block ciphers are 3DES which has a Block Size of 64 bits and a Key Size of 168 bits, and AES which has a Block Size of 128 bits and a Key Size of 128, 192 or 256 bits.
Block Ciphers are also known as Keyed Permutations, or Pseudorandom Permutations, as they map every possible block to some other block. It’s important that it is Keyed, as the private key determines which input block maps to its corresponding ciphertext block. Because this is a one-to-one permutation, the ciphertext can be decrypted if the key is known.
The first notable block cipher was the Data Encryption Standard (DES) which was developed during the 1970s at IBM, however it was quickly found to be insecure and replaced with 3DES, but this was soon replaced with the Advanced Encryption Standard (AES), which was developed in 1997 following a call by the National Institute of Standards and Technology for a new standardized block cipher. I’ll be focusing on AES because it is the most common Block Cipher used today, as DES and 3DES are quite weak in comparison.
Lets now look at an overview of how AES works. Note that for simplicity’s sake i’m going to be skipping over lots of the technical details. For those interested, the lecture notes here go very in depth.
AES, and most other Block Ciphers work through iteration, where input text will be iteratively encrypted with a series of keys. The first step is to take a Private Key K as input which is typically 128, 192 or 256 bits (We will just focus on 128 bit AES), and expand it to a series of Round Keys to encrypt our message with.
In this case, we take our 128 bit (16 byte) input key and expand it into 11 seperate 16 byte keys through a Key Expansion function known as a Rijndael key schedule. AES now uses each of these keys by encrypting the message 10 times by passing it through 10 seperate Round Functions R(kₙ, m) which takes a round key kₙ and the message state m as input.
Because AES works on 128 bit blocks, we can represent the input message m as a 4x4 matrix of one-byte cells. We also represent each Round Key as a 4x4 matrix, so that they can be XORed with the cell representing the message state.
Firstly, the input message is XORed with the first Round Key, and then the resulting message state is passed through an function with the steps
MixColumns to update the state of the message (these steps will be explained in a bit). We then repeat these steps 10 times with each of the Round Keys, the only difference being that the
MixColumn step is missing from the final round. The final state is then XORed with the 10th round key to produce the output. Here’s a quick overview of each of the three steps involved in a Round:
ByteSub: Each byte in the message state matrix is replaced by its corresponding byte as defined in a Substitution Box shown here:
For example, the byte
9a would be replaced with
ShiftRow: Each row is shifted a certain amount. The first row is not shifted, the second row is shifted left once, the third row is shifted left twice, and the fourth row is shifted left three times.
MixColumns: A linear transformation is applied to each column in the matrix representing the message state
From here, we can now start to use AES to encrypt data. However, you may be quick to realise one very visible limitation, you cannot use AES to encrypt more than 128 bits of information (or 16 bytes) at a time. To encrypt more than 16 bytes, we need to introduce a concept known as Modes of operation.
Modes of operation
There are a variety of methods (or modes of operation) for encrypting more than 128 bits at once. The simplest of such is known as Electronic Codebook, or ECB.
ECB is the simplest mode of operation for encrypting large amounts of data with block ciphers. ECB simply divides a message into 16 bit blocks, and uses the private key to encrypt each of these blocks separately. The plaintext may have to be padded in order to align it with 16 byte boundaries. For example, a 28 byte message would need to have an additional four bytes added to it to make two blocks of 16 bytes. After the message has been divided into its blocks, each block is encrypted individually with the secret key.
While this is quite simple to understand, it comes with a significant downside. Because identical plaintext blocks will produce identical ciphertext blocks, patterns are easy to see. For example: look at the following bitmap image encrypted in ECB mode.
Areas of uniform information are still identical after encryption, and so general patterns can still be discerned. To protect against this, we need to use a more secure mode of operation.
Cipher Block Chaining, or CBC, works very similarly to ECB with a small difference. Before each block of plaintext is encrypted, it is first XORed with the ciphertext of the block preceding it to make each plaintext block unique. For the first block, the plaintext is first XORed with a randomly generated Initialisation Vector.
These are the two main modes of operation that we’ll go over, but there are other ones that you can read up about here if you’re interested.
That’s about all you need to know to get started with Block Ciphers, so now we’re going to move onto our last important area in Cryptography.
Public Key Cryptography
All of the encryption techniques that you have learnt up until this point are what’s known as Symmetric Encryption, meaning that only one key is used to encrypt and decrypt data. Public Key Cryptography, also known as Asymmetric Encryption instead uses Key pairs, consisting of ‘Public Keys’ and ‘Private Keys’, allowing for communication across an untrusted channel. This will be explained soon.
Public Key Cryptography is often used to encrypt communications between servers or machines, commonly being used in sending emails, and web browsing. However for the sake of simplicity, we will use the names ‘Alice’ and ‘Bob’ as placeholders for users
Here we have the two users, Alice and Bob, sending and receiving messages with one another. Here we will stick to the analogy of two people exchanging messages, but this can be viewed as a user accessing a website, email servers exchanging messages, or any other secure exchange.
This situation works well, Alice and Bob are able to communicate with each other freely. However, lets say that there were an eavesdropper listening in on the conversation. This eavesdropper can listen in on the messages, but cannot interfere with the data.
Here, all of the data being sent can be seen by the attacker, but it cannot be tampered with. In this situation, Alice and Bob are unable to communicate sensitive or confidential information, as it is all known by the Attacker.
They can try to use Symmetric Encryption, as we have been discussing up until this point, but to no avail.
This is next to useless, as the Attacker can simply intercept the Key, and use that to decrypt the message. Alice and Bob can get around this by using Public Key Cryptography, but now we have to face the question of what this actually is.
What is Public Key Cryptography?
Public Key Cryptography works by having users each create a pair of keys: a Public Key, and a Private Key. The Public Key may be spread widely and openly distributed, while the Private Key is known only by the owner. In this system, any person can encrypt a message using the the Public Key of the receiver, but the message can only be decrypted by the Private Key of the recipient.
Because of this, if Bob wanted to send an encrypted message to Alice, he would encrypt it using Alice’s Public Key before sending. Alice could then decrypt the message using her private key. (We’ll get into the technicalities of how this works soon).
One of the reasons that making a switch to Asymmetric Encryption is useful, is that the keys used to decrypt a message are always kept private, and never have to be distributed across a channel for communication. As you can see here, using this technique solves a part of the eavesdropper problem we stated before:
If the attacker were to intercept all the traffic being sent between Alice and Bob, they would not be able to decrypt any information being sent because they do not have access to any of the private keys for decryption.
While this good, it simply leads onto another major security threat. Until this point, we’ve assumed that an attacker can listen in to conversations, but cannot modify the data being sent. However if the attacker in our threat model can alter messages in transit, they could still commit a Man in the Middle Attack. If Alice and Bob are using regular mail in communications, an attacker might work at the post office. Or when using a computer network, an attacker might work for an ISP, or have access to the networking equipment being used.
In this situation, the attacker will store Alice’s Public key and pass on their own Public key to Bob. When Bob encrypts his message intended for Alice, he actually sends it to the Attacker, who will be able to read the message before sending it along to Alice, possible after altering it.
This attack is possible because of two major flaws. A lack of Integrity, and a lack of Authenticity, meaning that there is no way to check that a message has not been interfered with, and that there is no way to verify that a message was sent by the genuine person. These vulnerabilities can be addressed with both Digital Signatures and Certificate Authorities.
A digital signature is a way to ensure that the contents of a message have not been altered and that the message was created by the right person. As stated before, a message encrypted with a person’s Public Key can only be decrypted with their Private Key, but the opposite is also true. A message encrypted with a person’s Private Key can only be decrypted with their Public Key, and so if Alice were to encrypt a message with her Private Key before sending it to Bob, Bob can then decrypt the message with her Public Key, thus confirming that the message was created by someone with access to Alice’s Private Key. This process is known as ‘Signing’.
While this can be done by signing the entire message, it is often more efficient and simpler to sign a Cryptographic Hash of the message and include it in the message body, proving that the message has not changed since it was created.
(Note that this is a massive oversimplification. While the idea of “encrypting with the private key” is approximately the case with RSA, other systems such as Diffie-Hellman or ECC use completely separate Signing algorithms such as ECDSA)
However, this only pushes our problem back slightly. Bob can now securely verify that the message he received was made with the private key of the person he is communicating with and that it has not been tampered with. But, he still cannot verify that the Public Key that he has received truly belongs to Alice. To do this, we need to introduce Certificate Authorities.
A Certificate Authority (or CA) is a trusted third party that will digitally sign and publish the public key bound to a user or entity.
This is done using the Certificate Authority’s own private key, so any person can verify that a certificate was issued in the name of the trusted Certificate Authority. Because of this, trust in another user relies upon the validity of the CA’s key.
In practice, the most common use of these certificates is in securing web traffic. When you navigate to a website that uses HTTPS, you can view the digital certificate in the browser.
This system relies upon trust in the Certificate Authority issuing the Certificate, devices and computers will come with a pre-installed list of trusted Certificate Authorities and their Public Keys.
This article is long enough already so I won’t be going over the technicalities of how the internals of Public Key Cryptosystems work. For those interested, this article explains the internals of RSA, a system that provides both secrecy and signing.
One significant difference between Stream/Block Ciphers and Public Key crypto is that while Stream/Block Ciphers work by doing operations on the individuals bits and bytes of a message, Public Key crypto simply works with numbers. This means that any messages would have to be converted into a number before being encrypted. Because of this, Public Key Crypto is quite inefficient at encrypting large messages. To make this more efficient for large messages, we can combine Symmetric and Asymmetric encryption together, in order to efficiently encrypt large amounts of data with a recipient’s Public Key.
Here is one such method which is used by PGP. The message is encrypted with a Randomly generated key by a Block Cipher, and then that smaller key is encrypted with the receiver’s Public Key. The Ciphertext message is then concatenated with the encrypted random key.
This is far more quick and efficient because Block Ciphers are much faster at encrypting large amounts of data, while an asymmetric cipher is used to encrypt the smaller key which is needed for decryption. This still works as a Public Key Cryptosystem, as the key for decryption can only be obtained by the owner of the Private Key.
Where to from here?
Thank you and good job for making it this far, this concludes the article. From here you should have enough of an understanding of Stream Ciphers, Block Ciphers and Public Key Cryptography to appreciate the modern state of encryption and learn more going forward.
For those who want to learn more, I highly recommend completing Stanford’s online Cryptography I Course which gives a great technical and theoretical overview on cryptography. Another useful resource is the book Crypto 101, which goes over many useful areas in Cryptography, and how to exploit common flaws in cryptosystems. Those interested in demonstrating attacks on real-world crypto through practical programming should try the Cryptopals challenges.