A Crash Course in Everything Cryptographic

Read this article on Medium

Cryptography’s  inner workings have long been regarded as exclusive to the realms of  experts and mathematicians, its technicalities being largely attributed  to magic. Given the complicated nature of the state of modern  cryptography, this is understandable. However, through many of the  global movements that stem from a lack of understanding on the subject  such as the UK’s proposed Encryption Ban and Australia’s Assistance and  Access Bill, it is clear that this approach is doing far more harm than  good.

In  this guide, I will give a crash course in everything that you might need  to know to get started in understanding Cryptography. I will give a  rundown on the history of various cryptographic systems, and give a  crash course on the three most prevalent areas of cryptography: Stream Ciphers, Block Ciphers and Public Key Cryptography.

Ciphers

Ciphers  are the cornerstone of cryptography. A cipher is a set of algorithms  that performs encryption or decryption on a message. An encryption  algorithm (E) takes a secret key (k) and a message (m), and produces a  ciphertext (c). Similarly, a Decryption algorithm (D) takes a secret key  (K) and the previous resulting Ciphertext (C). They be represented as  follows:

An Encryption algorithm 'E', and a Decryption algorithm 'D'.

This also means that in order for it to be a cipher, it must satisfy the consistency equation as follows, making it possible to decrypt.

This just means that if you encrypt a message with the key K, you would get the exact same message back by decrypting it with K as well.

One  of the oldest and simplest ciphers is the Caesar Cipher, which simply  replaces each letter in a message with a letter some fixed number of  positions down the alphabet.

The Caesar Cipher was used around 50 C.E by Julius Caesar, where he shifted each letter in a message by 3 places to send strategic military communications.

In the following case, the message has been shifted three characters up:

This cipher can be represented mathematically as follows:

While  this meets our definitions of a cipher, it is hardly secure. If an  attacker knows that this cipher has been used, they could just try all  25 combinations. Even if they were not sure what cipher has been used,  they could figure this out by noticing the pattern of letters in the  ciphertext.

To move onto some more secure encryption algorithms, we will have to discuss an operator known as Xor.

XOR

An XOR,  or ‘Exclusive Or’ gate is a boolean digital logic gate that takes two  inputs of 1 or 0, and returns 1 if both inputs are different, or 0 if  both are the same. It can be represented with the following truth table  covering all possible inputs and outputs:

This operator is often represented with the ⊕ sign. In this notation:

0 ⊕ 0 = 0,

0 ⊕ 1 = 1,

1 ⊕ 0 = 1, and

1 ⊕ 1 = 0.

There are several important properties that apply to Xor.

  1. You can apply them in any order: a ⊕ (b ⊕ c) = (a ⊕ b) ⊕ c
  2. Anything XOR’d with itself is 0: a ⊕ a = 0
  3. Anything XOR with 0 is itself: a ⊕ 0 = a

This  implies that a ⊕ b ⊕ a = b, as this equals a ⊕ a ⊕ b which equals 0 ⊕ b  which equals b, according to the laws stated above. It’s important to  note that this only works on ones and zeroes, so numbers of a different radix must be first be converted to binary.

87 ⊕ 73 = 1010111b ⊕ 1001001b = 0011110b = 30

Now we can move on to our first secure cipher.

One-time pads

Described  by Frank Miller in 1882, a One-time pad (also known as a Vernam cipher)  works by performing an XOR with a message and a private key, and then  using XOR on the private key and the ciphertext to re-obtain the  message. This can be done owing to the property described above, that a ⊕  b ⊕ a = b. The pair of algorithms that define the One-time pad can be  represented as follows:

The consistency equation for this pair of ciphers can be easily proven as follows:

We’ll do a simple example to show how easy one-time pads can be to use.  Say we want to encrypt the word “Message”. The first step would be to  convert the message into binary (just ones and zeroes.). We can do this  by converting each letter according to the ASCII character set.

We now  need 56 random bits to make a key to XOR the clear-text with. It is  important that the private key is as random as possible.

Randomly generated bits from Random.org

We then XOR each of the bits in either message with each-other.

The  resulting set of bits is our ciphertext. To decrypt the ciphertext, we  simply need to do the same operation, but this time XORing the  ciphertext with the same randomly generated key, and then converting it  back to ASCII.

This  cipher is very simple to use and understand, but it also has another  interesting feature about it. The One-time pad has what is known as  perfect secrecy, essentially meaning that if an attacker only knew the  ciphertext (the result of m⊕k), it is mathematically impossible to know  anything about the plain-text message, and it is completely impossible  to crack.

So,  we have a cipher that is easy to understand and even do by hand, but  mathematically impossible to break. Why would we ever want to use  anything else? The reason being that while One-time pads are effective  and easy to use, they have a number of important downfalls.

The  first major issue is that for any message being sent, the private key  must be as long as, or longer than the clear-text message. For the  recipient to be able to decrypt said message, you must have a way to  securely provide them with the private key. If this is the case, then it  would simply be easier to transfer the original message across the  secure channel.

This  problem is exacerbated with the second major drawback which lies within  the name. The private key of a One-time pad may only be used one time,  meaning that each message being encrypted must use a unique and random  key. The major weakness that comes from encrypting multiple messages  with the same private key can be demonstrated mathematically quite  easily.

Lets say we have two messages, m1 and m2, which we will both encrypt using the same key, K (making our system a two-time pad). We can get the following by XORing both cipher texts together.

From  this, we get m1⊕m2. Now an attacker can perform various forms of  analysis on the resulting text, such as statistical analysis, frequency  analysis, pattern matching, or using a natural language approach as  described in this 2006 paper. I won’t get fully into explaining why this is so insecure, but for those interested, this answer here explains it graphically. The more times the same key is used (E.g, a  three-time pad or a four-time pad) it is intuitive that the less secure  it is.

Now that we’ve gone over the basics of XOR encryption and One-time pads, we can move onto a more practical form of encryption.

Stream Ciphers

The  most important takeaway from One-time pads is that they have perfect  secrecy, meaning that there are no possible attacks if an attacker is  only given the Cipher text. However, having perfect secrecy means that the key length must be equal or greater than the message length.  This makes the One time pad impractical because if two parties have a  way to secretly agree on a key to encrypt a message, they may as well  use that mechanism to transmit the message.

To  make the One-time pad more practical, we introduce ‘Stream Ciphers’.  The key idea behind a Stream Cipher is to replace the ‘random’ key in a  One-time pad with a ‘pseudorandom’ key, meaning that the message is  XORed with a key produced with a Cryptographically Secure Pseudo-random number generator or CSPRNG. (Note that this differs from a Pseudo-random number  generator, in that data generated from a CSPRNG must be  indistinguishable from true randomness).

A  CSPRNG is simply an algorithm (or a function) that generates large  sequences of numbers, approximating the properties of random numbers.  Because random numbers are difficult to generate, CSPRNGs rely on a seed to determine the starting state as well as the numbers generated in the  future. They allow for a disproportionately large amount of random  numbers to be generated from a comparatively small starting seed (e.g, a  128 bit seed generating gigabytes of random data). If the starting seed  is known, then all the subsequent numbers generated are known, meaning  that a CSPRNG is deterministic. Because of this, the limit of how random the generated numbers are depends on how random the starting seed is.

To  make one time pads practical, we can replace the private key with the  desired length of output from a pseudo random number generator, and  treat the starting seed as the new private key. Because a CPRNG is  deterministic, using the same starting seed the same output will be  given.

To clarify, while this is a traditional One-time pad:

Given a Pseudorandom number generator G(K), we replace K as follows:

Note  that this example is only of one type of stream cipher, known as a  Synchronous stream cipher. Another related approach is a Self-synchronizing stream cipher, in which several of the previous digits in the ciphertext are used to compute each digit of the encrypting key.)

This  is far more practical than the one time pad for a number of reasons,  namely that the private key can be much shorter than the message being  encrypted, making distribution more manageable. However, this different  approach comes with a downside.

By  switching the private key for the output of a secure number generator  our cipher no longer has perfect secrecy, since the key is shorter than  the message. Because of this, how secure a stream cipher is now relies  upon how unpredictable our pseudorandom number generator is. If the  output of a CSPRNG can be predicted, then the cleartext message can be  obtained. Here are a few well known cryptosystems with weak stream  ciphers:

802.11b WEP: WEP is an algorithm for encrypting data across WiFi, which used a  stream cipher called RC4. Because the same key can not be used more than  once in a stream cipher, the long-term secret key was concatenated with  a value ‘IV’ which would change every time. However, ‘IV’ was only 24 bits long, meaning that after 5000 messages were encrypted, there is a 50% chance that the same key would be used.

CSS: The Content Scramble System was used by the DVD Forum as a form of Digital Rights Management which  would encrypt DVDs, restricting access to the content to only be used in  licensed applications. CSS used a 40 bit key which could be  brute-forced relatively quickly, owing to the small keyspace of the  system. (Although the keys were 40 bits, the system could be cracked  after only generating 17 bit combinations owing to the technicalities of  the CSPRNG.)

Now that we’ve covered stream ciphers, we can move onto another cryptosystem known as Block Ciphers.

Block Ciphers

Block Ciphers are another way to encrypt and decrypt data. A Block Cipher consists of two algorithms, E and D for encryption and decryption, which both take a secret key K.

The  primary point behind a Block Cipher is that it the length of the input  text and the resulting ciphertext is always going to be an equal, fixed  amount. This amount is known as the Block Size, and is dependent on which Block Cipher is being used. Additionally, the length of the private key K is known as the Key Size and this is also a fixed amount. Two of the common block ciphers are  3DES which has a Block Size of 64 bits and a Key Size of 168 bits, and  AES which has a Block Size of 128 bits and a Key Size of 128, 192 or 256  bits.

Block Ciphers are also known as Keyed Permutations, or Pseudorandom Permutations, as they map every possible block to some other block. It’s important that it is Keyed,  as the private key determines which input block maps to its  corresponding ciphertext block. Because this is a one-to-one  permutation, the ciphertext can be decrypted if the key is known.

The  first notable block cipher was the Data Encryption Standard (DES) which  was developed during the 1970s at IBM, however it was quickly found to  be insecure and replaced with 3DES, but this was soon replaced with the  Advanced Encryption Standard (AES), which was developed in 1997  following a call by the National Institute of Standards and Technology  for a new standardized block cipher. I’ll be focusing on AES because it  is the most common Block Cipher used today, as DES and 3DES are quite  weak in comparison.

Lets  now look at an overview of how AES works. Note that for simplicity’s  sake i’m going to be skipping over lots of the technical details. For  those interested, the lecture notes here go very in depth.

AES,  and most other Block Ciphers work through iteration, where input text  will be iteratively encrypted with a series of keys. The first step is  to take a Private Key K as input which is typically 128, 192 or 256 bits (We will just focus on 128 bit AES), and expand it to a series of Round Keys to encrypt our message with.

In this  case, we take our 128 bit (16 byte) input key and expand it into 11  seperate 16 byte keys through a Key Expansion function known as a Rijndael key schedule. AES now uses each of these keys by encrypting the message 10 times by passing it through 10 seperate Round Functions R(k, m) which takes a round key kₙ and the message state m as input.

Because AES works on 128 bit blocks, we can represent the input message m as a 4x4 matrix of one-byte cells. We also represent each Round Key as a  4x4 matrix, so that they can be XORed with the cell representing the  message state.

Firstly,  the input message is XORed with the first Round Key, and then the  resulting message state is passed through an function with the steps ByteSub , ShiftRows and MixColumns to update the state of the message (these steps will be explained in a  bit). We then repeat these steps 10 times with each of the Round Keys,  the only difference being that the MixColumn step is missing from the final round. The final state is then XORed  with the 10th round key to produce the output. Here’s a quick overview  of each of the three steps involved in a Round:

ByteSub: Each byte in the message state matrix is replaced by its corresponding byte as defined in a Substitution Box shown here:

For example, the byte 9a would be replaced with b8 .

ShiftRow: Each row is shifted a certain amount. The first row is not shifted, the  second row is shifted left once, the third row is shifted left twice,  and the fourth row is shifted left three times.

MixColumns: A linear transformation is applied to each column in the matrix representing the message state

From  here, we can now start to use AES to encrypt data. However, you may be  quick to realise one very visible limitation, you cannot use AES to  encrypt more than 128 bits of information (or 16 bytes) at a time. To  encrypt more than 16 bytes, we need to introduce a concept known as Modes of operation.

Modes of operation

There  are a variety of methods (or modes of operation) for encrypting more  than 128 bits at once. The simplest of such is known as Electronic Codebook, or ECB.

ECB

ECB  is the simplest mode of operation for encrypting large amounts of data  with block ciphers. ECB simply divides a message into 16 bit blocks, and  uses the private key to encrypt each of these blocks separately. The  plaintext may have to be padded in order to align it with 16 byte  boundaries. For example, a 28 byte message would need to have an  additional four bytes added to it to make two blocks of 16 bytes. After  the message has been divided into its blocks, each block is encrypted  individually with the secret key.

While  this is quite simple to understand, it comes with a significant  downside. Because identical plaintext blocks will produce identical  ciphertext blocks, patterns are easy to see. For example: look at the  following bitmap image encrypted in ECB mode.

Areas  of uniform information are still identical after encryption, and so  general patterns can still be discerned. To protect against this, we  need to use a more secure mode of operation.

CBC

Cipher Block Chaining, or CBC,  works very similarly to ECB with a small difference. Before each block  of plaintext is encrypted, it is first XORed with the ciphertext of the  block preceding it to make each plaintext block unique. For the first  block, the plaintext is first XORed with a randomly generated  Initialisation Vector.

Using the previous example of the bitmap image encryption, we can intuitively see that this is much more secure.

These are the two main modes of operation that we’ll go over, but there are other ones that you can read up about here if you’re interested.

That’s  about all you need to know to get started with Block Ciphers, so now  we’re going to move onto our last important area in Cryptography.

Public Key Cryptography

All of the encryption techniques that you have learnt up until this point are what’s known as Symmetric Encryption, meaning that only one key is used to encrypt and decrypt data. Public Key Cryptography, also known as Asymmetric Encryption instead uses Key pairs,  consisting of ‘Public Keys’ and ‘Private Keys’, allowing for  communication across an untrusted channel. This will be explained soon.

Public  Key Cryptography is often used to encrypt communications between  servers or machines, commonly being used in sending emails, and web  browsing. However for the sake of simplicity, we will use the names  ‘Alice’ and ‘Bob’ as placeholders for users

Here we  have the two users, Alice and Bob, sending and receiving messages with  one another. Here we will stick to the analogy of two people exchanging  messages, but this can be viewed as a user accessing a website, email  servers exchanging messages, or any other secure exchange.

This  situation works well, Alice and Bob are able to communicate with each  other freely. However, lets say that there were an eavesdropper  listening in on the conversation. This eavesdropper can listen in on the  messages, but cannot interfere with the data.

The ‘Attacker’ can be viewed as other users on an unsecured WiFi Network, ISPs or network administrators.

Here,  all of the data being sent can be seen by the attacker, but it cannot be  tampered with. In this situation, Alice and Bob are unable to  communicate sensitive or confidential information, as it is all known by  the Attacker.

They can try to use Symmetric Encryption, as we have been discussing up until this point, but to no avail.

This is  next to useless, as the Attacker can simply intercept the Key, and use  that to decrypt the message. Alice and Bob can get around this by using  Public Key Cryptography, but now we have to face the question of what  this actually is.

What is Public Key Cryptography?

Public  Key Cryptography works by having users each create a pair of keys: a  Public Key, and a Private Key. The Public Key may be spread widely and  openly distributed, while the Private Key is known only by the owner. In  this system, any person can encrypt a message using the the Public Key  of the receiver, but the message can only be decrypted by the Private  Key of the recipient.

Because  of this, if Bob wanted to send an encrypted message to Alice, he would  encrypt it using Alice’s Public Key before sending. Alice could then  decrypt the message using her private key. (We’ll get into the  technicalities of how this works soon).

One of  the reasons that making a switch to Asymmetric Encryption is useful, is  that the keys used to decrypt a message are always kept private, and  never have to be distributed across a channel for communication. As you  can see here, using this technique solves a part of the eavesdropper  problem we stated before:

If the  attacker were to intercept all the traffic being sent between Alice and  Bob, they would not be able to decrypt any information being sent  because they do not have access to any of the private keys for  decryption.

While  this good, it simply leads onto another major security threat. Until  this point, we’ve assumed that an attacker can listen in to  conversations, but cannot modify the data being sent. However if the  attacker in our threat model can alter messages in transit, they could  still commit a Man in the Middle Attack.  If Alice and Bob are using regular mail in communications, an attacker  might work at the post office. Or when using a computer network, an  attacker might work for an ISP, or have access to the networking  equipment being used.

In this  situation, the attacker will store Alice’s Public key and pass on their  own Public key to Bob. When Bob encrypts his message intended for  Alice, he actually sends it to the Attacker, who will be able to read  the message before sending it along to Alice, possible after altering  it.

This attack is possible because of two major flaws. A lack of Integrity, and a lack of Authenticity,  meaning that there is no way to check that a message has not been  interfered with, and that there is no way to verify that a message was  sent by the genuine person. These vulnerabilities can be addressed with  both Digital Signatures and Certificate Authorities.

Digital Signatures

A  digital signature is a way to ensure that the contents of a message  have not been altered and that the message was created by the right  person. As stated before, a message encrypted with a person’s Public Key  can only be decrypted with their Private Key, but the opposite is also  true. A message encrypted with a person’s Private Key can only be  decrypted with their Public Key, and so if Alice were to encrypt a  message with her Private Key before sending it to Bob, Bob can then  decrypt the message with her Public Key, thus confirming that the  message was created by someone with access to Alice’s Private Key. This  process is known as ‘Signing’.

While this can be done by signing the entire message, it is often more efficient and simpler to sign a Cryptographic Hash of the message and include it in the message body, proving that the message has not changed since it was created.

(Note  that this is a massive oversimplification. While the idea of  “encrypting with the private key” is approximately the case with RSA, other systems such as Diffie-Hellman or ECC use completely separate Signing algorithms such as ECDSA)

However,  this only pushes our problem back slightly. Bob can now securely verify  that the message he received was made with the private key of the  person he is communicating with and that it has not been tampered with.  But, he still cannot verify that the Public Key that he has received  truly belongs to Alice. To do this, we need to introduce Certificate Authorities.

Certificate Authorities

A  Certificate Authority (or CA) is a trusted third party that will  digitally sign and publish the public key bound to a user or entity.

This is  done using the Certificate Authority’s own private key, so any person  can verify that a certificate was issued in the name of the trusted  Certificate Authority. Because of this, trust in another user relies  upon the validity of the CA’s key.

In practice, the most common use of these certificates is in securing web traffic. When you navigate to a website that uses HTTPS, you can view the digital certificate in the browser.

This system relies upon trust in the Certificate Authority issuing the  Certificate, devices and computers will come with a pre-installed list  of trusted Certificate Authorities and their Public Keys.

This  article is long enough already so I won’t be going over the  technicalities of how the internals of Public Key Cryptosystems work.  For those interested, this article explains the internals of RSA, a system that provides both secrecy and signing.

One  significant difference between Stream/Block Ciphers and Public Key  crypto is that while Stream/Block Ciphers work by doing operations on  the individuals bits and bytes of a message, Public Key crypto simply  works with numbers. This means that any messages would have to be  converted into a number before being encrypted. Because of this, Public  Key Crypto is quite inefficient at encrypting large messages. To make  this more efficient for large messages, we can combine Symmetric and  Asymmetric encryption together, in order to efficiently encrypt large  amounts of data with a recipient’s Public Key.

Here is one such method which is used by PGP.  The message is encrypted with a Randomly generated key by a Block  Cipher, and then that smaller key is encrypted with the receiver’s  Public Key. The Ciphertext message is then concatenated with the  encrypted random key.

This is  far more quick and efficient because Block Ciphers are much faster at  encrypting large amounts of data, while an asymmetric cipher is used to  encrypt the smaller key which is needed for decryption. This still works  as a Public Key Cryptosystem, as the key for decryption can only be  obtained by the owner of the Private Key.

Where to from here?

Thank  you and good job for making it this far, this concludes the article.  From here you should have enough of an understanding of Stream Ciphers,  Block Ciphers and Public Key Cryptography to appreciate the modern state  of encryption and learn more going forward.

For those who want to learn more, I highly recommend completing Stanford’s online Cryptography I Course which gives a great technical and theoretical overview on cryptography. Another useful resource is the book Crypto 101,  which goes over many useful areas in Cryptography, and how to exploit  common flaws in cryptosystems. Those interested in demonstrating attacks  on real-world crypto through practical programming should try the Cryptopals challenges.