RSA Cryptography Behind the Scenes

Read this article on Medium

Public-Key Cryptography is an increasingly important area of mathematics that forms the cornerstone of modern communication. However owing to the growing complexity of such systems, even developers and those that interact with such systems often don’t understand their maths and inner workings, instead relying on high-level abstractions and preexisting implementations. Owing to the inherent danger of implementing your own cryptographic algorithms, this often isn’t a bad thing. However, if your goal is to understand these systems from the ground up and learn of their common weaknesses, having a deep understanding can be highly useful.

In this article I’ll primarily be discussing the mathematics that underpin the RSA cryptosystem, with it being one of the oldest and most widely used forms of public-key cryptography. I will also be explaining its common weaknesses and how it is used in practice. If you are not familiar with the concept of public-key cryptography, I would recommend first reading the relevant section of my previous article.


RSA

RSA, named after its creators Ron Rivest, Adi Shamir, and Leonard Adleman, is an asymmetric cryptosystem that functions through the use of public-private key pairs. In this system a user wishing to receive messages generates both a public key and a private key, which are used for encryption and decryption respectively. The public key consists of the numbers e and n, and the private key consists of numbers d and n.

Here, n is the modulus, e is the public exponent, and d is the private exponent. The algorithms for encrypting and decrypting a message m and corresponding ciphertext c use a technique known as modular exponentiation as shown:

E is the encryption algorithm, and D is the decryption algorithm

Because RSA is a public-key cryptosystem, a message encrypted with a public key can only be decrypted using the corresponding private key. In order to receive encrypted messages an individual can widely share and distribute their public key, allowing for anyone to encrypt a message to send to them. However, these messages can only be decrypted by them.

Meeting the definition of a cipher, RSA follows the consistency equation being expressed as:

This just means that after encrypting a message m with a public key (e, n), the message can then only be decrypted with the corresponding private key (d, n) to recover the message. The magic of how this works comes from the very careful generation of the numbers n, e and d, which I will be covering shortly.

Additionally, to verify the authenticity of the source of a digital message, we can include a digital signature along with the message. As a message encrypted with a public key can only decrypted with the corresponding private key, it also holds that a message encrypted with the private key can only be decrypted using the corresponding public key. Therefore, if you can decrypt a message with someone’s public key, you can guarantee that it was created by the person who possessed the corresponding private key.

The typical way to do this is to calculate a message digest using a cryptographic hash function such as SHA-512, encrypt it with their private key, and concatenate it to the ciphertext being sent.

For example, if Alice wants to send a digitally signed message to Bob, Alice can encrypt the message with Bob’s public key, then encrypt a hash of the message with her own private key.

Here, || denotes concatenation

To verify this digital signature, Bob can decrypt the message with his private key, create a hash of the message, decrypt the signature with Alice’s public key and compare the two to ensure that Alice was the one that wrote the message.

Ignoring signatures for now, here’s a simple example with some pre-calculated values. Say we want to encrypt the number 42 with a public key, where e=17 and n=3233. (Note that these numbers are very small for demonstration purposes, and are much much larger in practice).

The resulting ciphertext of this operation is the number 2557. We can now decrypt this message using the corresponding private key, where d=2753 and n=3233.

As demonstrated, this makes RSA fairly easy to understand for most use cases. However, the magic of how RSA works comes from how the public and private keys are generated, which tends to get more complicated.

RSA Key Generation

One of the primary ideas behind RSA’s security are actions that are easy to compute but impractical to do in reverse such as the technique of modulo exponentiation, which is near impossible to reverse in optimal implementations. Because of this, modulo exponentiation can be considered a Trapdoor function.

As covered earlier, the algorithm for decryption can be rewritten using index laws as follows:

As e is a component of the public key, and d is a component of the private key, we can think of e as the exponent for encryption, and d as the exponent for decryption. In order to make this system work, we must find a pair of numbers e and d that satisfy two very important conditions.

  1. Raising a ciphertext (c or mᵉ) to the power of d must undo the original operation applied to the message, resulting in the original cleartext message (m).
  2. It must be (practically) impossible to calculate d just by knowing e and n.

In order to satisfy these conditions, we can turn to another trapdoor function known as prime factorisation.

As a trapdoor function, there is no known fast and efficient way to find the prime factors of a large number. However, multiplying two prime numbers together and finding the result is trivial to do. For example, the following question is very difficult to calculate by hand:

14078417 is the product of two prime numbers. What are those numbers?

However, rephrasing the question slightly makes it far easier to work it out manually:

What is 1801×7817?

Note that in this case, ‘14078417’ is a 24 bit number, and so can be factorised by a computer trivially. However finding the prime factors of the following 2048 bit number:

28054858894067023484001444540318453571018106101097278546878654383462804693324724266179068736999021606591696140306368101007527587969081983087482330183244549608668390742295364641677132049586501132135729395873453995217546629615237960162783190031657288708106646763170912197351177871907428727339739967246045009611849617088464032097342684035032810437813967876189592255029228354836446888654251720287338691938771818905976064989199133637615430931722685466870972608988899507683650966189062385231180105749751696410490948186037908771602234032431258650407531488547072384375563523617112604767003493676401338233014749354971625136557

would take — for all intents and purposes — an infinite amount of time with our current methods. With the current state of technology, I am likely the only person who will ever know the prime factors of the above number.

Owing this, we can use prime factorisation as a trapdoor function in this algorithm by finding a function that depends upon knowing the prime factors of a large number. We can do this by using Euler totient function.

Euler’s totient function, denoted φ(n), counts the amount of positive integers that are relatively prime to n. I.e, The number of integers in the range 1 ≤ kn where the greatest common divisor gcd(n, k) = 1. For example:

Here, φ(8) is 4, as 8 is relatively prime to 1, 3, 5, and 7. Additionally, φ(9) is 6, as 9 is relatively prime to 1, 2, 4, 5, 7, and 9.

In order to make this function useful, there are two very important properties we can look at:

  1. If n is prime, φ(n) = n-1, as prime numbers have no factors greater than one. For example, φ(7) = 6, as 7 is coprime with 1, 2, 3, 4, 5 and 6. This holds true for any prime number.

2. Euler’s totient function is multiplicative, and as such:

Owing to this, we can combine these properties to derive a function that is only possible to calculate given a number’s prime factors. Therefore, if we have a significantly large number n which is equal to the product of the two prime numbers p and q, then:

As such, we now have a trapdoor function for φ(n) that is only computable if you know the prime factors of n. Now, we need to find a way to link the totient function to modular exponentiation. For this we can use Euler’s theorem, which states that for two numbers m and n, if n and m are coprime, then:

We can now rearrange this equation using simple index laws. As 1 to the power of any exponent is itself 1, we can rewrite the equation as:

Additionally, as any number m, multiplied by 1 is m, we can again rewrite this equation as:

Now, compare this to the original equation we are looking for,

Rearranging, we can now solve for d with the following equation:

n this equation, k can be substituted for a value such that d is a natural number. Additionally, e must be chosen such that it is odd, and does not share a factor with φ(n). What is so great about this equation is that to find d, the only variables used are n and e. However, for large numbers it is impossible to calculate without knowing the prime factors of n.

Using what we’ve gathered, we can generate an RSA public-private key pair (e, n) and (d, n) quite simply.

  1. Generate two large random prime numbers p and q (In RSA-2048, p and q are 1024-bit numbers)
  2. Multiply p×q and store the result as n.
  3. Calculate φ(n) as (p-1)(q-1)
  4. Choose a value of e, such that e and φ(n) are coprime, and 1 < e < φ(n).
  5. Calculate d = (k×φ(n)+1) / e for some value k, such that d is natural.

Common Weaknesses

Typically speaking, RSA is a very secure cipher with no known ways to break it with traditional computers. However, if not done properly, key-generation is the stage at which many weaknesses will arise.

  1. Using small primes

If either of p or q are too small, then N can be factorised with ease.

2. The primes being too close

If the distance between primes p and q (Δ = |p-q|) is too small, N’s primes can be deduced using Fermat’s Factorisation Method. However, this problem rarely arises, as if p and q are randomly selected in a standard way, the chance that |p-q| is too small is negligible.

3. Using a small value of d

Utilising Wiener’s attack, if the value of d is too small relative to n— typically when d < ⅓N^¼ — It is possible to recover the value of d, given only e and n.

4. Using a common factor with another RSA modulus.

If two RSA keys share a common factor, an attacker can use the Euclidean Algorithm to find a common value for p or q between keys that allow for the private key to be discovered. A team of researchers have discovered that owing to poor random number generation, around 1 in every 172 digital certificates are vulnerable to this form of attack by sharing a factor of n.

However, keep in mind that even if a given key pair is generated mathematically correctly, poor implementation of algorithms can give way to side-channel attacks such as timing attacks, power analysis and fault analysis.

RSA In Practice

Everything that we have covered up until this point is what’s known as “Textbook RSA” which describes the underlying mathematics and processes, but is very rarely used on its own in practice owing to its numerous weaknesses. In particular, there are several undesirable properties of Textbook RSA that restricts its use.

  1. Textbook RSA is malleable, as an attacker can make predictable changes to a message by modifying the ciphertext. For example, given a cipher text c, an attacker can compute c′≡c×2ᵉ (mod n), such that the decryption of c′ results in 2m (mod n)
  2. Textbook RSA is deterministic, as identical messages will produce identical ciphertexts. Using this, an attacker can perform traffic analysis to distinguish if the same encrypted message is being sent.
  3. The security of a cipher can vary widely depending on the value of m. For example, with a small value of m paired with a common low value of e such as 3, it can be the case that mᵉ < n, which would completely bypass the modular exponentiation and set the ciphertext c=mᵉ, allowing an attacker to guess m as the eᵗʰ root of c.

In order to lessen the effect of these problems, commonly followed standards of RSA such as PKCS#1 call for the use of Padding , in which the message m is modified and manipulated prior to encryption.

One of the simplest and widely deployed padding schemes is the one outlined in PKCS#1 v1.5 mode 2, in which a message m is padded as follows:

The resultant padded message begins with the bytes 02 denoting mode 2, followed by a random pad, the bytes FF, and is then followed by the message. The random pad is chosen in a way such that it does not contain the bytes FF, so that the recipient knows to treat all bytes following FF as the message. In this scheme the message is padded to the same length as the modulus (e.g, 2048 bits). Additionally, this scheme is defined such that it must add at least 11 bytes of data to the message, meaning that the message m can be at most 245 bytes.

The use of this padding scheme greatly improves the security of Textbook RSA. The resultant message is no longer malleable as (most) changes to the ciphertext will result in an invalid message when decrypted, the use of the random pad prevents the ciphertexts from being deterministic, and it prevents small messages from being vulnerable to eᵗʰ root attacks, as it pads messages of any length to their maximum size.

However, this scheme does have it’s weaknesses. Although this scheme is not as predictively malleable, if a ciphertext is modified in a way that causes the first two bytes to remain as 02, an invalid or modified ciphertext when decrypted will appear to be valid. In RSA-2048, the chance that any v1.5 ciphertext is valid against any key is roughly 1 in 100,000. If an attacker has access to an oracle that can determine whether a specific ciphertext is valid (such as a web server that returns an error when a message is invalid), the attacker could use Bleichenbacher’s million message attack to recover the message.

In order to prevent these forms of attacks, people often instead opt for OAEP (Optional asymmetric encryption padding), as it has been proven secure against such attacks.

OAEP is more complex than v1.5 padding as it utilises a variety of cryptographic techniques.

In this diagram, ⊕ denotes XOR, and G and H are mask generation functions, which are similar to cryptographic hash functions, although they support outputs of a variable length. In the PKCS#1 v2 standard, G and H are identical. k₁ and k₀ are fixed integers depending on the protocol being used, and n is the amount of bits in the modulus (E.g, 2048).

Here, m is a message of length n-(k₁+k₀), r is a randomly generated string of k₀ bits, and ‘000’ is a string of zeroes of length k₁.

The steps to encode a message with this scheme are as follows:

  1. Expand r to n-k₀ bits with G
  2. Calculate X as (m||00…0)⊕G(r)
  3. Reduce X to k₀ bits with H
  4. Calculate Y as rH(X)
  5. Return the result of X||Y

These steps can be represented as such:

After receiving and decrypting an OAEP encoded ciphertext, retrieving the message is very simple, and can be done by passing the values back into the Feistel network diagram above, following the steps:

  1. Recover the random string as r=YH(X)
  2. Recover the message as m||00…0=XG(r)

This form of padding greatly improves upon the security of PKCS#1 v1.5 padding through the use of cryptographic hash functions as trapdoor functions. Because changing a single bit on the input of a cryptographic hash function greatly changes the resulting output, attackers cannot recover any portion of the plaintext message without being able to invert the hash function. As such, decoding an OAEP encoded message is only possible knowing the entire values of X and Y. Because of this important property, OAEP prevents the partial decryption of ciphertexts along with other information leakage, and is effective against chosen ciphertext attacks.


Hybrid Encryption

If you’ve been reading carefully you may have noticed a major limiting restriction of RSA, which is the maximum size that the cleartext message can be. The length of a message cannot exceed the length of the modulus (commonly 2048 bits), meaning that you’d often be unable to encrypt more than 256 bytes of data. When taking into account bytes lost due to padding, the maximum message size gets even smaller — 245 bytes for PKCS#1 v1.5, or 214 for OAEP. With this limitation, I wouldn’t even be able to encrypt the text in this paragraph. However there are several ways to get around this constraint that involve treating RSA as a cryptographic primitive and building upon it.

One approach would be to split the message into blocks and use a chaining mode such as CBC in the same way that block ciphers encrypt large messages. However this is rarely done owing primarily to the slow speed of RSA operations when compared to other encryption methods. The most commonly used solution is to implement a hybrid cryptosystem which combines the benefits of public key encryption with the efficiency of symmetric key encryption. Typically this involves using RSA to encrypt a symmetric key, which is then used to encrypt the message.

Such a system is fairly simple to operate. First, a randomly generated secret key is used to encrypt the message with a symmetric encryption algorithm such as AES, the key is then encrypted with the recipient’s public key using RSA. Then, the symmetricly encrypted message is then transmitted alongside the asymmetrically encrypted key. As the ciphertext can only be unencrypted using the RSA-encrypted symmetric key, the message is only readable to the recipient using their own private key. Using this method of encryption allows for the use cases of asymmetric encryption along with the speed and convenience of symmetric encryption.

Represented more formally, a sender would compute and transmit the following message using the recipient’s public key:

Upon receiving the message, the recipient could then decrypt the randomly generated secret key with their own private key, and then use that to decrypt the AES-encrypted message.

Using a hybrid cryptosystem is far more efficient than using plain RSA as the majority of encryption/decryption is being done by an efficient symmetric-key algorithm, while the relatively inefficient asymmetric-key algorithm is being done on a much smaller message. Because of this, most practical implementations of public-key cryptography use some form of hybrid encryption. For example, OpenPGP uses the system described above to encode messages. Additionally, TLS uses symmetric keys for encrypting data along with a Diffie-Hellman public key exchange, which internally functions similar to RSA.


Where to from here?

This article is long enough and understandably, there is a lot of content relating to public-key cryptosystems that I’ve been unable to cover. If you’d like to understand RSA more deeply I’d recommend reading more about signatures, and in particular, signature padding as outlined in PKCS#1 v1.5 and PSS. Additionally, reading about Diffie-Hellman key exchanges and their applications can be very helpful in understanding common cryptosystems. If you’d like to learn about some more recent trends in public key cryptography, consider reading up on Elliptic-curve cryptography. Additionally, as Quantum computers have been shown to break many common public-key algorithms, many current efforts in cryptography centre around Post-quantum cryptography.