DI Management Home > Cryptography > RSA > RSA Theory

RSA Theory


The RSA algorithm is a public key algorithm that can be used to send an encrypted message without a separate exchange of secret keys. It can also be used to sign a message.

This page looks at the mathematics behind the algorithm. The page has been completely re-written as of November 2011. We discuss the RSA algorithm and its implementation in more detail on our RSA Algorithm page.

You will need some understanding of elementary number theory. First a quick summary of the RSA cryptosystem.

The RSA cryptosystem

Recommended reading

Join Amazon Student FREE Two-Day Shipping for College Students

Affiliate disclosure: we get a small commission for purchases made through the above links

Key generation

  • Choose two distinct primes p and q of approximately equal size so that their product n = pq is of the required length.
  • Compute φ(n) = (p-1)(q-1).
  • Choose a public exponent e, 1 < e < φ(n), which is coprime to φ(n), that is, gcd(e, φ(n))=1.
  • Compute a private exponent d that satisfies the congruence ed ≡ 1 (mod φ(n)).
  • Make the public key (n, e) available to others. Keep the private values d, p, q, and φ(n) secret.

RSA Encryption scheme

Encryption rule: ciphertext, c = RsaPublic(m) = me mod n, 1 < m < n-1
Decryption rule: plaintext, m = RsaPrivate(c) = cd mod n
Inverse transformation: m = RsaPrivate(RsaPublic(m))

RSA Signature scheme

Signing: signature, s = RsaPrivate(m) = md mod n, 1 < m < n-1
Verification: check, v = RsaPublic(s) = se mod n
Inverse transformation: m = RsaPublic(RsaPrivate(m))

First, note that the inverse transformations for encryption and signing are equivalent, since

RsaPrivate(RsaPublic(m)) = (me mod n)d mod n = med mod n
RsaPublic(RsaPrivate(m)) = (md mod n)e mod n = med mod n

So we only need to show that the decryption rule works; that is, for (n, e) as defined above,

If c = me mod n for 0 < m < n then m = cd mod n, where d is the secret exponent that satisfies the relation ed ≡ 1 (mod φ(n)).

All proofs of how RSA works boil down to the fact that for any integer x, x1+kφ(n) ≡ x (mod n).

Note that the algorithm strictly works for 0 ≤ m < n, but we exclude the values m = 0, 1 and m = n-1 from the cryptosystem because there is no secrecy for such messages. Why?

First proof

The first proof is short. It uses the Euler-Fermat Theorem. It works for messages m that are relatively prime to the modulus n, that is where gcd(m, n) = 1.

Proof. Suppose gcd(m, n) = 1. The relation ed ≡ 1 (mod φ(n)) gives that ed = 1 + kφ(n) for some integer k. If c ≡ me (mod n) then, working modulo n,

cd ≡ med
   ≡ m1+kφ(n)
   ≡ m.(mφ(n))k
   ≡ m.1k, since mφ(n) ≡ 1 (mod n), by the Euler-Fermat theorem, as gcd(m, n)=1
   ≡ m (mod n).

Hence m = cd mod n is a unique integer in the range 0 ≤ m < n.   ♦

Second proof

The first proof does not include the special cases where the message m is divisible by one of the prime factors, p or q. There are p numbers where q|m: that is, m = 0, q, 2q, 3q, ..., (p-1)q. Similarly there are q numbers where p|m, and thus p+q-1 possible numbers where gcd(m, n) > 1 (zero occurs twice). Note that p and q cannot both divide m since we restrict m < n = pq.

The second proof specifically considers the case where gcd(m, n) > 1, but it could easily be written to include the first case above (if gcd(q, n)=1 then just say "Similarly for q" after the result mod p). It uses Fermat's Little Theorem and the fact that if s and t are coprime and a ≡ b (mod s) and a ≡ b (mod t), then a ≡ b (mod st).

If s and t are coprime and a ≡ b (mod s) and a ≡ b (mod t), then a ≡ b (mod st).
Proof. a ≡ b (mod s) ⇒ s|a-b. a ≡ b (mod t) ⇒ t|a-b. Since s and t are coprime then st|a-b. Hence a ≡ b (mod st). ♦

Proof. Suppose one of the primes, say q, divides m. Then m ≡ 0 (mod q) and so, trivially, we have

0 ≡ m1+kφ(n) ≡ m (mod q).

If q divides m then p cannot also divide m since m < n = pq and so we must have gcd(m, p) = 1. By Fermat's Little Theorem we have

mp-1 ≡ 1 (mod p)
m(p-1)(q-1) ≡ 1q-1 ≡ 1 (mod p), raising both sides to the power q-1
mφ(n) ≡ 1, since φ(n) = φ(pq) = (p-1)(q-1)
mkφ(n) ≡ 1k ≡ 1 (mod p), raising both sides to the power k
m1+kφ(n) ≡ m (mod p), multiplying both sides by m.

Now since m1+kφ(n) ≡ m (mod q) and m1+kφ(n) ≡ m (mod p), and p and q are coprime, then it follows that (see † above)

m1+kφ(n) ≡ m (mod pq) ≡ m (mod n), for all integers m.

We have ed ≡ 1 (mod φ(n)) ⇒ ed = 1 + kφ(n). So, if c ≡ me (mod n) then cd ≡ med ≡ m1+kφ(n) ≡ m (mod n). Hence m = cd mod n is a unique integer in the range 0 ≤ m < n.   ♦

Longer proof of the RSA algorithm

We wrote this proof of the RSA algorithm (pdf, 93 kB) back in 2006, and that in turn is a revised version of something we wrote in 2002. It uses very elementary principles of number theory. Apologies to all university tutors who have received plagiarised versions in students' assignments. The PDF document was written in LaTex and generated on a Windows system using the wonderful TeXniCenter.

Using λ(n) instead of φ(n)

The original version of RSA defined the secret exponent as d = e-1 (mod φ(n)), where φ is the Euler totient function and φ(n) = φ(pq) = (p-1)(q-1). In fact you can use the smaller Charmichael function instead, so that an alternative secret exponent (we'll call it d') is defined by d' = e-1 (mod λ(n)), where λ(n) = lcm(p-1, q-1) = (p-1)(q-1)/gcd(p-1, q-1). Later refinements of the RSA algorithm like PKCS#1 use this definition.

But it doesn't matter which one you use. Both d and d' will give you the same results.

Note that the proofs above use the fact that x1+kφ(n) ≡ x (mod n) for any integer x. Well, by definition, xλ(n) ≡ 1 (mod n), so we can just write x1+kλ(n) ≡ x (mod n), and all the proofs will still work.

Both values of d and d' will uniquely decrypt a message c encrypted with c = me mod n, and both will give the same signature s = md mod n = md' mod n.

In fact, any secret exponent of the form d' + rλ(n) will work, for any integer r. Since λ(n) divides φ(n), then our d is one of these.

For practical values of n in the thousands of bits, the difference between d and d' is a couple of bits and makes no practical difference.

Revision History

Feedback

To comment on this page or ask a question, please send us a message.

This page last updated 22 May 2016