Appendix: Mathematical Details#

This appendix contains several sections with detailed examinations of mathematical topics that were touched on in the main chapters.

Euclidean Algorithm#

The Euclidean algorithm computes the greatest common divisor of two integers. It can do so very quickly, even for extremely large numbers.

For inputs \(a\) and \(b\), the algorithm works as follows:

  1. Let’s assume \(a\) is the bigger of the two inputs. Find the quotient and remainder when dividing it by \(b\). That is, write the equation:

    \[a = q \cdot b + r \textrm{, where } 0 \leq r < b\]

    Recall that there is only one choice of \(q\) and \(r\) that fulfills these properties.

  2. If \(r = 0\), terminate the algorithm; the result is \(b\). Otherwise, continue.

  3. Replace \(a\) with \(b\), and \(b\) with \(r\), and go back to step 1.

We’ll omit the proof of this algorithm’s correctness; such proofs are easily found in discrete math literature[JS13] and online.

Let’s run through an example with \(a = 27\) and \(b = 15\).

\(a\)

\(q\)

\(b\)

\(r\)

Equation

27

1

15

12

\(27 = 1 \cdot 15 + 12\)

15

1

12

3

\(15 = 1 \cdot 12 + 3\)

12

4

3

0

\(12 = 4 \cdot 3 + 0\)

The value of \(b\) when \(r\) reaches zero is the answer.

Let’s see another example, of two relatively prime numbers, 9 and 23.

\(a\)

\(q\)

\(b\)

\(r\)

Equation

23

2

9

5

\(23 = 2 \cdot 9 + 5\)

9

1

5

4

\(9 = 1 \cdot 5 + 4\)

5

1

4

1

\(5 = 1 \cdot 4 + 1\)

4

4

1

0

\(4 = 4 \cdot 1 + 0\)

The answer is 1, which tells us that 9 and 23 have no nontrivial factors in common. Also, note that the value of \(q\) is irrelevant; the algorithm does not use it after calculating it. Most implementations of the Euclidean algorithm don’t bother explicitly calculating \(q\). We’ve included it here for clarity.

However, hidden within the Euclidean algorithm, there is actually a way to calculate modular multiplicative inverses, which does involve \(q\). This is called the extended Euclidean algorithm.

Extended Euclidean Algorithm#

Formally, the problem of finding modular multiplicative inverses is as follows. Given a modulus \(n\), and a member \(a\) of the multiplicative group modulo \(n\), we want to find a value \(x\) such that:

\[a\cdot x \equiv 1 \pmod{n}\]

Recall that a modular congruence means that the modulus divides the difference of the two sides. Therefore, we can express this in another way: \(a\cdot x - 1 = n\cdot y\) for some integer \(y\), and then rearrange:

\[a\cdot x - n\cdot y = 1\]

Remember that \(a\) and \(n\) are known values, and we want to find values of \(x\) and \(y\) so that this equation is satisfied; \(x\) is the inverse we’re looking for.

The trick is that we can run the Euclidean algorithm on \(n\) and \(a\), and use the equations from that to easily find \(x\). First, we rearrange the equations so that \(r\) is by itself on one side. Returning to the example of 9 and 23 from before, here are the original equations on the left and the rearranged ones on the right:

\begin{align*} 23 &= 2 \cdot 9 + 5 & 5 &= 23 - 2 \cdot 9 \\ 9 &= 1 \cdot 5 + 4 & 4 &= 9 - 1 \cdot 5 \\ 5 &= 1 \cdot 4 + 1 & 1 &= 5 - 1 \cdot 4 \end{align*}

Then we start from the bottom equation on the right, and substitute in the right-hand side of the equation above it for the value 4, as follows:

\[\begin{split} 1 &= 5 - 1 \cdot 4 \\ &= 5 - 1 \cdot (9 - 1 \cdot 5) \\ &= 5 - 1 \cdot 9 + 1 \cdot 5 \\ &= - 1 \cdot 9 + 2 \cdot 5 \end{split}\]

Now the right-hand side is in terms of 9 and 5, instead of 5 and 4. Using the same technique, we can substitute the right-hand side of \(5 = 23 - 2 \cdot 9\) to get an equation in terms of 23 and 9 instead:

\[\begin{split} 1 &= -1 \cdot 9 + 2 \cdot 5 \\ &= -1 \cdot 9 + 2 \cdot (23 - 2 \cdot 9) \\ &= -1 \cdot 9 + 2 \cdot 23 - 4 \cdot 9 \\ &= -5 \cdot 9 + 2 \cdot 23 \end{split}\]

And now we have our desired solution: \(x = -5\) and \(y = 2\). Because we were looking for the multiplicative inverse of 9 modulo 23, \(-5\) is not exactly what we need: it’s not a member of the group. We simply take its equivalent modulo 23 to get our final answer, which is 18. You can confirm this: \(9 \cdot 18 = 162\), and \(162 \Mod 23 = 1\) (because \(23 \cdot 7 = 161\)).

Notice that we’ve relied on the fact that \(\gcd(a, n) = 1\). If that weren’t the case, the Euclidean algorithm would not have given us an equation with 1 by itself on one side, and we would have had nowhere to start this chain of substitutions. This is why only group members that are relatively prime to the modulus have an inverse.

In practice, implementations of the extended Euclidean algorithm don’t work backwards like this; they compute both the inverse and the GCD at the same time as they work forwards. This is how it is done, given modulus \(n\) and group element \(a\):

  1. Set \(t_0 \leftarrow 0\), and \(t_1 \leftarrow 1\).

  2. Write the equation \(n = q \cdot a + r\), where \(0 \leq r < a\).

  3. If \(r = 0\), terminate the algorithm: \(a\) is the GCD, and if \(a = 1\), then \(t_1\) is the inverse of \(a\) modulo \(n\). (If \(a \neq 1\), then the inverse does not exist.) Otherwise, continue.

  4. Compute \(t \leftarrow t_0 - q\cdot t_1\). Replace \(t_0\) with \(t_1\), and \(t_1\) with \(t\).

  5. Replace \(n\) with \(a\), and \(a\) with \(r\). Go back to step 2.

Galois Fields#

A field is similar to a group, but with two operations instead of just one. These two operations, whatever they actually are, are denoted as “addition” and “multiplication”. Both operations must be associative, and both must have an identity element. Every element must have an additive inverse, and every element except the additive identity must have a multiplicative inverse.

For example, the rational numbers under standard addition and multiplication form a field. The identity elements are 0 (for addition) and 1 (for multiplication). The additive inverse of \(\frac{a}{b}\) is \(-\frac{a}{b}\). The multiplicative inverse of \(\frac{a}{b}\) is \(\frac{b}{a}\), which is defined for every rational number except zero.

A Galois field is one where the set of elements is finite; they are also called finite fields. We’ve already seen an example of a Galois field: the integers modulo any prime \(p\), under modular addition and multiplication, form a Galois field. Such fields are called prime fields, and are denoted as \(\GF(p)\). In general, Galois fields only exist when the modulus is a prime, or a power of a prime.

Binary fields#

When the modulus of the Galois field is a power of two, it is called a binary field. These are used in cryptography, so we’ll look closely at how they operate.

The members of \(\GF(2^n)\) are sequences of \(n\) bits. The idea behind operations on them is to treat the bit sequence as if the bits were coefficients of a polynomial. For example, the bit sequence 110101 is taken to represent the polynomial \(x^5 + x^4 + x^2 + 1\) (recall that \(x^0 = 1\)). Note that \(x\) is just a placeholder that we use in doing algebra; it is never given a value.

Adding two of these polynomials together is similar to standard polynomial addition: you add the coefficients of like terms. However, this addition is performed in \(\GF(2)\) — i.e. it is addition modulo 2. You may recall that “addition modulo 2” is another name for XOR, and indeed, bitwise XOR of two \(n\)-bit sequences is addition of those sequences in \(\GF(2^n)\).

So, for example, let’s add the 4-bit sequences 1100 and 0101. These correspond to the polynomials \(x^3 + x^2\) and \(x^2 + 1\). Doing polynomial addition gives \(x^3 + (1 \oplus 1)x^2 + 1 = x^3 + 1\). Expressed as a bit sequence, \(x^3 + 1\) is 1001, and that is in fact the result of 1100 \(\oplus\) 0101.

Multiplication starts similarly to standard polynomial multiplication: you multiply out each term, then add (modulo 2) the coefficients of like terms. However, multiplying two polynomials with \(n\) terms can result in a polynomial with more than \(n\) terms, which would mean the result isn’t a member of the group. There must be a way to deal with that.

The answer is that the product of two polynomials must be reduced modulo a reducing polynomial. The reducing polynomial must be specified when a usage of Galois-field multiplication is specified1No matter what irreducible polynomial you choose, the resulting field will have the same underlying structure. However, the specific polynomial does affect the bit-sequence representation, so if that matters, the reducing polynomial must be specified., and its highest-order term must be \(x^n\). It must also be irreducible: it cannot be expressed as a product of two polynomials. The reduction process is analogous to reduction in modular arithmetic — just as polynomials can be multiplied, they can also be divided, and thus the remainders of division can be computed.

Let’s go through an example. We will multiply the 4-bit sequences 1011 and 1100, which corresponds to the polynomials \(x^3 + x + 1\) and \(x^3 + x^2\), and reduce the result modulo \(x^4 + x + 1\).

\[\begin{split} (x^3 + x + 1)(x^3 + x^2) &= (x^6 + x^5) + (x^4 + x^3) + (x^3 + x^2) \\ &= x^6 + x^5 + x^4 + x^2 \end{split}\]

Then, the result of reducing \(x^6 + x^5 + x^4 + x^2\) modulo \(x^4 + x + 1\) is \(x^3 + x^2 + 1\) — our final answer. The details of polynomial division aren’t important, so we won’t go into them; in this case, the.

Back to the bit-sequence representation, this means that 1011 multiplied by 1100 in \(\GF(2^4)\) is 1101. Note that this is different from the result you’d get from interpreting those bit sequences as integers and doing modular multiplication: 1011 is 11, 1100 is 12, the reducing polynomial is 19, and the final result is 13. However, \(11 \cdot 12 \Mod 19 = 18\). Using \(2^4\) as the modulus does not give the same result either: \(11 \cdot 12 \Mod 16 = 4\).

From this description, it is not at all obvious that this multiplication operation is invertible. We won’t prove it here, but it is indeed invertible, as long as the reducing polynomial is irreducible. They can be computed with an efficient algorithm called the Itoh-Tsujii algorithm[IT88].

Proof of RSA’s Correctness#

There are a lot of different ways to do this proof, but in my opinion, this is the easiest one to understand.

The end result that we’re aiming to prove is that, for \(p\) and \(q\) prime:

\[m^{ed} \equiv m \pmod{pq}, \textrm{where } ed \equiv 1 \pmod{(p-1)(q-1)}\]

To do so, we’ll prove the following statements. Throughout, assume that \(p\) and \(q\) are prime and not equal to each other, and that \(e\) and \(d\) are related as above. \(a\) and \(b\) are any integers.

  1. If \(p\) divides \(ab\) and \(p\) does not divide \(a\), then \(p\) divides \(b\).

    This is called Euclid’s Lemma. We’ll state it without proof here; it should be intuitive. The proof relies on the Fundamental Theorem of Arithmetic.

    It’s important to note that this is not always true if \(p\) is not prime. For example, 6 divides \(3 \times 4\) even though 6 divides neither 3 nor 4.

  2. If \(a = b\), and \(p\) divides \(a\), then \(p\) also divides \(b\).

    Again, this should be intuitive and we’ll state it without proof. (This also holds even if \(p\) is not prime, but we won’t use that fact here.)

  3. If \(xa \equiv xb \pmod{p}\) and \(x<p\), then \(a \equiv b \pmod{p}\). In other words, we can “cancel out” the common factor of \(x\) on both sides.

    By definition, \(xa - xb = kp\) for some integer \(k\). Rearranging the left-hand side, we have \(x(a-b) = kp\). Since \(p\) divides the right-hand side, it must also divide the left-hand side (by statement 2). Since \(x<p\), \(p\) cannot divide \(x\); therefore \(p\) must divide \(a-b\) (by statement 1). Therefore, \(a \equiv b \pmod {p}\).

  4. For \(a\) greater than zero and less than \(p\), if we reduce each member of the the sequence \(a, 2a, 3a, ..., (p-1)a\) modulo \(p\), the resulting sequence is a rearrangement of the sequence \(1, 2, 3, ..., p-1\).

    An example will help illustrate this. Let’s take \(a = 4\) and \(p = 11\). We write down the first sequence:

    \[4, 8, 12, 16, 20, 24, 28, 32, 36, 40\]

    Then reduce each one modulo 11 to get:

    \[4, 8, 1, 5, 9, 2, 6, 10, 3, 7\]

    Which is a rearrangement of the numbers 1 through 10.

    First, we prove that no member of the sequence \(a, 2a, ..., (p-1)a\) is congruent to zero modulo \(p\). By statement 1, if \(p\) divides \(k\cdot a\), then either \(p\) divides \(k\) or \(p\) divides \(a\). However, we have defined \(a\) as less than \(p\), and \(k\) is less than \(p\) for every member of this sequence, so this is not possible.

    Second, we prove that no two members of the reduced-modulo-\(p\) sequence are equal. We will prove by contradiction. Suppose there are two members, \(xa\) and \(ya\), that are not equal but are congruent modulo \(p\). This means that \(kp = xa - ya = a(x - y)\) for some \(k\), and for \(x\) and \(y\) greater than 0 and less than \(p\).

    By statement 2, since \(p\) divides \(kp\), it must also divide \(a(x-y)\). \(p\) is greater than \(a\), so \(p\) does not divide \(a\). Therefore, by statement 1, it must divide \(x-y\).

    Since \(x\) and \(y\) are both positive and less than \(p\), \(x-y\) must be greater than \(-p\) and less than \(p\). Therefore, the only way for \(p\) to divide \(x-y\) is if \(x-y = 0\), or in other words that \(x=y\). Therefore \(xa\) and \(ya\) are equal, contradicting our assumption.

    Therefore, all \(p-1\) members of the reduced-modulo-\(p\) sequence are greater than zero and less than \(p\), and are all distinct. Therefore, they must be a rearrangement of the sequence \(1, 2, 3, ..., p-1\).

  5. \(a^{p-1} \equiv 1 \pmod{p}\) for any \(a\) other than zero. This is known as Fermat’s Little Theorem2Named for the mathematician who came up with it in 1640, it’s probably called “little” to contrast it with Fermat’s Last Theorem, an infamous math problem that went unsolved for hundreds of years despite mathematicians’ best efforts..

    Consider multiplying together all elements of the sequence \(a, 2a, 3a, ..., (p-1)a\). By statement 4, we can express this as:

    \[a \cdot 2a \cdot 3a \cdots (p-1)a \equiv 1 \cdot 2 \cdot 3 \cdots (p-1) \pmod{p}\]

    Rearranging the left-hand side, we have:

    \[a^{p-1}\cdot(1\cdot 2 \cdots (p-1)) \equiv 1 \cdot 2 \cdots (p-1) \pmod{p}\]

    Then, by statement 3, we can cancel out everything except the \(a^{p-1}\) factor to get: $\(a^{p-1} \equiv 1 \pmod{p}\)$

  6. If \(ed \equiv 1 \pmod{(p-1)(q-1)}\), then for any \(a\): \(a^{ed} \equiv a \pmod{p}\) and \(a^{ed} \equiv a \pmod{q}\).

    We will prove this for the case of \(p\) only. The proof for \(q\) is identical in structure.

    First, we have that \(ed \equiv 1 \pmod{(p-1)(q-1)}\), and therefore that \(ed - 1 = k(p-1)(q-1)\) for some integer \(k\), and therefore that \(ed = k(p-1)(q-1) + 1\). Then we substitute into \(a^{ed}\) and simplify:

    \[a^{ed} = a^{k(p-1)(q-1) + 1} = a\cdot a^{k(p-1)(q-1)} = a\cdot (a^{p-1})^{k(q-1)}\]

    Statement 5 states that \(a^{p-1} \equiv 1 \pmod{p}\) for \(a \neq 0\). We then use this fact in the above equation:

    \[\begin{split} a^{ed} &\equiv a\cdot (a^{p-1})^{k(q-1)} \pmod{p} \\ &\equiv a\cdot 1^{k(q-1)} \pmod{p} \\ &\equiv a\cdot 1 = a \pmod{p} \end{split}\]

    For the case of \(a = 0\), the congruence \(0^{ed} \equiv 0 \pmod{p}\) is obviously true.

  7. If \(a \equiv b \pmod{p}\) and \(a \equiv b \pmod{q}\), then \(a \equiv b \pmod{pq}\).

    We have that \(a \equiv b \pmod{p}\) and \(a \equiv b \pmod{q}\), which by definition means that \(a-b = xp\) and \(a-b = yq\) for some integers \(x\) and \(y\). Therefore, \(xp = yq\). Since \(p\) divides the left-hand side, it must also divide the right-hand side (by statement 1). Since \(p\) and \(q\) are distinct primes, \(p\) does not divide \(q\), and thus it must divide \(y\) (by statement 2). Therefore, \(xp = y'pq\).

    Substituting back into \(a-b = xp\), we get \(a-b = y'pq\). Since \(pq\) divides the right-hand side, it must also divide the left-hand side. Therefore, \(a-b\) is divisible by \(pq\), and \(a \equiv b \pmod{pq}\).

It follows from statements 6 and 7 that \(m^{ed} \equiv m \pmod{pq}\), for any \(m\). That completes the proof of RSA’s correctness.

A side note on the number \((p-1)(q-1)\), which is the modulus under which \(d\) is computed as the inverse of \(e\). Look carefully at the proof of statement 6: it does not rest on the modulus being exactly equal to \((p-1)(q-1)\). All it really requires is that the modulus be divisible by both \(p-1\) and \(q-1\). So the modulus can be, instead, the least common multiple of \(p-1\) and \(q-1\). This can be computed as:

\[\mathrm{lcm}(p-1, q-1) = \frac{(p-1)(q-1)}{\gcd(p-1,q-1)}\]

Let’s redo the example from the first section, using this method instead. We’ll use the same \(p\), \(q\), and \(e\): 97, 47, and 17, respectively (\(n=4559\)). Now we compute the inverse of 17 modulo \(\mathrm{lcm}(96, 46) = 4416 / 2 = 2208\). By the extended Euclidean algorithm, we get \(d = 1169\). The ciphertext was 3707; to decrypt it, we compute \(3707^{1169} \Mod 4559 = 3048\), which again is the original message.