Sunday, 24 March 2019

RSA Encryption

Well, let us start of with a quick bit of revision using another diagram from our lecture notes. This is the basic concept of symmetric key cryptography, namely where we use the same key to encrypt, and decrypt, the message.

Now, the major problem with this form of cryptography is basically getting the key from one place to another. Like, you can't actually send the key with the message because that would sort of defeat the purpose of encrypting the message. However, you also need to send the key in a way that it gets there at the same time, or faster than, the original message. Sending an encrypted message by email, and the key by airmail, sort of defeats the entire purpose of email - you might as well send the message by pony express.

So, we have another way of doing it, and that involves generating two keys - a public and private key. The public key is basically shared to the entire world, while the private key is retained by the person who generated the key. The message is then encoded using the public key, sent to the person holding the private key who then decrypts the message. Sounds simple enough, but we are actually going to go over the maths that is used to do this to see how it works. Anyway, another image from the lecture notes to helpfully explain this:


First, the Maths

So, to be able to understand, and actually do some of these, we need to know about some mathematical functions (who would have thought that maths would actually be useful, but then again a maths degree is one of those degrees that has the capacity to open up an awful lot of doors). So, let us go through some definitions:

GCD: Greatest common divisor. This is basically the largest number that can divide both of the other numbers. For instance, the greatest common divisor of 18 and 30 is 6. The other way of writing this is GCD(18,30) = 6.

Prime Number: I hope you know what a prime number is, though if you are reading this post I assume that you do. However, just to clarify that we are talking about the same thing, a prime number is a number that is divisible only by 1 and itself.

LCM: Least common multiple. This is the lowest number that is divisible by the two numbers that we want the least common multiple. For instance, the least common multiple of 2 and 4 is 8. The other way of writing this is LCM(2,4) = 8.

Co-Prime: Two numbers, or more specifically integers, are co-prime if they do not share any common divisors. Basically, the greatest common divisor of both numbers happens to be 1 or to put it another way: GCD(a,b) = 1.

Modular Arithmetic: Now this is where it starts to mess with your head. Basically modular arithmetic involves getting the remainder if you divide two numbers. Then there is the inverse mod (or mod-1). Look, it is possible to work out the inverse mod on paper using something called the Euclidean algorithm, and the same goes for the normal mod, but since this involves some pretty large numbers (which is one of the reasons it happens to be secure), we basically use an online calculator. Actually, it turns out that Wolfram Alpha happens to be the best.

However, our lecturer did give us a way of sort of working out an inverse modulus problem - though it sort of works through intuition. So, we have d*7 = 1 mod 20. So, what we need is to know is that d*7/20 = something that happens to have a remainder of 1. Well, that is pretty easy to work out because we know that 21/20 gives us a remainder of 1 (or 41/20, or 61/20) so then d*7 = 21, which means that d happens to be 3. While that might work in principle, if we have a number like 8784625620317589042067872136735782638941256752940984658 (which is basically a bunch of random digits that I banged into my keyboard) using this intuitive method sort of hits a brick wall. Which is why we then go to Wolfram Alpha.

RSA Encryption

So, I should point out that most of these examples will be done using numbers. However, in reality the message would actually be a message as opposed to some numbers. It is just that you need to convert the message into a numerical value because that is how you employ the encryption mechanism. However, so that we don't have to do that, since each and every letter would be encrypted this way, we will just work on the principle that the number represents something. Oh, and we should also remember that when it comes down to it, computers basically work on 0s and 1s.

So, the first thing we need to do is to generate the public and private keys. This starts off with the server picking two prime numbers which are represented by p and q. Here, p = 3 and q = 11. Then we calculate n, which is p*q, and in this instance n=33. Then we calculate φ(n) which is (p-1)*(q-1). In this instance φ(n) = 20. Now we will determine e, which needs to be co-prime with φ(n), that is GCD(e,φ(n)) = 1. While we have several choices, we will pick 7, so e=7. Now we have the public and private keys. The public key is n=33 and e=7, and the private key is worked out as follows: de = 1 mod 20. We worked that out above where d=3. That is the private key and is retained by the server.

I'll do that so it is easier to read:

p=3
q=11
n=p*q = 3*11 = 33.
φ(n) = (p-1)*(q-1) = 2*10 = 20.
GCD(e,φ(n)) = 1 thus e = 7
de = 1 mod φ(n) = d*7 = 1 mod 20 = 3
M = 14 (which is the message).

Now, the cryptographic function is C = Me mod n and M = Cd mod n. So, with the above example:

C = 147 mod 33 =20 so our encrypted message is 20.

Let us decrypt it:

M = Cd mod n  = 203 mod 33 = 14.

So, we have the number which we started with, but in doing so we used two different keys to get to that place. This is the beauty of this form of encryption, namely that you don't need to pass the key on to another person.

However, there is a flaw. Say we have the public key n and e, and we also have the encrypted message C. Can we work out the private key? Well, we sort of can. First of all p and q are both prime numbers, which means that if we have n, then we can pretty quickly work out p and q through a system known as prime factorisation. The thing with that is that all numbers are made up of the product of two prime numbers, so if we have one, we also have the other. Also, once we have p and q, we can pretty quickly determine the rest of the algorithm, and then proceed decrypt the entire message.

There is a way to defeat that, which basically involves much more complicated mathematics, however we will look at that next time. Well, there is another way of making it hard, and that is by using ridiculously huge numbers, but then again that is a no brainer. As for where it happens to be used - well, it is pretty much used everywhere, or at least some version of it is used pretty much everywhere. For instance, when you go to a website, these algorithms are flying about across the internet, confirm your identity, and validating the website that you are visiting.

El-Gamal

Actually, since the next two versions of RSA encryption are also just a bunch of mathematical formulas, I'll just go over them here as well. Basically they are El-Gamal (named after the guy who came up with it) and Pallier (also named after the guy who came up with it). However, as we previously saw, the strength of RSA is the ability to factorise the prime numbers. El-Gamal is different in that cracking it requires the use, or rather computation, of logarithmic functions.

So, what we do first is that we select a prime number p and a primitive root g. We also select a private key x. We then generate the public key y by using gx mod p. The public key is then sent to the entire world to use. The world then picks a random number r and generates k through yr mod p. The world then generates C1 and C2 through C1 = gr mod p and C2 = M*k mod p. C1 and C2 are then sent back to the holder of the private key. Now, k is calculated as follows: k = C1x mod p, or rather the modular multiplicative inverse, so it is actually k-1 = k-1 mod p. The encrypted message is then decrypted as follows M = k-1 C2 mod p.

Let us look at it in a more readable way. So, first p, g, and x are selected:

p = 139
g = 3
x = 12

Then we calculate y:

y = gx mod p = 312 mod 139 = 44.

The public key is then sent to the entire world. Then r is selected:

r = 52.

Then we generate k:

k = yr mod p = 4452 mod 139 = 112

C1 = gr mod p  = 352 mod139 = 38

Oh, and we need a message:

M = 100.

C2 = M*k mod p = 100 * 112 mod 139 = 80.

The message is then sent back to the holder of the private key:

k = C1x mod p = k = 3812 mod 139 = 112

k-1 = 112-1 mod 139 = 36

M = k-1 C2 mod p = 36*80 mod 139 = 100.

Clear as mud? The thing is that this is only one variety, you also have a couple of others, which are even more complicated. The thing with complexity is that it makes it hard to crack, however the other problem is that the encrypted message ends up being twice the size of the original message.

Let's move on to Pallier.

Pallier

So, we have a plain text message M, being 42. Then we select two large prime numbers (which won't be all that large in his example) p and q, randomly and independent except that GCD (p*q, (p-1)*(q-1)) = 1. n is then calculated as p*q. g is then selected in that g is in fact a multiple of n. The public key n & g is done with what you normally do to public keys. Next we calculate 入 which is LCM(p-1, q-1).  Next we calculate k, which is L(g mod n2) where L(u) = (u-1)/n. Finally we generate μ as k-1 mod n. 入 & μ become the private keys. Now that we have the keys we can encrypt and decrypt messages. To encrypt the message we use c = gm * rn mod n2 and to decrypt it we use (c mod n2) μ mod n. Oh, and r is a random number. Let us put this in action:

m=42
p = 7,
q=11
r=23 ( a random number
gcd (p*q,(p-1)*(q-1)) = 1 thus gcd (77,60) = 1.
n = pq = 7*11 = 77.
g=5652
入 = lcm (p-1,q-1) = lcm(6,10) = 30
k = L(g mod n2) = L(565230 mod 772) = 3928
k = L(3928) = 3927/77 = 51
μ = k-1 mod n = 51-1 mod 77 = 74.

Now that we have the public and private keys, it is time to do some encrypting and decrypting.

So:

c = gm * rn mod n2 = 565242 * 2377 mod 772 = 4624

m = (c mod n2) μ mod n = (462430 mod 772) 74 mod 77 = L(4852) 74 mod 77

L(4852) = 4851/77 = 63

4662 mod 77 = 42

And that is it. Next we will look at ways these algorthims help our modern, digitised, world.


Creative Commons License

RSA Encryption by David Alfred Sarkies is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This license only applies to the text and any image that is within the public domain. Any images or videos that are the subject of copyright are not covered by this license. Use of these images are for illustrative purposes only are are not intended to assert ownership. If you wish to use this work commercially please feel free to contact me

No comments:

Post a Comment