Most people in computing-related fields will have heard of public key encryption. The goal of this article to shed some light on how it works and how it is used. Most of the focus will be on providing explanations and examples that do not involve the mathematics of asymmetric cryptography.
Terminology
First we need to quickly define some terms that might be new to you.
- Encryption: the act of taking a readable piece of text and transforming it into a scrambled, unreadable text. This operation is always reversible.
- Decryption: the act of taking a previously encrypted text and transforming it back into its original readable form.
- Plain Text: text that is readable by either humans or computers and not encrypted in any way.
- Cipher Text: the output of some encryption algorithm. This text is (if the encryption is good) not readable by anyone or anything.
- Key: an object used to open a lock. In cryptography, “keys” are used to transform plain text to cipher text and cipher text to plain text.
- Public Key: a very large integer that is paired mathematically with a unique private key.
- Private Key: a very large integer that is paired mathematically with a unique public key.
What is Asymmetric Encryption?
Imagine a box with a lock on it which we will define as a “lock box”. The typical lock box may have a number of identical keys, any of which can lock or unlock the box. This is how typical, or “symmetric” encryption works: one key is used to both transform the plain text to and from a cipher text. In asymmetric encryption, you have to imagine two keys and a single lock box. One key can only unlock the lock box (we will call this our “private” key), while the second key can only lock the lock box (we will call this our “public” key). To send a message, first distribute the public key to the person we wish to communicate with; we will call this person Alice. Now suppose Alice wishes to send us a message. She will compose her message, and place it in the lock box, shut and lock the box with the public key we have given her. The lock box will be sent to us, and we will then open the lock box with the private key that we have. The bit that makes this work is that the public key, which in theory could be given to anyone, can only lock the lock box. So any message that is placed in the lock box can only be viewed by someone with the matching private key. If we wish to send a message to Alice, we would need to have her public key and lock box and the same process would occur.
Details and the Devil
Unfortunately, no exact physical representation of asymmetric encryption exists (to my knowledge). A real asymmetric encryption system uses only a public key and a private key pair; the public and private lock boxes are not used. Furthermore, the astute observer will notice that a single public-private key pair were used to communicate only in a single direction. This is due to some properties of asymmetric cryptography.
Public Key Encryption
Suppose Alice and Bob wish to use asymmetric cryptography. They each have a public and private key, and they have both exchanged their public keys. Each can do the following:
- encrypt a message with the others public key
- encrypt a message with their private key
The previous example used the first case, where Alice used our public key to send us a message. This works because only our private key will decrypt the message, so the message is safely encrypted and only the recipient can read it. However, if we had wanted to send a message back to Alice such that no one else could read it, we could not use our private key to encrypt it, as anyone with our public key could decrypt and read the message. We must always assume that public keys are available to everyone and anyone (indeed we want this to be the case as we will see later on), and that private keys must be kept absolutely secret and is never distributed.
Public Key Signing
This property of anyone being able to decrypt a message that we encrypt with our private is actually useful, just not for exchanging secret messages. It can be used to verify that a message could only have been sent by a specific individual. For example, suppose we wish to send a message to Alice (perhaps our physical mail address) but we can not be sure that Mallory will not intercept the message and change our address for his. Assuming that Alice has our public key, we could simply encrypt our mail address with our private key and then send it to Alice. She will then be able to decrypt it with our public key and get our mail address. And what about Mallory? Suppose he finds a way of intercepting our encrypted mail address message (and don’t forget, he can decrypt it since we assume he could have access to our public key just as Alice does), replaces it with his own mail address, encrypts that with his own private key and then forwards it to Alice such that she thinks it came from us. Alice will then, believing that the message came from us, attempt to decrypt it with our public key. This will fail and result in garbage since our public key will not decrypt Mallory’s message. Alice will then know that the message did not originate from us and will not trust the mail address that she received.
In Real Life
Asymmetric encryption is rarely (if ever) used to encrypt or sign whole messages; the computation cost is too high. For encryption, a symmetric key (remember that symmetric encryption uses a single key for both encrypting and decrypting, much like a physical key and lock) is generated by the sender (this is sometimes called a “session key” since it is usually valid only for the immediate message being exchanged between two individuals), and used to encrypt a message. The session key is then encrypted with the public key of the recipient and sent to the recipient along with the encrypted message (Note that these session keys are almost never sent between two parties over the wire; protocols like Diffie-Hellman key-exchange perform this function in a more secure fashion for interactive communications). The recipient will then decrypt the session key with their private key and will now be able to decrypt the message . Symmetric encryption is used to encrypt messages because they are very very fast when compared with asymmetric encryption and messages are typically fairly large (typical “large” keys are 1024 bits or 128 bytes versus messages that are typically greater than 512 bytes).
For signing, a hash of the message being sent is encrypted (or “signed”) with the private key of the sender. A hash is a one-way function that maps a block of data (the message in this case) to a unique numeric value. The message and signed hash are then sent to the recipient, who will then re-hash the message, and compare it to the decrypted hash (signature). If they match, then that message was sent by the sender.
Problems
There are a few problems with asymmetric encryption. The most glaring problem is how does an individual know that the public key they have actually belongs to who they think it does? This seems trivial; one just has to be given it by the owner of the public key. However, in real-life we may not know the recipient of a message or have had any prior contact with them before we wish to send them a message. This is where Public Key Infrastructure (PKI) comes into the discussion. This is a system that provides a mechanism for the distribution and storage of public keys. Key exchange is a complicated problem (and far, far outside the scope of this article), and there exists no fool-proof scheme that is convenient enough for the general public to use at this time.
Another major hurdle to adoption for this type of encryption and signing was hinted earlier: there is no good physical representation of asymmetric encryption, and so it is difficult for individuals to grasp how it actually works and how to use it correctly.
But perhaps the biggest problem with this (and many other forms of encryption) is that people don’t feel that their email to grandma is important enough to warrant the added complexity encryption brings. This view is changing slowly; until fairly recently the whole idea of “encrypted” web sites (such as your bank’s web site for example) was very abstract for the general population.
Conclusions
Asymmetric encryption serves two primary purposes: to encrypt messages and to sign message. For bi-directional communications, each participant must have their own public and private keys, and must have the public key of every participant that they wish to communicate with. Messages are never encrypted with asymmetric keys; they use a symmetric (session) key instead. Hopefully this article provides a bit of clarity on how asymmetric encryption technology can be used to aid in secure and trusted communication.
References
- Public Key cryptography
- Applied Cryptography, by Bruce Schneier
- Hash functions
>Unfortunately, no exact physical representation of asymmetric encryption exists
A good analogy is that Bob gives an opened padlock to Alice, who puts the message in a box then locks it with Bob’s open padlock. She then sends the box to Bob. Bob opens the box using the key/combination to the padlock. Having the padlock doesn’t give Alice any special knowledge of the key; assuming she can’t open and disassemble the padlock, of course. If she were to open and disassemble the padlock, it should be difficult (a bit like integer factorisation would be if Alice tried to recover Bob’s private key).