# Cryptography Notes

1. PLAINTEXT is the unencrypted data. It's called plainTEXT even if it's really binary data. The name is a throwback to the days when handwritten messages were encrypted manually.

2. CIPHERTEXT is the encrypted data. Ideally the ciphertext would be statistically indistinguishable from random noise no matter how regular the plaintext was. The "randomness" of the ciphertext is a measure of the quality of the encryption algorithm. Be aware: humans are generally lousy at evaluating "randomness." Just because it looks like gibberish to you doesn't mean it's random. You need to apply appropriate statistical tests.

3. When analyzing cryptographic systems one always adopts an ATTACKER MODEL where the attacker is assumed to have full knowledge of all algorithms and methods used. The attacker is assumed to not know any secret keys or similar data items.

4. SYMMETRIC ENCRYPTION algorithms use the same key for encryption as for decryption. When using such an algorithm for secure communication a major problem is: how do Alice and Bob share the key? You need a secure channel for the key material, and if you have such a channel why not just use it for the data itself?

Secure channels tend to be low bandwidth and intermittent (think about Alice and Bob meeting in a Cafe to exchange keys). Thus the value of encryption is that it lets you use a reliable, high bandwidth, and probably also very public channel for the bulk communication.

5. Block ciphers like DES operate on a block of data at a time... never more or less. Typical block sizes are 64 bits or 128 bits but some algorithms use other sizes.

6. The key size is an important parameter. Small keys can be attacked by brute force: just try them all. This assumes you can recognize the plaintext when you see it (normally not a big problem but sometimes it's an issue). Notice that brute force can be used against any encryption algorithm; if the key is too small it doesn't matter how clever the algorithm is.

7. Good algorithms have long enough keys to be "computational infeasible" to brute force. What that means exactly depends on several factors:

• The resources of the attacker. Very resourceful attackers with lots of time/money to spend can feasibly do more than your 10 year old sister.
• The value of the protected data. If it costs \$10,000,000 to build a machine to crack your encryption and the data you encrypted is only worth \$20, who is the real loser here?
• The lifetime of the protected data. If it takes a month to crack the encryption of (for example) battlefield orders that are only meaningful for 24 hours, the data is secure enough.

The parameters above are interrelated. Spending more money will likely reduce the time needed to crack the encryption (more money buys faster hardware) but there is no point in spending more money than the data is worth. Even a weak algorithm is fine if the data it is protecting is relatively worthless or very short lived (or both).

That said, strong encryption algorithms are readily available so there's no reason not to use one.

8. Some algorithms have mathematical weaknesses that make them easier to attack than brute force. However, this is not a black and white issue. Often such attacks are still quite difficult or even infeasible. If you are worried about mathematical weaknesses you can try increasing the key size. Often (not always) that makes such attacks harder. Thus there are reasons for using very long keys even if such keys would be far beyond the reach of brute force.

9. Good encryption algorithms use the principles of "diffusion" and "confusion."

DIFFUSION means that changing one bit of plaintext will impact a large amount of cipher text. I use "impact" instead of "change" because changing bits is actually very regular since a bit is either 1 or 0. With a good encryption algorithm changing one bit of plaintext "randomizes" a large amount of ciphertext (the entire block).

CONFUSION means the transformation implemented by the encryption is highly non-linear. If the transformation were linear it would be possible to represent each bit of ciphertext as a key dependent linear combination of the plaintext bits (using modulo 2 arithmetic). This isn't great because if that kind of relationship existed it would be relatively easy to algebraically invert it.

10. Most encryption algorithms use multiple "rounds." They transform the key to produce multiple "subkeys" (also called "round keys"). A typical algorithm works by doing a basic operation on the input using one of the subkeys. That operation is then repeated multiple times (rounds) using the various subkeys. Generally the strength of the algorithm increases as the number of rounds increases. Many algorithms have been shown to be vulnerable in "reduced round variants." However, doing lots of rounds makes the algorithm slower.

For efficiency reasons you only want the minimum security you can get away with!

11. The ONE TIME PAD encryption algorithm entails just XORing a "key" as long as the message with the message itself. It is critical that a) the pad remain secure, b) the pad be used only once, and c) the pad be generated from a true source of random numbers. If these conditions are met it is theoretically impossible to crack the One Time Pad. The reason is that all possible messages are potential decryptions so the attacker has no way to know which one is the true message.

12. A SUBSTITUTION CIPHER entails just mapping each unit of plaintext to a different unit of cipher text. Typically a "unit" is a byte. If the unit is small the algorithm is very weak because it doesn't sufficiently disrupt the statistical properties of the plaintext. If the unit of substitution is larger the plaintext statistics tend to be flatter and the method is safer. For example ECB mode of a normal block cipher is a kind of substitution over block-sized units. The One Time Pad could be regarded as a substitution where the entire message is exactly one unit (and thus no statistics can be obtained for analysis).

13. An encryption algorithm can be regarded as a key-dependent permutation of all possible blocks. For example, for a given key the plaintext block 0x0000000000000000 might map to the ciphertext block 0x4F729AAB39382A4F. The mapping is one-to-one and onto (a permutation) because a) every possible plain text must be encryptable to something, and b) every possible ciphertext must be uniquely decryptable.

There are a hug number of possible permutations. For 64 bit blocks the number is (2^64)!. This is vastly larger than the number of possible keys meaning that for any given algorithm the key only gives you access to a tiny fraction of possible block permutations. This is why it is normally no problem to recognize the plaintext when you do a brute force attack; there is probably only one key that produces anything resembling reasonable plaintext. In contrast the One Time Pad "algorithm" gives you access to all of the permutations making it impossible to distinguish the correct plaintext from irrelevant plaintext.

14. In addition to the various encryption algorithms there are also various encryption MODES. Each mode represents a different way to use an encryption algorithm and they differ in their security, support for random read/write, and error propagation. Examples include electronic codebook (ECB), cipher block chaining (CBC), cipher feedback (CFB), output feedback (OFB), counter mode (CTC), etc.

15. In PUBLIC KEY CRYPTOGRAPHY each user has a private key and a corresponding public key. It is computationally feasible to generate the keys together but not feasible to compute the private key given knowledge of the public key. This allows the public key to distributed freely. Anyone can use it to encrypt a message that only the owner of the private key can decrypt.

A major issue with public key cryptography is ensuring that you have the correct public key. An attacker can post a public key claiming to belong to someone else; any message encrypted with it can then be read by the attacker who actually has the corresponding private key. In real life this problem is addressed by "certifying" public keys... having them digitally signed by a "certificate authority" who is trusted to only endorse (sign) public keys that are truly controlled by the entity to which they are associated.

16. The key sizes of public key algorithms can't be compared with the key sizes of symmetric algorithms. Public key cryptography relies on "hard" mathematical problems and the keys need to be large enough to make solving the underlying problem infeasible. How large this is depends on the complexity of the underlying problem.

17. Typically public key algorithms have much larger keys than symmetric algorithms and require far more computational effort to encrypt/decrypt. Thus they are much slower and more difficult to run on resource constrained systems. As a result most uses of public key cryptography entail one principal generating a random "session key" for a suitable symmetric algorithm and then using the public key of the recipient to encrypt just the session key. The bulk data communicated is then encrypted with the session key. This uses the public key algorithm on a small object (the session key) and lets the large data set be handled by the much faster symmetric algorithm.

However, this approach does provide three points of attack: 1) The public key algorithm could be attacked in an effort to recover the session key. 2) The symmetric key algorithm could be attacked in an effort to decrypt the data without the session key. 3) The random number generator used to create the session key could be attacked in an effort to predict the session key closely enough to make guessing it feasible (it is not necessary to predict the session key exactly). This last attack vector is often overlooked.

18. A HASH FUNCTION is a one-way function that takes a message of arbitrary size and computes a fixed size (usually smaller) value that represents the message. The function is one-way in that it is easy to compute the hash of a message but infeasible to find a message that has a given hash. This property is also sometimes called "pre-image resistence." Notice that because the message is large and the hash value is usually smaller it is normally the case that a large number of messages do, in fact, hash to the same value. Pre-image resistence doesn't mean there is no pre-image (in fact there normally a huge number of pre-images), only that finding one is computationally infeasible.

19. Hash functions used for security applications should also have two other properties.

• COLLISION RESISTANCE. Given a message M it should be computationally infeasible to find another message M' such that H(M) = H(M') where H is the hash function.

• STRONG COLLISION RESISTANCE. It should be computationally infeasible to find two messages M and M' such that H(M) = H(M').

Strong collision resistance is a stronger property. The attacker is free to choose both messages. This is in contrast with collision resistance where one of the messages is fixed and the attacker must try to match it in the sense of finding another message with the same hash.

20. Strong collision resistence is important because of the BIRTHDAY ATTACK. In this attack the attacker prepares two messages; one desirable to the victim and one not. The attacker than identifies a number of independent but semantically insignificant changes to both documents (extra white space, sentence rewordings, etc). The attacker computes a table of hash values made from the first message while ranging over every combination of insignificant change. The attacker then computes the hash of the second document while trying every combination of insignificant change there as well. For each hash of the second document the attacker tries to match it against one of the hashes in the previously computed table. If a match is found the attacker will have succeeded in finding two documents with the same hash. This means that a digital signature of one will verify correctly when applied to the other.

This attack can be surprisingly effective because each hash of the second document is compared against all the hashes of the first. Because of a statistical quirk called the "birthday surprise," the probability of finding a match is higher than one might expect. Specificially, if a hash function produces a 64 bit hash, it is only necessary to use 232 variations (the square root of 264) of each document to create a high chance of finding a match. For this reason hash values need to be about twice the size you might otherwise expect. An 128 bit hash value "only" entails about 264 amount of work to break strong collision resistance. Note that this attack is independent of the hash algorithm.

21. A MESSAGE AUTHENTICATION CODE (MAC) is a hash value that was computed with the help of a secret key. Only people in possession of the key can make or verify MACs. A MAC thus provides an authentication and data integrity service. If the MAC checks the sender must have the right key and furthermore the message must not have been modified.

The difference between MACs and digital signatures (below) is that MACs use symmetric key cryptography whereas digital signatures use public key cryptography. The usual pros and cons apply: MAC computation is faster, but it requires the communicating parties to previously exchange secret key material.

22. Most public key encryption systems also allow for the possibility of using the private key to make a DIGITAL SIGNATURE. In the case of the RSA algorithm the signature is made by just encrypting with the private key. However, other public key algorithms use different approaches.

23. The normal way to make a digital signature entails computing a hash of the document or message to be signed and then applying the digital signature algorithm to the hash. This avoids processing a potentially large message with the slow public key algorithm. It also allows the message itself to be sent or saved "in the clear" as would be appropriate for public information.

24. Digital signatures provide three security services.

• Data Integrity. If an attacker changes the message, he/she will not be able to make a new signature without the private key of the original signer. Assuming the hash function is good, the attacker can't change the message or produce another message with the same hash. This also prevents the attacker from attaching a signature on one message to a different message.

• Authentication. Assuming the signer's private key remains secure a digital signature shows that the signed document was in fact signed by the specified principal. It is important, however, that the entity verifying the signature be confident they have the true public key of the signer. Also note that a valid signature does not prove that the signer was the author/creator of the document.

• Non-Repudiation. The signer can't later deny signing the document ("I never said that!") without also claiming that his/her private key has been compromised.

25. A CERTIFICATE is a digitally signed public key. Certificates typically also include other information. The most commonly used certificate format, X.509, is normally used to bind a public key, which is a large binary number, to the identity (or name) of the key's owner. Thus X.509 certificates are often called "identity certificates." Other kinds of certificates also exist. Certificates often contain additional information such as a validity interval (the time over which the certificate is considered valid) and information about the intended purpose of the certificate.

The entity that signs public keys and thus creates certificates is called a CERTIFICATE AUTHORITY (CA). Depending on the application domain a CA might be a large company but in other cases it can also be a private individual.

26. To verify a certificate it is necessary to check the digital signature made by the CA. This requires having the public key of the CA. Typically that is provided in another certificate signed by a "higher level" CA. Thus it is common for certificates to be arranged in a CERTIFICATE CHAIN where the key used to sign one certificate is provided in the next certificate. The chain ends on a "self signed" certificate made by the trusted or root authority. This approach allows for distributed CAs whereby a high level CA can certify the public keys of the lower level CAs that then certify user's keys.

27. A RANDOM NUMBER GENERATOR is an entity (hardware and/or software) that produces a stream of bits that have properties consistent with those of random data. In security applications the primary property of interest is unpredictability: No amount of knowledge of the previous output of the generator allows you to predict the next bit with probability greater than 0.5 (even with knowledge of the algorithms used by the generator). This property implies that the numbers are statistically random but the converse is not true: a generator that produces output that looks statistically random may still be predictable, especially if the inner workings of the generator are known.

For example, one popular (high speed) way to generate random looking data is to use a linear congruential generator. Such generators compute Xn+1 = (aXn + b) mod c. For suitably choosen a, b, and c the output data stream can have good statistical properties. However, there is no security since knowing the output allows one to trivially compute the following output, assuming knowledge of a, b, and c. Even if a, b, and c are not known it is algebraically uncomplicated to solve for them given a sufficiently large sample of the generator's output.

28. A PSEUDO RANDOM NUMBER GENERATOR (PRNG) is a deterministic algorithm for computing random looking data. The linear congruential generator above is a PRNG.

All PRNGs have internal state that they update as they work. The initial state of the PRNG is called the "seed" and it is considered secret information (similar to a key). Since that state is finite it is necessary that as a PRNG works eventually an older state will be revisited. Thus all PRNG generators produce periodic output. High quality PRNGs have very long periods, however. One could attempt to "brute force" an PRNG by trying every possible seed. Thus to be secure the internal state of a PRNG must be large. Also seeding a PRNG with the date/time is not secure because it is too easily guessed.

29. In security applications PRNGs must be seeded with random data from a "true" RNG. Such generators typically rely on some physical process to collect randomness. Ultimately these physical processes are random because of quantum mechanics and thus a true RNG gets its randomness from very fundamental physics. Yet care must still be taken in the design of a true RNG to avoid introducing biases that make their output more predictable than it should be.

30. True RNGs tend to produce random numbers slowly so the usual procedure is to use the true RNG to seed a cryptographically strong PRNG which is then used to generate the bulk random data. The PRNG can be periodically reseeded from the true RNG to frustrate analysis. Thus PRNGs don't really generate any randomness at all. Instead they effectively just amplify the bandwidth of the true RNG.

Last Revised: 2018-10-01