Symmetric Ciphers. The One Time Pad

sourses: one, two





What is a one-time pad?


A one-time pad is a cryptosystem invented by Vernam. It's a very simple system and is unbreakable if used correctly. To use a one-time pad, you need 2 copies of the "pad" which is a block of random data equal in length to the message you wish to encode. The word "random" is used in its most literal possible sense here. If the data on the pad is not TRULY RANDOM, the security of the pad is reduced, potentially to near zero.

One-time pads are used in pairs. The more copies of a given pad, the greater the likelihood is that one may be captured, in which case the system is completely broken. One copy of the pad is kept by each user, and pads must be exchanged via a secure channel [e.g.: face to face on floppy disks]. The pad is used by XORing every bit of the pad with every bit of the original message. Once the message is encoded with the pad, the pad is destroyed and the encoded message is sent. On the recipient's side, the encoded message is XORed with the duplicate copy of the pad and the plaintext message is generated.

Think of a one-time pad as a 100% noise source which is used to mask the message. Since both parties of the communication have copies of the noise source they are the only people who can filter it out.

Are one-time pads really unbreakable?


Yes. But only if used properly. A one-time pad must be truly random data and must be kept secure in order to be unbreakable.

Consider if the one-time pad is used to encode the word "otter." If an attacker tries to brute force "guess" the contents of the pad, the message will "decrypt" into every possible combination of 6 characters (e.g.: "lemur." "badger" etc..) Since the pad is truly random there are no statistical methods that the attacker can hope to use to infer which combination is correct.

Can I reuse my pad?


Never, without reducing the security of the pad to something less than 100%, usually close to 0%. As long as the pads are unique and never reused there is nothing that can be used to attack the encryption using statistical analysis or pattern matching.

Wright recounts a situation in which Soviet intelligence re-used one-time pads years after they had originally been distributed to field agents in Britain. The British noticed a pattern in some coded messages and began searching for "hits" through a complete archive of all encrypted communications intercepts. Over a period of years, various secret communications slowly were compromised. This operation took place under the legendary code word VENONA which was a closely held secret until recently. NSA has recently declassified parts of the story and put information about VENONA on the Web.

What are good sources of randomness?


Feeding cosmic background radiation through one-way functions (such as MD5) is one approach. Remember that the one-way function is used, in such a case, to adjust the statistical properties of the data, not as super-encryption - so a lower-quality one-way function such as DES or MD5 is acceptable, even though they might not be good enough to use for encryption in their own right, anymore. Using radioactive decay is another option! The randomness of radioactive decay is an open question depending on whose cosmology you believe, which is another reason that feeding samples of background radiation through another permutation generator such as a one-way hash is a good idea. See Internet RFC 1750 for more information on generating random numbers.

Quality random numbers are a very important problem in cryptography. Not only must the number have the properties of a random number, it must be unpredictable. For example, the output of random() might appear to be statistically random, but is absolutely predictable if you know the seed that went into it. Many begining one-time-padders think that using the output of random() will give them something useable as a one-time-pad. In fact, it's only as strong as the seed that was used, which is often getpid() or time(). These are basic beginner flaws, but unfortunately, even commercial products containing cryptography have been known to contain such elementary weaknesses. (e.g., NetScape).

HOW THE ENCRYPTION IS DONE

There are many variations in how the One-Time Pad encryption can be done. We will be focussing on only one method: the method that was used by the Soviets.

Let's assume that the message to be encrypted is "Retreat at dawn."

STEP 1.
To begin our hypothetical encryption, we will be using a Code Book (sort of a "code dictionary"). A Code Book has words, listed alphabetically, along with corresponding numbers that represent each word (the period at the end of the sentence is also considered to be a "word"). The Soviets used 4-digit numbers in their Code Book. The Code Book can be re-used for future messages. Both the sender and the recipient possess two copies of the Code Book: one copy is used for sending a message, and another copy (which is formatted differently) is used when one has received a message.

So, let's say that, after consulting our Code Book, we find that "Retreat at dawn." translates to the following four numbers:

0441 0412 2123 9000

[the number "9000" represents a period at the end of the sentence]

Note that, even though we have converted the characters into numbers, the message is not encrypted enough to be uncrackable by an enemy. To achieve a high level of encryption, we must proceed to the next step.

STEP 2.
Next, we rearrange the 4-digit numbers into 5-digit numbers by moving the left-most digit from the number group to the right, over to the adjacent number group on the left. Since there is a shift of some digits one digit to the left, you will find that you run out of digits for the last (right-most) numbers, so simply add zeros until the last number has 5 digits:

It is easier to show it than to explain it. Shifting one digit from each number to the left, and adding some zeros to the right-most number, the code now becomes:

04410 41221 23900 00000

STEP 3 (the most important step).
To finish the encryption of our message, we consult yet another book (called the One-Time Pad Book, a reference book that has rows of randomly generated 5-digit numbers). There are only two copies of the One-Time Pad Book: the sender has one copy, and the recipient has the other copy. We must arithmetically add these random numbers to the coded message. So, the 1st random number in the row is added to the 1st coded word in the message; the 2nd random number is added to the 2nd coded word in the message; and so on. The numbers on the One-Time Pad page are read from left to right across a row, and then the next row of random numbers is used. NOTE: To do the addition correctly, we must use a method called "non-carrying addition", meaning that any digit carried over is not used in the addition.

So, let's assume that the first four random numbers on the top row of the One-Time Pad page are 23402 89524 94742 and 00425. The non-carrying math results in the following:

04410 41221 23900 00000 <----The message (rearranged 5-digit numbers from Code Book).
+
23402 89524 94742 00425 <-----random numbers copied from the One-Time Pad book.
______________________
27812 20745 17642 00425 <-----resultant encrypted message.

STEP 4 (OPTIONAL).
Now, if you wish, you can convert the resulting sum back into characters, using a chart. Note that this step does not add another layer of security to the encrypted message. So, if it doesn't add any extra security to the message, why go through the trouble of converting the numbers to characters? The reason was economics; transmitting characters via wire was cheaper than transmitting numbers (the Western Union company charged its customers a premium for sending numbers), and one could save a lot of money by using the conversion chart provided by the Western Union Company (shown below). Therefore, the Soviets converted the numbers back into alphabetic characters soley to save money.

HOW TO CRACK THE CODE? 

(Answer: wait for your enemy to make a mistake)

Two huge unknowns confront the enemy code-breaker: 1) The randomly-generated numbers that were on a particular One-Time Pad Book's page(s) are unknown; and 2) the numeric code from the Code Book and its corresponding Russian words are unknown. If some random numbers on a One-Time Pad sheet could be identified, then it might be possible to uncover some of the numerical "words" in the Code Book. In theory, this should be impossible. But what if the encrypter/sender made a mistake during the encryption process?

Let's set up a scenario in which the encryptor forgot to destroy the page in his One Time Pad book and then, mistakenly, he re-used it to encrypt another message. Let's further assume that both of these encrypted messages were intercepted and are in the possession of enemy code-breakers. The enemy can now use a technique called "pattern matching and analysis" on both of the messages to find similarities in the code structure between the two messages. The code-breaker is looking for at least two matching pairs of 5-digit numbers in the two different encrypted messages. If the same One-Time Pad page was used to encrypt both messages, then two (or more) matching pairs of numbers might be found in both messages and, further, if they are found, they must occur in the same localities within both messages (this is because the 50th random number on the One-Time Pad page will be added to the 50th code "word" in both messages. Finding these matches then triggers a search for other matching pairs of numbers in the two messages. The chances of finding such an encryption mistake are miniscule, and the task is daunting. But if matches are found, decryption can then proceed, but it will only work for those twoparticular messages. Any other messages that correctly used a One-Time Pad sheet only once will still remain "dark" to the code-breaker.

After some of the original random numbers from the One-Time Pad page(s) had been uncovered by the code-breakers, then another technique, called "book-breaking", is used to recreate portions of the original Code Book (i.e., the 4-digit number groups and their respective words). Tying a particular 4-digit number to its corresponding Russian word is not as difficult as it may appear. However, book-breaking requires a thorough knowledge of the language and grammar of the country who's code is being cracked. One of the best book-breakers of coded Russian language messages during the 1940s was U.S. govt. linguist Meredith Gardner. As one example, Gardner found that the Soviet Code Book number "0669" represented the English word "into". By 1946, Gardner was slowly cloning the Soviet Code Book, piece by piece, even though he didn't have a physical copy in his hands.

Around 1950, largely due to Meredith Garner's book-breaking talent and his hard work, American nuclear physicist Theodore Hall and his friend Saville Saks were uncovered as nuclear spies for the Soviet Union. While working at the Los Alamos National Laboratory between 1943-1946, Hall copied top secret nuclear physics equations and other information in invisible ink (i.e., regular milk) onto the margins of newspapers. He then mailed the newspapers to Saville Saks, who made the milk ink visible by applying heat to the newspaper with a clothes iron. Saks then handed the information over to KGB agents. The Soviet Embassy then encrypted the message using the One-Time Pad system and sent the message by telegraph to Moscow via the Western Union Company. Because of a War-time agreement, all Soviet diplomatic messages that were sent from U.S. soil were also made available to the U.S. government. The Soviets were not concerned about this, because they thought that their code was unbreakable. The Soviets rarely made proceedural mistakes with their encryption. Only 1% of all encrypted Soviet messages sent to Moscow between 1944 and 1948 contained encryption errors that allowed them to be cracked by American intellegence agencies.

[Footnote: Even though Hall and Sacks were identified as spies for the Soviet Union, neither of them was prosecuted. This is because the U. S. government didn't want the Soviets to know that portions of their messages were being decrypted. But in 1948, the Soviets finally learned, through another KGB spy named William W. Weisband, that the U. S. had breached their coded channels. The Soviets then switched over to a different encryption system. Hall and Saks lived as free men into old age. Saville Saks died in 1980. Theodore Hall died in 1999 after a distinguished career in England as a biophysicist.

Footnote #2: Ironically, Senator Joseph McCarthy (R-Wisc.), a witch hunter, a drunk and a kook who destroyed the reputations of many innocent American citizens, was totally unaware of Hall's or Saks' spying activities, nor of the U.S.'s ability to break into some Soviet encrypted messages. The moral of the story is to leave the job of spy hunting to the intelligence community and to reject the claims of pathologically paranoid zeolots and politicians.]
















Categories:

Blogger news

Popular Posts

Blogroll

Popular Posts