CDR: A cool idea that didn't hold up under cryptanalysis.
Ray Dillinger
bear at sonic.net
Wed Sep 20 17:49:34 PDT 2000
Recently, I've been making and breaking pencil-and-paper
ciphers specifically to acquaint myself with the art of
cryptanalysis.
I've developed methods for solving vignere, general
transpositions, playfair, etc... just basically bringing
myself up to speed with the general background of the field.
And then, with my brain sharpened by the classics, I've been
returning to things I formerly thought were "really neat
ideas" in crypto but couldn't formulate a good reason for
why they would be secure, and demolishing them.
On most of them, the attacks have turned out to be embarassingly
simple; On this one, although I found an attack where known
plaintexts or chosen plaintexts could demolish it, to a
cryptanalyst guessing cribs it's pretty resistant (it requires
depressingly long cribs of a depressingly specific form unlikely
to be found at random).
Anyway, any fans of pencil-and-paper systems out there, enjoy!
Bear
The Overlapping Additive Nomenclator
(Pencil-and-paper cipher; believed secure against amateur
pencil-and-paper cryptanalysts but insecure against
computer-assisted or really good pencil-and-paper
cryptanalysts -- falls under the heading of "a cool
idea that didn't hold up under cryptanalysis")
This is a pencil-and-paper cipher, moderately difficult to
use and extremely difficult to solve by pencil-and-paper
cryptanalysis.
I specifically do not assert that it is computationally
secure. The cryptanalysis explaining why not follows
at the bottom of this paper.
In this cipher, the key is a list of numbers, each number
corresponding to one cleartext symbol. The following property
must hold true of these numbers.
For some number N, (N=2 in the following example)
One of the two following conditions must hold:
Either:
All numbers used must differ from one another
in first N digits. If the first difference
is in the Nth digit, then the difference in
that digit must be at least two.
Or:
All numbers used must differ from one another
in either one of the last two digits.
(In the example, both conditions are true).
For some number D, (D=3 in the following example),
Each code number must be at least N x D in length.
The cipher provides sufficient confusion to "cover up"
runs of identical plaintext characters of length D
or smaller -- runs larger than length D will start
showing a discernible pattern in the ciphertext.
For a key on a small alphabet, consider the following:
A 129754
B 148943
D 208531
E 259146
F 268953
Now, if we are to encrypt the message "DEADBEEF" we would
proceed as follows:
First, the key number is written lined up beneath each
plaintext letter, with N columns of digits for each column
of letters; with a leading-difference system, the numbers
are written starting in the column of the corresponding
letter: With a trailing-difference system, the numbers
are written ending in the column of the corresponding letter.
D E A D B E E F leading difference
D E A D B E E F trailing difference
208531 148943
259146 259146
129754 259146
208531 268953
Next, the numbers in each column are added, with any carries
spilling over to the next column;
208531 148943
259146 259146
129754 259146
208531 268953
--------------------
21113564544660643553
The sum is the ciphertext.
Decryption proceeds as follows:
If we have chosen "Leading difference" as our condition, then
we look at the first six digits of the sum and determine which
code number is closest to it but not over. We subtract
that value from the whole, recovering a new sum. We write the
plaintext symbol corresponding to the number we have just
subtracted in place of the leading two digits, as the new sum
will be two digits shorter than the old. Repeating the process
until completion, we get the transformation in the left column.
If on the other hand we have chosen "trailing difference" as our
criterion, then we look at the trailing two digits, pick the
cleartext letter corresponding to these digits, and subtract it
from the result. The difference will end in two zeros, which
we remove and replace by the cleartext symbol whose code value we
have just subtracted. Repeating the process until completion,
we obtain the transformation in the right column.
21113564544660643553 21113564544660643553
208531 268953
-------------------- --------------------
D 260464544660643553 211135645446603746 F
259146 259146
-------------------- --------------------
D E 1318544660643553 2111356454463446 E F
129754 259146
-------------------- --------------------
D E A 21004660643553 21113564542043 E E F
208531 148943
-------------------- --------------------
D E A D 151560643553 211135643931 B E E F
148943 208531
-------------------- --------------------
D E A D B 2617643553 2111354354 D B E E F
259146 129754
-------------------- --------------------
D E A D B E 26183553 21112246 A D B E E F
259146 259146
-------------------- --------------------
D E A D B E E 268953 208531 E A D B E E F
268953 208531
-------------------- --------------------
D E A D B E E F D E A D B E E F
In a real system, we could have chosen "leading difference" or
"trailing difference" and made the key such that it would have
been unusable in the other direction.
With either mode, we can use code numbers of varying numbers
of digits; the requirement of a constant number of digits
holds only for systems such as the above which may be solved
in either direction. With a leading-difference system, the
correct method is that the numbers be lined up with their
leading digits under the letter before the addition is done;
with a trailing-difference system, it is important that the
numbers be lined up with their trailing digits under the
letter before the addition is done.
Since the "trailing difference" system need not worry about
a carry digit altering the key, more symbols may be included
in its alphabet.
I have used N=2 here as the range of sensitivity, thinking
of the roman alphabet plus perhaps seventy code signs. If
you prefer a nomenclator-like alphabet of nine thousand code
signs, the same technique may be used; you will need to use
N=4 instead.
I have also used a "digit overlap" or D of three here, meaning
that each digit of the ciphertext is the result of adding
three digits in a sum, plus a possible carry digit. Better
security may be had by increasing the digit overlap (the
minimum length of each numeric element becomes N times the
digit overlap) but real computational security cannot be
achieved by this method; it will frustrate only pencil-and-
paper cipherers.
Finally, since the ciphertext is "open at its ends", it is best
for security to use a predetermined number of NULL symbols
at both ends of the cipher; Null code numbers, unlike the actual
key numbers, need not differ from legitimate key numbers in
the first or last few digits; they need only to differ in the
first or last few digits from the code numbers assigned to
other null symbols. Leading and trailing NULL symbols make
this cipher quite difficult of solution; if you keep track
of your supply of leading and trailing nulls and do not reuse
them, the cipher is reasonably secure against pen-and-paper
cryptanalysis.
A successful method for pen-and-paper analysis, however, does
exist. If a known plaintext enciphered with some key is captured,
a cryptanalyst may set up linear equations constraining the
values of the code numbers. If a set of D+1 consecutive symbols
is used more than D times in the message under the same key,
then the linear equations may be solved and will generally
find the code number corresponding to at least one symbol.
Once at least a few such code numbers are known, a cryptanalyst
may start looking for "cribs" consisting of a seqence of length
D consisting of known symbols, one unknown code symbol, and another
sequence of D known symbols. Each time such a crib is found, the
cryptanalyst will learn another key code number.
A cryptanalyst may also look for "cribs" consisting of a sequence
of length D consisting of known symbols, two unknown symbols,
and another sequence of length D of known symbols; each time
this happens, he will obtain a linear equation constraining the
values of two symbols. If more than one linear equation is found
in this way constraining the value of any two symbols, the key
for those symbols can be found. Then of course, the other
symbols captured in the linear equations with them from other
cribs may also be solved for.
What security this cipher has is as a result of the abstracted
and difficult form that a crib must take, and the difficulty of
finding and identifying suitable cribs.
*sigh.* Oh well, it's better than a standard nomenclator or
a pigpen cipher. Polyphony or polyalphabetic code substitution
could of course be applied to good effect, but that's just
obfuscation.
Bear
More information about the Testlist
mailing list