# Progress Report #11

Moshe Rubin (mosher@mountainvistasoft.com)

 Table of ContentsCorrespondences relating to Jeff Hill's Paper "Chaocipher: Analysis and Models"Moshe Rubin to Jeff Hill (16 April 2009): Comments on Jeff's paperJeff Hill to Moshe Rubin (19 April 2009): Reply to commentsMoshe Rubin to Jeff Hill (20 April 2009): Research that C98 with step probabilities of 1/2, 1/4, and 1/4 must be given upJeff Hill to Moshe Rubin (20 April 2009): Initial response to C98 analysisJeff Hill to Moshe Rubin (21 April 2009): Final response to C98 analysisMike Cowan's Chaocipher Strategy Paper (20 April 2009): MC presents a strategy paperMike Cowan to Moshe Rubin (21 April 2009): Comments why Friedman might not have been interested in Chaocipher

## Correspondences relating to Jeff Hill's Paper "Chaocipher: Analysis and Models"

Since uploading Jeff Hill's exciting paper "Chaocipher: Analysis and Models" to The Chaocipher Clearing House, there have been numerous correspondences and e-mails about the models mentioned in the paper.  These include e-mails between Jeff Hill, Moshe Rubin, and Mike Cowan (the current editor of the Computer Column in the American Cryptogram Association's periodical, The Cryptogram).  Here is a selection of the correspondences.

### Moshe Rubin to Jeff Hill (16 April 2009)

 I stayed up till 3 AM this morning reading the article -- I enjoyed every minute of it.  I do have some comments and questions which I would like to list for you.You write on page 6 "Thus the possibility that each sector contained 1 to 3 teeth is considered first and when this does not yield a convincing fit to the graph of Figure 1, ...".  Solving the equations (1) p1+p2+p3=1; (2) p1 + 2p2 + 3p3 = 2 yields the equations (3) p2 = 1 - 2p3 and (4) p1 = p3.  So, given a value for p3 we can compute p1 and p2.  It would take work for me to set up code to fit p1/p2/p3 to the graph.  I take it you weren't able to find a good fit?MM1, with steps of five, does not explain why Exhibit 1 does not repeat machine states in under nine steps, but MM2 does.  Playing with steps of 1, 2, and 4, we can reach 26 steps by individual steps of +4, +4, +4, +4, +4, +4, +2.  The probability of this occurring is p = (0.25)7 = 0.000061035.  In 13,615 characters we would expect 13,615p = 0.831+ occurrences.  It is plausible that a repeat could have happened within nine steps but didn't statistically.  On the other hand, how would you explain repeat machine steps in the other exhibits?BTW, MM2's steps of 1, 2, and 4 seem very binary and logical, allowing multiple step length from 1 to 7 economically.On page 8: "These probabilities arise when a trial has two outcomes, A and B, ...".  Do you have any ideas on what these outcomes could be based on in Chaocipher?  Comparing the plaintext letter with the ciphertext letter, i.e., some autokey condition?  Byrne alludes to a simple system that even a ten-year-old could learn quickly.  I've wondered if it's some grouping of the A-Z alphabet.  For example, A-M could be outcome 'A', while N-Z would be outcome 'B'.  If it were some autokey, that would explain the extreme sensitivity to error.On page 9, your discussion of possible cryptographs is most interesting.  Regarding half-rotors, have you had an opportunity to see John Savard's attempt at a half-rotor?Regarding M2 periods discussed starting from page 10: you raise the point that, within a single M2 period, the equivalent C94 cryptograph will have no more than three contact varieties.  I haven't checked yet, but can any counter cases be found where there are too many varieties within a single M2 block?  Although we don't know what the M2 periods are, a plethora of counter cases could be indicative.Still regarding M2 blocks: if two adjacent M2 blocks *can* cause identical adjacent machines states, wouldn't we expect more such adjacent machine states in Exhibit 1 at M2 boundaries?I like your section on Complexity Reduction.  It is a clever nuance to reduce the cryptanalytic complexity by reducing the work from N alphabets to fewer ones.In table 7, N=14 shows that C98U had two (2) occurrences.  How odd!  Is this correct?Figure 14 shows rows 22-34 and 43-55 juxtaposed.  Do any of the models you mention explain how, say, there are five consecutive repeat machine states which suddenly stops matching?  What keying factor would have allowed five consecutive states to be identical, but then suddenly become non-identical?

### Jeff Hill to Moshe Rubin (20 April 2009)

 Hi Moshe, Thank you for a very interesting analysis.  I will need some time to read this carefully and consider the consequences.   I'm not sure that I will have it completely evaluated before next weekend. However, if I understand the main thesis of your analysis, what you seem to have invalidated is the possibility that C98 uses simple probabilities of 1/2, 1/4, and 1/4.    One reason that I included both MM1 and MM2 in my paper is that this shows that there are more ways than one to interpret the Exhibit 1 repeat frequencies. As far as which models in the C98-series are still viable, I am sure, as I wrote in my paper, that C94 is not.   I am also sure that C98A is not viable based on analysis similar to that in Table 7 of my paper. The main feature that gives either C98 or C98U a chance is that both divide the text up into M2 periods. However, I doubt that C98 is a model of Byrne's device even though it comes closest to matching Langen's description.  My doubts stem from the fact that the M1 disk must also serve as the KY disk which, as your analysis confirms, exposes the Key transitions to direct analysis. Best regards,Jeff

### Jeff Hill to Moshe Rubin (21 April 2009)

 Hi Moshe, You have proven that C98 cannot be stepping with simple probabilities of 1/2, 1/4/ and 1/4, but the question is, which do we reject, C98, M2 periods of 55 letters, or the simple stepping probabilities?   After reviewing the 14 pt/ct matches that your analysis located, I believe that these can be reconciled with C98 by using a Markov model that includes steps of 3, 5, 6, and 7 steps in addition to steps of 1, 2, and 4, thereby giving up the simple probabilities.  However, I prefer to keep my options open on this question and not choose between C98 and simple probabilities until further research has clarified the issue. I find after reviewing my Cryptograph simulators that the 55-letter block will have to be given up.  I wrote my paper C:A&M as a concept piece and was focused on what I consider the essential clues found in Silent Years and the Chaocipher Exhibits.   By not going into greater detail about the construction of the C98 series of cryptographs and the statistics generated by them, I overlooked the fact that the M2 period has to be much longer than 55 letters in order for C98 and C98U to replicate the probabilities derived from Exhibit 1.  My apologies for any confusion this may have caused. Best regards,Jeff

### Mike Cowan's Chaocipher Strategy Paper (20 April 2009)

Mike Cowan is the editor of the American Cryptogram Association's (ACA) "Computer Column".  This column discusses computer programming approaches to cryptanalysis.  Mike is also the author of a paper published in the January 2008 issue of Cryptologia entitled "Breaking Short Playfair Ciphers with the Simulated Annealing Algorithm".  This paper describes the adaptation of simulated annealing to solve short Playfair ciphers (80-120 letters) without using a probable word.

While corresponding with Mike about using Simulated Annealing to solve the Wheatstone cryptograph, I mentioned to Mike about The Chaocipher Clearing House.  As you can see, Mike has joined the corps with gusto <g>.  Mike sent in a paper describing his view of a strategy for tackling Chaocipher research.

 Hi Moshe, I've found your website really excellent as a source of interesting and useful information on Chaocipher and it's good to have it all in one place. In trying to get thoughts in order on how to tackle this one, I've drafted a strategy on which I'd much appreciate comments, criticisms, additions...or whatever may occur to you and other folk. This is attached. It includes one or two aspects not previously reported, that may be of interest. For me the hunt is now on for a machine that churns out the right kind of ciphertext! As Jeff Hill has commented, once we've found that, cryptanalysis to find Byrne's keys should be possible. Look forward to hearing findings on a Wheatstone-type machine when you're ready. Best, Mike.

Mike's e-mail for comments is .

### Mike Cowan to Moshe Rubin (21 April 2009)

 Hi Moshe, I think I'm beginning to see why this is so difficult for us but why Friedman really wasn't interested. I can see that with 2 cipher wheels the machine will use [676] different alphabets during enciphering. The order it uses these alphabets will be set by some kind of mechanism that advances one of the disks by a different amount after each encipherment -- for example in a cycle of 1,2,1,4 as Jeff wrote about -- and the other disk at less frequent intervals. With so many alphabets there are less than 10 letters enciphered with any one alphabet in the 5500 cipher letters we've been given -- and we can't find out which letters belong to each alphabet because we don't know the advancement pattern. However if we stole the machine, as the French stole Enigma, we'd be laughing. Each of the [676] cipher disk positions could be tried in little time. In a computer driven world, you could think of having daily changes to the mixed alphabets and the advancements but I doubt this was a practical proposition for Byrne's machine in the 1920's. Friedman must have thought he had better devices available in-house, and taken the easy way out to put Byrne down gently. All that doesn't help us! so on with the merry activity. . . .Best, Mike.