Stochastic algorithms

The next few pages give an outline of traditional stochastic algorithms, such as hillclimbing, preparatory to a full description of the Churn algorithm and an illustrative computer program. Those familiar with stochastic algorithms may want to jump directly to the Churn algorithm.

I will illustrate the use of these algorithms for solving Classical ciphers – by which I mean those used up to the 2nd World War. One or two of them were also used in the 1939-45 period, for example the Double Playfair cipher. You will find descriptions of all Classical ciphers at the web site of the American Cryptogram Association. 

Stochastic algorithms are based on random processes. The simplest is a random search. When applied to cipher solving, it keeps cycling as below:
do forever:

-select a key at random,
-decipher,
-rate the ‘fitness’ of the plaintext in some way and display it if it is the best to date.

Hopefully the correct plaintext will eventually appear – much as a monkey with a typewriter might eventually produce a passage from Shakespeare. Neither is an efficient or practical approach to these tasks.

I need to explain how ‘fitness’ can be assessed.

A wrong key will produce poor plaintext like “qstxindujsolzltdsfwh”. The quality of this plaintext can be judged by looking at each four-letter group (tetragraph) and giving it a score dependant on its frequency in English text. Such frequencies can be determined by looking through a number of books. On this basis, ‘qstx’ gets a score of zero because it is never found. ‘stxi’ scores 3, ‘txin’ 2, 'xind' 4, 'indu' 11 and all the rest zero. The whole of this plaintext scores 20. In comparison, the first 20 letters of this current sentence score 218. You can see that tetragraph scoring clearly differentiates between the quality, or fitness, of these two plaintexts.

In actual fact, the frequencies used are log frequencies. Using the logs reduces the impact of very frequent tetragraphs, like the-, that, -ing, -ion and so on which otherwise have too great an effect on the score.