Deciphering the ciphers
Mrs Bennet, Mathematics Department, reports on the progress of the 2016 Soton Cipher Competition and the mathematical maze through which our students are finding their way!
‘So far the Priory Decoders and the Cipher Guild (our two teams from Upper IV and Lower V) have managed to decrypt all of the challenges 1 to 6 in the annual Soton Cipher Competition, (with each one consisting of two parts A and B; B being much harder than A). Challenges are released on a Thursday (3:00 pm) with maximum marks for a solution submitted by midnight and a sliding scale of points for correct solution submitted during the following two weeks. However, Cipher 7 proved particularly challenging and time consuming.
7A required the girls to reverse a string of more than 3,000 characters before decryption was possible. The teams used Python to reverse the text, saving many hours and reducing the likelihood of errors creeping in. We used Mr Dellow’s expertise here as he guided us towards a simple piece of Python code to reverse the string [: : -1] – so easy when you know it!
So with the string reversed, the girls set about figuring out 7A and reading the message for clues to 7B. 7A turned out to be a Vigenere (polyalphabetic) cipher with a key length of 7, but it gave us no further clues as to what to do with 7B. Frequency analysis soon told us that it was not a monoalphabetic cipher and Naomi correctly spotted the lack of J in a polyalphabetic cipher could indicate that it was a Bifid or a Playfair. Playfair encrypts in bigrams (pairs of letters), and thus must produce an even number of cipher characters. Our message had 1,639 characters and so the teams decided it must a Bifid.
This year however, the compilers made it exceptionally hard for us, because 7A gave us no new information or possible lines of attack. The cipher text has an alphabet which consists only of 25 letters, but what makes it very difficult is that a letter does not encrypt to the same letter each time. For those of you who have seen The Imitation Game, it is a little like the Enigma code. For a Bifid, the output depends on where each letter is in a string of text and what letters it is next to.
Keys for the Bifid cipher consist of a 25 letter ‘key square’. e.g.
1 2 3 4 5
1| p h q g m
2| e a y l n
3| o f d x k
4| r c v s z
5| w b u t I
The order of the letters within the key square depends on a key word known only to the sender and recipient.
Example: encrypt ‘defend the east wall of the castle’ using the key shown above.
When enciphering a plaintext, each letter is replaced by the numbers on the left hand side and top of the keysquare. These are then written on top of one another as shown in step 1 (below). E.g. ‘d’ is in row 3, column 3 of the key square so 3 is written in the top row and 3 is written in the second row. This is done for all plaintext letters.
Step 2: The numbers are then grouped into blocks of a certain size (this is called the period, and forms part of the key). In this example the period is 5. The groups are then read off left to right (this is the fractionating step that makes bifid slightly more difficult to crack than a simple substitution cipher).
Step 3 shows the new sequence of numbers after reading the groups left to right, first the top row of the group followed by the bottom row. The entire string is then re-enciphered using the original keysquare (shown in step 4) e.g. ‘row 3, col 2’ is ‘f’ in the original keysquare.
An example encryption using the above key:
plaintext: defend the east wall of the castle
step 1: row 323223 512 2245 5222 33 512 424522
col 312153 421 1244 1244 12 421 224441
step 2: 32322 35122 24552 22335 12424 522
31215 34211 24412 44124 21224 441
step 3: 3232231215 3512234211 2455224412 2233544124 1242421224 522441
step 4: f f y h m k h y c p l i a s h a d t r l h c c h l b l r
Thus ‘defendtheeastwallofthecastle’ becomes ‘ffyhmkhycpliashadtrlhcchlblr’ using the key square shown above and a period of 5 during the enciphering step.
Sadly, we were nowhere near solving the problem even by the end of the first week, as without the block length or the keyword we hit a brick wall.
There was nothing for it but to look at bigram counts:
It turns out that estimating the period is much easier when the period is even; when it is odd we need more ciphertext to get good results. This depends on calculating the bigram distribution for our ciphertext with different ‘steps’. The start of our cipher text is HTPEGWEEHWAOHCP
* Bigrams with a step of zero will be the following: HT,TP,PE,EG,GW, etc.
* Bigrams with a step of one will be the following: HP, TE, PG, EW, GE etc.
* Bigrams with a step of two will be the following: HE, TG, PW, EE, GE etc.
and so forth. We calculate the bigrams with steps up to 20 or so. Modern scientific calculators can easily calculate the variance of the bigram counts merely by inputting strings of numbers. Plotting the variance for step length produces a typical peak at the correct period length.
For example:
There is a clear spike at 6, which is half of 12.
In fact this part of the project was not completed correctly before the organisers kindly published an official clue telling us that the period was 4.
So next we started on trying to work out the arrangement of the keysquare and thinking about what the key word might be. Knowing the length of the keyword is a good first step, but knowing the keyword itself allows decryption in full. We concentrated on finding the keyword rather than its length!
Along the way we discovered all sorts of details about Edgar Allan Poe, who was fascinated by cryptography. His book ‘The Purloined Letter’ forms part of a clue in an earlier challenge. Phrases that he used such as ‘The Gold Bugs’, and his detective ‘Augustus Dupin’ were tried and tested as key words for the Bifid, but without success.
Dorothy L Sayers, also wrote about cryptography with her hero Lord Peter Wimsey decrypting a Playfair cipher in the 1932 novel ‘Have His Carcase’.
However, despite clues pointing in this direction, we found nothing useful.
A very helpful site bionsgadgets.appspot.com allowed us to test possible arrangements of letters to see if we could generate the first word in the message, which we suspected might be “Martin”. The messages are sent between fictitious characters called Harry, Charlie, Martin, Trinity and Jamelia. We all thought it was time for a letter to be addressed to Martin. Using a crib like this was one of the ways that the enigma cipher was broken during the second world war. Assuming the keyword did not contain z, y, x or other infrequent letters towards the end of the alphabet, then we knew that these would appear in the bottom row of our keysquare. We also tried to make one of the most common groups of 4 letters SOQR in the message decrypt to give either ‘tion’ or ‘ther’ or similar.
A post on the forum on Thursday 8th gave us hope that it was solvable – the poster urged everyone to think deep and hope for a “lightbulb moment” as he himself had had, and which had enabled him to solve the puzzle. Cipher Club on Friday lunchtime was spent brainstorming what we could have missed in the clues along the way.
The lightbulb moment came when we discarded all thoughts about Edgar Allan Poe and old detective stories (which we knew the fictional Martin loved), and instead turned our attention to the science behind the story – the weaponisation of gravitational waves. As those of you who follow the scientific news, gravitational waves (which were proposed in 1916 by Albert Einstein) were first detected as distortions in the space time by a lab within the California Institute of Technology on September 14, 2015. So as not to spoil the fun of anyone who wishes to follow the Cipher Guild and the Priory Decoders in their decryption quests, I will not publish the key word here. But this allowed us to piece together the remaining parts of the puzzle and find out what Martin was trying to tell his team in the message.
And so we turn our attentions to next week and the final challenges, 8A and 8B, which are traditionally so hard that the final deadline is extended into January. We know that one of them will be a Hill cipher. A Hill cipher involves matrix multiplication of letters turned into numerical values, using modulo 26, which some of the team members studied last year in Cipher Club. As always it is far, far easier to encrypt than to decrypt!
To find the decryption matrix:
- Find the determinant d of the encryption matrix
The determinant d of matrix () is −
- Find the multiplicative inverse d-1 of the determinant working modulo 26. In other words d d-1 = 1 mod 26
- Find the adjugate matrix adj
adj () = (−−)
Multiply the multiplicative inverse of the determinant by the adjugate matrix to get the decryption matrix. Apply this to the encoded message to work out the original!
Budding mathematicians who are interested in working for GCHQ can try the GCHQ Mathematics Aptitude Test here: https://www.gchq-careers.co.uk/media/30806/Sample_Maths_Aptitude_Test.pdf
And for anyone who hasn’t sent off their Christmas list to Santa Claus yet, there is the wonderful GCHQ Puzzle Book, which was published in October.’
Categories: Senior Whole School