6.3 The Effects of Swaps on Bigram Tables

Next: 6.4 Continuing to Investigate Up: 6. Improving Decryption Methods Previous: 6.2 Why Swaps Are

6.3 The Effects of Swaps on Bigram Tables

Let us look at the effect of swapping two letters in a key on bigram frequency tables. In Figure 33 we take some plaintext through a series of Stages. Figure 32 shows the plaintext, summarizes the Stages, and includes the encryption keys that could be used to transform the plaintext into the text at each Stage.

**Figure 32:** The Pipeline of Stages that Plaintext Goes Through
$\begin{figure} \begin{center} \vspace{2ex} % \begin{tabular}{ll} Stage 0, plaint... ...a abc ac ad ada add baa bad cab cb cdc dab dad dada dc}}\end{center}\end{figure}$

**Figure 33:** Effects of Swaps on Bigram Frequencies
$\begin{figure}\small \noindent \begin{tabular}{rrrl} \multicolumn{4}{l}{Text: \t... ...}{30ex}}\shortstack{S\\ t\\ a\\ g\\ e\\ [2ex]4}} } \\ \end{tabular}\end{figure}$

Figure 33 has a lot of information. Let us look at it a bit at a time.

At each Stage, we show the text so far and three pairs of unigram and bigram frequencies. These form three separate columns (left, middle, right) of frequency tables. For brevity, we use tallies instead of percentages. Verify the observation from Section 3.5 that adding up the tallies in each row or each column in the bigram table yields the tallies in the unigram table. (This property is also true for percentages.)

The left and middle columns contain the same frequency information --the unigram and bigram frequencies of the text so far-- but organize it slightly differently; we will explain the right column in Section 6.4. A little terminology at this point will be helpful. Let us call the frequencies, the numbers themselves, the contents of a table, as distinguished from the the character labels.

The left column leaves the contents of the unigram and bigram tables unchanged, but swaps labels. This directly reflects two letters getting renamed to each other.
The middle column leaves the labels unchanged, but rearranges the contents of the unigram and bigram tables to reflect the new frequencies.

Take a careful look to see that the left and middle columns do contain the same frequency information, namely the unigram and bigram tallies for the text so far. For example, in Stage 1, "-d" is shown to occur 8 times in both the left and middle bigram tables and also in the ciphertext.

Now take a closer look at how the contents of the tables in the middle column are rearranged. On the far right between each stage, we have a large brace $\fbox{$\left.\rule{0in}{4ex}\right\}$ }$ labeled with the following information: the two letters that were swapped, and a picture of the changes to bigram table in the middle column -- entries that are changed are drawn as xs. Observe that when two letters are swapped, the effect on the tables in the middle column are to swap the corresponding rows and columns. This makes sense: If two letters are swapped (renamed to each other), then their corresponding frequencies get swapped, e.g. if 'a' and 'd' are swapped, then the frequencies of "ab" and "db" are swapped (rows 'a' and 'd' are swapped), as are the frequencies of "ba" and "bd" (columns 'a' and 'd' are swapped).

This tells us how to quickly (re)compute bigram frequencies when we swap two letters in text. Instead of recounting all the bigram frequencies by scanning the text, we merely take the old bigram table and swap the corresponding two rows and swap the corresponding two columns!

So, we have learned the effects of swaps on bigram frequency tables.

However, there is still more information to be learned from Figure 33.

Next: 6.4 Continuing to Investigate Up: 6. Improving Decryption Methods Previous: 6.2 Why Swaps Are

Thomas Yan
2000-05-01