Next: 3.7 Enciphering and Deciphering
Up: 3. Foundation for Cracking:
Previous: 3.5 Bigram Frequencies
3.6 Example of Bigrams
The string "abccc" has bigrams "-a", "ab", "bc",
"cc", "cc", and "c-". Where did the "-a" and "c-"
come from? We insert spaces at the start and the end of each line of
text to mark the start of the first word and end of the last word in
each line of text. Figure 14 collects and organizes the tallies of
unigrams and bigrams for "abccc".
Figure 15:
Tallies for Example String "abccc"
![\begin{figure}
\begin{center}
\begin{tabular}[t]{\vert c\vert cccc\vert}
\multic...
...lticolumn{2}{r}{\fbox{6}\rlap{ Total}} \\
\end{tabular}\end{center}\end{figure}](img33.gif) |
Observe that the unigram tallies are equal to the row and column sums
of the bigram tallies. Further observe that the total number of
unigrams is equal to the total number of bigrams. You might wonder
why the number of spaces is 1. This is because we treat the inserted
spaces at the front and end of each line of text as half-spaces or
shared spaces; this also makes the bigram and unigram tables match.
Thomas Yan
2000-05-01