To try to calibrate what distances count as close we can compute the
distances between the large plaintexts we used to discover intrinsic
frequencies.
At this point, we should also perform a sanity check. We should check at least one example to see that frequencies for ciphertext are not ``close'' to intrinsic frequencies: If ciphertext frequencies are also ``close'' to intrinsic frequencies, then ``closeness'' is not a good criterion for recognizing unscrambled frequencies. We use the following as our sample ciphertext:
http://courses.cs.cornell.edu/cs100/2000sp/cryptannounce.txt
Roadmap | |
Section 3.7 | Encipher plaintext
![]() |
Section 3.7 | Decipher ciphertext
![]() |
Section 4.2 | (Hope)
Unscramble frequencies
![]() |
Section 4.3 | Unscramble = Bring ``close'' to intrinsic frequencies |
Approximate intrinsic frequencies with training text | |
Assume ciphertext is medium to large so that unscrambled frequencies resemble intrinsic frequencies | |
Section 4.4 | Use the L1 distance to measure ``closeness''; ignore labels. |
![]() |
Q: What are legal and effective ways to rearrange frequencies? |