How it works
Encoding
Index of coincidence (abbreviated as “IC”) will be different based on the character encoding of your input.For all data encodings, the data is first decoded to a
Uint8Array and then converted to
a string based on its encoding type (ASCII, Latin1, UTF-8, UTF-16, UTF-32). This step takes precedence over any other ciphertext settings,
unless the widget evaluates the ciphertext bytes rather than the encoded text data.Display encodings are
not decoded before analysis and are analyzed as-is, unless the widget is designed
to evaluate the ciphertext bytes.Ciphertext Settings
IC takes into consideration the settings for each ciphertext. These settings currently include:
- Ignore whitespace
- Ignore punctuation
- Ignore casing
- Genericize text
IC Calculations
After all of the above steps are performed, the IC calculations are executed and displayed.Where:
IC formula
This widget uses the following formula for calculating index of coincidence:n_i= frequency of n-gram i (or character i when n-gram size = 1)N= total number of n-grams (or characters)Σ= sum over all unique n-grams/characters
Periodic IC formula
Periodic IC is different based on the mode.N-gram mode: Block
For this mode, the process is:- Generate n-grams from the entire text
- Group n-grams by their index modulo the period:
- Calculate IC for each group using the basic IC formula
- Average the IC values across all groups:
N-gram mode: Sliding window
For this mode, the process is:- Slide a window of size ngramSize across the text
- For each window:
- Generate n-grams within the window
- Group by index modulo period (same as block mode)
- Calculate and average group ICs
- Average the window ICs:
Index of Coincidence Settings
N-grams and sliding window vs. block analysis
IC can be performed on n-grams wheren >= 1. For n-grams > 1, it is important
to understand the difference between sliding window and block analysis.
N-gram mode: Sliding window
Sliding window analysis “slides” across the ciphertext to create n-grams. For the textHello:
N-gram mode: Block analysis
Block analysis evaluates your n-grams as non-overlapping chunks. ForHello:
o, is not present. When using block analysis,
beware of missing data. The ciphertext length (after all toggles are applied)
must be divisible by your n-gram size for all characters to be represented in the
IC!
Graph vs. Table
There are two display options for IC.Graph (Periodic analysis)
Shows a line chart, where each line is the periodic IC of a ciphertext. The height represents the IC, and the horizontal axis represents how far into the ciphertext the measurement is.Table
Shows a table of values, with a single IC value for each ciphertext. This is the IC for the entire ciphertext.Max Period
The max period is used in the graph to determine how far the x-axis will go. This helps with scaling the data in case characters are of varying lengths.Show Average Lines
In the periodic analysis graph, shows a dotted line for each ciphertext indicating the average line.Practical Application
IC can be leveraged to:- Determine if a cipher is periodic or aperiodic.
- Discover key lengths of periodic ciphers, such as Vigenere with a repeat key.
- Compare the periods of two or more ciphers.
Caveats
- Noisy results may mislead you.
- Autokey is a sufficient way to get rid of periodic spikes in IC.
- Coincidences in sufficiently short text may also mislead you.
- Some languages have similar IC patterns. If you don’t know the language of the plaintext, you may be misled.

