How it works
Character association
It will then associate the most frequent character with the letter A, the next most frequent character with the letter B, so on and so forth.
Display vs. non-display formats
CipherInspector supports many character encodings. Some are considered “display” formats, such as UTF-8, ASCII, UTF-16, and UTF-32. Others are used to store bytes of data, such as hexadecimal, binary, octal, decimal, and base64. For display formats, the genericized text is calculated by counting each character. For non-display formats, the genericized text is calculated by counting each byte.Interactions
The genericized text will only be computed after any ciphertext updates are saved.
- Ignore whitespace
- Ignore casing
- Ignore punctuation
- Reverse text
Practical Application
This can be useful when evaluating if two pieces of text with different characters share the same unique character distribution.Caveats
One major caveat is that this is only practical with an alphabet size of roughly 62, but ideally no more than 26. The reason is because we quickly run out of intuitive symbols to use. It is recommended to limit usage of genericized text to:- Hexadecimal
- Decimal
- Octal
- Base64
- UTF-8, ASCII, UTF-16, and UTF-32 strings which are limited to a unique character set between 1 and 62.

