Understand Information Theory and Entropy

Last update 2 w. agoCreated on the 23rd of March 2026

The Mathematical Foundation of Compression, Communication, and Secrecy

Claude Shannon's 1948 paper "A Mathematical Theory of Communication" defined information precisely, derived the limits of compression and reliable transmission, and — one year later — proved the conditions for perfect secrecy in cryptography. These results share the same mathematical core.

Shannon Entropy

Shannon entropy H(X) = -sum(p_i * log_2(p_i)) measures the average surprise in a random variable X. If an event is certain (p = 1), it contributes zero information: log_2(1) = 0. A fair coin flip has H = 1 bit. A fair six-sided die has H = log_2(6) approx

The Noiseless Coding Theorem

Shannon's first theorem: the minimum average number of bits required to encode one symbol from a source X equals H(X). No compression scheme can do better on average. Huffman coding and arithmetic coding approach this limit. This defines what compression

1bit

4.7bits

Channel Capacity and the Shannon-Hartley Theorem

The maximum reliable information rate over a noisy channel is C = B * log_2(1 + S/N) bits per second, where B is bandwidth and S/N is the signal-to-noise ratio. This is not an engineering limitation — it is a theorem. Any transmission scheme can approach

Perfect Secrecy and the One-Time Pad

Shannon proved in 1949 that a cipher has perfect secrecy if and only if the ciphertext reveals zero information about the plaintext — formally, if the ciphertext and plaintext are statistically independent. Achieving this requires key entropy at least as

Entropy as Password Strength

A password chosen uniformly at random from a set of M equally likely symbols has log_2(M) bits of entropy. Lowercase letters give log_2(26) approximately 4.7 bits per character. Adding digits and symbols increases entropy per character slightly — but addi

Go Deeper: Entropy, Logarithms, and Number Theory

Shannon entropy is defined using logarithms and probability — the same mathematical objects appear throughout number theory and analysis. The number-theoretic results that cryptography directly uses, including Fermat's little theorem and the structure of prime fields, are worth studying as a foundation in their own right.

Understand Number Theory for Cryptography

A focused introduction to the number theory results that cryptography actually uses — Euler's theorem, the Chinese Remainder Theorem, and the primality tests that make RSA key generation fast.

slatesource.com

Information Theory, Inference, and Learning Algorithms (free textbook)

inference.org.uk