A very intuitive yet powerful inequality in information theory is the data processing inequality. Lemma: If random variable , and form a Markov chain , then . The great thing about the inequality is that unlike some results in information theory, it works for both discrete and continuous random variables. (Actually it works even for…

## Data Processing Inequality

