Confusing things

I guess I seem to be confused with the partition function from evidence. When they appear together, I don’t seem to be confused. But when they appeared in different contexts, I just got confused like falling into some illusion tricks. Given data $latex x_D$ and parameter $latex \theta$, the evidence is simply $latex p(x_D;\theta)$. And…

Distributed representation

It sounds like a misnomer to me. I probably will just call it a “vector” representation. It doesn’t have the “distributed” meaning of scattering information into different places. For example, to recognize a cat with “distributed” representation, we may distribute features into like “does it has a tail?”, “does it have four legs?”, and “does…

LeCun’s first lecture

LeCun has a new course on deep learning this spring. I found two things he mentioned that worth jotting down. First, natural data lives in low-dimensional manifold. Probably I should have came across that before but it didn’t register earlier. Come to think of it. This is a very important fact. Second, as it is…

Framing and prospect theory

Asian disease problem illustrated that framing can alter one’s decision based on if we are emphasizing gain or loss. Prospect theory is just a fancy name to conjecture what happens when the utility function is indeed what economists believe.

Free energy

When we model probability of a variable $latex x$ by $latex p(x) = {e^{-\frac{F(x)}{T}}}$, $latex F(x)$ is often referred to as the free energy. The name is coming from historical reason. The Gibbs-Boltzmann distribution for a configuration is proportional to $latex e^{-\frac{H}{k_B T}}$. And the closest reason I found is from here $latex p(H,T) =…