I guess I seem to be confused with the partition function from evidence. When they appear together, I don’t seem to be confused. But when they appeared in different contexts, I just got confused like falling into some illusion tricks. Given data $latex x_D$ and parameter $latex \theta$, the evidence is simply $latex p(x_D;\theta)$. And…

# Month: October 2020

## Distributed representation

It sounds like a misnomer to me. I probably will just call it a “vector” representation. It doesn’t have the “distributed” meaning of scattering information into different places. For example, to recognize a cat with “distributed” representation, we may distribute features into like “does it has a tail?”, “does it have four legs?”, and “does…

## LeCun’s first lecture

LeCun has a new course on deep learning this spring. I found two things he mentioned that worth jotting down. First, natural data lives in low-dimensional manifold. Probably I should have came across that before but it didn’t register earlier. Come to think of it. This is a very important fact. Second, as it is…

## Framing and prospect theory

Asian disease problem illustrated that framing can alter one’s decision based on if we are emphasizing gain or loss. Prospect theory is just a fancy name to conjecture what happens when the utility function is indeed what economists believe.

## One pager for proposal

A good slide from a workshop.

## Free energy

When we model probability of a variable $latex x$ by $latex p(x) = {e^{-\frac{F(x)}{T}}}$, $latex F(x)$ is often referred to as the free energy. The name is coming from historical reason. The Gibbs-Boltzmann distribution for a configuration is proportional to $latex e^{-\frac{H}{k_B T}}$. And the closest reason I found is from here $latex p(H,T) =…