1. Theoretical Background

(1) ELBO

$$ \begin{align}log L(\theta;x) =& log P(x_{1:N}|\theta) = \sum_{n=1}^{N}logP(x_{n}|\theta) = \sum_{n=1}^{N}log \sum_{z_n}P(x_n, z_n|\theta) \\ =&\sum_{n=1}^{N}log \sum_{z_n}\frac{P(x_n, z_n|\theta)}{q(z_n|\theta)}q(z_n|\theta) = \sum_{n=1}^{N} log E_{z_n\sim q(z|\theta)}\left[ \frac{P(x_n, z_n|\theta)}{q(z_n|\theta)}\right] \\ \geq &\sum_{n=1}^{N} E_{z_n\sim q(z|\theta)}\left[ log\frac{P(x_n, z_n|\theta)}{q(z_n|\theta)}\right] = L(q(z|\theta), \theta) \end{align} $$

$$ \begin{align} L(q(z|\theta), \theta) =& \sum_{n=1}^{N}E_{z_n\sim q(z|\theta)}\left[ log\frac{P(x_n, z_n|\theta)}{q(z_n|\theta)}\right] \\ =& \sum_{n=1}^{N}E_{z_n\sim q(z|\theta)}\left[logP(x_n, z_n|\theta)\right] - \sum_{n=1}^{N}E_{z_n\sim q(z|\theta)}\left[logq(z_n|\theta)\right] \end{align} $$

(2) KL-Divergence & equality condition

$$ \begin{align} KL(q(z|\theta)||p(z|x)) = & E_{q(z|\theta)}\left[log\frac{q(z|\theta)}{p(z|x)}\right] \\ = & E_{q(z|\theta)}\left[log q(z|\theta)\right] -E_{q(z|\theta)}\left[log p(z|x)\right] \\ = & E_{q(z|\theta)}\left[log q(z|\theta)\right] -E_{q(z|\theta)}\left[log p(z, x)\right] + E_{q(z|\theta)}log p(x) \\ =& E_{z_n\sim q(z|\theta)}\left[ log\frac{q(z_n|\theta)}{P(x_n, z_n|\theta)}\right] + E_{q(z|\theta)}log p(x) \\ =& -ELBO(q, \theta) + logp(x) \end{align} $$

$$ \begin{align} & logp(x;\theta) = ELBO(q, \theta) + KL(q(z;\theta) || p(z|x)) \end{align} $$

(3) Another Form

A. complete log-likelihood + Entropy

$$ \begin{align} ELBO(q, \theta) = & E_{z_n\sim q(z|\theta)}\left(log P(x_n, z_n|\theta)\right)- E_{z_n\sim q(z|\theta)}\left(q(z_n|\theta) \right) \\ = & \; log P(x_n, z_n|\theta) + H(q(z|\theta)) \end{align} $$

B. Reconstruction Loss +