# Variational Inference

相比MCMC，在大数据量采样下，VI要快。\
1

$$\begin{eqnarray\*} \ln{P(X)}&=&\ln{(P(X,Z))}-\ln{P(Z|X)}\ &=&\ln{(\frac{P(X,Z)}{q(Z)})}-\ln{(\frac{P(Z|X)}{q(Z)})}\ &=&\ln{P(X,Z)}-\ln{q(Z)}-\ln{(\frac{P(Z|X)}{q(Z)})} \end{eqnarray\*}$$

2

$$
\begin{eqnarray\*}
\int\_{Z}q(Z)\ln{P(X)},dZ&=&\int\_{Z}q(Z)\ln{P(X,Z)},dZ-\int\_{Z}q(Z)\ln{q(Z)},dZ-\int\_{Z}q(Z)\ln{(\frac{P(Z|X)}{q(Z)})},dZ\\

\ln{P(X)}&=&\underbrace{\int\_{Z}q(Z)\ln{P(X,Z)},dZ-\int\_{Z}q(Z)\ln{q(Z)},dZ}*{\text{L(q):ELOB}}\underbrace{-\int*{Z}q(Z)\ln{(\frac{P(Z|X)}{q(Z)})},dZ}\_{\text{KL(q(Z)$||$P(Z|X))}}
\end{eqnarray\*}
$$

3 真实的后验概率$$P(Z|X)$$往往是十分复杂的，我们用$$q(Z)$$近似$$P(Z)$$ ，\
并且选择$$q(Z)=q\_1(Z\_1)q\_2(Z\_2)⋯q\_M(Z\_M) = \prod\_{i=1}^{M}q\_{i}(Z\_{i})$$，将每个q分解。这样便于计算积分等。这叫做平均场理论(mean field theory)，主要基于基于系统中个体的局部相互作用可以产生宏观层面较为稳定的行为这个物理思想。

$$
\begin{eqnarray\*}
L(q)&=&\int\_{Z}q(Z)\ln{P(X,Z)},dZ-\int\_{Z}q(Z)\ln{q(Z)},dZ\\
&=&\underbrace{\int\_{Z}\prod\_{i=1}^{M}q\_{i}(Z\_{i})\ln{P(X,Z)},dZ}*{\text{$L*{1}$}} - \underbrace{\int\_{Z}\prod\_{i=1}^{M}q\_{i}(Z\_{i})\ln{\prod\_{i=1}^{M}q\_{i}(Z\_{i})},dZ}*{\text{$L*{2}$}}
\end{eqnarray\*}
$$

l1部分，只关心j部分

$$
\begin{eqnarray\*}
L\_{1}&=&\int\_{Z}\prod\_{i=1}^{M}q\_{i}(Z\_{i})\ln{P(X,Z)},dZ\\
&=&\idotsint\limits\_{Z\_{1},Z\_{2}\cdots Z\_{M}} \prod\_{i=1}^{M}q\_{i}(Z\_{i})\ln{P(X,Z)},dZ\_{1}dZ\_{2}\ldots dZ\_{M}\\
&=&\int\_{Z\_{j}} q\_{j}(Z\_{j})(\idotsint\limits\_{Z\_{i\neq j}} q\_{i}(Z\_{i})\prod\_{i\neq j}^{M} q\_{i}(Z\_{i})\ln{P(X,Z)}\prod\_{i\neq j}^{M}dZ\_i)dZ\_{j}\\
&=&\int\_{Z\_{j}}q\_{j}(Z\_{j})(E\_{i\neq j}(\ln{P(X,Z)})),dZ\_{j}
\end{eqnarray\*}
$$

l2部分，只关心第j部分\
$$\begin{eqnarray\*} L\_{2}&=&\int\prod\_{i=1}^{M}q\_{i}(Z\_{i})\sum\_{i=1}^{M}\ln{q\_{i}(Z\_{i})},dZ\ &=&\sum\_{i=1}^{M}(\int\limits\_{Z\_{i}}q\_{i}(Z\_{i})\ln{q\_{i}(Z\_{i})},dZ\_{i}\ &=&\int\limits\_{Z\_{j}}q\_{j}(Z\_{j})\ln{q\_{j}(Z\_{j})},dZ\_{j}+const \end{eqnarray\*}$$

4 所以最终$$L(q)$$可以简化成：

$$\begin{eqnarray\*} L(q)&=\&L\_{1}-L\_{2}\ &=&\int\limits\_{Z\_{j}}q\_{j}(Z\_{j})(E\_{i\neq j}(\ln{P(X,Z)})),dZ\_{j}-\int\limits\_{Z\_{j}}q\_{j}(Z\_{j})\ln{q\_{j}(Z\_{j})},dZ\_{j}\ &=&\int\limits\_{Z\_{j}}q\_{j}(Z\_{j})\frac{E\_{i\neq j}(\ln{P(X,Z)})}{\ln{q\_{j}(Z\_{j})}},dZ\_{j} \end{eqnarray\*}$$

再简化：

$$\begin{equation} \ln{\tilde{P}(X,Z\_{j})}=E\_{i\neq j}(\ln{P(X,Z)}) \end{equation}$$

$$\begin{equation} L(q)=\int\limits\_{Z\_{j}}q\_{j}(Z\_{j})\ln{\frac{\tilde{P}(X,Z\_{j})}{q\_{j}(Z\_{j})}},dZ\_{j}+const=-D\_{KL}(q\_{j}(Z\_{j})||\tilde{P}(X,Z\_{j}))+const \end{equation}$$

![](/files/-M7DeTQIM58HAn4-ZP73)

## 参考佳文

[变分推断——深度学习第十九章](https://zhuanlan.zhihu.com/p/49401976)

[【论文每日读】NIPS 2016 Tutorial: Variational Inference](http://weibo.com/ttarticle/p/show?id=2309404062185344101869)

[Automatic Differentiation Variational Inference](https://arxiv.org/abs/1603.00788)

[Hierarchical Variational Models](https://arxiv.org/pdf/1511.02386v1.pdf)\
[Fast hierarchical Gaussian processes](http://sethrf.com/files/fast-hierarchical-GPs.pdf)

[Hierarchical Variational Models](http://approximateinference.org/schedule/Ranganath2015.pdf)

[洪亮劼 【论文每日读】Stein Variational Gradient Descent](http://weibo.com/ttarticle/p/show?id=2309404051346545369162)

[变分贝叶斯](http://www.junnanzhu.com/?p=386)

[Variational Inference with Implicit Probabilistic Models: Part 1:Bayesian Logistic Regression](http://www.inference.vc/variational-inference-with-implicit-probabilistic-models-part-1-2/)

[Variational Inference with Implicit Models Part II: Amortised Inference](http://www.inference.vc/variational-inference-with-implicit-models-part-ii-amortised-inference-2/)

[数学之美：两点之间最快的路径](http://jandan.net/2014/04/21/beauty-in-math.html)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://json007.gitbook.io/svm/math-probability/variational_inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
