更改

跳到导航 跳到搜索
创建页面,内容为“==Variational approach using the Bethe approximation== The Boltzmann distribution is :<math> P(\underline\sigma)=\frac{1}{Z}e^{-\beta E(\sigma)}</math> :<math> Z=\s…”
==Variational approach using the Bethe approximation==
The Boltzmann distribution is
:<math> P(\underline\sigma)=\frac{1}{Z}e^{-\beta E(\sigma)}</math>
:<math> Z=\sum_{\underline\sigma}e^{-\beta E(\sigma)}</math>

The Bethe approximation asks to use
:<math> \hat P(\underline \sigma)=\frac{\prod_{\langle ij\rangle}P_{ij}(\sigma_i,\sigma_j)}{\prod_iP_i^{d_i-1}(\sigma_i)}</math>
Here <math>\textstyle\langle ij\rangle </math> denotes pair of nodes that is connected in the topology of the Ising model. That is, <math>\textstyle J_{ ij}>0, </math> <math>\textstyle d_i </math> is the degree of node ''i'' in the topology, and
:<math>\textstyle P_{ij}(\sigma_i,\sigma_j)=\sum_{\underline\sigma\backslash\sigma_i,\sigma_j}P(\underline\sigma), </math>,
:<math>\textstyle P_{i}(\sigma_i) =\sum_{\underline\sigma\backslash\sigma_i}P(\underline\sigma)</math> are two-point and one-point marginals respectively.

Then the task is to find parameters of the Bethe approximation, such that
:<math>\{\hat P_{ij},\hat P_i \}=\arg\min_{P_{ij},P_i}D_{\textrm{KL}}(P,\hat P).</math>
And we can write the Kellback-Leibler divergence as
:<math>
\begin{align}
D_{\textrm{KL}}(P,\hat P) &= \sum_{\underline\sigma}\hat P(\underline\sigma)\log\frac{\hat P(\underline\sigma)}{ P(\underline\sigma)}\\
&=\sum_{\underline\sigma}\hat P(\underline\sigma)\log \hat P(\underline\sigma)-\sum_{\underline\sigma}\hat P(\underline\sigma)\log P(\underline\sigma)\\
&=-S_{\hat P}+\sum_{\underline\sigma} \hat{P}(\underline\sigma)\beta E(\underline\sigma)+\sum_{\underline\sigma}\hat P(\underline\sigma)\log Z\\
&=-S_{\hat P}+\beta\langle E\rangle_{\hat P} +\log Z\\
&=\beta F_{\hat P}-\beta F_{P}\\
&\geq 0.
\end{align}
</math>

It means that
* A good estimate of parameters in the approximation can be obtained by minimizing the variational free energy <math>\textstyle F_{\hat P}</math>.
* The variational free energy gives an upper-bound to the true free energy.
:<math>\{\hat P_{ij},\hat P_i \}=\arg\min_{P_{ij},P_i}F_{\hat P}.</math>

Since
:<math> \hat P(\underline \sigma)=\frac{\prod_{\langle ij\rangle}P_{ij}(\sigma_i,\sigma_j)}{\prod_iP_i^{d_i-1}(\sigma_i)},</math>
we have
:<math>\log \hat{P}(\underline\sigma)=\sum_{\langle ij\rangle}\log P_{ij}-\sum_i(d_i-1)\log P_i.</math>
Thus
:<math>
\begin{align}
F_{\hat P}&=\sum_{\underline\sigma}\hat P(\underline\sigma)\sum_{\langle ij\rangle}J_{ij}\sigma_i\sigma_j+\sum_{\underline\sigma}\hat P(\underline\sigma)\sum_{\langle ij\rangle}\log P_{ij}-\sum_{\underline\sigma}\hat P(\underline\sigma)(d_i-1)\log P_i\\
&=\sum_{\langle ij\rangle}\sum_{\sigma_i\sigma_j}P_{ij}(\sigma_i,\sigma_j)(E_{ij}+\log P_{ij})-\sum_i\sum_{\sigma_i}P_i(\sigma_i)(d_i-1)\log P_i(\sigma_i)
\end{align}
</math>

Now let us introduce several Lagragnian with several multipliers

:<math>
\begin{align}
\mathcal L&=\sum_{\langle ij\rangle}\sum_{\sigma_i\sigma_j}P_{ij}(\sigma_i,\sigma_j)(E_{ij}+\log P_{ij})-\sum_i\sum_{\sigma_i}P_i(\sigma_i)(d_i-1)\log P_i(\sigma_i)\\
&-\sum_i\lambda_i\left (\sum_{\sigma_i}P_i(\sigma_i)-1\right )-\sum_{\langle ij\rangle}\lambda_{j\to i}\left ( \sum_{\sigma_j} P_{ij}(\sigma_i,\sigma_j)-P_i(\sigma_i)\right)-\sum_{\langle ij\rangle}\lambda_{i\to j}\left ( \sum_{\sigma_i} P_{ij}(\sigma_i,\sigma_j)-P_j(\sigma_j)\right)
\end{align}
</math>

By setting derivatives to zero we have
:<math>
\begin{align}
\frac{\partial \mathcal L}{\partial P_{i}(\sigma_i)}&=(1-d_i)(1+\log P_i(\sigma_i))-\lambda_i-\sum_{\langle ij\rangle}\lambda_{j\to i}=0,\\
\frac{\partial \mathcal L}{\partial P_{ij}(\sigma_i,\sigma_j)}&=E_{ij}(1+\log P_{ij})-\lambda_{j\to i}-\lambda_{i\to j}=0.
\end{align}
</math>
Last equations yield that
:<math>
\begin{align}
P_i(\sigma_i)&\propto e^{\sum_{\langle ij\rangle}\lambda_{j\to i}},\\
P_{ij}(\sigma_i,\sigma_j)&\propto e^{-E_{ij}-\lambda_{j\to i}-\lambda_{i\to j}}.
\end{align}
</math>
After introducing new variables, we have
:<math>
\begin{align}
P_i(\sigma_i)&\propto \prod_{j\in\partial i}\psi^{j\to i},\\
P_{ij}(\sigma_i,\sigma_j)&\propto e^{-E_{ij}}\psi^{j\to i}\psi^{i\to j}.
\end{align}
</math>
where <math>\textstyle \{\psi_{i\to j}\}</math> are ''cavity messages'' along the directed edges of the graph, and they can be determined using the ''Belief Propagation'' equation (also called ''Sum-Product'' equation):
:<math>
\begin{align}
\psi_{i\to j}(\sigma_j)&=\frac{1}{Z_{i\to j}}\sum_{\sigma_i}e^{\beta E_{ij}(\sigma_i,\sigma_j)}\prod_{k\in\partial i\backslash j}\psi_{k\to i}(\sigma_i),\\
Z_{i\to j}&=\sum_{\sigma_j}\sum_{\sigma_i}e^{\beta E_{ij}(\sigma_i,\sigma_j)}\prod_{k\in\partial i\backslash j}\psi_{k\to i}(\sigma_i).
\end{align}
</math>

==Computing macro observables==
:<math>
\begin{align}
F&=\sum_{i}\Delta F_i-\sum_{\langle ij\rangle}\Delta F_{ij},\\
\Delta F_i&=-\frac{1}{\beta}\log Z_i=-\frac{1}{\beta}\sum_{\sigma_i}\prod_{k\in\partial i\backslash j}\sum_{\sigma_k}e^{-\beta E_{ik}(\sigma_i,\sigma_k)}\psi_{\sigma_k}^{k\to i},\\
\Delta F_{ij}&=-\frac{1}{\beta}\log Z_{ij}=-\frac{1}{\beta}\sum_{\sigma_i,\sigma_j}e^{-\beta E_{ij}(\sigma_i,\sigma_j)}\psi_{\sigma_i}^{i\to j}\psi_{\sigma_j}^{j\to i}.
\end{align}
</math>

==Deriving Belief Propagation using directly the conditional independence==
We can actually try to solve the marginals directly by adding an additional spin ''i'' into a system with ''n'' spins.
<math>
\begin{align}
P_i(\sigma_i)&=\sum_{\{\underline{\sigma_j} \},\{\underline{\sigma_k}\}}P_{N+1}(\sigma_i;\underline{\sigma_j};\underline{\sigma_k})\\
&=\frac{\sum_{\{\underline{\sigma_j} \},\{\underline{\sigma_k} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \},\{\underline{\sigma_k} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\frac{\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})}}{\sum_{\{\underline{\sigma_j} \}}\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} } }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\frac{\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})}}{\sum_{\{\underline{\sigma_j} \}}\sum_{\{\underline{\sigma_k} \}}e^{ -\beta \sum_{x,y:x,y\neq i} E_{x,y}(\underline{\sigma_j},\underline{\sigma_k})} } }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\sum_{\{\underline{\sigma_k} \}}P_N(\underline{\sigma_j},\underline{\sigma_k}) }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\sum_{\{\underline{\sigma_k} \}}P_N{ (\underline{\sigma_j},\underline{\sigma_k})} }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}P_N(\underline{\sigma_j}) }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}P_N{ (\underline{\sigma_j})} }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\prod_{j\in\partial i}P_N(\sigma_j) }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}e^{ -\beta \sum_j E_{i,j}(\sigma_i,\underline{\sigma_j})}\prod_{j\in\partial i}P_N{( {\sigma_j})} }\\
&=\frac{\sum_{\{\underline{\sigma_j} \}}\prod_{j\in\partial i}e^{ -\beta E_{i,j}(\sigma_i,\sigma_j)}\psi_{j\to i}(\sigma_j) }{\sum_{\sigma_i}\sum_{\{\underline{\sigma_j} \}}\prod_{j\in\partial i}e^{ -\beta E_{i,j}(\sigma_i,{\sigma_j})}\psi_{j\to i}{( {\sigma_j})} }\\
&=\frac{\prod_{j\in\partial i}\sum_{ {\sigma_j} }e^{ -\beta E_{i,j}(\sigma_i,\sigma_j)}\psi_{j\to i}(\sigma_j) }{\sum_{\sigma_i}\prod_{j\in\partial i}\sum_{{\sigma_j} }e^{ -\beta E_{i,j}(\sigma_i,{\sigma_j})}\psi_{j\to i}{( {\sigma_j})} }
\end{align}
</math>


==Computing macro observables==
The Bethe free energy can be computed as
:<math>
\begin{align}
F&=\sum_{i}\Delta F_i-\sum_{\langle ij\rangle}\Delta F_{ij},\\
\Delta F_i&=-\frac{1}{\beta}\log Z_i=-\frac{1}{\beta}\sum_{\sigma_i}\prod_{k\in\partial i\backslash j}\sum_{\sigma_k}e^{-\beta E_{ik}(\sigma_i,\sigma_k)}\psi_{\sigma_k}^{k\to i},\\
\Delta F_{ij}&=-\frac{1}{\beta}\log Z_{ij}=-\frac{1}{\beta}\sum_{\sigma_i,\sigma_j}e^{-\beta E_{ij}(\sigma_i,\sigma_j)}\psi_{\sigma_i}^{i\to j}\psi_{\sigma_j}^{j\to i}.
\end{align}
</math>

Then average internal energy and entropy can be computed as
:<math>
\begin{align}
E&=\sum_{i}\Delta E_i-\sum_{\langle ij\rangle}\Delta E_{ij},\\
&=\sum_{i}\frac{\partial(\beta F_i)}{\delta \beta}-\sum_{\langle ij\rangle}\frac{\partial (\beta\Delta F_{ij})}{\partial \beta}\\
S&=F-\beta E
\end{align}
</math>


==Taking ensemble average using the ''Population Dynamics''==
If we use <math>\textstyle \mathcal F_{\textrm{RS}}(\{ P_{\sigma_k}^{k\to i} \}</math> to denote the Replica Symmetric equations written in the previous sections, then distribution of messages in the directed edges sampled from ensemble of graphs can be written as
:<math>
\mathcal P(P^{i\to j}_{\sigma_i})=\int \prod_{k\in\partial i\backslash j}dP_{\sigma_k}^{k\to i}\mathcal P(P_{\sigma_k}^{k\to i})\cdot\delta\left [ P^{i\to j}_{\sigma_i} - \mathcal F_{\textrm{RS}}(\{ P_{\sigma_k}^{k\to i} \}\right ] ).
</math>

Then the macro observables like energy and entropy can be computed using the distribution of messages.

[[category:旧词条迁移]]

导航菜单