更改

NIS+ (查看源代码)

2024年8月6日 (二) 23:13的版本

添加4,743字节、 2024年8月6日 (星期二)

无编辑摘要

第317行：第317行：

=== 编码器的通用逼近定理 ===

+

首先，我们扩展基本编码器的定义，引入一个新的运算<math>

+

\eta_{p,s}: \mathcal{R}^p\rightarrow \mathcal{R}^s

+

</math>，它表示原始变量的自复制。

+

<math>

+

\eta_{p,s}(\boldsymbol{x})=\boldsymbol{x}\bigoplus \boldsymbol{x}_{s-p}

+

</math>

+

向量<math>

+

\boldsymbol{x}_{s-p}

+

</math>是<math>

+

s-p

+

</math>维，其中每个维都是x中特定维的重复。例如，若<math>

+

\boldsymbol{x}=(0.1,0.2,0.3)

+

</math>，则<math>

+

\eta_{2,5}(\boldsymbol{x})=(0.1,0.2,0.3,0.1,0.2)

+

</math>。

+

学者们曾提出一般前馈神经网络<ref name=":2">Shalizi C and Moore C. What is a macrostate? subjective observations and objective dynamics. arXiv: cond-mat/0303625.</ref><ref name=":3">Fisch D, Jänicke M and Sick B et al. Quantitative emergence–a refined approach based on divergence measures. Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems, 2010.</ref>和可逆神经网络<ref>Mnif M and Müller-Schloer C. Quantitative emergence. In: Müller-Schloer C, Schmeck H and Ungerer T(ed.). Organic Computing—A Paradigm Shift for Complex Systems. Berlin:

+

Springer, 2011, 39-52.</ref><ref>Fisch D, Jänicke M and Müller-Schloer C et al. Divergence measures as a generalised approach to quantitative emergence. In: Müller-Schloer C, Schmeck H and Ungerer T(ed.). Organic Computing—A Paradigm Shift for Complex Systems. Berlin: Springer, 2011, 53-66.</ref>的通用逼近定理，将其作为桥梁，可以证明任何前馈神经网络都可以用一系列双射映射(ψ)、投影(χ)和向量扩展(η)过程来模拟。对向量展开进行扩展后的基本编码器可表示为：

+

<math>

+

\phi= Proj_q \circ\psi_{s} \circ \eta_{p,s}\circ \psi_{p}

+

</math>

+

式中，函数<math>

+

\psi_s: \mathcal{R}^s\rightarrow \mathcal{R}^s

+

</math>和<math>

+

\psi_p: \mathcal{R}^p\rightarrow \mathcal{R}^p

+

</math>表示两个可逆映射。保留的最终维数q可能大于初始维数p。φ为降维算子。

+

根据通用逼近定理<ref name=":2" /><ref name=":3" />，对于定义在<math>

+

K\times \mathcal{R}^p

+

</math>上的任意函数<math>

+

f

+

</math>，其中<math>

+

K\in \mathcal{R}^p

+

</math>是紧集，且<math>

+

p>q\in \mathcal{Z^+}

+

</math>，则存在整数<math>

+

s

+

</math>和<math>

+

W\in\mathcal{R}^{s\times p}, W'\in\mathcal{R}^{q\times s}, b\in\mathcal{R}^{s}

+

</math>，使得:

+

<math>

+

W'\cdot \sigma(W+b)\simeq f

+

</math>

+

式中，<math>

+

\sigma(\boldsymbol{x})=1/(1+\exp(-\boldsymbol{x}))

+

</math>是向量上的sigmoid函数。

+

根据引理4，<math>

+

+b

+

</math>和<math>

+

\sigma(\cdot)

+

</math>都是可逆算子，因此，存在可逆神经网络<math>

+

\psi_{q},\psi_{s}',\psi_{s},\psi_{p}

+

</math>和两个整数<math>

+

s_1,s_2

+

</math>（矩阵<math>

+

W'

+

</math>和<math>

+

W

+

</math>的秩），使得：

+

<math>

+

(\psi_{q}\circ\eta_{s_2,q}\circ\chi_{s,s_2}\circ\psi_{s}')\circ(\psi_{s}\circ\eta_{s_1,s}\circ\chi_{p,s_1}\circ\psi_{p})\simeq W'\cdot\sigma(W\cdot+b)

+

</math>

+

式中，<math>

+

\psi_{s}\circ\eta_{s_1,s}\circ\chi_{p,s_1}\circ\psi_{p}

+

</math>近似(模拟)函数<math>

+

\sigma(W\cdot+b)

+

</math>，<math>

+

\psi_{q}\circ\eta_{s_2,q}\circ\chi_{s,s_2}\circ\psi_{s}'

+

</math>近似(模拟)函数<math>

+

W'\cdot

+

</math>。

+

因此，如果令<math>

+

\phi_{p,s,q}=(\psi_{q}\circ\eta_{s_2,q}\circ\chi_{s,s_2}\circ\psi_{s}')\circ(\psi_{s}\circ\eta_{s_1,s}\circ\chi_{p,s_1}\circ\psi_{p})

+

</math>，那么<math>

+

\phi_{p,s,q}\simeq f

+

</math>。

+

在实际应用中，虽然基本编码器和扩展版本不包括展开运算符，但我们总是在输入向量为编码器输入之前展开它。因此，有理由相信此定理仍然适用于堆叠编码器。

+

综上可知，编码器通用逼近定理：

+

对于任何连续函数<math>

+

f

+

</math>，定义在<math>

+

K\times \mathcal{R}^p

+

</math>，<math>

+

K\in \mathcal{R}^p

+

</math>是一个紧集，<math>

+

p>q\in \mathcal{Z^+}

+

</math>，存在整数<math>

+

s

+

</math>和扩展堆编码器<math>

+

\phi_{p,s,q}: \mathcal{R}^p\rightarrow \mathcal{R}^q

+

</math>（有<math>

+

s

+

</math>隐藏层）和扩展操作<math>

+

\eta_{p,s}

+

</math>，使得：

+

<math>

+

\phi_{p,s,q}\simeq f

+

</math>

+

此后，扩展堆叠编码器具有通用逼近性质，这意味着它可以近似(模拟)任何定义在<math>

+

\mathcal{R}^p\times \mathcal{R}^q

+

</math>粗粒化函数。

+

'''引理4'''：

+

对于任意向量<math>

+

X\in \mathcal{R}^p

+

</math>和矩阵<math>

+

W\in \mathcal{R}^{s\times p}

+

</math>，其中<math>

+

s,p\in \mathcal{N}

+

</math>，存在一个整数<math>

+

s_1\leq \min(s,p)

+

</math>和两个编码器的基本单位：<math>

+

\psi_{s}\circ\eta_{s_1,s}$ and $\chi_{p,s_1}\circ \psi_{p}

+

</math>，使得：

+

<math>

+

W\cdot X\simeq(\psi_{s}\circ\eta_{s_1,s})\circ(\chi_{p,s_1}\circ \psi_{p})(X)

+

</math>

+

式中，<math>

+

\simeq

+

</math>表示近似或模拟。

== 机器学习算法 ==

念

259

个编辑