更改

人工神经网络 (查看源代码)

2018年8月27日 (一) 23:33的版本

添加518字节、 2018年8月27日 (一) 23:33

第83行：第83行：

==模型==

−

一个“人工神经网络”是一个称为[https://en.wikipedia.org/wiki/Artificial_neurons 人工神经元]的简单元素的网络，它们接收输入，根据输入改变内部状态（“激活”），然后依靠输入和激活产生输出，通过连接某些神经元的输出到其他神经元的输入的“网络”形式构成了一个[https://en.wikipedia.org/wiki/Directed_graph 有向的][https://en.wikipedia.org/wiki/Weighted_graph 有权图]~~。权重和~~[https://en.wikipedia.org/wiki/Activation_function ~~计算激活的函数~~]可以被称为“学习”的过程改变，这被[https://en.wikipedia.org/wiki/Learning_rule 学习规则]控制。<ref name=Zell1994ch5.2>{{cite book |last=Zell |first=Andreas |year=1994 |title=Simulation Neuronaler Netze |trans-title=Simulation of Neural Networks |language=German |edition=1st |publisher=Addison-Wesley |chapter=chapter 5.2}}</ref>

+

一个“人工神经网络”是一个称为[https://en.wikipedia.org/wiki/Artificial_neurons 人工神经元]的简单元素的网络，它们接收输入，根据输入改变内部状态（“激活”），然后依靠输入和激活产生输出，通过连接某些神经元的输出到其他神经元的输入的“网络”形式构成了一个[https://en.wikipedia.org/wiki/Directed_graph 有向的][https://en.wikipedia.org/wiki/Weighted_graph 有权图]。权重和计算[https://en.wikipedia.org/wiki/Activation_function 激活的函数]可以被称为“学习”的过程改变，这被[https://en.wikipedia.org/wiki/Learning_rule 学习规则]控制。<ref name=Zell1994ch5.2>{{cite book |last=Zell |first=Andreas |year=1994 |title=Simulation Neuronaler Netze |trans-title=Simulation of Neural Networks |language=German |edition=1st |publisher=Addison-Wesley |chapter=chapter 5.2}}</ref>

===人工神经网络的组成部分（Components of an artificial neural network）===

====神经元（Neurons）====

−

一个有标记<math>{j}</math> ~~的神经元从前驱神经元接收输入~~ <math>{p_j}(t)</math> ~~，这些前驱由下面的部分组成：~~<ref name=Zell1994ch5.2 />

+

一个有标记<math>{j}</math> 的神经元从上一层神经元接收输入 <math>{p_j}(t)</math> ，这些上一层由下面的部分组成：<ref name=Zell1994ch5.2 />

* 一个''激活'' <math>{{a_j}(t)}</math>, 取决于一个离散时间参数，

第94行：第94行：

* 一个''激活函数'' <math>f</math> ，从 <math>{{a_j}(t)}</math>, <math>\theta_j</math>和网络输入<math>{p_j}(t)</math> 计算在给定时间<math>{t+1}</math>新的激活

: <math> {a_j}(t+1) = f({a_j}(t), {p_j}(t), \theta_j) </math>,

−

* 和一个 ''输出函数'' <math>f_{out}</math> ~~计算从激活的输出~~

+

* 和一个 ''输出函数'' <math>f_{out}</math> 计算从激活函数得到的输出

: <math> {o_j}(t) = f_{out}({a_j}(t)) </math>.

通常输出函数只是简单的[https://en.wikipedia.org/wiki/Identity_function 恒等函数]

−

~~一个“输入神经元”没有前驱，但作为整个网络的输入接口。同样地，一个“输出神经元”没有后继而作为整个网络的输出接口。~~

+

一个“输入神经元”没有上一层网络，但作为整个网络的输入接口。同样地，一个“输出神经元”没有下一层而作为整个网络的输出接口。

====连接和权重（Connections and weights）====

第129行：第129行：

===学习===

−

~~学习的可能性在神经网络吸引了最多的兴趣。给定一个特定的“任务”和一类函数~~<math>\textstyle F</math>待解决，学习意味着使用一组观测值寻找<math>{f^{*}} \in F</math>，它以某种最优的道理解决任务。

+

神经网络的学习能力吸引了人们最多的兴趣。给定一个特定的“任务”和一类函数<math>\textstyle F</math>待解决，学习意味着使用一组观测值寻找<math>{f^{*}} \in F</math>，它以某种最优的道理解决任务。

这引发了定义一个损失函数 <math>{C} : {F} \rightarrow {\mathbb{R}}</math>使得对于最优解<math>{f^{*}}</math>,<math>{C}(f^{*}) \leq C(f)</math><math>\forall {f} \in {F}</math>—— 也就是没有解有比最优解更小的损失。

损失函数<math>{C}</math>是学习中一个重要的概念，因为它是衡量一个特定的解距离一个解决问题的最优解有多远。学习算法搜索解空间寻找一个有最小可能损失的函数。

第138行：第138行：

====选择一个损失函数====

−

虽然可能定义一个[https://en.wikipedia.org/wiki/Ad_hoc ~~特别的~~]损失函数，通常使用一个特定的损失函数，无论因为它有需要的性质（例如[https://en.wikipedia.org/wiki/Convex_function 凸性质]）或因为它从问题的一种特定公式中自然产生（例如在概率公式中模型的[https://en.wikipedia.org/wiki/Posterior_probability 后验概率]可以被用作相反损失）。最后，损失函数取决于任务。

+

虽然可能定义一个[https://en.wikipedia.org/wiki/Ad_hoc 专用的]损失函数，通常使用一个特定的损失函数，无论因为它有需要的性质（例如[https://en.wikipedia.org/wiki/Convex_function 凸性质]）或因为它从问题的一种特定公式中自然产生（例如在概率公式中模型的[https://en.wikipedia.org/wiki/Posterior_probability 后验概率]可以被用作相反损失）。最后，损失函数取决于任务。

====反向传播====

第155行：第155行：

===学习范式（Learning paradigms）===

−

~~三种主要学习范式对应于特定学习任务。它们是：【监督学习】，【无监督学习】和【强化学习】~~

+

三种主要学习范式对应于特定学习任务。它们是：[http://wiki.swarma.net/index.php/%E4%BA%BA%E5%B7%A5%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C#.E7.9B.91.E7.9D.A3.E5.AD.A6.E4.B9.A0.EF.BC.88Supervised_learning.EF.BC.89 监督学习]，[http://wiki.swarma.net/index.php/%E4%BA%BA%E5%B7%A5%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C#.E6.97.A0.E7.9B.91.E7.9D.A3.E5.AD.A6.E4.B9.A0.EF.BC.88Unsupervised_learning.EF.BC.89 无监督学习]和[http://wiki.swarma.net/index.php/%E4%BA%BA%E5%B7%A5%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C#.E5.BC.BA.E5.8C.96.E5.AD.A6.E4.B9.A0.EF.BC.88Reinforcement_learning.EF.BC.89 强化学习]

匿名用户

http://c2.com/cgi/wiki?$1>Leo