McCulloch & Pitts神经网络

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
跳到导航 跳到搜索

此词条暂由彩云小译翻译,翻译字数共1994,未经人工整理和审校,带来阅读不便,请见谅。

An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network.[1] The artificial neuron receives one or more inputs (representing excitatory postsynaptic potentials and inhibitory postsynaptic potentials at neural dendrites) and sums them to produce an output (or 模板:Vanchor, representing a neuron's action potential which is transmitted along its axon). Usually each input is separately weighted, and the sum is passed through a non-linear function known as an activation function or transfer function模板:Clarify. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable and bounded. Non-monotonic, unbounded and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU like activation functions on many tasks have also been recently explored.[2][3] Oscillating activation functions can improve gradient flow in neural networks and allow functions to be learnt with fewer neurons. For example the bipolar-encoded XOR function can be learnt with a single GCU neuron.[2] The thresholding function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic in recent times.[4]


An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs (representing excitatory postsynaptic potentials and inhibitory postsynaptic potentials at neural dendrites) and sums them to produce an output (or , representing a neuron's action potential which is transmitted along its axon). Usually each input is separately weighted, and the sum is passed through a non-linear function known as an activation function or transfer function. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable and bounded. Non-monotonic, unbounded and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU like activation functions on many tasks have also been recently explored. Oscillating activation functions can improve gradient flow in neural networks and allow functions to be learnt with fewer neurons. For example the bipolar-encoded XOR function can be learnt with a single GCU neuron. The thresholding function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic in recent times.

人工神经元是一个数学函数,被构想为生物神经元的模型,一个神经网络。人工神经元是人工神经网络的基本单元。人工神经元接收一个或多个输入(代表神经树突上的兴奋性突触后电位和抑制性突触后电位) ,并将它们相加以产生输出(或者,代表神经元的动作电位沿轴突传递)。通常每个输入都是分开加权的,和通过一个非线性函数传递,这个函数称为激活函数函数或传递函数。传递函数通常具有 s 形,但也可以采用其他非线性函数、分段线性函数或阶跃函数的形式。它们也经常是单调递增的、连续的、可微的和有界的。非单调、无界和振荡的多零点激活函数在许多任务中的性能优于类似 sigmoidal 和 ReLU 的激活函数,近年来也得到了研究。振荡激活函数可以改善神经网络的梯度流,使得用较少的神经元学习函数。例如,双极性编码的 XOR 函数可以通过单个 GCU 神经元学习。阈值函数启发了建立逻辑门,称为阈值逻辑,适用于建立类似大脑处理的逻辑电路。例如,新的设备,如忆阻器已被广泛用于发展这样的逻辑在最近的时代。

The artificial neuron transfer function should not be confused with a linear system's transfer function.

The artificial neuron transfer function should not be confused with a linear system's transfer function.

人工神经元传递函数不应与线性系统的传递函数相混淆。

Basic structure

For a given artificial neuron k, let there be m + 1 inputs with signals x0 through xm and weights wk0 through wkm. Usually, the x0 input is assigned the value +1, which makes it a bias input with wk0 = bk. This leaves only m actual inputs to the neuron: from x1 to xm.

For a given artificial neuron k, let there be m + 1 inputs with signals x0 through xm and weights wk0 through wkm. Usually, the x0 input is assigned the value +1, which makes it a bias input with wk0 = bk. This leaves only m actual inputs to the neuron: from x1 to xm.

= = = 基本结构 = = 对于给定的人工神经元 k,设有 m + 1输入,信号 x0到 xm,重量 wk0到 wkm。通常,x0输入被赋值为 + 1,这使得它成为一个偏向输入 wk0 = bk。这样就只剩下对神经元的 m 个实际输入: 从 x1到 xm。

The output of the kth neuron is:

The output of the kth neuron is:

第 k 个神经元的输出是:

[math]\displaystyle{ y_k = \varphi \left( \sum_{j=0}^m w_{kj} x_j \right) }[/math]
y_k = \varphi \left( \sum_{j=0}^m w_{kj} x_j \right)
y _ k = varphi left (sum { j = 0} ^ m w { kj } x _ j right)

Where [math]\displaystyle{ \varphi }[/math] (phi) is the transfer function (commonly a threshold function).

Where \varphi (phi) is the transfer function (commonly a threshold function).

其中 varphi (phi)是传递函数(通常是阈值函数)。

文件:Artificial neuron.png

File:artificial neuron.png

文件: artificial neuron.png

The output is analogous to the axon of a biological neuron, and its value propagates to the input of the next layer, through a synapse. It may also exit the system, possibly as part of an output vector.

The output is analogous to the axon of a biological neuron, and its value propagates to the input of the next layer, through a synapse. It may also exit the system, possibly as part of an output vector.

输出类似于生物神经元的轴突,其价值通过突触传播到下一层的输入。它也可能退出系统,可能作为输出向量的一部分。

It has no learning process as such. Its transfer function weights are calculated and threshold value are predetermined.

It has no learning process as such. Its transfer function weights are calculated and threshold value are predetermined.

它本身没有学习过程。计算其传递函数权重,确定阈值。

Types

Depending on the specific model used they may be called a semi-linear unit, Nv neuron, binary neuron, linear threshold function, or McCulloch–Pitts (MCP) neuron.

Depending on the specific model used they may be called a semi-linear unit, Nv neuron, binary neuron, linear threshold function, or McCulloch–Pitts (MCP) neuron.

根据所使用的特定模型,它们可以被称为半线性单位、 Nv 神经元、二元神经元、线性阈值函数或 McCulloch-Pitts (MCP)神经元。

Simple artificial neurons, such as the McCulloch–Pitts model, are sometimes described as "caricature models", since they are intended to reflect one or more neurophysiological observations, but without regard to realism.[5]

Simple artificial neurons, such as the McCulloch–Pitts model, are sometimes described as "caricature models", since they are intended to reflect one or more neurophysiological observations, but without regard to realism.


简单的人造神经元,如 McCulloch-Pitts 模型,有时被称为”漫画模型”,因为它们旨在反映一个或多个神经生理学观察,但不考虑现实情况。

模板:Expand section

Biological models

文件:Neuron3.svg
Neuron and myelinated axon, with signal flow from inputs at dendrites to outputs at axon terminals


thumb|right|400px|Neuron and myelinated axon, with signal flow from inputs at dendrites to outputs at axon terminals

生物模型 = = 拇指 | 右 | 400px | 神经元和有髓神经元,信号从树突输入到轴突终末输出

Artificial neurons are designed to mimic aspects of their biological counterparts. However a significant performance gap exists between biological and artificial neural networks. In particular single biological neurons in the human brain with oscillating activation function capable of learning the XOR function have been discovered.[6] However single artificial neurons with popular sigmoidal and ReLU like activation functions cannot learn the XOR function.[2]

  • Dendrites – In a biological neuron, the dendrites act as the input vector. These dendrites allow the cell to receive signals from a large (>1000) number of neighboring neurons. As in the above mathematical treatment, each dendrite is able to perform "multiplication" by that dendrite's "weight value." The multiplication is accomplished by increasing or decreasing the ratio of synaptic neurotransmitters to signal chemicals introduced into the dendrite in response to the synaptic neurotransmitter. A negative multiplication effect can be achieved by transmitting signal inhibitors (i.e. oppositely charged ions) along the dendrite in response to the reception of synaptic neurotransmitters.
  • Soma – In a biological neuron, the soma acts as the summation function, seen in the above mathematical description. As positive and negative signals (exciting and inhibiting, respectively) arrive in the soma from the dendrites, the positive and negative ions are effectively added in summation, by simple virtue of being mixed together in the solution inside the cell's body.
  • Axon – The axon gets its signal from the summation behavior which occurs inside the soma. The opening to the axon essentially samples the electrical potential of the solution inside the soma. Once the soma reaches a certain potential, the axon will transmit an all-in signal pulse down its length. In this regard, the axon behaves as the ability for us to connect our artificial neuron to other artificial neurons.

Artificial neurons are designed to mimic aspects of their biological counterparts. However a significant performance gap exists between biological and artificial neural networks. In particular single biological neurons in the human brain with oscillating activation function capable of learning the XOR function have been discovered. However single artificial neurons with popular sigmoidal and ReLU like activation functions cannot learn the XOR function.

  • Dendrites – In a biological neuron, the dendrites act as the input vector. These dendrites allow the cell to receive signals from a large (>1000) number of neighboring neurons. As in the above mathematical treatment, each dendrite is able to perform "multiplication" by that dendrite's "weight value." The multiplication is accomplished by increasing or decreasing the ratio of synaptic neurotransmitters to signal chemicals introduced into the dendrite in response to the synaptic neurotransmitter. A negative multiplication effect can be achieved by transmitting signal inhibitors (i.e. oppositely charged ions) along the dendrite in response to the reception of synaptic neurotransmitters.
  • Soma – In a biological neuron, the soma acts as the summation function, seen in the above mathematical description. As positive and negative signals (exciting and inhibiting, respectively) arrive in the soma from the dendrites, the positive and negative ions are effectively added in summation, by simple virtue of being mixed together in the solution inside the cell's body.
  • Axon – The axon gets its signal from the summation behavior which occurs inside the soma. The opening to the axon essentially samples the electrical potential of the solution inside the soma. Once the soma reaches a certain potential, the axon will transmit an all-in signal pulse down its length. In this regard, the axon behaves as the ability for us to connect our artificial neuron to other artificial neurons.

人工神经元的设计是为了模仿它们的生物同类。然而,生物神经网络和人工神经网络之间存在着巨大的性能差距。特别是人类大脑中具有振荡激活函数的单个生物神经元能够学习 XOR 功能,这已经被发现。单个人工神经元具有广泛的 sigmoidal 和 ReLU 样激活函数,不能学习 XOR 函数。

  • 树突-在生物神经元中,树突作为输入载体。这些树突允许细胞接收大量(> 1000)邻近神经元的信号。在上述数学处理中,每个枝晶都能够根据枝晶的“重量值”进行“增殖”增殖是通过增加或减少突触神经递质与信号化学物质的比例来完成的,这些化学物质是针对突触神经递质引入树突的。通过传递信号抑制剂(即信号抑制剂)可以产生负增殖效应。相反带电离子)沿着树突响应突触神经递质的接收。
  • 躯体-在生物神经元中,躯体作为总和函数,见上述数学描述。正离子和负离子(分别为激发信号和抑制信号)通过树突进入胞体,通过混合在细胞体内的溶液中,有效地叠加在一起。
  • Axon-轴突从胞体内部的求和行为中获得信号。轴突的开口基本上取样体细胞内溶液的电势。一旦躯体达到一定的电位,轴突将传递一个全能信号脉冲到它的长度。在这方面,轴突的作用就像是我们将人工神经元与其他人工神经元连接起来的能力。

Unlike most artificial neurons, however, biological neurons fire in discrete pulses. Each time the electrical potential inside the soma reaches a certain threshold, a pulse is transmitted down the axon. This pulsing can be translated into continuous values. The rate (activations per second, etc.) at which an axon fires converts directly into the rate at which neighboring cells get signal ions introduced into them. The faster a biological neuron fires, the faster nearby neurons accumulate electrical potential (or lose electrical potential, depending on the "weighting" of the dendrite that connects to the neuron that fired). It is this conversion that allows computer scientists and mathematicians to simulate biological neural networks using artificial neurons which can output distinct values (often from −1 to 1).

Unlike most artificial neurons, however, biological neurons fire in discrete pulses. Each time the electrical potential inside the soma reaches a certain threshold, a pulse is transmitted down the axon. This pulsing can be translated into continuous values. The rate (activations per second, etc.) at which an axon fires converts directly into the rate at which neighboring cells get signal ions introduced into them. The faster a biological neuron fires, the faster nearby neurons accumulate electrical potential (or lose electrical potential, depending on the "weighting" of the dendrite that connects to the neuron that fired). It is this conversion that allows computer scientists and mathematicians to simulate biological neural networks using artificial neurons which can output distinct values (often from −1 to 1).

然而,与大多数人工神经元不同的是,生物神经元以离散脉冲的方式发射信号。每当体细胞内的电位达到一定的阈值时,就会有一个脉冲沿轴突传递下去。这种脉动可以转化为连续值。速率(每秒激活次数,等等)轴突发射的信号能够直接转化为相邻细胞获得信号离子的速率。生物神经元发射的速度越快,附近的神经元积累电位的速度就越快(或者说电位的损失,取决于连接发射的神经元的树突的“重量”)。正是这种转换使得计算机科学家和数学家能够使用人造神经元模拟生物神经网络,这些神经元可以输出不同的值(通常是 -1到1)。

Encoding

Research has shown that unary coding is used in the neural circuits responsible for birdsong production.[7][8] The use of unary in biological networks is presumably due to the inherent simplicity of the coding. Another contributing factor could be that unary coding provides a certain degree of error correction.[9]

Research has shown that unary coding is used in the neural circuits responsible for birdsong production. The use of unary in biological networks is presumably due to the inherent simplicity of the coding. Another contributing factor could be that unary coding provides a certain degree of error correction.

研究表明,鸟鸣产生的神经回路使用一元编码。在生物网络中使用一元大概是由于编码本身的简单性。另一个影响因素可能是一元编码提供了一定程度的错误纠正。

History

The first artificial neuron was the Threshold Logic Unit (TLU), or Linear Threshold Unit,[10] first proposed by Warren McCulloch and Walter Pitts in 1943. The model was specifically targeted as a computational model of the "nerve net" in the brain.[11] As a transfer function, it employed a threshold, equivalent to using the Heaviside step function. Initially, only a simple model was considered, with binary inputs and outputs, some restrictions on the possible weights, and a more flexible threshold value. Since the beginning it was already noticed that any boolean function could be implemented by networks of such devices, what is easily seen from the fact that one can implement the AND and OR functions, and use them in the disjunctive or the conjunctive normal form. Researchers also soon realized that cyclic networks, with feedbacks through neurons, could define dynamical systems with memory, but most of the research concentrated (and still does) on strictly feed-forward networks because of the smaller difficulty they present.

The first artificial neuron was the Threshold Logic Unit (TLU), or Linear Threshold Unit, first proposed by Warren McCulloch and Walter Pitts in 1943. The model was specifically targeted as a computational model of the "nerve net" in the brain. As a transfer function, it employed a threshold, equivalent to using the Heaviside step function. Initially, only a simple model was considered, with binary inputs and outputs, some restrictions on the possible weights, and a more flexible threshold value. Since the beginning it was already noticed that any boolean function could be implemented by networks of such devices, what is easily seen from the fact that one can implement the AND and OR functions, and use them in the disjunctive or the conjunctive normal form. Researchers also soon realized that cyclic networks, with feedbacks through neurons, could define dynamical systems with memory, but most of the research concentrated (and still does) on strictly feed-forward networks because of the smaller difficulty they present.

= = 历史 = = 第一个人工神经元是阈值逻辑单元(TLU) ,或者线性阈值单元,由 Warren McCulloch 和 Walter Pitts 在1943年首次提出。这个模型被特别定位为大脑中“神经网络”的一个计算模型。作为一个传递函数,它采用了一个阈值,相当于使用单位阶跃函数。最初,只考虑了一个简单的模型,具有二进制输入和输出,对可能的权重有一些限制,以及一个更灵活的阈值。从一开始,人们就已经注意到,任何布尔函数都可以通过这样的设备网络来实现,这一点很容易从一个人可以实现与或函数,并在析取或合取范式中使用它们这一事实中看出来。研究人员还很快意识到,通过神经元进行反馈的循环网络可以定义具有记忆的动态系统,但大多数研究集中在严格的前馈网络上(现在仍然如此) ,因为它们提出的难度较小。

One important and pioneering artificial neural network that used the linear threshold function was the perceptron, developed by Frank Rosenblatt. This model already considered more flexible weight values in the neurons, and was used in machines with adaptive capabilities. The representation of the threshold values as a bias term was introduced by Bernard Widrow in 1960 – see ADALINE.

One important and pioneering artificial neural network that used the linear threshold function was the perceptron, developed by Frank Rosenblatt. This model already considered more flexible weight values in the neurons, and was used in machines with adaptive capabilities. The representation of the threshold values as a bias term was introduced by Bernard Widrow in 1960 – see ADALINE.

使用线性阈值函数的人工神经网络的一个重要的先驱是感知器,由弗兰克 · 罗森布拉特开发。该模型已经考虑了神经元中更灵活的权重值,并在具有自适应能力的机器中得到了应用。将阈值表示为偏差项是由 Bernard Widrow 在1960年提出的——参见 ADALINE。

In the late 1980s, when research on neural networks regained strength, neurons with more continuous shapes started to be considered. The possibility of differentiating the activation function allows the direct use of the gradient descent and other optimization algorithms for the adjustment of the weights. Neural networks also started to be used as a general function approximation model. The best known training algorithm called backpropagation has been rediscovered several times but its first development goes back to the work of Paul Werbos.[12][13]

In the late 1980s, when research on neural networks regained strength, neurons with more continuous shapes started to be considered. The possibility of differentiating the activation function allows the direct use of the gradient descent and other optimization algorithms for the adjustment of the weights. Neural networks also started to be used as a general function approximation model. The best known training algorithm called backpropagation has been rediscovered several times but its first development goes back to the work of Paul Werbos.Paul Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974

在20世纪80年代后期,当神经网络的研究恢复了活力,连续形状的神经元开始被考虑。区分激活函数的可能性允许直接使用梯度下降法和其他优化算法来调整权重。神经网络也开始被用作一般的函数逼近模型。最著名的训练算法称为反向传播已经被重新发现了好几次,但它的第一次发展可以追溯到 Paul Werbos 的工作。超越回归: 行为科学中预测和分析的新工具。1974年哈佛大学博士论文

Types of transfer functions

The transfer function (activation function) of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any multilayer perceptron using a linear transfer function has an equivalent single-layer network; a non-linear function is therefore necessary to gain the advantages of a multi-layer network.[citation needed]


The transfer function (activation function) of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any multilayer perceptron using a linear transfer function has an equivalent single-layer network; a non-linear function is therefore necessary to gain the advantages of a multi-layer network.

= = = 传递函数的类型 = = 选择一个神经元的传递函数(激活函数)具有一些性质,这些性质可以增强或简化包含神经元的网络。至关重要的是,例如,任何使用线性传递函数的多层感知机都有一个等效的单层网络,因此非线性函数是获得多层网络的优势所必需的。

Below, u refers in all cases to the weighted sum of all the inputs to the neuron, i.e. for n inputs,

Below, u refers in all cases to the weighted sum of all the inputs to the neuron, i.e. for n inputs,

下面,u 在所有情况下都指向神经元所有输入的加权和,即。对于 n 个输入,

[math]\displaystyle{ u = \sum_{i = 1}^n w_i x_i }[/math]

u = \sum_{i = 1}^n w_i x_i


u = \sum_{i = 1}^n w_i x_i

where w is a vector of synaptic weights and x is a vector of inputs.

where w is a vector of synaptic weights and x is a vector of inputs.

w 是突触重量的矢量,x 是输入的矢量。

Step function

The output y of this transfer function is binary, depending on whether the input meets a specified threshold, θ. The "signal" is sent, i.e. the output is set to one, if the activation meets the threshold.

The output y of this transfer function is binary, depending on whether the input meets a specified threshold, θ. The "signal" is sent, i.e. the output is set to one, if the activation meets the threshold.

这个传递函数的输出是二进制的,取决于输入是否满足指定的阈值 θ。发出”信号”,即。如果激活达到阈值,输出设置为1。

[math]\displaystyle{ y = \begin{cases} 1 & \text{if }u \ge \theta \\ 0 & \text{if }u \lt \theta \end{cases} }[/math]
y = \begin{cases} 1 & \text{if }u \ge \theta \\ 0 & \text{if }u < \theta \end{cases}

1 & text { if } u ge theta 0 & text { if } u < theta end { cases }

This function is used in perceptrons and often shows up in many other models. It performs a division of the space of inputs by a hyperplane. It is specially useful in the last layer of a network intended to perform binary classification of the inputs. It can be approximated from other sigmoidal functions by assigning large values to the weights.

This function is used in perceptrons and often shows up in many other models. It performs a division of the space of inputs by a hyperplane. It is specially useful in the last layer of a network intended to perform binary classification of the inputs. It can be approximated from other sigmoidal functions by assigning large values to the weights.

这个函数在感知器中使用,并且经常出现在许多其他模型中。它用超平面对输入空间进行划分。它在网络的最后一层特别有用,用于对输入进行二进制分类。它可以通过赋予较大的权值来近似于其他的 sigmoidal 函数。

Linear combination

In this case, the output unit is simply the weighted sum of its inputs plus a bias term. A number of such linear neurons perform a linear transformation of the input vector. This is usually more useful in the first layers of a network. A number of analysis tools exist based on linear models, such as harmonic analysis, and they can all be used in neural networks with this linear neuron. The bias term allows us to make affine transformations to the data.

In this case, the output unit is simply the weighted sum of its inputs plus a bias term. A number of such linear neurons perform a linear transformation of the input vector. This is usually more useful in the first layers of a network. A number of analysis tools exist based on linear models, such as harmonic analysis, and they can all be used in neural networks with this linear neuron. The bias term allows us to make affine transformations to the data.

在这种情况下,输出单位就是其输入的加权和加上一个线性组合。许多这样的线性神经元执行一个线性映射的输入向量。这通常在网络的第一层更有用。许多分析工具都是以线性模型为基础的,比如傅里叶分析神经网络,它们都可以在神经网络中用到这个线性神经元。偏差项允许我们对数据进行仿射变换。

See: Linear transformation, Harmonic analysis, Linear filter, Wavelet, Principal component analysis, Independent component analysis, Deconvolution.

See: Linear transformation, Harmonic analysis, Linear filter, Wavelet, Principal component analysis, Independent component analysis, Deconvolution.

参见: 线性映射、傅里叶分析、线性滤波、小波、主成分分析、独立元素分析、反卷积。

Sigmoid

A fairly simple non-linear function, the sigmoid function such as the logistic function also has an easily calculated derivative, which can be important when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimize the computational load of their simulations. It was previously commonly seen in multilayer perceptrons. However, recent work has shown sigmoid neurons to be less effective than rectified linear neurons. The reason is that the gradients computed by the backpropagation algorithm tend to diminish towards zero as activations propagate through layers of sigmoidal neurons, making it difficult to optimize neural networks using multiple layers of sigmoidal neurons.


A fairly simple non-linear function, the sigmoid function such as the logistic function also has an easily calculated derivative, which can be important when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimize the computational load of their simulations. It was previously commonly seen in multilayer perceptrons. However, recent work has shown sigmoid neurons to be less effective than rectified linear neurons. The reason is that the gradients computed by the backpropagation algorithm tend to diminish towards zero as activations propagate through layers of sigmoidal neurons, making it difficult to optimize neural networks using multiple layers of sigmoidal neurons.

= = = Sigmoid = = = = 一个相当简单的非线性函数,S形函数,比如 Logistic函数,也有一个很容易计算的导数,这在计算网络中的权重更新时很重要。因此,它使得网络更容易在数学上进行操作,并且对那些需要最小化模拟计算量的早期计算机科学家很有吸引力。它以前常见于多层感知器中。然而,最近的研究表明乙状结肠神经元的效果不如线性矫正神经元。其原因是,由于活化通过乙状结构神经元层传播,反向传播算法计算的梯度趋向于向零减少,使得使用多层乙状结构神经元来优化神经网络变得困难。

Rectifier

In the context of artificial neural networks, the rectifier is an activation function defined as the positive part of its argument:


In the context of artificial neural networks, the rectifier is an activation function defined as the positive part of its argument:

= = = 整流器 = = = 在人工神经网络的背景下,整流器是一个激活函数,被定义为其论点的积极部分:

[math]\displaystyle{ f(x) = x^+ = \max(0, x), }[/math]
f(x) = x^+ = \max(0, x),
f (x) = x ^ + = max (0,x) ,

where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering. This activation function was first introduced to a dynamical network by Hahnloser et al. in a 2000 paper in Nature[14] with strong biological motivations and mathematical justifications.[15] It has been demonstrated for the first time in 2011 to enable better training of deeper networks,[16] compared to the widely used activation functions prior to 2011, i.e., the logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical[17] counterpart, the hyperbolic tangent.

where x is the input to a neuron. This is also known as a ramp function and is analogous to half-wave rectification in electrical engineering. This activation function was first introduced to a dynamical network by Hahnloser et al. in a 2000 paper in Nature with strong biological motivations and mathematical justifications. It has been demonstrated for the first time in 2011 to enable better training of deeper networks, compared to the widely used activation functions prior to 2011, i.e., the logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical counterpart, the hyperbolic tangent.

其中 x 是神经元的输入。这也被称为斜坡函数,类似于电气工程中的半波整流。这个激活函数是由 hahn loser 等人首次引入到一个动态网络中的。2000年发表在《自然》杂志上的一篇论文中,有着强烈的生物学动机和数学证明。与2011年之前广泛使用的激活函数相比,它在2011年第一次被证明可以更好地训练更深层的网络,即逻辑 s 形函数(灵感来自概率论; 参见 Logit模型)和更实用的双曲正切函数。

Pseudocode algorithm

模板:Split section

The following is a simple pseudocode implementation of a single TLU which takes boolean inputs (true or false), and returns a single boolean output when activated. An object-oriented model is used. No method of training is defined, since several exist. If a purely functional model were used, the class TLU below would be replaced with a function TLU with input parameters threshold, weights, and inputs that returned a boolean value.

The following is a simple pseudocode implementation of a single TLU which takes boolean inputs (true or false), and returns a single boolean output when activated. An object-oriented model is used. No method of training is defined, since several exist. If a purely functional model were used, the class TLU below would be replaced with a function TLU with input parameters threshold, weights, and inputs that returned a boolean value.

下面是单个 TLU 的简单伪代码实现,它接受布尔输入(true 或 false) ,并在激活时返回单个布尔输出。使用了面向对象的模型。没有定义训练方法,因为存在几种。如果使用纯函数模型,下面的类 TLU 将被一个带有输入参数阈值、权重和返回布尔值的输入的函数 TLU 替换。

class TLU defined as:
    data member threshold : number
    data member weights : list of numbers of size X

    function member fire(inputs : list of booleans of size X) : boolean defined as:
        variable T : number
        T  0
        for each i in 1 to X do
            if inputs(i) is true then
                T  T + weights(i)
            end if
        end for each
        if T > threshold then
            return true
        else:
            return false
        end if
    end function
end class
class TLU defined as:
    data member threshold : number
    data member weights : list of numbers of size X

    function member fire(inputs : list of booleans of size X) : boolean defined as:
        variable T : number
        T ← 0
        for each i in 1 to X do
            if inputs(i) is true then
                T ← T + weights(i)
            end if
        end for each
        if T > threshold then
            return true
        else:
            return false
        end if
    end function
end class

类 TLU 定义为: 数据成员阈值: 数字数据成员权重: 列表的数字 x 大小的函数成员火(输入: 列表布尔大小 x 布尔) : 布尔定义为: 变量 t: 数字 t ←0每个 i 在1到 x 做如果输入(i)是真然后 t ← t + 权重(i)结束如果每个结束如果 t > 阈值然后返回真否则: 返回假结束如果结束函数结束类

See also

  • Binding neuron
  • Connectionism

= =

  • 结合神经元
  • 连接主义

References

  1. "Neuromorphic Circuits With Neural Modulation Enhancing the Information Content of Neural Signaling | International Conference on Neuromorphic Systems 2020" (in English). doi:10.1145/3407197.3407204. Unknown parameter |s2cid= ignored (help); Cite journal requires |journal= (help)
  2. 2.0 2.1 2.2 Noel, Mathew Mithra; L, Arunkumar; Trivedi, Advait; Dutta, Praneet (2021-09-04). "Growing Cosine Unit: A Novel Oscillatory Activation Function That Can Speedup Training and Reduce Parameters in Convolutional Neural Networks". arXiv:2108.12943 [cs.LG].
  3. Noel, Matthew Mithra; Bharadwaj, Shubham; Muthiah-Nakarajan, Venkataraman; Dutta, Praneet; Amali, Geraldine Bessie (2021-11-07). "Biologically Inspired Oscillating Activation Functions Can Bridge the Performance Gap between Biological and Artificial Neurons". arXiv:2111.04020 [cs.NE].
  4. Maan, A. K.; Jayadevi, D. A.; James, A. P. (1 January 2016). "A Survey of Memristive Threshold Logic Circuits". IEEE Transactions on Neural Networks and Learning Systems. PP (99): 1734–1746. arXiv:1604.07121. Bibcode:2016arXiv160407121M. doi:10.1109/TNNLS.2016.2547842. ISSN 2162-237X. PMID 27164608. Unknown parameter |s2cid= ignored (help)
  5. F. C. Hoppensteadt and E. M. Izhikevich (1997). Weakly connected neural networks. Springer. p. 4. ISBN 978-0-387-94948-2. 
  6. Gidon, Albert; Zolnik, Timothy Adam; Fidzinski, Pawel; Bolduan, Felix; Papoutsi, Athanasia; Poirazi, Panayiota; Holtkamp, Martin; Vida, Imre; Larkum, Matthew Evan (2020-01-03). "Dendritic action potentials and computation in human layer 2/3 cortical neurons". Science. 367 (6473): 83–87. Bibcode:2020Sci...367...83G. doi:10.1126/science.aax6239. PMID 31896716. Unknown parameter |s2cid= ignored (help)
  7. Squire, L.; Albright, T.; Bloom, F. et al., eds. (October 2007). Neural network models of birdsong production, learning, and coding. New Encyclopedia of Neuroscience: Elservier. https://clm.utexas.edu/fietelab/Papers/birdsong_review_topost.pdf. 
  8. Moore, J.M.; et al. (2011). "Motor pathway convergence predicts syllable repertoire size in oscine birds". Proc. Natl. Acad. Sci. USA. 108 (39): 16440–16445. Bibcode:2011PNAS..10816440M. doi:10.1073/pnas.1102077108. PMC 3182746. PMID 21918109.
  9. Potluri, Pushpa Sree (26 November 2014). "Error Correction Capacity of Unary Coding". arXiv:1411.7406 [cs.IT].
  10. Martin Anthony (January 2001). Discrete Mathematics of Neural Networks: Selected Topics. SIAM. pp. 3–. ISBN 978-0-89871-480-7. https://books.google.com/books?id=qOy4yLBqhFcC&pg=PA3. 
  11. Charu C. Aggarwal (25 July 2014). Data Classification: Algorithms and Applications. CRC Press. pp. 209–. ISBN 978-1-4665-8674-1. https://books.google.com/books?id=gJhBBAAAQBAJ&pg=PA209. 
  12. Paul Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974
  13. Werbos, P.J. (1990). "Backpropagation through time: what it does and how to do it". Proceedings of the IEEE. 78 (10): 1550–1560. doi:10.1109/5.58337. ISSN 0018-9219.
  14. Hahnloser, Richard H. R.; Sarpeshkar, Rahul; Mahowald, Misha A.; Douglas, Rodney J.; Seung, H. Sebastian (2000). "Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit". Nature. 405 (6789): 947–951. Bibcode:2000Natur.405..947H. doi:10.1038/35016072. ISSN 0028-0836. PMID 10879535. Unknown parameter |s2cid= ignored (help)
  15. R Hahnloser, H.S. Seung (2001). Permitted and Forbidden Sets in Symmetric Threshold-Linear Networks. NIPS 2001.CS1 maint: uses authors parameter (link)
  16. Xavier Glorot, Antoine Bordes and Yoshua Bengio (2011). Deep sparse rectifier neural networks (PDF). AISTATS.CS1 maint: uses authors parameter (link)
  17. Yann LeCun, Leon Bottou, Genevieve B. Orr and Klaus-Robert Müller (1998). "Efficient BackProp" (PDF). In G. Orr; K. Müller (eds.). Neural Networks: Tricks of the Trade. Springer.CS1 maint: uses authors parameter (link)

Further reading



= 进一步阅读 =

External links

  • neuron mimicks function of human cells
  • McCulloch-Pitts Neurons (Overview)

= = =

  • 神经元模仿人类细胞的功能
  • McCulloch-Pitts 神经元(概述)

Category:Artificial neural networks Category:American inventions

类别: 人工神经网络类别: 美国的发明


This page was moved from wikipedia:en:Artificial neuron. Its edit history can be viewed at McCulloch & Pitts神经网络/edithistory