更改

跳到导航 跳到搜索
添加10,138字节 、 2020年11月26日 (四) 17:51
第1,066行: 第1,066行:     
'''<font color=#ff8000>缺省逻辑 Default Logics</font>'''、'''<font color=#ff8000>非单调逻辑 Non-monotonic Logics</font>'''、'''<font color=#ff8000>限制逻辑 Circumscription</font>'''和'''<font color=#ff8000>模态逻辑 Modal Logics</font>'''<ref name="Default reasoning and non-monotonic logic"/>,都用逻辑形式来解决缺省推理和限定问题。一些逻辑扩展被用于处理特定的知识领域,例如:'''<font color=#ff8000>描述逻辑 Description Logics</font>'''<ref name="Representing categories and relations"/> 、情景演算、事件演算、'''<font color=#ff8000>流态演算 Fluent Calculus</font>'''(用于表示事件和时间)<ref name="Representing time"/>、因果演算<ref name="Representing causation"/>、信念演算(信念修正)<ref>"The Belief Calculus and Uncertain Reasoning", Yen-Teh Hsia</ref>、和模态逻辑<ref name="Representing knowledge about knowledge"/>。人们也设计了对多主体系统中出现的矛盾或不一致陈述进行建模的逻辑,如次协调逻辑。
 
'''<font color=#ff8000>缺省逻辑 Default Logics</font>'''、'''<font color=#ff8000>非单调逻辑 Non-monotonic Logics</font>'''、'''<font color=#ff8000>限制逻辑 Circumscription</font>'''和'''<font color=#ff8000>模态逻辑 Modal Logics</font>'''<ref name="Default reasoning and non-monotonic logic"/>,都用逻辑形式来解决缺省推理和限定问题。一些逻辑扩展被用于处理特定的知识领域,例如:'''<font color=#ff8000>描述逻辑 Description Logics</font>'''<ref name="Representing categories and relations"/> 、情景演算、事件演算、'''<font color=#ff8000>流态演算 Fluent Calculus</font>'''(用于表示事件和时间)<ref name="Representing time"/>、因果演算<ref name="Representing causation"/>、信念演算(信念修正)<ref>"The Belief Calculus and Uncertain Reasoning", Yen-Teh Hsia</ref>、和模态逻辑<ref name="Representing knowledge about knowledge"/>。人们也设计了对多主体系统中出现的矛盾或不一致陈述进行建模的逻辑,如次协调逻辑。
  −
        第1,085行: 第1,083行:     
AI中的许多问题(在推理、规划、学习、感知和机器人技术方面)要求主体在信息不完整或不确定的情况下进行操作。AI研究人员从概率论和经济学的角度设计了许多强大的工具来解决这些问题。
 
AI中的许多问题(在推理、规划、学习、感知和机器人技术方面)要求主体在信息不完整或不确定的情况下进行操作。AI研究人员从概率论和经济学的角度设计了许多强大的工具来解决这些问题。
        第1,113行: 第1,110行:     
   --[[用户:Thingamabob|Thingamabob]]([[用户讨论:Thingamabob|讨论]]) 分类器(“ if shiny then diamond”)和控制器(“ if shiny then pick up”) 一句不能准确翻译
 
   --[[用户:Thingamabob|Thingamabob]]([[用户讨论:Thingamabob|讨论]]) 分类器(“ if shiny then diamond”)和控制器(“ if shiny then pick up”) 一句不能准确翻译
--[[用户:Qige96|Ricky]]([[用户讨论:Qige96|讨论]])已解决。这里主要是突出分类器的工作是下判断,而控制器的工作是做动作。
+
 
 +
  --[[用户:Qige96|Ricky]]([[用户讨论:Qige96|讨论]])已解决。这里主要是突出分类器的工作是下判断,而控制器的工作是做动作。
    
A classifier can be trained in various ways; there are many statistical and [[machine learning]] approaches. The [[decision tree learning|decision tree]]<ref name="Decision tree"/> is perhaps the most widely used machine learning algorithm.{{sfn|Domingos|2015|p=88}} Other widely used classifiers are the [[Artificial neural network|neural network]],<ref name="Neural networks"/> [[Gaussian mixture model]],<ref name="Gaussian mixture model"/> and the extremely popular [[naive Bayes classifier]].{{efn|Naive Bayes is reportedly the "most widely used learner" at Google, due in part to its scalability.{{sfn|Domingos|2015|p=152}}}}<ref name="Naive Bayes classifier"/> Classifier performance depends greatly on the characteristics of the data to be classified, such as the dataset size, distribution of samples across classes, the dimensionality, and the level of noise. Model-based classifiers perform well if the assumed model is an extremely good fit for the actual data. Otherwise, if no matching model is available, and if accuracy (rather than speed or scalability) is the sole concern, conventional wisdom is that discriminative classifiers (especially SVM) tend to be more accurate than model-based classifiers such as "naive Bayes" on most practical data sets.<ref name="Classifier performance"/>{{sfn|Russell|Norvig|2009|loc=18.12: Learning from Examples: Summary}}
 
A classifier can be trained in various ways; there are many statistical and [[machine learning]] approaches. The [[decision tree learning|decision tree]]<ref name="Decision tree"/> is perhaps the most widely used machine learning algorithm.{{sfn|Domingos|2015|p=88}} Other widely used classifiers are the [[Artificial neural network|neural network]],<ref name="Neural networks"/> [[Gaussian mixture model]],<ref name="Gaussian mixture model"/> and the extremely popular [[naive Bayes classifier]].{{efn|Naive Bayes is reportedly the "most widely used learner" at Google, due in part to its scalability.{{sfn|Domingos|2015|p=152}}}}<ref name="Naive Bayes classifier"/> Classifier performance depends greatly on the characteristics of the data to be classified, such as the dataset size, distribution of samples across classes, the dimensionality, and the level of noise. Model-based classifiers perform well if the assumed model is an extremely good fit for the actual data. Otherwise, if no matching model is available, and if accuracy (rather than speed or scalability) is the sole concern, conventional wisdom is that discriminative classifiers (especially SVM) tend to be more accurate than model-based classifiers such as "naive Bayes" on most practical data sets.<ref name="Classifier performance"/>{{sfn|Russell|Norvig|2009|loc=18.12: Learning from Examples: Summary}}
第1,136行: 第1,134行:  
Neural networks were inspired by the architecture of neurons in the human brain. A simple "neuron" N accepts input from other neurons, each of which, when activated (or "fired"), cast a weighted "vote" for or against whether neuron N should itself activate. Learning requires an algorithm to adjust these weights based on the training data; one simple algorithm (dubbed "fire together, wire together") is to increase the weight between two connected neurons when the activation of one triggers the successful activation of another. The neural network forms "concepts" that are distributed among a subnetwork of shared neurons that tend to fire together; a concept meaning "leg" might be coupled with a subnetwork meaning "foot" that includes the sound for "foot". Neurons have a continuous spectrum of activation; in addition, neurons can process inputs in a nonlinear way rather than weighing straightforward votes. Modern neural networks can learn both continuous functions and, surprisingly, digital logical operations. Neural networks' early successes included predicting the stock market and (in 1995) a mostly self-driving car. In the 2010s, advances in neural networks using deep learning thrust AI into widespread public consciousness and contributed to an enormous upshift in corporate AI spending; for example, AI-related M&A in 2017 was over 25 times as large as in 2015.
 
Neural networks were inspired by the architecture of neurons in the human brain. A simple "neuron" N accepts input from other neurons, each of which, when activated (or "fired"), cast a weighted "vote" for or against whether neuron N should itself activate. Learning requires an algorithm to adjust these weights based on the training data; one simple algorithm (dubbed "fire together, wire together") is to increase the weight between two connected neurons when the activation of one triggers the successful activation of another. The neural network forms "concepts" that are distributed among a subnetwork of shared neurons that tend to fire together; a concept meaning "leg" might be coupled with a subnetwork meaning "foot" that includes the sound for "foot". Neurons have a continuous spectrum of activation; in addition, neurons can process inputs in a nonlinear way rather than weighing straightforward votes. Modern neural networks can learn both continuous functions and, surprisingly, digital logical operations. Neural networks' early successes included predicting the stock market and (in 1995) a mostly self-driving car. In the 2010s, advances in neural networks using deep learning thrust AI into widespread public consciousness and contributed to an enormous upshift in corporate AI spending; for example, AI-related M&A in 2017 was over 25 times as large as in 2015.
   −
神经网络的诞生受到人脑神经元结构的启发。一个简单的“神经元”''N'' 接受来自其他神经元的输入,每个神经元在被激活(或者说“放电”)时,都会对N是否应该被激活按一定的权重赋上值。学习的过程需要一个根据训练数据调整这些权重的算法; 一个被称为“相互放电,彼此联系”简单的算法在一个神经元激活触发另一个神经元的激活时增加两个连接神经元之间的权重。神经网络中形成一种分布在一个共享的神经元子网络中的”概念”,这些神经元往往一起放电; ”腿”的概念可能和”脚”概念的子网络相结合,后者包括”脚”的发音。神经元有一个连续的激活谱; 此外,神经元还可以用非线性的方式处理输入,而不是简单地加权求和。现代神经网络可以学习连续函数甚至的数字逻辑运算。神经网络早期的成功包括预测股票市场和自动驾驶汽车(1995年)。2010年代,神经网络使用深度学习取得巨大进步,也因此将AI推向了公众视野里,并促使企业对AI投资急速增加; 例如2017年与AI相关的并购交易规模是2015年的25倍多。
+
神经网络的诞生受到人脑神经元结构的启发。一个简单的“神经元”''N'' 接受来自其他神经元的输入,每个神经元在被激活(或者说“放电”)时,都会对''N''是否应该被激活按一定的权重赋上值。学习的过程需要一个根据训练数据调整这些权重的算法:一个被称为“相互放电,彼此联系”简单的算法在一个神经元激活触发另一个神经元的激活时增加两个连接神经元之间的权重。神经网络中形成一种分布在一个共享的神经元子网络中的“概念”,这些神经元往往一起放电。“腿”的概念可能和“脚”概念的子网络相结合,后者包括”脚”的发音。神经元有一个连续的激活频谱; 此外,神经元还可以用非线性的方式处理输入,而不是简单地加权求和。现代神经网络可以学习连续函数甚至的数字逻辑运算。神经网络早期的成功包括预测股票市场和自动驾驶汽车(1995年)。{{efn|Steering for the 1995 "[[History of autonomous cars#1990s|No Hands Across America]]" required "only a few human assists".}}2010年代,神经网络使用深度学习取得巨大进步,也因此将AI推向了公众视野里,并促使企业对AI投资急速增加; 例如2017年与AI相关的并购交易规模是2015年的25倍多。<ref>{{cite news|title=Why Deep Learning Is Suddenly Changing Your Life|url=http://fortune.com/ai-artificial-intelligence-deep-machine-learning/|accessdate=12 March 2018|work=Fortune|date=2016}}</ref><ref>{{cite news|title=Google leads in the race to dominate artificial intelligence|url=https://www.economist.com/news/business/21732125-tech-giants-are-investing-billions-transformative-technology-google-leads-race|accessdate=12 March 2018|work=The Economist|date=2017|language=en}}</ref>
      第1,143行: 第1,141行:  
The study of non-learning artificial neural networks began in the decade before the field of AI research was founded, in the work of Walter Pitts and Warren McCullouch. Frank Rosenblatt invented the perceptron, a learning network with a single layer, similar to the old concept of linear regression. Early pioneers also include Alexey Grigorevich Ivakhnenko, Teuvo Kohonen, Stephen Grossberg, Kunihiko Fukushima, Christoph von der Malsburg, David Willshaw, Shun-Ichi Amari, Bernard Widrow, John Hopfield, Eduardo R. Caianiello, and others.
 
The study of non-learning artificial neural networks began in the decade before the field of AI research was founded, in the work of Walter Pitts and Warren McCullouch. Frank Rosenblatt invented the perceptron, a learning network with a single layer, similar to the old concept of linear regression. Early pioneers also include Alexey Grigorevich Ivakhnenko, Teuvo Kohonen, Stephen Grossberg, Kunihiko Fukushima, Christoph von der Malsburg, David Willshaw, Shun-Ichi Amari, Bernard Widrow, John Hopfield, Eduardo R. Caianiello, and others.
   −
沃尔特 · 皮茨和沃伦 · 麦克卢奇共同完成的非学习型人工神经网络的研究比AI研究领域成立早十年。他们发明了'''<font color=#ff8000>感知机 Perceptron</font>''',这是一个单层的学习网络,类似于线性回归的概念。早期的拓荒者还包括 Alexey Grigorevich Ivakhnenko,Teuvo Kohonen,Stephen Grossberg,Kunihiko Fukushima,Christoph von der Malsburg,David Willshaw,Shun-Ichi Amari,Bernard Widrow,John Hopfield,Eduardo r. Caianiello 等人。
+
沃尔特·皮茨和沃伦·麦克卢奇共同完成的非学习型人工神经网络<ref name="Neural networks"/>的研究比AI研究领域成立早十年。他们发明了'''<font color=#ff8000>感知机 Perceptron</font>''',这是一个单层的学习网络,类似于线性回归的概念。早期的先驱者还包括 Alexey Grigorevich Ivakhnenko,Teuvo Kohonen,Stephen Grossberg,Kunihiko Fukushima,Christoph von der Malsburg,David Willshaw,Shun-Ichi Amari,Bernard Widrow,John Hopfield,Eduardo r. Caianiello 等人。
 
        第1,151行: 第1,148行:  
The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback and short-term memories of previous input events). Among the most popular feedforward networks are perceptrons, multi-layer perceptrons and radial basis networks. Neural networks can be applied to the problem of intelligent control (for robotics) or learning, using such techniques as Hebbian learning ("fire together, wire together"), GMDH or competitive learning.
 
The main categories of networks are acyclic or feedforward neural networks (where the signal passes in only one direction) and recurrent neural networks (which allow feedback and short-term memories of previous input events). Among the most popular feedforward networks are perceptrons, multi-layer perceptrons and radial basis networks. Neural networks can be applied to the problem of intelligent control (for robotics) or learning, using such techniques as Hebbian learning ("fire together, wire together"), GMDH or competitive learning.
   −
网络的主要分为'''<font color=#ff8000> 非循环或前馈神经网络 Acyclic or Feedforward Neural Networks</font>'''(信号只向一个方向传递)和'''<font color=#ff8000>循环神经网络 Recurrent Neural Network</font>''' (允许对以前的输入事件进行反馈和短期记忆)。其中最常用的前馈网络有感知机、'''<font color=#ff8000多层感知机 Multi-layer Perceptrons></font>''' 和'''<font color=#ff8000> 径向基网络 Radial Basis Networks</font>'''。使用'''<font color=#ff8000>赫布型学习 Hebbian Learning </font>''' (“相互放电,共同链接”) ,GMDH 或竞争学习等技术的神经网络可以被应用于智能控制(机器人)或学习问题。
+
网络主要分为'''<font color=#ff8000> 非循环或前馈神经网络 Acyclic or Feedforward Neural Networks</font>'''(信号只向一个方向传递)和'''<font color=#ff8000>循环神经网络 Recurrent Neural Network</font>''' (允许反馈和对以前的输入事件进行短期记忆)。其中最常用的前馈网络.<ref name="Feedforward neural networks"/>有感知机、'''<font color=#ff8000多层感知机 Multi-layer Perceptrons></font>''' 和'''<font color=#ff8000> 径向基网络 Radial Basis Networks</font>'''。使用'''<font color=#ff8000>赫布型学习 Hebbian Learning </font>''' (“相互放电,共同链接”) ,GMDH 或竞争学习等技术的神经网络可以被应用于智能控制(机器人)或学习问题。
      第1,159行: 第1,156行:  
Today, neural networks are often trained by the backpropagation algorithm, which had been around since 1970 as the reverse mode of automatic differentiation published by Seppo Linnainmaa, and was introduced to neural networks by Paul Werbos.
 
Today, neural networks are often trained by the backpropagation algorithm, which had been around since 1970 as the reverse mode of automatic differentiation published by Seppo Linnainmaa, and was introduced to neural networks by Paul Werbos.
   −
当下神经网络常用'''<font color=#ff8000>反向传播算法</font>''' 来训练,1970年反向传播算法出现,被认为是 Seppo Linnainmaa提出的自动微分的反向模式出现,被保罗·韦伯引入神经网络。
+
当下神经网络常用'''<font color=#ff8000>反向传播算法</font>''' 来训练,1970年反向传播算法出现,被认为是 Seppo Linnainmaa提出的自动微分的反向模式出现<ref name="lin1970">[[Seppo Linnainmaa]] (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's Thesis (in Finnish), Univ. Helsinki, 6–7.</ref><ref name="grie2012">Griewank, Andreas (2012). Who Invented the Reverse Mode of Differentiation?. Optimization Stories, Documenta Matematica, Extra Volume ISMP (2012), 389–400.</ref>,被保罗·韦伯引入神经网络。<ref name="WERBOS1974">[[Paul Werbos]], "Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences", ''PhD thesis, Harvard University'', 1974.</ref><ref name="werbos1982">[[Paul Werbos]] (1982). Applications of advances in nonlinear sensitivity analysis. In System modeling and optimization (pp. 762–770). Springer Berlin Heidelberg. [http://werbos.com/Neural/SensitivityIFIPSeptember1981.pdf Online] {{webarchive|url=https://web.archive.org/web/20160414055503/http://werbos.com/Neural/SensitivityIFIPSeptember1981.pdf |date=14 April 2016 }}</ref><ref name="Backpropagation"/>
      第1,167行: 第1,164行:  
Hierarchical temporal memory is an approach that models some of the structural and algorithmic properties of the neocortex.
 
Hierarchical temporal memory is an approach that models some of the structural and algorithmic properties of the neocortex.
   −
分级暂时性记忆是一种模拟大脑新皮层结构和算法特性的方法。
+
层次化暂时性记忆是一种模拟大脑新皮层结构和算法特性的方法。<ref name="Hierarchical temporal memory"/>
 
  −
 
        第1,177行: 第1,172行:  
To summarize, most neural networks use some form of gradient descent on a hand-created neural topology. However, some research groups, such as Uber, argue that simple neuroevolution to mutate new neural network topologies and weights may be competitive with sophisticated gradient descent approaches. One advantage of neuroevolution is that it may be less prone to get caught in "dead ends".
 
To summarize, most neural networks use some form of gradient descent on a hand-created neural topology. However, some research groups, such as Uber, argue that simple neuroevolution to mutate new neural network topologies and weights may be competitive with sophisticated gradient descent approaches. One advantage of neuroevolution is that it may be less prone to get caught in "dead ends".
   −
总之,大多数神经网络都会在人工神经拓扑结构上使用某种形式的'''<font color=#ff8000>梯度下降法 Gradient Descent</font>'''。然而,一些比如 Uber的研究组织,认为通过简单的神经进化改变新神经网络拓扑结构和神经元间的权重可能比复杂的梯度下降法更适用。神经进化的一个优势是,它不容易陷入“死胡同”。
+
总之,大多数神经网络都会在人工神经拓扑结构上使用某种形式的'''<font color=#ff8000>梯度下降法 Gradient Descent</font>'''。然而,一些研究组,比如 Uber的,认为通过简单的神经进化改变新神经网络拓扑结构和神经元间的权重可能比复杂的梯度下降法更适用{{citation needed|date=July 2019}}。神经进化的一个优势是,它不容易陷入“死胡同”。<ref>{{cite news|title=Artificial intelligence can 'evolve' to solve problems|url=http://www.sciencemag.org/news/2018/01/artificial-intelligence-can-evolve-solve-problems|accessdate=7 February 2018|work=Science {{!}} AAAS|date=10 January 2018|language=en}}</ref>
      第1,189行: 第1,184行:  
Deep learning is any artificial neural network that can learn a long chain of causal links. For example, a feedforward network with six hidden layers can learn a seven-link causal chain (six hidden layers + output layer) and has a "credit assignment path" (CAP) depth of seven. Many deep learning systems need to be able to learn chains ten or more causal links in length.
 
Deep learning is any artificial neural network that can learn a long chain of causal links. For example, a feedforward network with six hidden layers can learn a seven-link causal chain (six hidden layers + output layer) and has a "credit assignment path" (CAP) depth of seven. Many deep learning systems need to be able to learn chains ten or more causal links in length.
   −
深度学习是任何可以学习长因果链的人工神经网络。例如,一个具有六个隐藏层的前馈网络可以学习有七个链接的因果链(六个隐藏层 + 一个输出层) ,并且具“'''<font color=#ff8000>信用分配路径 Credit Assignment Path,CAP</font>''' ”的深度为7。许多深度学习系统需要学习长度在十及以上的因果链。
+
深度学习是任何可以学习长因果链的人工神经网络。例如,一个具有六个隐藏层的前馈网络可以学习有七个链接的因果链(六个隐藏层 + 一个输出层) ,并且具深度为7的“'''<font color=#ff8000>信用分配路径 Credit Assignment Path,CAP</font>''' ”。许多深度学习系统需要学习长度在十及以上的因果链。<ref name="goodfellow2016">Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016). Deep Learning. MIT Press. [http://www.deeplearningbook.org Online] {{webarchive|url=https://web.archive.org/web/20160416111010/http://www.deeplearningbook.org/ |date=16 April 2016 }}</ref><ref name="HintonDengYu2012">{{cite journal | last1 = Hinton | first1 = G. | last2 = Deng | first2 = L. | last3 = Yu | first3 = D. | last4 = Dahl | first4 = G. | last5 = Mohamed | first5 = A. | last6 = Jaitly | first6 = N. | last7 = Senior | first7 = A. | last8 = Vanhoucke | first8 = V. | last9 = Nguyen | first9 = P. | last10 = Sainath | first10 = T. | last11 = Kingsbury | first11 = B. | year = 2012 | title = Deep Neural Networks for Acoustic Modeling in Speech Recognition – The shared views of four research groups | url = | journal = IEEE Signal Processing Magazine | volume = 29 | issue = 6| pages = 82–97 | doi=10.1109/msp.2012.2205597}}</ref><ref name="schmidhuber2015">{{cite journal |last=Schmidhuber |first=J. |year=2015 |title=Deep Learning in Neural Networks: An Overview |journal=Neural Networks |volume=61 |pages=85–117 |arxiv=1404.7828 |doi=10.1016/j.neunet.2014.09.003|pmid=25462637 }}</ref>
      第1,197行: 第1,192行:       −
According to one overview,<ref name="scholarpedia">{{cite journal | last1 = Schmidhuber | first1 = Jürgen | authorlink = Jürgen Schmidhuber | year = 2015 | title = Deep Learning | journal = Scholarpedia | volume = 10 | issue = 11 | page = 32832 | doi = 10.4249/scholarpedia.32832 | df = dmy-all | bibcode = 2015SchpJ..1032832S | doi-access = free }}</ref> the expression "Deep Learning" was introduced to the [[machine learning]] community by [[Rina Dechter]] in 1986<ref name="dechter1986">[[Rina Dechter]] (1986). Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory.[https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems Online] {{webarchive|url=https://web.archive.org/web/20160419054654/https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems |date=19 April 2016 }}</ref> and gained traction after
+
According to one overview,<ref name="scholarpedia">{{cite journal | last1 = Schmidhuber | first1 = Jürgen | authorlink = Jürgen Schmidhuber | year = 2015 | title = Deep Learning | journal = Scholarpedia | volume = 10 | issue = 11 | page = 32832 | doi = 10.4249/scholarpedia.32832 | df = dmy-all | bibcode = 2015SchpJ..1032832S | doi-access = free }}</ref> the expression "Deep Learning" was introduced to the [[machine learning]] community by [[Rina Dechter]] in 1986<ref name="dechter1986">[[Rina Dechter]] (1986). Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory.[https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems Online] {{webarchive|url=https://web.archive.org/web/20160419054654/https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems |date=19 April 2016 }}</ref> and gained traction after Igor Aizenberg and colleagues introduced it to [[artificial neural network]]s in 2000.
   −
According to one overview, the expression "Deep Learning" was introduced to the machine learning community by Rina Dechter in 1986 and gained traction after
+
According to one overview, the expression "Deep Learning" was introduced to the machine learning community by Rina Dechter in 1986 and gained traction after Igor Aizenberg and colleagues introduced it to artificial neural networks in 2000.<ref name="aizenberg2000">Igor Aizenberg, Naum N. Aizenberg, Joos P.L. Vandewalle (2000). Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media.</ref> The first functional Deep Learning networks were published by [[Alexey Grigorevich Ivakhnenko]] and V. G. Lapa in 1965.<ref>{{Cite book|title=Cybernetic Predicting Devices|last=Ivakhnenko|first=Alexey|publisher=Naukova Dumka|year=1965|isbn=|location=Kiev|pages=}}</ref>
   −
根据一篇综述,“深度学习”这种表述是在1986年被里纳·德克特引入到机器学习领域的,并在2000年伊克尔·艾森贝格和他的同事将其引入人工神经网络后获得了关注。
+
根据一篇综述<ref name="scholarpedia">{{cite journal | last1 = Schmidhuber | first1 = Jürgen | authorlink = Jürgen Schmidhuber | year = 2015 | title = Deep Learning | journal = Scholarpedia | volume = 10 | issue = 11 | page = 32832 | doi = 10.4249/scholarpedia.32832 | df = dmy-all | bibcode = 2015SchpJ..1032832S | doi-access = free }}</ref>,“深度学习”这种表述是在1986年<ref name="dechter1986">[[Rina Dechter]] (1986). Learning while searching in constraint-satisfaction problems. University of California, Computer Science Department, Cognitive Systems Laboratory.[https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems Online] {{webarchive|url=https://web.archive.org/web/20160419054654/https://www.researchgate.net/publication/221605378_Learning_While_Searching_in_Constraint-Satisfaction-Problems |date=19 April 2016 }}</ref>被里纳·德克特引入到机器学习领域的,并在2000年伊克尔·艾森贝格和他的同事将其引入人工神经网络后获得了关注。<ref name="aizenberg2000">Igor Aizenberg, Naum N. Aizenberg, Joos P.L. Vandewalle (2000). Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media.</ref> The first functional Deep Learning networks were published by [[Alexey Grigorevich Ivakhnenko]] and V. G. Lapa in 1965.<ref>{{Cite book|title=Cybernetic Predicting Devices|last=Ivakhnenko|first=Alexey|publisher=Naukova Dumka|year=1965|isbn=|location=Kiev|pages=}}</ref>
   −
Igor Aizenberg and colleagues introduced it to [[artificial neural network]]s in 2000.<ref name="aizenberg2000">Igor Aizenberg, Naum N. Aizenberg, Joos P.L. Vandewalle (2000). Multi-Valued and Universal Binary Neurons: Theory, Learning and Applications. Springer Science & Business Media.</ref> The first functional Deep Learning networks were published by [[Alexey Grigorevich Ivakhnenko]] and V. G. Lapa in 1965.<ref>{{Cite book|title=Cybernetic Predicting Devices|last=Ivakhnenko|first=Alexey|publisher=Naukova Dumka|year=1965|isbn=|location=Kiev|pages=}}</ref>{{page needed|date=December 2016}} These networks are trained one layer at a time. Ivakhnenko's 1971 paper<ref name="ivak1971">{{Cite journal |doi = 10.1109/TSMC.1971.4308320|title = Polynomial Theory of Complex Systems|journal = IEEE Transactions on Systems, Man, and Cybernetics|issue = 4|pages = 364–378|year = 1971|last1 = Ivakhnenko|first1 = A. G.|url = https://semanticscholar.org/paper/b7efb6b6f7e9ffa017e970a098665f76d4dfeca2}}</ref> describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by [[Geoffrey Hinton]] and Ruslan Salakhutdinov introduced another way of pre-training many-layered [[feedforward neural network]]s (FNNs) one layer at a time, treating each layer in turn as an [[unsupervised learning|unsupervised]] [[restricted Boltzmann machine]], then using [[supervised learning|supervised]] [[backpropagation]] for fine-tuning.{{sfn|Hinton|2007}} Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.<ref>{{cite web|last1=Research|first1=AI|title=Deep Neural Networks for Acoustic Modeling in Speech Recognition|url=http://airesearch.com/ai-research-papers/deep-neural-networks-for-acoustic-modeling-in-speech-recognition/|website=airesearch.com|accessdate=23 October 2015|date=23 October 2015}}</ref>
     −
Igor Aizenberg and colleagues introduced it to artificial neural networks in 2000. The first functional Deep Learning networks were published by Alexey Grigorevich Ivakhnenko and V. G. Lapa in 1965. These networks are trained one layer at a time. Ivakhnenko's 1971 paper describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by Geoffrey Hinton and Ruslan Salakhutdinov introduced another way of pre-training many-layered feedforward neural networks (FNNs) one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine, then using supervised backpropagation for fine-tuning. Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.
+
These networks are trained one layer at a time. Ivakhnenko's 1971 paper<ref name="ivak1971">{{Cite journal |doi = 10.1109/TSMC.1971.4308320|title = Polynomial Theory of Complex Systems|journal = IEEE Transactions on Systems, Man, and Cybernetics|issue = 4|pages = 364–378|year = 1971|last1 = Ivakhnenko|first1 = A. G.|url = https://semanticscholar.org/paper/b7efb6b6f7e9ffa017e970a098665f76d4dfeca2}}</ref> describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by [[Geoffrey Hinton]] and Ruslan Salakhutdinov introduced another way of pre-training many-layered [[feedforward neural network]]s (FNNs) one layer at a time, treating each layer in turn as an [[unsupervised learning|unsupervised]] [[restricted Boltzmann machine]], then using [[supervised learning|supervised]] [[backpropagation]] for fine-tuning.{{sfn|Hinton|2007}} Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.<ref>{{cite web|last1=Research|first1=AI|title=Deep Neural Networks for Acoustic Modeling in Speech Recognition|url=http://airesearch.com/ai-research-papers/deep-neural-networks-for-acoustic-modeling-in-speech-recognition/|website=airesearch.com|accessdate=23 October 2015|date=23 October 2015}}</ref>
   −
第一个功能性的深度学习网络是由A. G.伊瓦赫年科和V.G.拉帕 在1965年发表的。这些网络每次只训练一层。1971年伊瓦赫年科的论文描述了一个8层的深度前馈多层感知机网络的学习过程,这个网络已经比许多后来的网络要深得多了。2006年,杰弗里•辛顿和特迪诺夫的文章介绍了另一种预训练'''<font color=#ff8000>多层前向神经网络 Many-layered Feedforward Neural Networks, FNNs</font>''' 的方法,一次训练一层,将每一层都视为无监督的受限玻尔兹曼机,然后使用监督式反向传播进行微调。与浅层人工神经网络类似,深层神经网络可以模拟复杂的非线性关系。在过去的几年里,机器学习算法和计算机硬件的进步催生了更有效的方法训练包含许多层非线性隐藏单元和一个非常大的输出层的深层神经网络。
+
The first functional Deep Learning networks were published by Alexey Grigorevich Ivakhnenko and V. G. Lapa in 1965. These networks are trained one layer at a time. Ivakhnenko's 1971 paper describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by Geoffrey Hinton and Ruslan Salakhutdinov introduced another way of pre-training many-layered feedforward neural networks (FNNs) one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine, then using supervised backpropagation for fine-tuning. Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.
    +
第一个可以用的深度学习网络是由A. G.伊瓦赫年科和V.G.拉帕 在1965年发表的。这些网络每次只训练一层。1971年伊瓦赫年科的论文描述了一个8层的深度前馈多层感知机网络的学习过程,这个网络已经比许多后来的网络要深得多了<ref name="ivak1971">{{Cite journal |doi = 10.1109/TSMC.1971.4308320|title = Polynomial Theory of Complex Systems|journal = IEEE Transactions on Systems, Man, and Cybernetics|issue = 4|pages = 364–378|year = 1971|last1 = Ivakhnenko|first1 = A. G.|url = https://semanticscholar.org/paper/b7efb6b6f7e9ffa017e970a098665f76d4dfeca2}}</ref>。2006年,杰弗里•辛顿和特迪诺夫的文章介绍了另一种预训练'''<font color=#ff8000>多层前馈神经网络 Many-layered Feedforward Neural Networks, FNNs</font>''' 的方法,一次训练一层,将每一层都视为无监督的[[受限玻尔兹曼机]],然后使用监督式反向传播进行微调。与浅层人工神经网络类似,深层神经网络可以模拟复杂的非线性关系。在过去的几年里,机器学习算法和计算机硬件的进步催生了更有效的方法,训练包含许多层非线性隐藏单元和一个非常大的输出层的深层神经网络。<ref>{{cite web|last1=Research|first1=AI|title=Deep Neural Networks for Acoustic Modeling in Speech Recognition|url=http://airesearch.com/ai-research-papers/deep-neural-networks-for-acoustic-modeling-in-speech-recognition/|website=airesearch.com|accessdate=23 October 2015|date=23 October 2015}}</ref>
      第1,216行: 第1,211行:  
Deep learning often uses convolutional neural networks (CNNs), whose origins can be traced back to the Neocognitron introduced by Kunihiko Fukushima in 1980. In 1989, Yann LeCun and colleagues applied backpropagation to such an architecture. In the early 2000s, in an industrial application, CNNs already processed an estimated 10% to 20% of all the checks written in the US.
 
Deep learning often uses convolutional neural networks (CNNs), whose origins can be traced back to the Neocognitron introduced by Kunihiko Fukushima in 1980. In 1989, Yann LeCun and colleagues applied backpropagation to such an architecture. In the early 2000s, in an industrial application, CNNs already processed an estimated 10% to 20% of all the checks written in the US.
   −
深度学习通常使用'<font color=#ff8000>卷积神经网络 ConvolutionalNeural Networks CNNs</font>''' ,其起源可以追溯到1980年由福岛邦彦引进的新认知机。1989年扬·勒丘恩和他的同事将反向传播应用于这样的架构。在21世纪初,在一项工业应用中,CNNs已经处理了美国大约10% 到20%的签发支票。
+
深度学习通常使用'<font color=#ff8000>卷积神经网络 ConvolutionalNeural Networks CNNs</font>''' ,其起源可以追溯到1980年由福岛邦彦引进的新认知机。1989年扬·勒丘恩(Yann LeCun)和他的同事将反向传播算法应用于这样的架构。在21世纪初,在一项工业应用中,CNNs已经处理了美国大约10% 到20%的签发支票。
Since 2011, fast implementations of CNNs on GPUs have
  −
 
  −
Since 2011, fast implementations of CNNs on GPUs have
  −
 
  −
自2011年以来,在 GPUs上快速实现的 CNN 赢得了许多视觉模式识别比赛。
  −
 
  −
won many visual pattern recognition competitions.<ref name="schmidhuber2015"/>
  −
 
  −
won many visual pattern recognition competitions.
  −
 
  −
 
      +
Since 2011, fast implementations of CNNs on GPUs have won many visual pattern recognition competitions.<ref name="schmidhuber2015"/>
    +
Since 2011, fast implementations of CNNs on GPUs have won many visual pattern recognition competitions.
    +
自2011年以来,在 GPUs上快速实现的 CNN 赢得了许多视觉模式识别比赛。<ref name="schmidhuber2015"/>
      第1,237行: 第1,224行:  
CNNs with 12 convolutional layers were used in conjunction with reinforcement learning by Deepmind's "AlphaGo Lee", the program that beat a top Go champion in 2016.
 
CNNs with 12 convolutional layers were used in conjunction with reinforcement learning by Deepmind's "AlphaGo Lee", the program that beat a top Go champion in 2016.
   −
2016年Deepmind 的“阿尔法狗李”使用了有12个卷积层的 CNNs 和强化学习,击败了一个顶级围棋冠军。
+
2016年Deepmind 的“AlphaGo Lee”使用了有12个卷积层的 CNNs 和强化学习,击败了一个顶级围棋冠军。<ref name="Nature2017">{{cite journal |first1=David |last1=Silver|author-link1=David Silver (programmer)|first2= Julian|last2= Schrittwieser|first3= Karen|last3= Simonyan|first4= Ioannis|last4= Antonoglou|first5= Aja|last5= Huang|author-link5=Aja Huang|first6=Arthur|last6= Guez|first7= Thomas|last7= Hubert|first8= Lucas|last8= Baker|first9= Matthew|last9= Lai|first10= Adrian|last10= Bolton|first11= Yutian|last11= Chen|author-link11=Chen Yutian|first12= Timothy|last12= Lillicrap|first13=Hui|last13= Fan|author-link13=Fan Hui|first14= Laurent|last14= Sifre|first15= George van den|last15= Driessche|first16= Thore|last16= Graepel|first17= Demis|last17= Hassabis |author-link17=Demis Hassabis|title=Mastering the game of Go without human knowledge|journal=[[Nature (journal)|Nature]]|issn= 0028-0836|pages=354–359|volume =550|issue =7676|doi =10.1038/nature24270|pmid=29052630|date=19 October 2017|quote=AlphaGo Lee... 12 convolutional layers|bibcode=2017Natur.550..354S|url=http://discovery.ucl.ac.uk/10045895/1/agz_unformatted_nature.pdf}}{{closed access}}</ref>
 
  −
====深层递归神经网络 Deep recurrent neural networks ====
  −
 
  −
 
  −
 
  −
 
  −
 
            +
====深层循环(递归)神经网络 ====
    
{{Main|Recurrent neural networks}}
 
{{Main|Recurrent neural networks}}
  −
  −
  −
  −
  −
      
Early on, deep learning was also applied to sequence learning with [[recurrent neural network]]s (RNNs)<ref name="Recurrent neural networks"/> which are in theory Turing complete<ref>{{cite journal|last1=Hyötyniemi|first1=Heikki|title=Turing machines are recurrent neural networks|journal=Proceedings of STeP '96/Publications of the Finnish Artificial Intelligence Society|pages=13–24|date=1996}}</ref> and can run arbitrary programs to process arbitrary sequences of inputs. The depth of an RNN is unlimited and depends on the length of its input sequence; thus, an RNN is an example of deep learning.<ref name="schmidhuber2015"/> RNNs can be trained by [[gradient descent]]<ref>P. J. Werbos. Generalization of backpropagation with application to a recurrent gas market model" ''Neural Networks'' 1, 1988.</ref><ref>A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.</ref><ref>R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications. Hillsdale, NJ: Erlbaum, 1994.</ref> but suffer from the [[vanishing gradient problem]].<ref name="goodfellow2016"/><ref name="hochreiter1991">[[Sepp Hochreiter]] (1991), [http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf Untersuchungen zu dynamischen neuronalen Netzen] {{webarchive|url=https://web.archive.org/web/20150306075401/http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf |date=6 March 2015 }}, Diploma thesis. Institut f. Informatik, Technische Univ. Munich. Advisor: J. Schmidhuber.</ref> In 1992, it was shown that unsupervised pre-training of a stack of [[recurrent neural network]]s can speed up subsequent supervised learning of deep sequential problems.<ref name="SCHMID1992">{{cite journal | last1 = Schmidhuber | first1 = J. | year = 1992 | title = Learning complex, extended sequences using the principle of history compression | url = | journal = Neural Computation | volume = 4 | issue = 2| pages = 234–242 | doi=10.1162/neco.1992.4.2.234| citeseerx = 10.1.1.49.3934}}</ref>
 
Early on, deep learning was also applied to sequence learning with [[recurrent neural network]]s (RNNs)<ref name="Recurrent neural networks"/> which are in theory Turing complete<ref>{{cite journal|last1=Hyötyniemi|first1=Heikki|title=Turing machines are recurrent neural networks|journal=Proceedings of STeP '96/Publications of the Finnish Artificial Intelligence Society|pages=13–24|date=1996}}</ref> and can run arbitrary programs to process arbitrary sequences of inputs. The depth of an RNN is unlimited and depends on the length of its input sequence; thus, an RNN is an example of deep learning.<ref name="schmidhuber2015"/> RNNs can be trained by [[gradient descent]]<ref>P. J. Werbos. Generalization of backpropagation with application to a recurrent gas market model" ''Neural Networks'' 1, 1988.</ref><ref>A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.</ref><ref>R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications. Hillsdale, NJ: Erlbaum, 1994.</ref> but suffer from the [[vanishing gradient problem]].<ref name="goodfellow2016"/><ref name="hochreiter1991">[[Sepp Hochreiter]] (1991), [http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf Untersuchungen zu dynamischen neuronalen Netzen] {{webarchive|url=https://web.archive.org/web/20150306075401/http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf |date=6 March 2015 }}, Diploma thesis. Institut f. Informatik, Technische Univ. Munich. Advisor: J. Schmidhuber.</ref> In 1992, it was shown that unsupervised pre-training of a stack of [[recurrent neural network]]s can speed up subsequent supervised learning of deep sequential problems.<ref name="SCHMID1992">{{cite journal | last1 = Schmidhuber | first1 = J. | year = 1992 | title = Learning complex, extended sequences using the principle of history compression | url = | journal = Neural Computation | volume = 4 | issue = 2| pages = 234–242 | doi=10.1162/neco.1992.4.2.234| citeseerx = 10.1.1.49.3934}}</ref>
   −
Early on, deep learning was also applied to sequence learning with recurrent neural networks (RNNs) and can run arbitrary programs to process arbitrary sequences of inputs. The depth of an RNN is unlimited and depends on the length of its input sequence; thus, an RNN is an example of deep learning. but suffer from the vanishing gradient problem. In 1992, it was shown that unsupervised pre-training of a stack of recurrent neural networks can speed up subsequent supervised learning of deep sequential problems.
+
Early on, deep learning was also applied to sequence learning with recurrent neural networks (RNNs)<ref name="Recurrent neural networks"/>  and can run arbitrary programs to process arbitrary sequences of inputs. The depth of an RNN is unlimited and depends on the length of its input sequence; thus, an RNN is an example of deep learning. but suffer from the vanishing gradient problem. In 1992, it was shown that unsupervised pre-training of a stack of recurrent neural networks can speed up subsequent supervised learning of deep sequential problems.
   −
早期,深度学习也被用于'''<font color=#ff8000>循环神经网络 Recurrent Neural Networks,RNNs</font>''' 的序列学习,可以运行任意程序来处理任意的输入序列。一个神经网络的深度是无限制的,取决于其输入序列的长度; 因此,神经网络是一个深度学习的例子,但却存在梯度消失问题。1992年的一项研究表明无监督的预训练循环神经网络可以加速后续的深度序列问题的监督式学习。
+
早期,深度学习也被用于'''<font color=#ff8000>循环神经网络 Recurrent Neural Networks,RNNs</font>''' 的序列学习<ref>{{cite journal|last1=Hyötyniemi|first1=Heikki|title=Turing machines are recurrent neural networks|journal=Proceedings of STeP '96/Publications of the Finnish Artificial Intelligence Society|pages=13–24|date=1996}}</ref>,可以运行任意程序来处理任意的输入序列。一个循环神经网络的深度是无限制的,取决于其输入序列的长度; 因此,循环神经网络是一个深度学习的例子<ref name="schmidhuber2015"/>,但却存在梯度消失问题。1992年的一项研究表明无监督的预训练循环神经网络可以加速后续的深度序列问题的监督式学习。]<ref>P. J. Werbos. Generalization of backpropagation with application to a recurrent gas market model" ''Neural Networks'' 1, 1988.</ref><ref>A. J. Robinson and F. Fallside. The utility driven dynamic error propagation network. Technical Report CUED/F-INFENG/TR.1, Cambridge University Engineering Department, 1987.</ref><ref>R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks and their computational complexity. In Back-propagation: Theory, Architectures and Applications. Hillsdale, NJ: Erlbaum, 1994.</ref> but suffer from the [[vanishing gradient problem]].<ref name="goodfellow2016"/><ref name="hochreiter1991">[[Sepp Hochreiter]] (1991), [http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf Untersuchungen zu dynamischen neuronalen Netzen] {{webarchive|url=https://web.archive.org/web/20150306075401/http://people.idsia.ch/~juergen/SeppHochreiter1991ThesisAdvisorSchmidhuber.pdf |date=6 March 2015 }}, Diploma thesis. Institut f. Informatik, Technische Univ. Munich. Advisor: J. Schmidhuber.</ref> In 1992, it was shown that unsupervised pre-training of a stack of [[recurrent neural network]]s can speed up subsequent supervised learning of deep sequential problems.<ref name="SCHMID1992">{{cite journal | last1 = Schmidhuber | first1 = J. | year = 1992 | title = Learning complex, extended sequences using the principle of history compression | url = | journal = Neural Computation | volume = 4 | issue = 2| pages = 234–242 | doi=10.1162/neco.1992.4.2.234| citeseerx = 10.1.1.49.3934}}</ref>
      第1,269行: 第1,244行:  
Numerous researchers now use variants of a deep learning recurrent NN called the long short-term memory (LSTM) network published by Hochreiter & Schmidhuber in 1997. LSTM is often trained by Connectionist Temporal Classification (CTC). At Google, Microsoft and Baidu this approach has revolutionized speech recognition.<ref name="hannun2014">{{cite arXiv
 
Numerous researchers now use variants of a deep learning recurrent NN called the long short-term memory (LSTM) network published by Hochreiter & Schmidhuber in 1997. LSTM is often trained by Connectionist Temporal Classification (CTC). At Google, Microsoft and Baidu this approach has revolutionized speech recognition.<ref name="hannun2014">{{cite arXiv
   −
许多研究人员现在使用的是被称为 '''<font color=#ff8000>长短期记忆 Long Short-term Memory, LSTM </font>'''网络——一种深度学习循环神经网络的变体,由霍克赖特和施米德胡贝在1997年提出。人们通常使用'''<font color=#ff8000>连接时序分类Connectionist Temporal Classification, CTC</font>'''训练LSTM。[21谷歌,微软和百度用CTC彻底改变了语音识别。例如,2015年谷歌的语音识别性能大幅提升了49%,现在数十亿智能手机用户都可以通过谷歌声音使用这项技术。谷歌也使用LSTM来改进机器翻译,例如2015年,通过训练的LSTM,谷歌的语音识别性能大幅提升了49%,现在通过谷歌语音可以被数十亿的智能手机用户使用。谷歌还使用LSTM来改进机器翻译、语言建模和多语言语言处理。LSTM与CNNs一起使用改进了自动图像字幕的功能等众多应用。
+
许多研究人员现在使用着一种被称为 '''<font color=#ff8000>长短期记忆 Long Short-term Memory, LSTM </font>'''的网络——一种深度学习循环神经网络的变体,由霍克赖特和施米德胡贝在1997年提出。人们通常使用'''<font color=#ff8000>连接时序分类 Connectionist Temporal Classification, CTC</font>'''训练LSTM<ref name="graves2006">Alex Graves, Santiago Fernandez, Faustino Gomez, and [[Jürgen Schmidhuber]] (2006). Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural nets. Proceedings of ICML'06, pp. 369–376.</ref>。谷歌,微软和百度用CTC彻底改变了语音识别。例如,2015年谷歌的语音识别性能大幅提升了49%,现在数十亿智能手机用户都可以通过谷歌声音使用这项技术。谷歌也使用LSTM来改进机器翻译,例如2015年,通过训练的LSTM,谷歌的语音识别性能大幅提升了49%,现在通过谷歌语音可以被数十亿的智能手机用户使用。谷歌还使用LSTM来改进机器翻译、语言建模和多语言语言处理。LSTM与CNNs一起使用改进了自动图像字幕的功能等众多应用。
       
===评估进度 Evaluating progress ===
 
===评估进度 Evaluating progress ===
  −
  −
  −
      
{{Further|Progress in artificial intelligence|Competitions and prizes in artificial intelligence}}
 
{{Further|Progress in artificial intelligence|Competitions and prizes in artificial intelligence}}
  −
      
AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at.<ref>{{cite news|last1=Brynjolfsson|first1=Erik|last2=Mitchell|first2=Tom|title=What can machine learning do? Workforce implications|url=http://science.sciencemag.org/content/358/6370/1530|accessdate=7 May 2018|work=Science|date=22 December 2017|pages=1530–1534|language=en|doi=10.1126/science.aap8062|bibcode=2017Sci...358.1530B}}</ref> While projects such as [[AlphaZero]] have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets.<ref>{{cite news|last1=Sample|first1=Ian|title='It's able to create knowledge itself': Google unveils AI that learns on its own|url=https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own|accessdate=7 May 2018|work=the Guardian|date=18 October 2017|language=en}}</ref><ref>{{cite news|title=The AI revolution in science|url=http://www.sciencemag.org/news/2017/07/ai-revolution-science|accessdate=7 May 2018|work=Science {{!}} AAAS|date=5 July 2017|language=en}}</ref> Researcher [[Andrew Ng]] has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI."<ref>{{cite news|title=Will your job still exist in 10 years when the robots arrive?|url=http://www.scmp.com/tech/innovation/article/2098164/robots-are-coming-here-are-some-jobs-wont-exist-10-years|accessdate=7 May 2018|work=[[South China Morning Post]]|date=2017|language=en}}</ref> [[Moravec's paradox]] suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.<ref name="The Economist"/>
 
AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at.<ref>{{cite news|last1=Brynjolfsson|first1=Erik|last2=Mitchell|first2=Tom|title=What can machine learning do? Workforce implications|url=http://science.sciencemag.org/content/358/6370/1530|accessdate=7 May 2018|work=Science|date=22 December 2017|pages=1530–1534|language=en|doi=10.1126/science.aap8062|bibcode=2017Sci...358.1530B}}</ref> While projects such as [[AlphaZero]] have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets.<ref>{{cite news|last1=Sample|first1=Ian|title='It's able to create knowledge itself': Google unveils AI that learns on its own|url=https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own|accessdate=7 May 2018|work=the Guardian|date=18 October 2017|language=en}}</ref><ref>{{cite news|title=The AI revolution in science|url=http://www.sciencemag.org/news/2017/07/ai-revolution-science|accessdate=7 May 2018|work=Science {{!}} AAAS|date=5 July 2017|language=en}}</ref> Researcher [[Andrew Ng]] has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI."<ref>{{cite news|title=Will your job still exist in 10 years when the robots arrive?|url=http://www.scmp.com/tech/innovation/article/2098164/robots-are-coming-here-are-some-jobs-wont-exist-10-years|accessdate=7 May 2018|work=[[South China Morning Post]]|date=2017|language=en}}</ref> [[Moravec's paradox]] suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.<ref name="The Economist"/>
第1,286行: 第1,255行:  
AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at. While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets. Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI." Moravec's paradox suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.
 
AI, like electricity or the steam engine, is a general purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at. While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets. Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI." Moravec's paradox suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.
   −
AI和电或蒸汽机一样,是一种通用技术。在AI 擅长什么样的任务这个问题上尚未达成共识。虽然像 AlphaZero 这样的项目已经能做到从零开始产生知识,但是许多其他的机器学习项目仍需要大量的训练数据集。研究人员安德鲁 Ng 认为,作为一个“极不完美的经验法则”,“几乎任何普通人只需要不到一秒钟的思考就能做到的事情,我们现在或者在不久的将来都可以使用AI做到。”莫拉维克悖论表明,AI在执行许多人类大脑专门进化出来的、能够很好完成的任务时表现不如人类。
+
AI和电或蒸汽机一样,是一种通用技术。AI 擅长什么样的任务,这个问题尚未达成共识<ref>{{cite news|last1=Brynjolfsson|first1=Erik|last2=Mitchell|first2=Tom|title=What can machine learning do? Workforce implications|url=http://science.sciencemag.org/content/358/6370/1530|accessdate=7 May 2018|work=Science|date=22 December 2017|pages=1530–1534|language=en|doi=10.1126/science.aap8062|bibcode=2017Sci...358.1530B}}</ref>。虽然像 AlphaZero 这样的项目已经能做到从零开始产生知识,但是许多其他的机器学习项目仍需要大量的训练数据集<ref>{{cite news|last1=Sample|first1=Ian|title='It's able to create knowledge itself': Google unveils AI that learns on its own|url=https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own|accessdate=7 May 2018|work=the Guardian|date=18 October 2017|language=en}}</ref><ref>{{cite news|title=The AI revolution in science|url=http://www.sciencemag.org/news/2017/07/ai-revolution-science|accessdate=7 May 2018|work=Science {{!}} AAAS|date=5 July 2017|language=en}}</ref>。研究人员吴恩达认为,作为一个“极不完美的经验法则”,“几乎任何普通人只需要不到一秒钟的思考就能做到的事情,我们现在或者在不久的将来都可以使用AI做到。”莫拉维克悖论表明,AI在执行许多人类大脑专门进化出来的、能够很好完成的任务时表现不如人类。<ref>{{cite news|title=Will your job still exist in 10 years when the robots arrive?|url=http://www.scmp.com/tech/innovation/article/2098164/robots-are-coming-here-are-some-jobs-wont-exist-10-years|accessdate=7 May 2018|work=[[South China Morning Post]]|date=2017|language=en}}</ref> [[Moravec's paradox]] suggests that AI lags humans at many tasks that the human brain has specifically evolved to perform well.<ref name="The Economist"/>
 
  −
 
  −
 
        第1,296行: 第1,262行:  
Games provide a well-publicized benchmark for assessing rates of progress. AlphaGo around 2016 brought the era of classical board-game benchmarks to a close. Games of imperfect knowledge provide new challenges to AI in the area of game theory. E-sports such as StarCraft continue to provide additional public benchmarks. There are many competitions and prizes, such as the Imagenet Challenge, to promote research in artificial intelligence. The most common areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.
 
Games provide a well-publicized benchmark for assessing rates of progress. AlphaGo around 2016 brought the era of classical board-game benchmarks to a close. Games of imperfect knowledge provide new challenges to AI in the area of game theory. E-sports such as StarCraft continue to provide additional public benchmarks. There are many competitions and prizes, such as the Imagenet Challenge, to promote research in artificial intelligence. The most common areas of competition include general machine intelligence, conversational behavior, data-mining, robotic cars, and robot soccer as well as conventional games.
   −
游戏评估进步率用的是一个公众皆知的基准。2016年前后,AlphaGo 拉上了传统棋类基准的时代的幕布。不完全知识的游戏在博弈论领域对AI来说是提新的挑战。星际争霸等电子竞技用的是不同的公众基准。它们设立了有许多如 Imagenet 挑战赛的比赛和奖项以促进AI研究。最常见的比赛内容包括通用机器智能、对话行为、数据挖掘、机器人汽车、机器人足球以及传统游戏。  
+
游戏是评估进步率用的一个广泛认可的基准。2016年前后,AlphaGo 为传统棋类基准的时代的拉下终幕。不过,不完全知识的游戏给AI在博弈论领域提出了新的挑战。星际争霸等电子竞技现在仍然是一项的公众基准。现在出现了设立了有许多如 Imagenet 挑战赛的比赛和奖项以促进AI研究。最常见的比赛内容包括通用机器智能、对话行为、数据挖掘、机器人汽车、机器人足球以及传统游戏。  
 
        第1,304行: 第1,269行:  
The "imitation game" (an interpretation of the 1950 Turing test that assesses whether a computer can imitate a human) is nowadays considered too exploitable to be a meaningful benchmark. A derivative of the Turing test is the Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA). As the name implies, this helps to determine that a user is an actual person and not a computer posing as a human. In contrast to the standard Turing test, CAPTCHA is administered by a machine and targeted to a human as opposed to being administered by a human and targeted to a machine. A computer asks a user to complete a simple test then generates a grade for that test. Computers are unable to solve the problem, so correct solutions are deemed to be the result of a person taking the test. A common type of CAPTCHA is the test that requires the typing of distorted letters, numbers or symbols that appear in an image undecipherable by a computer.
 
The "imitation game" (an interpretation of the 1950 Turing test that assesses whether a computer can imitate a human) is nowadays considered too exploitable to be a meaningful benchmark. A derivative of the Turing test is the Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA). As the name implies, this helps to determine that a user is an actual person and not a computer posing as a human. In contrast to the standard Turing test, CAPTCHA is administered by a machine and targeted to a human as opposed to being administered by a human and targeted to a machine. A computer asks a user to complete a simple test then generates a grade for that test. Computers are unable to solve the problem, so correct solutions are deemed to be the result of a person taking the test. A common type of CAPTCHA is the test that requires the typing of distorted letters, numbers or symbols that appear in an image undecipherable by a computer.
   −
“模仿游戏”(对1950年图灵测试的一种解释,用来评估计算机是否可以模仿人类)如今被认为是一个过于灵活而不能成为有意义的基准。图灵测试衍生出了'''<font color=#ff8000>验证码 Completely Automated Public Turing test to tell Computers and Humans Apart,CAPTCHA</font>'''(即全自动区分计算机和人类的图灵测试)。顾名思义,这有助于确定用户是一个真实的人,而不是一台伪装成人的计算机。与标准的图灵测试不同,CAPTCHA 是由机器控制,面向人测试,而不是由人控制的,面向机器测试的。计算机要求用户完成一个简单的测试,然后给测试评出一个等级。计算机无法解决这个问题,所以一般认为只有人参加测试才能得出正确答案。验证码的一个常见类型是要求输入一幅计算机无法破译的图中扭曲的字母,数字或符号测试。  
+
“模仿游戏”(对1950年图灵测试的一种解释,用来评估计算机是否可以模仿人类)如今被认为过于灵活,所以不能成为有一项意义的基准。图灵测试衍生出了'''<font color=#ff8000>验证码 Completely Automated Public Turing test to tell Computers and Humans Apart,CAPTCHA</font>'''(即全自动区分计算机和人类的图灵测试),顾名思义,这有助于确定用户是一个真实的人,而不是一台伪装成人的计算机。与标准的图灵测试不同,CAPTCHA 是由机器控制,面向人测试,而不是由人控制的,面向机器测试的。计算机要求用户完成一个简单的测试,然后给测试评出一个等级。计算机无法解决这个问题,所以一般认为只有人参加测试才能得出正确答案。验证码的一个常见类型是要求输入一幅计算机无法破译的图中扭曲的字母,数字或符号测试。  
 
        第1,312行: 第1,276行:  
Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; unfortunately, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels.
 
Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; unfortunately, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels.
   −
“通用智能”测试旨在比较机器、人类甚至非人类动物在尽可能通用的问题集上的表现。在极端情况下,测试集可以包含所有可能出现的问题,由柯氏复杂性赋权重; 可是这些问题集往往是用有限的模式匹配练习完成的,在这些练习中,优化过的AI可以轻易地超过人类。
+
“通用智能”测试旨在比较机器、人类甚至非人类动物在尽可能通用的问题集上的表现。在极端情况下,测试集可以包含所有可能出现的问题,再通过柯尔莫哥洛夫复杂度赋予权重;可是这些问题集里大多数问题都是不怎么难的模式匹配练习,在这些练习中,优化过的AI可以轻易地超过人类。<ref name="Mathematical definitions of intelligence"/><ref>{{cite journal|last1=Hernández-Orallo|first1=José|last2=Dowe|first2=David L.|last3=Hernández-Lloreda|first3=M.Victoria|title=Universal psychometrics: Measuring cognitive abilities in the machine kingdom|journal=Cognitive Systems Research|date=March 2014|volume=27|pages=50–74|doi=10.1016/j.cogsys.2013.06.001|hdl=10251/50244|hdl-access=free}}</ref>
    
== 应用 Applications{{anchor|Goals}} ==
 
== 应用 Applications{{anchor|Goals}} ==
370

个编辑

导航菜单