更改

机器学习 Machine Learning (查看源代码)

2021年8月3日 (二) 22:23的版本

删除27,246字节、 2021年8月3日 (二) 22:23

→‎方法

第268行：第268行：

==方法==

−

~~:''主文章：[https://en.wikipedia.org/wiki/List_of_machine_learning_algorithms 机器学习算法列表]''~~

+

=== 学习算法的分类 ===

−

=== 学习算法的分类 ~~Types of learning algorithms~~ ===

−

~~The types of machine learning algorithms differ in their approach, the type of data they input and output, and the type of task or problem that they are intended to solve.~~

−

~~The types of machine learning algorithms differ in their approach, the type of data they input and output, and the type of task or problem that they are intended to solve.~~

−

不同类型的机器学习算法的方法、输入和输出的数据类型以及它们要解决的任务或问题的类型都有所不同。

+

==== 监督学习====

−

~~==== 监督学习 Supervised learning ====~~

+

[[File:Svm max sep hyperplane with margin.png|thumb|支持向量机是一个有监督学习模型，它将数据划分为由线性边界分隔的区域。在这里，有一个线性边界可以将黑色圆圈和白色圆圈分开。]]

−

+

有监督学习算法会建立一个包含输入和期望输出的数据集上的的数学模型。<ref>{{cite book |last1=Russell |first1=Stuart J. |last2=Norvig |first2=Peter |title=Artificial Intelligence: A Modern Approach |date=2010 |publisher=Prentice Hall |isbn=9780136042594 |edition=Third|title-link=Artificial Intelligence: A Modern Approach }}</ref> 这些数据被称为训练数据，由一组组训练样本组成。每个训练样本都有一个或多个输入和期望的输出，也称为监督信号。在数学模型中，每个训练样本由一个数组或向量表示，有时也称为'''特征向量 Feature Vector'''，训练数据由一个矩阵表示。通过对目标函数的迭代优化，监督式学习算法可以学习到一个用来预测与新输入相关的输出的函数。<ref>{{cite book |last1=Mohri |first1=Mehryar |last2=Rostamizadeh |first2=Afshin |last3=Talwalkar |first3=Ameet |title=Foundations of Machine Learning |date=2012 |publisher=The MIT Press |isbn=9780262018258}}</ref> 一个达到最优的目标函数可以实现算法对未知输入的输出结果有正确的预判，这种正确的预判并不仅限于训练数据上（即模型具有良好的泛化能力）。随着时间的推移，提高输出或预测精度的算法被称为已学会执行该任务。<ref name="Mitchell-1997" />

−

~~{{Main|Supervised learning}}~~

−

[[File:Svm max sep hyperplane with margin.png|thumb|A [[support vector machine]] is a supervised learning model that divides the data into regions separated by a [[linear classifier|linear boundary]]. Here, the linear boundary divides the black circles from the white.]]

−

~~A [[support vector machine is a supervised learning model that divides the data into regions separated by a linear boundary. Here, the linear boundary divides the black circles from the white.~~]]

−

~~支持向量机是一个有监督学习模型，它将数据划分为由线性边界分隔的区域。在这里，有一个线性边界可以将黑色圆圈和白色圆圈分开。]~~

−

~~Supervised learning algorithms build a mathematical model of a set of data that contains both the inputs and the desired outputs.~~<ref>{{cite book |last1=Russell |first1=Stuart J. |last2=Norvig |first2=Peter |title=Artificial Intelligence: A Modern Approach |date=2010 |publisher=Prentice Hall |isbn=9780136042594 |edition=Third|title-link=Artificial Intelligence: A Modern Approach }}</ref> The data is known as [[training data]], and consists of a set of training examples. Each training example has one or more inputs and the desired output, also known as a supervisory signal. In the mathematical model, each training example is represented by an [[array data structure|array]] or vector, sometimes called a feature vector, and the training data is represented by a [[Matrix (mathematics)|matrix]]. Through iterative optimization of an [[Loss function|objective function]], supervised learning algorithms learn a function that can be used to predict the output associated with new inputs.<ref>{{cite book |last1=Mohri |first1=Mehryar |last2=Rostamizadeh |first2=Afshin |last3=Talwalkar |first3=Ameet |title=Foundations of Machine Learning |date=2012 |publisher=The MIT Press |isbn=9780262018258}}</ref> An optimal function will allow the algorithm to correctly determine the output for inputs that were not a part of the training data. An algorithm that improves the accuracy of its outputs or predictions over time is said to have learned to perform that task.<ref name="Mitchell-1997" />

−

Supervised learning algorithms build a mathematical model of a set of data that contains both the inputs and the desired outputs. The data is known as training data, and consists of a set of training examples. Each training example has one or more inputs and the desired output, also known as a supervisory signal. In the mathematical model, each training example is represented by an array or vector, sometimes called a feature vector, and the training data is represented by a matrix. Through iterative optimization of an objective function, supervised learning algorithms learn a function that can be used to predict the output associated with new inputs. An optimal function will allow the algorithm to correctly determine the output for inputs that were not a part of the training data. An algorithm that improves the accuracy of its outputs or predictions over time is said to have learned to perform that task.

−

有监督学习算法会建立一个包含输入和期望输出的数据集上的的数学模型。这些数据被称为训练数据，由一组组训练样本组成。每个训练样本都有一个或多个输入和期望的输出，也称为监督信号。在数学模型中，每个训练样本由一个数组或向量表示，有时也称为'''~~特征向量 Feature Vector~~'''，训练数据由一个矩阵表示。通过对目标函数的迭代优化，监督式学习算法可以学习到一个用来预测与新输入相关的输出的函数。一个达到最优的目标函数可以实现算法对未知输入的输出结果有正确的预判，这种正确的预判并不仅限于训练数据上（即模型具有良好的泛化能力）。随着时间的推移，提高输出或预测精度的算法被称为已学会执行该任务。

+

监督式学习算法的类型包括'''主动学习 Active Learning'''、'''分类 Classification'''和'''回归 Regression'''。.<ref>{{cite book|last=Alpaydin|first=Ethem|title=Introduction to Machine Learning|date=2010|publisher=MIT Press|isbn=978-0-262-01243-0|page=9|url=https://books.google.com/books?id=7f5bBAAAQBAJ&printsec=frontcover#v=onepage&q=classification&f=false}}</ref>当输出被限制在一个有限的值集内时使用分类算法，当输出在一个范围内可能有任何数值时使用回归算法。例如，对于过滤电子邮件的分类算法，输入将是一封收到的电子邮件，输出将是用于将电子邮件归档的文件夹的名称。

−

Types of supervised learning algorithms include [[Active learning (machine learning)|Active learning]] , [[Statistical classification|classification]] and [[Regression analysis|regression]].<ref>{{cite book|last=Alpaydin|first=Ethem|title=Introduction to Machine Learning|date=2010|publisher=MIT Press|isbn=978-0-262-01243-0|page=9|url=https://books.google.com/books?id=7f5bBAAAQBAJ&printsec=frontcover#v=onepage&q=classification&f=false}}</ref> Classification algorithms are used when the outputs are restricted to a limited set of values, and regression algorithms are used when the outputs may have any numerical value within a range. As an example, for a classification algorithm that filters emails, the input would be an incoming email, and the output would be the name of the folder in which to file the email.

−

Types of supervised learning algorithms include Active learning , classification and regression. Classification algorithms are used when the outputs are restricted to a limited set of values, and regression algorithms are used when the outputs may have any numerical value within a range. As an example, for a classification algorithm that filters emails, the input would be an incoming email, and the output would be the name of the folder in which to file the email.

−

监督式学习算法的类型包括'''主动学习 Active Learning'''、'''分类 Classification'''和'''回归 Regression'''。当输出被限制在一个有限的值集内时使用分类算法，当输出在一个范围内可能有任何数值时使用回归算法。例如，对于过滤电子邮件的分类算法，输入将是一封收到的电子邮件，输出将是用于将电子邮件归档的文件夹的名称。

−

[[Similarity learning]] is an area of supervised machine learning closely related to regression and classification, but the goal is to learn from examples using a similarity function that measures how similar or related two objects are. It has applications in [[ranking]], [[recommendation systems]], visual identity tracking, face verification, and speaker verification.

−

Similarity learning is an area of supervised machine learning closely related to regression and classification, but the goal is to learn from examples using a similarity function that measures how similar or related two objects are. It has applications in ranking, recommendation systems, visual identity tracking, face verification, and speaker verification.

'''相似性学习 Similarity Learning'''是监督学习领域中与回归和分类密切相关的一个领域，但其目标是从实例中学习如何通过使用相似性函数来衡量两个对象之间的相似程度。它在排名、推荐系统、视觉身份跟踪、人脸验证和'''语者验证 Speaker Verification'''等方面都有应用。

−

~~==== 无监督学习 Unsupervised learning ====~~

−

Unsupervised learning algorithms take a set of data that contains only inputs, and find structure in the data, like grouping or clustering of data points. The algorithms, therefore, learn from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning algorithms identify commonalities in the data and react based on the presence or absence of such commonalities in each new piece of data. A central application of unsupervised learning is in the field of [[density estimation]] in [[statistics]], such as finding the [[probability density function]].<ref name="JordanBishop2004">{{cite book |first1=Michael I. |last1=Jordan |first2=Christopher M. |last2=Bishop |chapter=Neural Networks |editor=Allen B. Tucker |title=Computer Science Handbook, Second Edition (Section VII: Intelligent Systems) |location=Boca Raton, Florida |publisher=Chapman & Hall/CRC Press LLC |year=2004 |isbn=978-1-58488-360-9 }}</ref> ~~Though unsupervised learning encompasses other domains involving summarizing and explaining data features.~~

+

==== 无监督学习 ====

+

'''无监督学习 Unsupervised Learning'''算法只需要一组只包含输入的数据，通过寻找数据中潜在结构、规律，对数据点进行分组或聚类。因此，算法是从未被标记、分类或分类的测试数据中学习，而不是通过响应反馈来改进策略。无监督式学习算法可以识别数据中的共性，并根据每个新数据中是否存在这些共性而做出反应。无监督学习的一个核心应用是统计学中的密度估计领域，比如寻找概率密度函数。<ref name="JordanBishop2004">{{cite book |first1=Michael I. |last1=Jordan |first2=Christopher M. |last2=Bishop |chapter=Neural Networks |editor=Allen B. Tucker |title=Computer Science Handbook, Second Edition (Section VII: Intelligent Systems) |location=Boca Raton, Florida |publisher=Chapman & Hall/CRC Press LLC |year=2004 |isbn=978-1-58488-360-9 }}</ref>尽管非监督式学习也包含了其他领域，如总结和解释数据特性。

−

Unsupervised learning algorithms take a set of data that contains only inputs, and find structure in the data, like grouping or clustering of data points. The algorithms, therefore, learn from test data that has not been labeled, classified or categorized. Instead of responding to feedback, unsupervised learning algorithms identify commonalities in the data and react based on the presence or absence of such commonalities in each new piece of data. A central application of unsupervised learning is in the field of density estimation in statistics, such as finding the probability density function. Though unsupervised learning encompasses other domains involving summarizing and explaining data features.

−

'''无监督学习 Unsupervised Learning'''算法只需要一组只包含输入的数据，通过寻找数据中潜在结构、规律，对数据点进行分组或聚类。因此，算法是从未被标记、分类或分类的测试数据中学习，而不是通过响应反馈来改进策略。无监督式学习算法可以识别数据中的共性，并根据每个新数据中是否存在这些共性而做出反应。无监督学习的一个核心应用是统计学中的密度估计领域，比如寻找概率密度函数。尽管非监督式学习也包含了其他领域，如总结和解释数据特性。

=====聚类=====

−

~~:''主文章：[https://en.wikipedia.org/wiki/Cluster_analysis 聚类分析]''~~

+

聚类分析是将一组观测数据分配到子集(称为聚类)中，使同一簇内的观测按照某些预先指定的准则相似，而从不同的簇中提取的观测值则不同。不同的聚类技术对数据的结构提出了不同的假设，通常用某种相似性度量来定义，并通过内部紧密性(同一聚类成员之间的相似性)和不同聚类之间的分离性来评估。其他方法基于估计的密度和图的连通性。聚类是一种[[无监督学习]]方法，是一种常用的统计[[数据分析]]技术。

−

聚类分析是将一组观测数据分配到子集(称为聚类)中，使同一簇内的观测按照某些预先指定的准则相似，而从不同的簇中提取的观测值则不同。不同的聚类技术对数据的结构提出了不同的假设，通常用某种相似性度量来定义，并通过内部紧密性(同一聚类成员之间的相似性)和不同聚类之间的分离性来评估。其他方法基于估计的密度和图的连通性。聚类是一种[~~https://en.wikipedia.org/wiki/Unsupervised_learning~~ 无监督学习]~~方法，是一种常用的~~[~~https://en.wikipedia.org/wiki/Statistics 统计]~~[~~https://en.wikipedia.org/wiki/Data_analysis~~ 数据分析]技术。

−

~~==== 半监督学习 Semi-supervised learning ====~~

−

~~:''主文章：[[半监督学习]]''~~

−

Semi-supervised learning falls between [[unsupervised learning]] (without any labeled training data) and [[supervised learning]] (with completely labeled training data). Some of the training examples are missing training labels, yet many machine-learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce a considerable improvement in learning accuracy.

−

Semi-supervised learning falls between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data). Some of the training examples are missing training labels, yet many machine-learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce a considerable improvement in learning accuracy.

+

==== 半监督学习 ====

'''半监督学习 Semi-supervised Learning'''介于无监督式学习（没有任何标记的训练数据）和有监督学习（完全标记的训练数据）之间。有些训练样本缺少训练标签，但许多机器学习研究人员发现，如果将未标记的数据与少量标记的数据结合使用，可以大大提高学习的准确性。

+

在'''弱监督学习 Weak Supervision'''中，训练标签是有噪声的、有限的或不精确的; 然而，这些标签使用起来往往更加“实惠”——这种数据更容易得到、更容易拥有更大的有效训练集。<ref>{{Cite web|url=https://hazyresearch.github.io/snorkel/blog/ws_blog_post.html|title=Weak Supervision: The New Programming Paradigm for Machine Learning|author1=Alex Ratner |author2=Stephen Bach |author3=Paroma Varma |author4=Chris |others= referencing work by many other members of Hazy Research|website=hazyresearch.github.io|access-date=2019-06-06}}</ref>

−

In [[Weak supervision|weakly supervised learning]], the training labels are noisy, limited, or imprecise; however, these labels are often cheaper to obtain, resulting in larger effective training sets.<ref>{{Cite web|url=https://hazyresearch.github.io/snorkel/blog/ws_blog_post.html|title=Weak Supervision: The New Programming Paradigm for Machine Learning|author1=Alex Ratner |author2=Stephen Bach |author3=Paroma Varma |author4=Chris |others= referencing work by many other members of Hazy Research|website=hazyresearch.github.io|access-date=2019-06-06}}</ref>

−

~~In weakly supervised learning, the training labels are noisy, limited, or imprecise; however, these labels are often cheaper to obtain, resulting in larger effective training sets.~~

−

在'''弱监督学习 Weak Supervision'''中，训练标签是有噪声的、有限的或不精确的; 然而，这些标签使用起来往往更加“实惠”——这种数据更容易得到、更容易拥有更大的有效训练集。

−

~~==== 强化学习 Reinforcement learning ====~~

−

~~:''主文章：[[强化学习]]''~~

−

强化学习是指一个''智能体 agent''应该如何在''环境''中采取''行动''，从而最大限度地获得长期''报酬''的概念。强化学习算法试图找到一种''策略''，将世界''状态''映射到智能体在这些状态中应该采取的行动。强化学习不同于[https://en.wikipedia.org/wiki/Supervised_learning 监督学习]问题，因为不会提供正确的输入/输出对，也没有明确地修正次优行为。

−

Reinforcement learning is an area of machine learning concerned with how [[software agent]]s ought to take [[Action selection|actions]] in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as [[game theory]], [[control theory]], [[operations research]], [[information theory]], [[simulation-based optimization]], [[multi-agent system]]s, [[swarm intelligence]], [[statistics]] and [[genetic algorithm]]s. In machine learning, the environment is typically represented as a [[Markov Decision Process]] (MDP). Many reinforcement learning algorithms use [[dynamic programming]] techniques.<ref>{{Cite book|title=Reinforcement learning and markov decision processes|author1=van Otterlo, M.|author2=Wiering, M.|journal=Reinforcement Learning |volume=12|pages=3–42 |year=2012 |doi=10.1007/978-3-642-27645-3_1|series=Adaptation, Learning, and Optimization|isbn=978-3-642-27644-6}}</ref> Reinforcement learning algorithms do not assume knowledge of an exact mathematical model of the MDP, and are used when exact models are infeasible. Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game against a human opponent.

−

Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In machine learning, the environment is typically represented as a Markov Decision Process (MDP). Many reinforcement learning algorithms use dynamic programming techniques. Reinforcement learning algorithms do not assume knowledge of an exact mathematical model of the MDP, and are used when exact models are infeasible. Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game against a human opponent.

−

强化学习是机器学习的一个分支，它研究软件组件应该如何在某个环境中进行行动决策，以便最大化某种累积收益的概念。由于其存在的普遍性，该领域的研究在许多其他学科，如'''博弈论 Game Theory'''，'''控制理论 Control Theory'''，'''运筹学 Operations Research'''，'''信息论 Information Theory'''，'''基于仿真的优化 Simulation-based Optimization'''，'''多主体系统 Multi-agent System'''，'''群体智能 Swarm Intelligence'''，'''统计学 Statistics'''和'''遗传算法 Genetic Algorithm'''。在机器学习中，环境通常被表示为'''马可夫决策过程 Markov Decision Process ，MDP'''。许多强化学习算法使用动态编程技术。强化学习算法不需要知道 MDP 的精确数学模型，而是在精确模型不可行的情况下使用。强化学习算法常用于车辆自动驾驶问题或人机游戏场景。

−

==== ~~自学习 Self learning~~ ====

+

==== 强化学习 ====

+

强化学习是指一个''智能体 agent''应该如何在''环境''中采取''行动''，从而最大限度地获得长期''报酬''的概念。强化学习算法试图找到一种''策略''，将世界''状态''映射到智能体在这些状态中应该采取的行动。强化学习不同于[[监督学习]]问题，因为不会提供正确的输入/输出对，也没有明确地修正次优行为。

−

Self-learning as machine learning paradigm was introduced in 1982 along with a neural network capable of self-learning named Crossbar Adaptive Array (CAA). <ref> Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402. {{ISBN|978-0-444-86488-8}}.</ref> It is a learning with no external rewards and no external teacher advices. The CAA self-learning algorithm computes, in a crossbar fashion, both decisions about actions and emotions (feelings) about consequence situations. The system is driven by the interaction between cognition and emotion. <ref>Bozinovski, Stevo (2014) "Modeling mechanisms of cognition-emotion interaction in artificial neural networks, since 1981." Procedia Computer Science p. 255-263 </ref>

−

~~Self~~-~~learning as machine learning paradigm was introduced in 1982 along with a neural network capable of self-learning named Crossbar Adaptive Array~~ (~~CAA~~)~~. It is a~~ learning ~~with no external rewards~~ and ~~no external teacher advices~~. ~~The CAA self~~-~~learning algorithm computes~~, ~~in a crossbar fashion~~, ~~both decisions about actions~~ and ~~emotions (feelings) about consequence situations. The system is driven by the interaction between cognition and emotion.~~

+

强化学习是机器学习的一个分支，它研究软件组件应该如何在某个环境中进行行动决策，以便最大化某种累积收益的概念。由于其存在的普遍性，该领域的研究在许多其他学科，如[[博弈论]]，[[控制理论]]，[[运筹学]]，[[信息论]]，'''基于仿真的优化 Simulation-based Optimization'''，[[多主体系统]]，'''群体智能 Swarm Intelligence'''，统计学和[[遗传算法]]。在机器学习中，环境通常被表示为'''马可夫决策过程 Markov Decision Process(MDP)'''。许多强化学习算法使用动态编程技术。<ref>{{Cite book|title=Reinforcement learning and markov decision processes|author1=van Otterlo, M.|author2=Wiering, M.|journal=Reinforcement Learning |volume=12|pages=3–42 |year=2012 |doi=10.1007/978-3-642-27645-3_1|series=Adaptation, Learning, and Optimization|isbn=978-3-642-27644-6}}</ref> 强化学习算法不需要知道 MDP 的精确数学模型，而是在精确模型不可行的情况下使用。强化学习算法常用于车辆自动驾驶问题或人机游戏场景。

−

自学习作为一种机器学习范式，于1982年提出，并提出了一种具有自学习能力的神经网络叫做'''交叉自适应矩阵 Crossbar Adaptive Array，CAA'''。这是一种没有外部激励和学习器建议的学习方法。CAA自学习算法以交叉方式计算关于行为的决策和关于后果情况的情绪（感觉）。这个系统是由认知和情感的相互作用所驱动的。

−

~~The~~ self-learning ~~algorithm updates a memory matrix W =||w~~(a,s)~~|| such that~~ in ~~each iteration executes the following machine learning routine:~~

+

==== 自学习 ====

+

自学习作为一种机器学习范式，于1982年提出，并提出了一种具有自学习能力的神经网络叫做'''交叉自适应矩阵 Crossbar Adaptive Array，CAA'''。<ref> Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402..</ref>这是一种没有外部激励和学习器建议的学习方法。CAA自学习算法以交叉方式计算关于行为的决策和关于后果情况的情绪（感觉）。这个系统是由认知和情感的相互作用所驱动的。<ref>Bozinovski, Stevo (2014) "Modeling mechanisms of cognition-emotion interaction in artificial neural networks, since 1981." Procedia Computer Science p. 255-263 </ref>

−

~~The self-learning algorithm updates a memory matrix W =||w(a,s)|| such that in each iteration executes the following machine learning routine:~~

−

~~自学习算法更新内存矩阵 w |~~ | | w (~~a，s~~) | | ，以便在每次迭代中执行以下机器学习例程:

+

自学习算法更新内存矩阵W =||w(a,s)|| ，以便在每次迭代中执行以下机器学习例程:

−

~~In situation s perform action a;~~

−

~~In situation s perform action a;~~

在情境中执行动作 a;

−

~~Receive consequence situation s’;~~

−

~~Receive consequence situation s’;~~

接受结果状态 s’ ;

−

~~Compute emotion of being in consequence situation v(s’);~~

−

~~Compute emotion of being in consequence situation v(s’);~~

计算处于结果情境 v (s’)中的情绪;

−

~~Update crossbar memory w’(a,s) = w(a,s) + v(s’).~~

−

~~Update crossbar memory w’(a,s) = w(a,s) + v(s’).~~

更新交叉条记忆存储 w’(a，s) w (a，s) + v (s’)。

+

它是一个只有一个输入、情景和一个输出、动作(或行为)的系统。既没有单独的强化输入，也没有来自环境的通知输入。反向传播价值（二次强化）是对结果情境的情感信息。CAA 存在于两种环境中，一种是行为环境，另一种是遗传环境，CAA将从这样的环境中获取且仅获取到一次关于它自身的初始情绪（这种情绪信息描述了算法应该对这样环境下对应的结果持有何种态度）。在从遗传环境中获得基因组(物种)载体后，CAA 会在一个既包含理想情况又包含不理想情况的环境中学习一种寻求目标的行为。<ref> Bozinovski, S. (2001) "Self-learning agents: A connectionist theory of emotion based on crossbar value judgment." Cybernetics and Systems 32(6) 637-667. </ref>

−

It is a system with only one input, situation s, and only one output, action (or behavior) a. There is neither a separate reinforcement input nor an advice input from the environment. The backpropagated value (secondary reinforcement) is the emotion toward the consequence situation. The CAA exists in two environments, one is behavioral environment where it behaves, and the other is genetic environment, wherefrom it initially and only once receives initial emotions about situations to be encountered in the behavioral environment. After receiving the genome (species) vector from the genetic environment, the CAA learns a goal seeking behavior, in an environment that contains both desirable and undesirable situations. <ref> Bozinovski, S. (2001) "Self-learning agents: A connectionist theory of emotion based on crossbar value judgment." Cybernetics and Systems 32(6) 637-667. </ref>

−

It is a system with only one input, situation s, and only one output, action (or behavior) a. There is neither a separate reinforcement input nor an advice input from the environment. The backpropagated value (secondary reinforcement) is the emotion toward the consequence situation. The CAA exists in two environments, one is behavioral environment where it behaves, and the other is genetic environment, wherefrom it initially and only once receives initial emotions about situations to be encountered in the behavioral environment. After receiving the genome (species) vector from the genetic environment, the CAA learns a goal seeking behavior, in an environment that contains both desirable and undesirable situations.

−

它是一个只有一个输入、情景和一个输出、动作(或行为)的系统。既没有单独的强化输入，也没有来自环境的通知输入。反向传播价值（二次强化）是对结果情境的情感信息。CAA 存在于两种环境中，一种是行为环境，另一种是遗传环境，CAA将从这样的环境中获取且仅获取到一次关于它自身的初始情绪（这种情绪信息描述了算法应该对这样环境下对应的结果持有何种态度）。在从遗传环境中获得基因组(物种)载体后，CAA 会在一个既包含理想情况又包含不理想情况的环境中学习一种寻求目标的行为。

−

==== 特征学习 ~~Feature learning~~ ====

+

==== 特征学习 ====

−

~~:''主文章：~~[~~https://en.wikipedia.org/wiki/Representation_learning 表示学习]''~~

+

一些学习算法，大多是[[无监督学习]]算法，旨在发现更好的输入的训练数据的表示。经典的例子包括[[主成分分析]]和[[聚类分析]]。表示学习算法通常试图在输入中保留信息，并将其转换成有用的方式，通常是在执行分类或预测之前的预处理步骤，允许重构来自未知数据生成分布的输入，而不一定对不太可能服从该分布的结构可靠。

−

~~一些学习算法，大多是~~[~~https://en.wikipedia.org/wiki/Unsupervised_learning~~ 无监督学习]算法，旨在发现更好的输入的训练数据的表示。经典的例子包括[~~https://en.wikipedia.org/wiki/Principal_component_analysis~~ 主成分分析]和[~~https://en.wikipedia.org/wiki/Cluster_analysis~~ 聚类分析]。表示学习算法通常试图在输入中保留信息，并将其转换成有用的方式，通常是在执行分类或预测之前的预处理步骤，允许重构来自未知数据生成分布的输入，而不一定对不太可能服从该分布的结构可靠。

[https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Manifold_learning_algorithms 流形学习]算法尝试处理被学习的数据表示是低维的情况。[https://en.wikipedia.org/wiki/Neural_coding#Sparse_coding 稀疏编码]算法尝试处理被学习的数据表示是稀疏(有多个零)的情况。[https://en.wikipedia.org/wiki/Multilinear_subspace_learning 多线性子空间学习]算法的目的是直接从多维数据的[https://en.wikipedia.org/wiki/Tensor 张量]表示中学习低维表示，而不将它们重构成(高维)向量<ref>{{cite journal |first1=Haiping |last1=Lu |first2=K.N. |last2=Plataniotis |first3=A.N. |last3=Venetsanopoulos |url=http://www.dsp.utoronto.ca/~haiping/Publication/SurveyMSL_PR2011.pdf |title=A Survey of Multilinear Subspace Learning for Tensor Data |journal=Pattern Recognition |volume=44 |number=7 |pages=1540–1551 |year=2011 }}</ref>。深度学习算法能发现数据表示的多个层次，或者由低级特征定义(或生成)的更高、更抽象的特征层次。有人认为，智能机器是一种学习表示法的机器，它能找出那些解释观测数据的潜在变异因素<ref>{{cite book

第417行：第343行： −

Feature learning can be either supervised or unsupervised. In supervised feature learning, features are learned using labeled input data. Examples include [[artificial neural network]]s, [[multilayer perceptron]]s, and supervised [[dictionary learning]]. In unsupervised feature learning, features are learned with unlabeled input data. Examples include dictionary learning, [[independent component analysis]], [[autoencoder]]s, [[matrix decomposition|matrix factorization]]<ref>{{cite conference |author1=Nathan Srebro |author2=Jason D. M. Rennie |author3=Tommi S. Jaakkola |title=Maximum-Margin Matrix Factorization |conference=[[Conference on Neural Information Processing Systems|NIPS]] |year=2004}}</ref> and various forms of [[Cluster analysis|clustering]].<ref name="coates2011">{{cite conference

+

Feature learning can be either supervised or unsupervised. In supervised feature learning, features are learned using labeled input data. Examples include [[artificial neural network]]s, [[multilayer perceptron]]s, and supervised [[dictionary learning]]. In unsupervised feature learning, features are learned with unlabeled input data. Examples include dictionary learning, [[independent component analysis]], [[autoencoder]]s, [[matrix decomposition|matrix factorization]]<ref>{{cite conference |author1=Nathan Srebro |author2=Jason D. M. Rennie |author3=Tommi S. Jaakkola |title=Maximum-Margin Matrix Factorization |conference=[[Conference on Neural Information Processing Systems|NIPS]] |year=2004}}</ref> and various forms of [[Cluster analysis|clustering]].<ref name="coates2011">{{cite conference|last1 = Coates|first1 = Adam

−

+

|last2 = Lee|first2 = Honglak|last3 = Ng|first3 = Andrew Y.|title = An analysis of single-layer networks in unsupervised feature learning|conference = Int'l Conf. on AI and Statistics (AISTATS)|year = 2011|url = http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf|access-date = 2018-11-25|archive-url = https://web.archive.org/web/20170813153615/http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf|archive-date = 2017-08-13

−

~~Feature learning can be either supervised or~~ unsupervised~~. In supervised~~ feature learning~~, features are learned using labeled input data~~. ~~Examples include artificial neural networks, multilayer perceptrons,~~ and ~~supervised dictionary learning~~. ~~In unsupervised feature learning, features are learned~~ with ~~unlabeled input data~~. ~~Examples include dictionary learning, independent component analysis, autoencoders, matrix factorization and various forms of clustering~~.<ref name="~~coates2011~~">{{cite ~~conference~~

+

|url-status = dead}}</ref><ref>{{cite conference |last1 = Csurka |first1 = Gabriella|last2 = Dance |first2 = Christopher C.|last3 = Fan |first3 = Lixin|last4 = Willamowski |first4 = Jutta|last5 = Bray |first5 = Cédric|title = Visual categorization with bags of keypoints|conference = ECCV Workshop on Statistical Learning in Computer Vision|year = 2004|url = https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/csurka-eccv-04.pdf}}</ref><ref name="jurafsky">{{cite book |title=Speech and Language Processing |author1=Daniel Jurafsky |author2=James H. Martin |publisher=Pearson Education International |year=2009 |pages=145–146}}</ref>

−

~~|last1 = Coates~~

−

~~|last1 = Coates~~

−

~~1 Coates~~

−

~~|first1 = Adam~~

−

~~|first1 = Adam~~

−

~~首先，亚当~~

−

~~|last2 = Lee~~

−

~~|last2 = Lee~~

−

~~最后2名 Lee~~

−

~~|first2 = Honglak~~

−

~~|first2 = Honglak~~

−

~~| first2 Honglak~~

−

~~|last3 = Ng~~

−

~~|last3 = Ng~~

−

~~| 最后3 Ng~~

−

~~|first3 = Andrew Y.~~

−

~~|first3 = Andrew Y.~~

−

~~第三名: 安德鲁 · y。~~

−

~~|title = An analysis of single-layer networks in unsupervised feature learning~~

−

~~|title = An analysis of single-layer networks in unsupervised feature learning~~

−

~~无监督特征学习中的单层网络分析~~

−

~~|conference = Int'l Conf. on AI and Statistics (AISTATS)~~

−

~~|conference = Int'l Conf. on AI and Statistics (AISTATS)~~

−

~~国际会议。有关人工智能及统计的资料~~

−

~~|year = 2011~~

−

~~|year = 2011~~

−

~~2011年~~

−

~~|url = http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf~~

−

~~|url = http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf~~

−

~~Http://machinelearning.wustl.edu/mlpapers/paper_files/aistats2011_coatesnl11.pdf~~

−

~~|access-date = 2018-11-25~~

−

~~|access-date = 2018-11-25~~

−

~~2018-11-25~~

−

~~|archive-url = https://web.archive.org/web/20170813153615/http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf~~

−

~~|archive-url = https://web.archive.org/web/20170813153615/http://machinelearning.wustl.edu/mlpapers/paper_files/AISTATS2011_CoatesNL11.pdf~~

−

~~| 档案-网址 https://web.archive.org/web/20170813153615/http://machinelearning.wustl.edu/mlpapers/paper_files/aistats2011_coatesnl11.pdf~~

−

~~|archive-date = 2017-08-13~~

−

~~|archive-date = 2017-08-13~~

−

~~| 档案-日期2017-08-13~~

−

~~|url-status = dead~~

−

~~|url-status = dead~~

−

~~状态死机~~

−

}}</ref><ref>{{cite conference |last1 = Csurka |first1 = Gabriella|last2 = Dance |first2 = Christopher C.|last3 = Fan |first3 = Lixin|last4 = Willamowski |first4 = Jutta|last5 = Bray |first5 = Cédric|title = Visual categorization with bags of keypoints|conference = ECCV Workshop on Statistical Learning in Computer Vision|year = 2004|url = https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/csurka-eccv-04.pdf}}</ref><ref name="jurafsky">{{cite book |title=Speech and Language Processing |author1=Daniel Jurafsky |author2=James H. Martin |publisher=Pearson Education International |year=2009 |pages=145–146}}</ref>

−

~~}}</ref>~~

−

~~{} / ref~~

特征学习可以是有监督的，也可以是无监督的。在有监督的特征学习中，可以利用标记输入数据学习特征。例如'''人工神经网络 Artificial Neural Networks，ANN'''、'''多层感知机 Multilayer Perceptrons，MLP'''和受控字典式学习模型 Supervised Dictionary Learning Model，SDLM。在无监督的特征学习中，特征是通过未标记的输入数据进行学习的。例如，'''字典学习 Dictionary learning'''、'''独立元素分析 Independent Component Analysis'''、'''自动编码器 Autoencoders'''、'''矩阵分解 Matrix Factorization'''和各种形式的聚类。

−

[[Manifold ~~learning]] algorithms attempt to do so under the constraint that the learned representation is low-dimensional. [[~~Sparse ~~coding]] algorithms attempt to do so under the constraint that the learned representation is sparse, meaning that the mathematical model has many zeros. [[~~Multilinear ~~subspace learning]] algorithms aim to learn low-dimensional representations directly from [[tensor]] representations for multidimensional data, without reshaping them into higher-dimensional vectors.~~<ref>{{cite journal |first1=Haiping |last1=Lu |first2=K.N. |last2=Plataniotis |first3=A.N. |last3=Venetsanopoulos |url=http://www.dsp.utoronto.ca/~haiping/Publication/SurveyMSL_PR2011.pdf |title=A Survey of Multilinear Subspace Learning for Tensor Data |journal=Pattern Recognition |volume=44 |number=7 |pages=1540–1551 |year=2011 |doi=10.1016/j.patcog.2011.01.004}}</ref> [[Deep learning]] algorithms discover multiple levels of representation, or a hierarchy of features, with higher-level, more abstract features defined in terms of (or generating) lower-level features. It has been argued that an intelligent machine is one that learns a representation that disentangles the underlying factors of variation that explain the observed data.<ref>{{cite book | title = Learning Deep Architectures for AI | author = Yoshua Bengio | publisher = Now Publishers Inc. | year = 2009 | isbn = 978-1-60198-294-0 | pages = 1–3 | url = https://books.google.com/books?id=cq5ewg7FniMC&pg=PA3| author-link = Yoshua Bengio }}</ref>

+

'''流形学习 Manifold Learning'''算法试图在学习表示为低维的约束条件下进行流形学习。'''稀疏编码算法 Sparse Coding Algorithms'''试图在学习表示为稀疏的约束条件下进行编码，这意味着数学模型有许多'''零点 Zeros'''。'''多线性子空间学习算法 Multilinear Subspace Learning Algorithms'''旨在直接从多维数据的张量表示中学习低维的表示，而不是将它们重塑为高维向量。<ref>{{cite journal |first1=Haiping |last1=Lu |first2=K.N. |last2=Plataniotis |first3=A.N. |last3=Venetsanopoulos |url=http://www.dsp.utoronto.ca/~haiping/Publication/SurveyMSL_PR2011.pdf |title=A Survey of Multilinear Subspace Learning for Tensor Data |journal=Pattern Recognition |volume=44 |number=7 |pages=1540–1551 |year=2011 |doi=10.1016/j.patcog.2011.01.004}}</ref>'''深度学习算法 Deep Learning Algorithms'''发现了多层次的表示，或者是一个特征层次结构，具有更高层次、更抽象的特征，这些特征定义为（或可以生成）低层次的特征。有人认为，一个智能机器的表现是可以学习到一种表示的方法，并能够解释数据观测值变化背后的机理或潜在影响。<ref>{{cite book | title = Learning Deep Architectures for AI | author = Yoshua Bengio | publisher = Now Publishers Inc. | year = 2009 | isbn = 978-1-60198-294-0 | pages = 1–3 | url = https://books.google.com/books?id=cq5ewg7FniMC&pg=PA3| author-link = Yoshua Bengio }}</ref>

−

Manifold learning algorithms attempt to do so under the constraint that the learned representation is low-dimensional. Sparse coding algorithms attempt to do so under the constraint that the learned representation is sparse, meaning that the mathematical model has many zeros. Multilinear subspace learning algorithms aim to learn low-dimensional representations directly from tensor representations for multidimensional data, without reshaping them into higher-dimensional vectors. Deep learning algorithms discover multiple levels of representation, or a hierarchy of features, with higher-level, more abstract features defined in terms of (or generating) lower-level features. It has been argued that an intelligent machine is one that learns a representation that disentangles the underlying factors of variation that explain the observed data.

−

'''流形学习 Manifold Learning'''算法试图在学习表示为低维的约束条件下进行流形学习。'''稀疏编码算法 Sparse Coding Algorithms'''试图在学习表示为稀疏的约束条件下进行编码，这意味着数学模型有许多'''零点 Zeros'''。'''多线性子空间学习算法 Multilinear Subspace Learning Algorithms'''旨在直接从多维数据的张量表示中学习低维的表示，而不是将它们重塑为高维向量。'''深度学习算法 Deep Learning Algorithms'''发现了多层次的表示，或者是一个特征层次结构，具有更高层次、更抽象的特征，这些特征定义为（或可以生成）低层次的特征。有人认为，一个智能机器的表现是可以学习到一种表示的方法，并能够解释数据观测值变化背后的机理或潜在影响。

+

特征学习的动力来自于机器学习任务，如分类中，通常需要数学上和计算上方便处理的输入。然而，真实世界的数据，如图像、视频和感官数据，并没有那么简单就可以用通过算法定义特定特征。另一种方法是通过检查发现这些特征或表示，而不依赖于显式算法。

+

===== 稀疏字典学习 =====

−

~~Feature learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process~~. ~~However, real-world data such as images, video, and sensory data has not yielded to attempts to algorithmically define specific features~~. ~~An alternative is to discover such features or representations through examination~~, ~~without relying on explicit algorithms.~~

+

稀疏词典学习是一种特征学习方法，在这种方法中，将数据表示为[https://en.wikipedia.org/wiki/Basis_function 基函数]的线性组合，并假定系数是稀疏的。设x是d维数据，D是d乘n矩阵，其中D的每一列代表一个基函数，r是用D表示x的系数。数学上，稀疏字典学习意味着求解 <math> x\approx Dr ,r </math>是稀疏的。一般说来，假设n大于d，以便稀疏表示。

−

Feature learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process. However, real-world data such as images, video, and sensory data has not yielded to attempts to algorithmically define specific features. An alternative is to discover such features or representations through examination, without relying on explicit algorithms.

−

特征学习的动力来自于机器学习任务，如分类中，通常需要数学上和计算上方便处理的输入。然而，真实世界的数据，如图像、视频和感官数据，并没有那么简单就可以用通过算法定义特定特征。另一种方法是通过检查发现这些特征或表示，而不依赖于显式算法。

+

学习字典和稀疏表示是[https://en.wikipedia.org/wiki/Strongly_NP-hard 强NP难解]的，也很难近似求解<ref>{{cite journal |first=A. M. |last=Tillmann |title=On the Computational Intractability of Exact and Approximate Dictionary Learning |journal=IEEE Signal Processing Letters |volume=22 |issue=1 |year=2015 |pages=45–49 |bibcode;2015ISPL...22...45T |arxiv:1405.6664 }}</ref> 。稀疏字典学习的一种流行的启发式方法是[https://en.wikipedia.org/wiki/K-SVD K-SVD]。

−

~~===== 稀疏字典学习 Sparse dictionary learning =====~~

−

~~:''主文章：[https://en.wikipedia.org/wiki/Sparse_dictionary_learning 稀疏字典学习]''~~

−

稀疏词典学习是一种特征学习方法，在这种方法中，将数据表示为[https://en.wikipedia.org/wiki/Basis_function 基函数]的线性组合，并假定系数是稀疏的。设x是d维数据，D是d乘n矩阵，其中D的每一列代表一个基函数，r是用D表示x的系数。数学上，稀疏字典学习意味着求解 <math> x\approx Dr ,r </math>是稀疏的。一般说来，假设n大于d，以便稀疏表示。

−

学习字典和稀疏表示是[https://en.wikipedia.org/wiki/Strongly_NP-hard 强NP难解]的，也很难近似求解<ref>{{cite journal |first=A. M. |last=Tillmann |title=On the Computational Intractability of Exact and Approximate Dictionary Learning |journal=IEEE Signal Processing Letters |volume=22 |issue=1 |year=2015 |pages=45–49 |bibcode;2015ISPL...22...45T |arxiv:1405.6664 }}</ref> 。稀疏字典学习的一种流行的启发式方法是[https://en.wikipedia.org/wiki/K-SVD K-SVD]。

稀疏字典学习已经在几种环境中得到了应用。在分类中，问题是确定以前看不见的数据属于哪些类。假设已经为每个类构建了一个字典。然后，将一个新的数据与类相关联，使得它被相应的字典最优表示。稀疏字典学习也被应用于[https://en.wikipedia.org/wiki/Image_de-noising 图像去噪]。关键的思想是一个干净的图像补丁可以用图像字典来稀疏地表示，但是噪声却不能<ref>Aharon, M, M Elad, and A Bruckstein. 2006. "K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation." Signal Processing, IEEE Transactions on 54 (11): 4311–4322</ref>。

−

~~==== 异常检测 Anomaly detection ====~~

−

+

==== 异常检测 ====

+

在数据挖掘中，'''异常检测 Anomaly / Outlier detection'''是指识别那些引起怀疑的稀有项目、事件或者观测结果，它们与其他大多数数据有很大的不同。<ref name=":0">{{Citation|last=Zimek|first=Arthur|title=Outlier Detection|date=2017|encyclopedia=Encyclopedia of Database Systems|pages=1–5|publisher=Springer New York|language=en|doi=10.1007/978-1-4899-7993-3_80719-1|isbn=9781489979933|last2=Schubert|first2=Erich}}</ref>一般来说，这些不正常的项目都可以反映出数据背后的一个问题，如银行欺诈、结构缺陷、医疗问题或文本中的错误。异常也被称为''异常值 Outliers''、''奇异值 Novelties''、''噪音 Noise''、''偏差 Deviations''和''异常 Exceptions''。<ref>{{cite journal | last1 = Hodge | first1 = V. J. | last2 = Austin | first2 = J. | doi = 10.1007/s10462-004-4304-y | title = A Survey of Outlier Detection Methodologies | journal = Artificial Intelligence Review| volume = 22 | issue = 2 | pages = 85–126 | year = 2004 | url = http://eprints.whiterose.ac.uk/767/1/hodgevj4.pdf| pmid = | pmc = | citeseerx = 10.1.1.318.4023 }}</ref>

−

In [[data mining]], anomaly detection, also known as outlier detection, is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data.<ref name=":0">{{Citation|last=Zimek|first=Arthur|title=Outlier Detection|date=2017|encyclopedia=Encyclopedia of Database Systems|pages=1–5|publisher=Springer New York|language=en|doi=10.1007/978-1-4899-7993-3_80719-1|isbn=9781489979933|last2=Schubert|first2=Erich}}</ref> Typically, the anomalous items represent an issue such as [[bank fraud]], a structural defect, medical problems or errors in a text. Anomalies are referred to as [[outlier]]s, novelties, noise, deviations and exceptions.<ref>{{cite journal | last1 = Hodge | first1 = V. J. | last2 = Austin | first2 = J. | doi = 10.1007/s10462-004-4304-y | title = A Survey of Outlier Detection Methodologies | journal = Artificial Intelligence Review| volume = 22 | issue = 2 | pages = 85–126 | year = 2004 | url = http://eprints.whiterose.ac.uk/767/1/hodgevj4.pdf| pmid = | pmc = | citeseerx = 10.1.1.318.4023 }}</ref>

−

~~In data mining, anomaly detection, also known as outlier detection, is the identification~~ of ~~rare items, events or observations which raise suspicions by differing significantly from the majority of the data~~. ~~Typically, the anomalous items represent an issue such as bank fraud, a structural defect, medical problems or errors in a text~~. ~~Anomalies are referred to as outliers, novelties, noise, deviations and exceptions~~.

+

特别是在滥用和网络入侵检测的背景下，人们感兴趣的往往不是罕见的对象，而是突发性的活动。这种模式并不符合异常值作为稀有对象的通用统计学定义，而且许多异常检测方法（特别是无监督的算法）将无法处理这类数据，除非它已经被适当地聚合处理。相反地，数据聚类算法可以检测到这些模式形成的微团簇。<ref>{{cite journal | last1 = Hodge | first1 = V. J. | last2 = Austin | first2 = J. | doi = 10.1007/s10462-004-4304-y | title = A Survey of Outlier Detection Methodologies | journal = Artificial Intelligence Review| volume = 22 | issue = 2 | pages = 85–126 | year = 2004 | url = http://eprints.whiterose.ac.uk/767/1/hodgevj4.pdf| pmid = | pmc = | citeseerx = 10.1.1.318.4023 }}</ref>

−

在数据挖掘中，'''异常检测 Anomaly / Outlier detection'''是指识别那些引起怀疑的稀有项目、事件或者观测结果，它们与其他大多数数据有很大的不同。一般来说，这些不正常的项目都可以反映出数据背后的一个问题，如银行欺诈、结构缺陷、医疗问题或文本中的错误。异常也被称为''异常值 Outliers''、''奇异值 Novelties''、''噪音 Noise''、''偏差 Deviations''和''异常 Exceptions''。

+

异常检测技术有3大类。<ref name="ChandolaSurvey">{{cite journal |last1=Chandola |first1=V. |last2=Banerjee |first2=A. |last3=Kumar |first3=V. |year=2009 |title=Anomaly detection: A survey|journal=[[ACM Computing Surveys]]|volume=41|issue=3|pages=1–58|doi=10.1145/1541880.1541882|url=https://www.semanticscholar.org/paper/71d1ac92ad36b62a04f32ed75a10ad3259a7218d }}</ref>无监督的异常检测 / 测试技术在假设数据集中大多数实例都是正常的情况下，通过是来寻找数据集中最违和的实例，从而实现检测未被标记的测试数据集中的异常。监督式的异常检测分析技术需要一个被标记为“正常”和“异常”的数据集，还需要训练一个分类器（和许多其他分类分析问题的关键区别在于异常检测本身的不平衡性）。半监督的异常检测技术从给定的正常训练数据集构建一个表示正常行为的模型，然后测试由该模型生成的测试实例的可能性。

−

In particular, in the context of abuse and network intrusion detection, the interesting objects are often not rare objects, but unexpected bursts in activity. This pattern does not adhere to the common statistical definition of an outlier as a rare object, and many outlier detection methods (in particular, unsupervised algorithms) will fail on such data, unless it has been aggregated appropriately. Instead, a cluster analysis algorithm may be able to detect the micro-clusters formed by these patterns.<ref>{{cite journal| first=~~Paul | last~~=~~Dokas | first2~~=~~Levent |last2~~=~~Ertoz |first3~~=~~Vipin |last3~~=~~Kumar |first4~~=~~Aleksandar |last4~~=Lazarevic |first5=Jaideep |last5=Srivastava |first6=Pang-Ning |last6=Tan | title=Data mining for network intrusion detection | year=2002 | journal=Proceedings NSF Workshop on Next Generation Data Mining | url=http://www.csee.umbc.edu/~kolari1/Mining/ngdm/dokas.pdf}}</ref>

+

==== 机器人学习====

−

In particular, in the context of abuse and network intrusion detection, the interesting objects are often not rare objects, but unexpected bursts in activity. This pattern does not adhere to the common statistical definition of an outlier as a rare object, and many outlier detection methods (in particular, unsupervised algorithms) will fail on such data, unless it has been aggregated appropriately. Instead, a cluster analysis algorithm may be able to detect the micro-clusters formed by these patterns.

+

在'''发展型机器人 Developmental robotics'''学习中，机器人学习算法能够产生自己的学习经验序列，也称为课程，通过自我引导的探索来与人类社会进行互动，累积获得新技能。这些机器人在学习的过程中会使用诸如主动学习、成熟、协同运动和模仿等引导机制。

−

特别是在滥用和网络入侵检测的背景下，人们感兴趣的往往不是罕见的对象，而是突发性的活动。这种模式并不符合异常值作为稀有对象的通用统计学定义，而且许多异常检测方法（特别是无监督的算法）将无法处理这类数据，除非它已经被适当地聚合处理。相反地，数据聚类算法可以检测到这些模式形成的微团簇。

−

Three broad categories of anomaly detection techniques exist.<ref name="ChandolaSurvey">{{cite journal |last1=Chandola |first1=V. |last2=Banerjee |first2=A. |last3=Kumar |first3=V. |year=2009 |title=Anomaly detection: A survey|journal=[[ACM Computing Surveys]]|volume=41|issue=3|pages=1–58|doi=10.1145/1541880.1541882|url=https://www.semanticscholar.org/paper/71d1ac92ad36b62a04f32ed75a10ad3259a7218d }}</ref> Unsupervised anomaly detection techniques detect anomalies in an unlabeled test data set under the assumption that the majority of the instances in the data set are normal, by looking for instances that seem to fit least to the remainder of the data set. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier (the key difference to many other statistical classification problems is the inherently unbalanced nature of outlier detection). Semi-supervised anomaly detection techniques construct a model representing normal behavior from a given normal training data set and then test the likelihood of a test instance to be generated by the model.

−

Three broad categories of anomaly detection techniques exist. Unsupervised anomaly detection techniques detect anomalies in an unlabeled test data set under the assumption that the majority of the instances in the data set are normal, by looking for instances that seem to fit least to the remainder of the data set. Supervised anomaly detection techniques require a data set that has been labeled as "normal" and "abnormal" and involves training a classifier (the key difference to many other statistical classification problems is the inherently unbalanced nature of outlier detection). Semi-supervised anomaly detection techniques construct a model representing normal behavior from a given normal training data set and then test the likelihood of a test instance to be generated by the model.

−

异常检测技术有3大类。无监督的异常检测 / 测试技术在假设数据集中大多数实例都是正常的情况下，通过是来寻找数据集中最违和的实例，从而实现检测未被标记的测试数据集中的异常。监督式的异常检测分析技术需要一个被标记为“正常”和“异常”的数据集，还需要训练一个分类器（和许多其他分类分析问题的关键区别在于异常检测本身的不平衡性）。半监督的异常检测技术从给定的正常训练数据集构建一个表示正常行为的模型，然后测试由该模型生成的测试实例的可能性。

−

~~==== 机器人学习 Robot learning====~~

−

In [[developmental robotics]], [[robot learning]] algorithms generate their own sequences of learning experiences, also known as a curriculum, to cumulatively acquire new skills through self-guided exploration and social interaction with humans. These robots use guidance mechanisms such as active learning, maturation, [[Motor_coordination#Muscle_synergies|motor synergies]] and imitation.

−

In developmental robotics, robot learning algorithms generate their own sequences of learning experiences, also known as a curriculum, to cumulatively acquire new skills through self-guided exploration and social interaction with humans. These robots use guidance mechanisms such as active learning, maturation, motor synergies and imitation.

−

在'''发展型机器人 Developmental robotics'''学习中，机器人学习算法能够产生自己的学习经验序列，也称为课程，通过自我引导的探索来与人类社会进行互动，累积获得新技能。这些机器人在学习的过程中会使用诸如主动学习、成熟、协同运动和模仿等引导机制。

==== 基于规则的机器学习算法 ====

第575行：第387行：

[https://en.wikipedia.org/wiki/Rule-based_machine_learning 基于规则的机器学习]是任何机器学习方法的通用术语，它通过识别、学习或演化“规则”来存储、操作或应用知识。基于规则的机器学习者的定义特征是识别和使用一组关系规则，这些规则共同表示系统获取的知识。这与其他机器学习者不同，这些机器学习者通常会识别出一个可以普遍应用于任何实例的奇异模型，以便进行预测<ref>{{Cite journal|last=Bassel|first=George W.|last2=Glaab|first2=Enrico|last3=Marquez|first3=Julietta|last4=Holdsworth|first4=Michael J.|last5=Bacardit|first5=Jaume|date=2011-09-01|title=Functional Network Construction in Arabidopsis Using Rule-Based Machine Learning on Large-Scale Data Sets|url=http://www.plantcell.org/content/23/9/3101|journal=The Plant Cell|language=en|volume=23|issue=9|pages=3101–3116|doi:10.1105/tpc.111.088153|issn:1532-298X|pmc:3203449|pmid:21896882}}</ref> 。基于规则的机器学习方法包括[https://en.wikipedia.org/wiki/Learning_classifier_system 学习分类器系统]、[https://en.wikipedia.org/wiki/Association_rule_learning 关联规则学习]和[https://en.wikipedia.org/wiki/Artificial_immune_system 人工免疫系统]。

−

===== 关联规则 ~~Association rules~~ =====

+

===== 关联规则 =====

−

:''~~主文章：[https://en~~.~~wikipedia~~.~~org~~/~~wiki~~/~~Association_rule_learning 关联规则学习]''~~

+

'''关联规则学习 Association Rule Learning'''是一种'''基于规则的机器学习 Rule-based machine learning'''方法，用于发现大型数据库中变量之间的关系。它旨在利用某种“有趣度”的度量，识别在数据库中新发现的强大规则。<ref name="piatetsky">Piatetsky-Shapiro, Gregory (1991), ''Discovery, analysis, and presentation of strong rules'', in Piatetsky-Shapiro, Gregory; and Frawley, William J.; eds., ''Knowledge Discovery in Databases'', AAAI/MIT Press, Cambridge, MA.</ref>

−

Association rule learning is a [[rule-based machine learning]] method for discovering relationships between variables in large databases. It is intended to identify strong rules discovered in databases using some measure of "interestingness".<ref name="piatetsky">Piatetsky-Shapiro, Gregory (1991), ''Discovery, analysis, and presentation of strong rules'', in Piatetsky-Shapiro, Gregory; and Frawley, William J.; eds., ''Knowledge Discovery in Databases'', AAAI/MIT Press, Cambridge, MA.</ref>

−

~~Association rule learning is a rule-based machine learning method for discovering relationships~~ between ~~variables~~ in large databases. ~~It is intended to identify strong rules discovered in databases using some measure of "interestingness"~~.

+

在基于强规则的原理中，Rakesh Agrawal、 Tomasz imieli ski 和 Arun Swami 引入了关联规则这一概念，用于在超市销售点（POS）系统记录的大规模交易数据中发现产品之间的规则。<ref name="mining">{{Cite book | last1 = Agrawal | first1 = R. | last2 = Imieliński | first2 = T. | last3 = Swami | first3 = A. | doi = 10.1145/170035.170072 | chapter = Mining association rules between sets of items in large databases | title = Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93 | pages = 207 | year = 1993 | isbn = 978-0897915922 | pmid = | pmc = | citeseerx = 10.1.1.40.6984 }}</ref>例如，在超市的销售数据中发现的规则 <math>\{\mathrm{onions, potatoes}\} \Rightarrow \{\mathrm{burger}\}</math>表明，如果某位顾客同时购买洋葱和土豆，那么他也很可能会购买汉堡肉。这些信息可以作为市场决策的依据，如促销价格或产品植入。除了市场篮子分析之外，关联规则还应用于 '''Web 使用挖掘 Web Usage Mining'''、'''入侵检测 Intrusion Detection'''、连续生产和'''生物信息学 Bioinformatics'''等应用领域。与序列挖掘相比，关联规则学习通常不考虑事务内或事务之间的先后顺序。

−

'''关联规则学习 Association Rule Learning'''是一种'''基于规则的机器学习 Rule-based machine learning'''方法，用于发现大型数据库中变量之间的关系。它旨在利用某种“有趣度”的度量，识别在数据库中新发现的强大规则。

−

Based on the concept of strong rules, [[Rakesh Agrawal (computer scientist)|Rakesh Agrawal]], [[Tomasz Imieliński]] and Arun Swami introduced association rules for discovering regularities between products in large-scale transaction data recorded by [[point-of-sale]] (POS) systems in supermarkets.<ref name=~~"mining">{{Cite book | last1~~ = ~~Agrawal | first1~~ = ~~R. | last2~~ = ~~Imieliński | first2~~ = ~~T. | last3~~ = ~~Swami | first3~~ = ~~A. | doi~~ = ~~10.1145/170035.170072 | chapter~~ = ~~Mining association rules between sets of items in large databases | title~~ = Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93 | pages = 207 | year = 1993 | isbn = 978-0897915922 | pmid = | pmc = | citeseerx = 10.1.1.40.6984 }}</ref> For example, the rule <math>\{\mathrm{onions, potatoes}\} \Rightarrow \{\mathrm{burger}\}</math> found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat. Such information can be used as the basis for decisions about marketing activities such as promotional [[pricing]] or [[product placement]]s. In addition to [[market basket analysis]], association rules are employed today in application areas including [[Web usage mining]], [[intrusion detection]], [[continuous production]], and [[bioinformatics]]. In contrast with [[sequence mining]], association rule learning typically does not consider the order of items either within a transaction or across transactions.

+

=====学习分类器=====

−

~~Based on the concept of strong rules, Rakesh Agrawal, Tomasz Imieliński and Arun Swami introduced association rules for discovering regularities between products in large~~-~~scale transaction data recorded by point-of-sale~~ (~~POS~~) ~~systems in supermarkets. For example, the rule~~ <~~math~~>~~\{\mathrm{onions, potatoes}\} \Rightarrow \~~{~~\mathrm~~{~~burger}\}</math> found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat~~. ~~Such information can be used as the basis for decisions about marketing activities such as promotional pricing or product placements~~. ~~In addition to market basket analysis~~, ~~association rules are employed today in application areas including Web usage mining, intrusion detection, continuous production~~, and ~~bioinformatics~~. ~~In contrast with sequence mining, association rule learning typically does not consider the order~~ of ~~items either within a transaction or across transactions.~~

+

'''学习分类器系统 Learning Classifier Systems，LCS'''是一组[https://en.wikipedia.org/wiki/Rule-based_machine_learning 基于规则的机器学习]算法，它将发现组件(通常是[[遗传算法]])与学习组件(执行有[[监督学习]]、[[强化学习]]或[[无监督学习]])结合起来。他们试图找出一套与情境相关的规则，这些规则以一种分段的方式，集体存储和应用知识，以便进行预测<ref>{{Cite journal|last=Urbanowicz|first=Ryan J.|last2=Moore|first2=Jason H.|date=2009-09-22|title=Learning Classifier Systems: A Complete Introduction, Review, and Roadmap|url=http://www.hindawi.com/archive/2009/736398/|journal=Journal of Artificial Evolution and Applications|language=en|volume=2009|pages=1–25|issn:1687-6229}}</ref>。

−

在基于强规则的原理中，Rakesh Agrawal、 Tomasz imieli ski 和 Arun Swami 引入了关联规则这一概念，用于在超市销售点（POS）系统记录的大规模交易数据中发现产品之间的规则。例如，在超市的销售数据中发现的规则表明，如果某位顾客同时购买洋葱和土豆，那么他也很可能会购买汉堡肉。这些信息可以作为市场决策的依据，如促销价格或产品植入。除了市场篮子分析之外，关联规则还应用于 '''Web 使用挖掘 Web Usage Mining'''、'''入侵检测 Intrusion Detection'''、连续生产和'''生物信息学 Bioinformatics'''等应用领域。与序列挖掘相比，关联规则学习通常不考虑事务内或事务之间的先后顺序。

−

~~=====学习分类器=====~~

−

~~:''主文章：[https://en.wikipedia.org/wiki/Learning_classifier_system 学习分类器系统]''~~

−

'''学习分类器系统 Learning Classifier Systems，LCS'''是一组[https://en.wikipedia.org/wiki/Rule-based_machine_learning 基于规则的机器学习]算法，它将发现组件(通常是[https://en.wikipedia.org/wiki/Genetic_algorithm 遗传算法])与学习组件(执行有[https://en.wikipedia.org/wiki/Supervised_learning 监督学习]、[https://en.wikipedia.org/wiki/Reinforcement_learning 强化学习]或[https://en.wikipedia.org/wiki/Unsupervised_learning 无监督学习])结合起来。他们试图找出一套与情境相关的规则，这些规则以一种[https://en.wikipedia.org/wiki/Piecewise 分段]的方式，集体存储和应用知识，以便进行预测<ref>{{Cite journal|last=Urbanowicz|first=Ryan J.|last2=Moore|first2=Jason H.|date=2009-09-22|title=Learning Classifier Systems: A Complete Introduction, Review, and Roadmap|url=http://www.hindawi.com/archive/2009/736398/|journal=Journal of Artificial Evolution and Applications|language=en|volume=2009|pages=1–25|issn:1687-6229}}</ref>。

===== 归纳逻辑规划 =====

+

'''归纳逻辑规划 Inductive Logic Programming(ILP)'''是一种用逻辑规划作为输入示例、背景知识和假设的统一表示的规则学习方法。如果将已知的背景知识进行编码，并将一组示例表示为事实的逻辑数据库，ILP 系统将推导出一个假设的逻辑程序，其中包含所有正面和负面的样例。归纳编程是一个与其相关的领域，它考虑用任何一种编程语言来表示假设（不仅仅是逻辑编程），比如'''函数编程 Functional programs'''。

−

Inductive logic programming (ILP) is an approach to rule-learning using [[logic programming]] as a uniform representation for input examples, background knowledge, and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesized logic program that [[Entailment|entails]] all positive and no negative examples. [[Inductive programming]] is a related field that considers any kind of programming languages for representing hypotheses (and not only logic programming), such as [[Functional programming|functional programs]].

−

Inductive logic programming (ILP) is an approach to rule-learning using logic programming as a uniform representation for input examples, background knowledge, and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesized logic program that entails all positive and no negative examples. Inductive programming is a related field that considers any kind of programming languages for representing hypotheses (and not only logic programming), such as functional programs.

−

'''归纳逻辑规划 Inductive Logic Programming，ILP'''是一种用逻辑规划作为输入示例、背景知识和假设的统一表示的规则学习方法。如果将已知的背景知识进行编码，并将一组示例表示为事实的逻辑数据库，ILP 系统将推导出一个假设的逻辑程序，其中包含所有正面和负面的样例。归纳编程是一个与其相关的领域，它考虑用任何一种编程语言来表示假设（不仅仅是逻辑编程），比如'''函数编程 Functional programs'''。

−

~~:''主文章：[https://en.wikipedia.org/wiki/Inductive_logic_programming 归纳逻辑编程]''~~

而'''归纳逻辑程序设计 Inductive Logic Programming'''是一种将[https://en.wikipedia.org/wiki/Logic_programming 逻辑编程]作为输入实例、背景知识和假设的统一表示的规则学习方法。给定已知背景知识的编码和作为事实逻辑数据库表示的一组示例，ILP系统将导出一个假设逻辑程序，其中包含所有正负示例。[https://en.wikipedia.org/wiki/Inductive_programming 归纳编程]是一个相关的领域，它考虑任何类型的编程语言来表示假设(而不仅仅是逻辑编程)，例如[https://en.wikipedia.org/wiki/Functional_programming 函数程序]。

−

~~Inductive logic programming is particularly useful in [[bioinformatics]] and [[natural language processing]]. [[~~Gordon ~~Plotkin]] and [[~~Ehud ~~Shapiro]] laid the initial theoretical foundation for inductive machine learning in a logical setting.~~<ref>Plotkin G.D. [https://www.era.lib.ed.ac.uk/bitstream/handle/1842/6656/Plotkin1972.pdf;sequence=1 Automatic Methods of Inductive Inference], PhD thesis, University of Edinburgh, 1970.</ref><ref>Shapiro, Ehud Y. [http://ftp.cs.yale.edu/publications/techreports/tr192.pdf Inductive inference of theories from facts], Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.</ref><ref>Shapiro, Ehud Y. (1983). ''Algorithmic program debugging''. Cambridge, Mass: MIT Press. {{ISBN|0-262-19218-7}}</ref> ~~Shapiro built their first implementation (Model Inference System) in 1981~~: a Prolog ~~program that inductively inferred logic programs from positive and negative examples.~~<ref>Shapiro, Ehud Y. "[http://dl.acm.org/citation.cfm?id=1623364 The model inference system]." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.</ref> The term ''inductive'' here refers to [[Inductive reasoning|philosophical]] induction, suggesting a theory to explain observed facts, rather than [[mathematical induction|mathematical]] induction, proving a property for all members of a well-ordered set.

+

它在生物信息学和'''自然语言处理 Natural Language Processing'''中特别有用。戈登·普洛特金 Gordon Plotkin和埃胡德·夏皮罗 Ehud Shapiro为归纳机器学习在逻辑上奠定了最初的理论基础。<ref>Plotkin G.D. [https://www.era.lib.ed.ac.uk/bitstream/handle/1842/6656/Plotkin1972.pdf;sequence=1 Automatic Methods of Inductive Inference], PhD thesis, University of Edinburgh, 1970.</ref><ref>Shapiro, Ehud Y. [http://ftp.cs.yale.edu/publications/techreports/tr192.pdf Inductive inference of theories from facts], Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.</ref><ref>Shapiro, Ehud Y. (1983). ''Algorithmic program debugging''. Cambridge, Mass: MIT Press. {{ISBN|0-262-19218-7}}</ref>Shapiro在1981年实现了他们的第一个模型推理系统: 一个从正反例中归纳推断逻辑程序的 Prolog 程序。<ref>Shapiro, Ehud Y. "[http://dl.acm.org/citation.cfm?id=1623364 The model inference system]." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.</ref>这里的”归纳“指的是哲学上的归纳，通过提出一个理论来解释观察到的事实，而不是数学归纳法证明了一个有序集合的所有成员的性质。

−

Inductive logic programming is particularly useful in bioinformatics and natural language processing. Gordon Plotkin and Ehud Shapiro laid the initial theoretical foundation for inductive machine learning in a logical setting. Shapiro built their first implementation (Model Inference System) in 1981: a Prolog program that inductively inferred logic programs from positive and negative examples. The term inductive here refers to philosophical induction, suggesting a theory to explain observed facts, rather than mathematical induction, proving a property for all members of a well-ordered set.

−

它在生物信息学和'''自然语言处理 Natural Language Processing'''中特别有用。戈登 · 普洛特金 Gordon Plotkin和埃胡德 · 夏皮罗 Ehud Shapiro为归纳机器学习在逻辑上奠定了最初的理论基础。夏皮罗 Shapiro在1981年实现了他们的第一个模型推理系统: 一个从正反例中归纳推断逻辑程序的 Prolog 程序。这里的”归纳“指的是哲学上的归纳，通过提出一个理论来解释观察到的事实，而不是数学归纳法证明了一个有序集合的所有成员的性质。

+

====相似性与度量学习====

+

在这个问题中，学习机器给出了一对被认为相似的对象和一对不太相似的对象。然后，它需要学习一个相似函数(或距离度量函数)，该函数可以预测新对象是否相似。该算法有时用于推荐系统。

−

~~====相似性与度量学习====~~

−

~~:''主文章：[https://en.wikipedia.org/wiki/Similarity_learning 相似性学习]''~~

−

在这个问题中，学习机器给出了一对被认为相似的对象和一对不太相似的对象。然后，它需要学习一个相似函数(或距离度量函数)，该函数可以预测新对象是否相似。该算法有时用于[https://en.wikipedia.org/wiki/Recommendation_systems 推荐系统]。

=== 模型 ===

薄荷

7,129

个编辑

更改

机器学习 Machine Learning (查看源代码)

2021年8月3日 (二) 22:23的版本

导航菜单

搜索