第234行: |
第234行: |
| </ref><ref name = "Alpaydin2020"/> | | </ref><ref name = "Alpaydin2020"/> |
| | | |
− | === 与数据挖掘 Data mining的关系Relation to data mining === | + | === 与数据挖掘的关系 Relation to data mining === |
| | | |
| Machine learning and [[data mining]] often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on ''known'' properties learned from the training data, [[data mining]] focuses on the [[discovery (observation)|discovery]] of (previously) ''unknown'' properties in the data (this is the analysis step of [[knowledge discovery]] in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, [[ECML PKDD]] being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to ''reproduce known'' knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously ''unknown'' knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. | | Machine learning and [[data mining]] often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on ''known'' properties learned from the training data, [[data mining]] focuses on the [[discovery (observation)|discovery]] of (previously) ''unknown'' properties in the data (this is the analysis step of [[knowledge discovery]] in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, [[ECML PKDD]] being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to ''reproduce known'' knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously ''unknown'' knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. |
第326行: |
第326行: |
| | | |
| ==== 半监督学习 Semi-supervised learning ==== | | ==== 半监督学习 Semi-supervised learning ==== |
− | | + | :''主文章:[[半监督学习]]'' |
− | {{Main|Semi-supervised learning}}
| |
− | | |
| | | |
| | | |
第344行: |
第342行: |
| | | |
| 在'''弱监督学习 Weak Supervision'''中,训练标签是有噪声的、有限的或不精确的; 然而,这些标签使用起来往往更加“实惠”——这种数据更容易得到、更容易拥有更大的有效训练集。 | | 在'''弱监督学习 Weak Supervision'''中,训练标签是有噪声的、有限的或不精确的; 然而,这些标签使用起来往往更加“实惠”——这种数据更容易得到、更容易拥有更大的有效训练集。 |
| + | |
| | | |
| ==== 强化学习 Reinforcement learning ==== | | ==== 强化学习 Reinforcement learning ==== |
| | | |
| :''主文章:[[强化学习]]'' | | :''主文章:[[强化学习]]'' |
− | 强化学习是指一个''智能体(agent)''应该如何在''环境''中采取''行动'',从而最大限度地获得长期''报酬''的概念。强化学习算法试图找到一种''策略'',将世界''状态''映射到智能体在这些状态中应该采取的行动。强化学习不同于[https://en.wikipedia.org/wiki/Supervised_learning 监督学习]问题,因为不会提供正确的输入/输出对,也没有明确地修正次优行为。 | + | 强化学习是指一个''智能体 agent''应该如何在''环境''中采取''行动'',从而最大限度地获得长期''报酬''的概念。强化学习算法试图找到一种''策略'',将世界''状态''映射到智能体在这些状态中应该采取的行动。强化学习不同于[https://en.wikipedia.org/wiki/Supervised_learning 监督学习]问题,因为不会提供正确的输入/输出对,也没有明确地修正次优行为。 |
| | | |
| Reinforcement learning is an area of machine learning concerned with how [[software agent]]s ought to take [[Action selection|actions]] in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as [[game theory]], [[control theory]], [[operations research]], [[information theory]], [[simulation-based optimization]], [[multi-agent system]]s, [[swarm intelligence]], [[statistics]] and [[genetic algorithm]]s. In machine learning, the environment is typically represented as a [[Markov Decision Process]] (MDP). Many reinforcement learning algorithms use [[dynamic programming]] techniques.<ref>{{Cite book|title=Reinforcement learning and markov decision processes|author1=van Otterlo, M.|author2=Wiering, M.|journal=Reinforcement Learning |volume=12|pages=3–42 |year=2012 |doi=10.1007/978-3-642-27645-3_1|series=Adaptation, Learning, and Optimization|isbn=978-3-642-27644-6}}</ref> Reinforcement learning algorithms do not assume knowledge of an exact mathematical model of the MDP, and are used when exact models are infeasible. Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game against a human opponent. | | Reinforcement learning is an area of machine learning concerned with how [[software agent]]s ought to take [[Action selection|actions]] in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as [[game theory]], [[control theory]], [[operations research]], [[information theory]], [[simulation-based optimization]], [[multi-agent system]]s, [[swarm intelligence]], [[statistics]] and [[genetic algorithm]]s. In machine learning, the environment is typically represented as a [[Markov Decision Process]] (MDP). Many reinforcement learning algorithms use [[dynamic programming]] techniques.<ref>{{Cite book|title=Reinforcement learning and markov decision processes|author1=van Otterlo, M.|author2=Wiering, M.|journal=Reinforcement Learning |volume=12|pages=3–42 |year=2012 |doi=10.1007/978-3-642-27645-3_1|series=Adaptation, Learning, and Optimization|isbn=978-3-642-27644-6}}</ref> Reinforcement learning algorithms do not assume knowledge of an exact mathematical model of the MDP, and are used when exact models are infeasible. Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game against a human opponent. |
第357行: |
第356行: |
| | | |
| ==== 自学习 Self learning ==== | | ==== 自学习 Self learning ==== |
− |
| |
− |
| |
| | | |
| Self-learning as machine learning paradigm was introduced in 1982 along with a neural network capable of self-learning named Crossbar Adaptive Array (CAA). <ref> Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402. {{ISBN|978-0-444-86488-8}}.</ref> It is a learning with no external rewards and no external teacher advices. The CAA self-learning algorithm computes, in a crossbar fashion, both decisions about actions and emotions (feelings) about consequence situations. The system is driven by the interaction between cognition and emotion. <ref>Bozinovski, Stevo (2014) "Modeling mechanisms of cognition-emotion interaction in artificial neural networks, since 1981." Procedia Computer Science p. 255-263 </ref> | | Self-learning as machine learning paradigm was introduced in 1982 along with a neural network capable of self-learning named Crossbar Adaptive Array (CAA). <ref> Bozinovski, S. (1982). "A self-learning system using secondary reinforcement" . In Trappl, Robert (ed.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. pp. 397–402. {{ISBN|978-0-444-86488-8}}.</ref> It is a learning with no external rewards and no external teacher advices. The CAA self-learning algorithm computes, in a crossbar fashion, both decisions about actions and emotions (feelings) about consequence situations. The system is driven by the interaction between cognition and emotion. <ref>Bozinovski, Stevo (2014) "Modeling mechanisms of cognition-emotion interaction in artificial neural networks, since 1981." Procedia Computer Science p. 255-263 </ref> |