第71行: |
第71行: |
| | | |
| ==历史、和其它领域的关系== | | ==历史、和其它领域的关系== |
− | :''参见:[https://en.wikipedia.org/wiki/Timeline_of_machine_learning 机器学习时间线]''
| + | 美国计算机游戏和[[人工智能]]领域的先驱[https://en.wikipedia.org/wiki/Arthur_Samuel Arthur Samuel]于1959年在IBM时发明了“机器学习”一词。作为一项科学探索,机器学习源于对人工智能的探索 |
− | 美国[https://en.wikipedia.org/wiki/PC_game 计算机游戏]和[[人工智能]]领域的先驱[https://en.wikipedia.org/wiki/Arthur_Samuel Arthur Samuel]于1959年在[https://en.wikipedia.org/wiki/IBM IBM]时发明了“机器学习”一词。作为一项科学探索,机器学习源于对人工智能的探索
| |
| <ref>R. Kohavi and F. Provost, "Glossary of terms," Machine Learning, vol. 30, no. 2–3, pp. 271–274, 1998.</ref>. 。 | | <ref>R. Kohavi and F. Provost, "Glossary of terms," Machine Learning, vol. 30, no. 2–3, pp. 271–274, 1998.</ref>. 。 |
| 早在人工智能作为一门学术学科的时期,一些研究人员就对让机器从数据中学习产生了兴趣。他们试图用各种符号方法以及后来被称为“[https://en.wikipedia.org/wiki/Neural_network 神经网络]”的方法来解决这个问题;这些模型大多是[https://en.wikipedia.org/wiki/Perceptron 感知器]和其他模型,后来被发现是[https://en.wikipedia.org/wiki/Generalized_linear_model 广义线性统计模型]的再发明 | | 早在人工智能作为一门学术学科的时期,一些研究人员就对让机器从数据中学习产生了兴趣。他们试图用各种符号方法以及后来被称为“[https://en.wikipedia.org/wiki/Neural_network 神经网络]”的方法来解决这个问题;这些模型大多是[https://en.wikipedia.org/wiki/Perceptron 感知器]和其他模型,后来被发现是[https://en.wikipedia.org/wiki/Generalized_linear_model 广义线性统计模型]的再发明 |
第80行: |
第79行: |
| </ref>。[https://en.wikipedia.org/wiki/Probability_theory 统计学]推理也被使用,特别是在自动医疗诊断中 | | </ref>。[https://en.wikipedia.org/wiki/Probability_theory 统计学]推理也被使用,特别是在自动医疗诊断中 |
| <ref name="aima"/>。 | | <ref name="aima"/>。 |
| + | |
| | | |
| 然而,对[https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence 逻辑的、基于知识的方法]的日益重视,导致人工智能和机器学习之间产生了裂痕,概率系统受到数据采集和数据表示的理论和实践问题的困扰 | | 然而,对[https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence 逻辑的、基于知识的方法]的日益重视,导致人工智能和机器学习之间产生了裂痕,概率系统受到数据采集和数据表示的理论和实践问题的困扰 |
第86行: |
第86行: |
| <ref name="changing"> | | <ref name="changing"> |
| {{Cite journal | last1 = Langley | first1 = Pat| title = The changing science of machine learning | doi : 10.1007/s10994-011-5242-y | journal = Machine Learning| volume = 82 | issue = 3 | pages = 275–279 | year = 2011 }}</ref>。 | | {{Cite journal | last1 = Langley | first1 = Pat| title = The changing science of machine learning | doi : 10.1007/s10994-011-5242-y | journal = Machine Learning| volume = 82 | issue = 3 | pages = 275–279 | year = 2011 }}</ref>。 |
− | 基于符号与知识的学习的工作仍然属于AI领域,这促成了[https://en.wikipedia.org/wiki/Inductive_logic_programming 归纳逻辑编程],但更多的在[https://en.wikipedia.org/wiki/Pattern_recognition 模式识别]和[https://en.wikipedia.org/wiki/Information_retrieval 信息检索] | + | 基于符号与知识的学习的工作仍然属于AI领域,这促成了[https://en.wikipedia.org/wiki/Inductive_logic_programming 归纳逻辑编程],但更多的在模式识别和信息检索 |
| <ref name="aima"> | | <ref name="aima"> |
| Russell, Stuart; Norvig, Peter (2003) [1995]. | | Russell, Stuart; Norvig, Peter (2003) [1995]. |
第94行: |
第94行: |
| <ref name="aima" />。 | | <ref name="aima" />。 |
| | | |
− | 机器学习,作为一个独立的领域,在20世纪90年代开始蓬勃发展。机器学习的目标从实现人工智能转变为解决可解决的实践性问题。它将重点从AI中继承的符号方法转向了来自于[https://en.wikipedia.org/wiki/Probability_theory 统计学]和概率论的方法和模型<ref name="changing" /> ,同时也受益于数字化信息日益增长的普及性,以及互联网传播信息的能力。 | + | 机器学习,作为一个独立的领域,在20世纪90年代开始蓬勃发展。机器学习的目标从实现人工智能转变为解决可解决的实践性问题。它将重点从AI中继承的符号方法转向了来自于统计学和概率论的方法和模型<ref name="changing" /> ,同时也受益于数字化信息日益增长的普及性,以及互联网传播信息的能力。 |
| + | |
| | | |
| 机器学习和[https://en.wikipedia.org/wiki/Data_mining 数据挖掘]通常采用相同的方法,重叠程度很大。但是机器学习侧重于预测——基于从训练数据中学习到的''已知''属性,而数据挖掘则侧重于在数据中[https://en.wikipedia.org/wiki/Discovery_(observation) 发现](以前的)''未知''属性(这是数据库中[https://en.wikipedia.org/wiki/Knowledge_discovery 知识发现]的分析步骤)。数据挖掘采用多种机器学习方法而目标不同;另一方面,机器学习也采用数据挖掘的方法作为“无监督学习”或作为预处理步骤来提高学习精度。这两个研究对象之间的许多混淆(它们通常有单独的会议和单独的期刊,[https://en.wikipedia.org/wiki/ECML_PKDD ECML PKDD]是一个主要的例外)来自于它们所使用的基本假设:在机器学习中,性能通常是根据''重复生产已知知识''的能力来评估的,而在知识发现和数据挖掘(KDD)中,关键任务是发现以前''未知''的知识。在对学习已知知识进行评估时,其他监督学习方法很容易优于无监督学习方法,而在典型的KDD任务中,由于训练数据的不可得性,则无法使用监督学习的方法。 | | 机器学习和[https://en.wikipedia.org/wiki/Data_mining 数据挖掘]通常采用相同的方法,重叠程度很大。但是机器学习侧重于预测——基于从训练数据中学习到的''已知''属性,而数据挖掘则侧重于在数据中[https://en.wikipedia.org/wiki/Discovery_(observation) 发现](以前的)''未知''属性(这是数据库中[https://en.wikipedia.org/wiki/Knowledge_discovery 知识发现]的分析步骤)。数据挖掘采用多种机器学习方法而目标不同;另一方面,机器学习也采用数据挖掘的方法作为“无监督学习”或作为预处理步骤来提高学习精度。这两个研究对象之间的许多混淆(它们通常有单独的会议和单独的期刊,[https://en.wikipedia.org/wiki/ECML_PKDD ECML PKDD]是一个主要的例外)来自于它们所使用的基本假设:在机器学习中,性能通常是根据''重复生产已知知识''的能力来评估的,而在知识发现和数据挖掘(KDD)中,关键任务是发现以前''未知''的知识。在对学习已知知识进行评估时,其他监督学习方法很容易优于无监督学习方法,而在典型的KDD任务中,由于训练数据的不可得性,则无法使用监督学习的方法。 |
| | | |
− | 机器学习与优化也有着密切的联系:许多学习问题被描述为训练集上的一些[https://en.wikipedia.org/wiki/Loss_function 损失函数]的最小化。损失函数表示被训练模型的预测与实际问题之间的差异(例如,在分类中,要为实例分配一个标签,且模型要被训练以正确预测一组预先分配过标签的示例)。这两个领域的区别来自于一般化的目的:优化算法可以最小化训练集上的损失,而机器学习则涉及到最小化未知样本的损失
| + | |
| + | 机器学习与优化也有着密切的联系:许多学习问题被描述为训练集上的一些损失函数的最小化。损失函数表示被训练模型的预测与实际问题之间的差异(例如,在分类中,要为实例分配一个标签,且模型要被训练以正确预测一组预先分配过标签的示例)。这两个领域的区别来自于一般化的目的:优化算法可以最小化训练集上的损失,而机器学习则涉及到最小化未知样本的损失 |
| <ref> | | <ref> |
| Le Roux, Nicolas; Bengio, Yoshua;Fitzgibbon, Andrew(2012). | | Le Roux, Nicolas; Bengio, Yoshua;Fitzgibbon, Andrew(2012). |
第104行: |
第106行: |
| In Sra, Suvrit; Nowozin, Sebastian; Wright, Stephen J. ''Optimization for Machine Learning''. MIT Press. p. 404. | | In Sra, Suvrit; Nowozin, Sebastian; Wright, Stephen J. ''Optimization for Machine Learning''. MIT Press. p. 404. |
| </ref>。 | | </ref>。 |
| + | |
| | | |
| ===与统计学的关系=== | | ===与统计学的关系=== |
第116行: |
第119行: |
| | | |
| [https://en.wikipedia.org/wiki/Leo_Breiman Leo Breiman]区分了两种统计建模范式:数据模型和算法模型 | | [https://en.wikipedia.org/wiki/Leo_Breiman Leo Breiman]区分了两种统计建模范式:数据模型和算法模型 |
− | <ref> | + | <ref>Cornell University Library. [http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 "Breiman: Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)"]. Retrieved 8 August 2015. |
− | Cornell University Library. [http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726 "Breiman: Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)"]. Retrieved 8 August 2015. | + | </ref> ,其中“算法模型”或多或少是指像[[随机森林]]那样的机器学习算法。 |
− | </ref> ,其中“算法模型”或多或少是指像[https://en.wikipedia.org/wiki/Random_forest 随机森林]那样的机器学习算法。 | |
| | | |
| 一些统计学家采用机器学习的方法,形成了一个他们称之为''统计学习''的交叉领域<ref name="islr">{{cite book |author1=Gareth James |author2=Daniela Witten |author3=Trevor Hastie |author4=Robert Tibshirani |title=An Introduction to Statistical Learning |publisher=Springer |year=2013 |url=http://www-bcf.usc.edu/~gareth/ISL/ |page=vii}}</ref>。 | | 一些统计学家采用机器学习的方法,形成了一个他们称之为''统计学习''的交叉领域<ref name="islr">{{cite book |author1=Gareth James |author2=Daniela Witten |author3=Trevor Hastie |author4=Robert Tibshirani |title=An Introduction to Statistical Learning |publisher=Springer |year=2013 |url=http://www-bcf.usc.edu/~gareth/ISL/ |page=vii}}</ref>。 |
| | | |
| | | |
− | === 与人工智能的关系 Relation to artificial intelligence === | + | === 与人工智能的关系 === |
− | | |
− | As a scientific endeavor, machine learning grew out of the quest for artificial intelligence. In the early days of AI as an [[Discipline (academia)|academic discipline]], some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed "[[neural network]]s"; these were mostly [[perceptron]]s and [[ADALINE|other models]] that were later found to be reinventions of the [[generalized linear model]]s of statistics.<ref>{{cite citeseerx |last1=Sarle |first1=Warren |title=Neural Networks and statistical models |citeseerx=10.1.1.27.699 |year=1994}}</ref> [[Probability theory|Probabilistic]] reasoning was also employed, especially in automated [[medical diagnosis]].<ref name="aima">{{cite AIMA|edition=2}}</ref>{{rp|488}}
| |
− | | |
− | As a scientific endeavor, machine learning grew out of the quest for artificial intelligence. In the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed "neural networks"; these were mostly perceptrons and other models that were later found to be reinventions of the generalized linear models of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis.
| |
− | | |
− | 作为一项科研成果,机器学习源于对人工智能的探索。在人工智能这一学科研究的早期,一些研究人员对于让机器从数据中进行学习这一问题很感兴趣。他们试图用各种符号方法甚至是当时被称为'''”神经网络 Neural Networks”'''的方法来处理这个问题;但这些方法大部分是感知器或其他模型。后来这些模型随着统计学中广义线性模型的发展而重新出现在大众视野中,与此同时概率推理的方法也被广泛使用,特别是在自动医疗诊断问题上。
| |
− | | |
− | | |
− | | |
− | However, an increasing emphasis on the [[GOFAI|logical, knowledge-based approach]] caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation.<ref name="aima" />{{rp|488}} By 1980, [[expert system]]s had come to dominate AI, and statistics was out of favor.<ref name="changing">{{Cite journal | last1 = Langley | first1 = Pat| title = The changing science of machine learning | doi = 10.1007/s10994-011-5242-y | journal = [[Machine Learning (journal)|Machine Learning]]| volume = 82 | issue = 3 | pages = 275–279 | year = 2011 | pmid = | pmc = | doi-access = free }}</ref> Work on symbolic/knowledge-based learning did continue within AI, leading to [[inductive logic programming]], but the more statistical line of research was now outside the field of AI proper, in [[pattern recognition]] and [[information retrieval]].<ref name="aima" />{{rp|708–710; 755}} Neural networks research had been abandoned by AI and [[computer science]] around the same time. This line, too, was continued outside the AI/CS field, as "[[connectionism]]", by researchers from other disciplines including [[John Hopfield|Hopfield]], [[David Rumelhart|Rumelhart]] and [[Geoff Hinton|Hinton]]. Their main success came in the mid-1980s with the reinvention of [[backpropagation]].<ref name="aima" />{{rp|25}}
| |
− | | |
− | However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation. Work on symbolic/knowledge-based learning did continue within AI, leading to inductive logic programming, but the more statistical line of research was now outside the field of AI proper, in pattern recognition and information retrieval. Neural networks research had been abandoned by AI and computer science around the same time. This line, too, was continued outside the AI/CS field, as "connectionism", by researchers from other disciplines including Hopfield, Rumelhart and Hinton. Their main success came in the mid-1980s with the reinvention of backpropagation.
| |
− | | |
− | 然而,日益强调的'''基于知识的逻辑方法 Knowledge-based Approach'''导致了人工智能和机器学习之间的裂痕。概率系统一直被数据获取和表示的理论和实际问题所困扰。在人工智能内部,符号/知识学习的工作确实在继续,导致了归纳逻辑编程,但更多的统计研究现在已经超出了人工智能本身的领域,即模式识别和信息检索。神经网络的研究几乎在同一时间被人工智能和计算机科学领域所抛弃,但这种思路却在人工智能/计算机之外的领域被延续了下来,被其他学科的研究人员称为“连接主义”。(编者注:连接主义又称为仿生学派或生理学派,其主要原理为神经网络及神经网络间的连接机制与学习算法。)包括霍普菲尔德 Hopfield、鲁梅尔哈特 Rumelhart和辛顿 Hinton。他们的主要成就集中在20世纪80年代中期,在这一阶段神经网络的方法随着反向传播算法的出现而重新被世人所重视。
| |
− | | |
− | | |
− | | |
− | Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics and probability theory. As of 2019, many sources continue to assert that machine learning remains a sub field of AI. Yet some practitioners, for example Dr Daniel Hulme, who both teaches AI and runs a company operating in the field, argues that machine learning and AI are separate.
| |
− | | |
− | 机器学习,在重组为一个独立的领域之后,与20世纪90年代开始蓬勃发展。该领域的目标从实现人工智能转变为解决实际中的可解决问题。它将焦点从它从人工智能继承的符号方法转移到从统计学和概率论中借鉴的方法和模型。截止至2019年,许多资料都继续断言机器学习仍然是人工智能的一个子领域。然而,一些该领域的从业者(例如丹尼尔 · 休姆 Daniel Hulme博士,他既教授人工智能,又经营着一家在该领域运营的公司),则认为机器学习和人工智能是分开的。
| |
− | | |
− | | |
− | Machine learning, reorganized as a separate field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the [[symbolic artificial intelligence|symbolic approaches]] it had inherited from AI, and toward methods and models borrowed from statistics and [[probability theory]].<ref name="changing" /> As of 2019, many sources continue to assert that machine learning remains a sub field of AI. Yet some practitioners, for example Dr [[Daniel J. Hulme|Daniel Hulme]], who both teaches AI and runs a company operating in the field, argues that machine learning and AI are separate. <ref name="elements">
| |
− | | |
− | | |
− | | |
− | | |
− |
| |
− | Ref name"elements"
| |
− | | |
− | {{cite web
| |
− | | |
− | {{cite web
| |
− | | |
− | { cite web
| |
− | | |
− | |url= https://course.elementsofai.com/
| |
− | | |
− | |url= https://course.elementsofai.com/
| |
− | | |
− | Https://course.elementsofai.com/
| |
− | | |
− | |title= The Elements of AI
| |
− | | |
− | |title= The Elements of AI
| |
− | | |
− | 人工智能的元素
| |
− | | |
− | |publisher= [[University of Helsinki]]
| |
− | | |
− | |publisher= University of Helsinki
| |
− | | |
− | 出版商赫尔辛基大学
| |
− | | |
− | |date = Dec 2019
| |
− | | |
− | |date = Dec 2019
| |
− | | |
− | 2019年12月
| |
− | | |
− | |accessdate=7 April 2020}}
| |
− | | |
− | |accessdate=7 April 2020}}
| |
− | | |
− | 2020年4月7日}
| |
− | | |
− | </ref><ref>
| |
− | | |
− | </ref><ref>
| |
− | | |
− | / ref
| |
− | | |
− | {{cite web
| |
− | | |
− | {{cite web
| |
− | | |
− | { cite web
| |
− | | |
− | |url= https://www.techworld.com/tech-innovation/satalia-ceo-no-one-is-doing-ai-optimisation-can-change-that-3775689/
| |
− | | |
− | |url= https://www.techworld.com/tech-innovation/satalia-ceo-no-one-is-doing-ai-optimisation-can-change-that-3775689/
| |
− | | |
− | Https://www.techworld.com/tech-innovation/satalia-ceo-no-one-is-doing-ai-optimisation-can-change-that-3775689/
| |
− | | |
− | |title= Satalia CEO Daniel Hulme has a plan to overcome the limitations of machine learning
| |
− | | |
− | |title= Satalia CEO Daniel Hulme has a plan to overcome the limitations of machine learning
| |
| | | |
− | 萨塔利亚公司首席执行官丹尼尔 · 休姆计划克服机器学习的局限性
| + | 作为一项科研成果,机器学习源于对人工智能的探索。在人工智能这一学科研究的早期,一些研究人员对于让机器从数据中进行学习这一问题很感兴趣。他们试图用各种符号方法甚至是当时被称为'''”神经网络 Neural Networks”'''的方法来处理这个问题;但这些方法大部分是感知器或其他模型。后来这些模型随着统计学中广义线性模型的发展而重新出现在大众视野中,与此同时概率推理的方法也被广泛使用,特别是在自动医疗诊断问题上。<ref>{{cite citeseerx |last1=Sarle |first1=Warren |title=Neural Networks and statistical models |citeseerx=10.1.1.27.699 |year=1994}}</ref> [[Probability theory|Probabilistic]] reasoning was also employed, especially in automated [[medical diagnosis]].<ref name="aima">{{cite AIMA|edition=2}}</ref> |
| | | |
− | |publisher= [[Techworld]]
| |
| | | |
− | |publisher= Techworld
| |
| | | |
− | 出版商 Techworld
| + | 然而,日益强调的'''基于知识的逻辑方法 Knowledge-based Approach'''导致了人工智能和机器学习之间的裂痕。概率系统一直被数据获取和表示的理论和实际问题所困扰。<ref name="changing">{{Cite journal | last1 = Langley | first1 = Pat| title = The changing science of machine learning | doi = 10.1007/s10994-011-5242-y | journal = [[Machine Learning (journal)|Machine Learning]]| volume = 82 | issue = 3 | pages = 275–279 | year = 2011 | pmid = | pmc = | doi-access = free }}</ref>在人工智能内部,符号/知识学习的工作确实在继续,导致了归纳逻辑编程,但更多的统计研究现在已经超出了人工智能本身的领域,即模式识别和信息检索。<ref name="aima" />神经网络的研究几乎在同一时间被人工智能和计算机科学领域所抛弃,但这种思路却在人工智能/计算机之外的领域被延续了下来,被其他学科的研究人员称为“连接主义”。(编者注:连接主义又称为仿生学派或生理学派,其主要原理为神经网络及神经网络间的连接机制与学习算法。)包括霍普菲尔德 Hopfield、鲁梅尔哈特 Rumelhart和辛顿 Hinton。他们的主要成就集中在20世纪80年代中期,在这一阶段神经网络的方法随着反向传播算法的出现而重新被世人所重视。<ref name="aima" /> |
| | | |
− | |date = October 2019
| |
| | | |
− | |date = October 2019 | + | 机器学习,在重组为一个独立的领域之后,与20世纪90年代开始蓬勃发展。该领域的目标从实现人工智能转变为解决实际中的可解决问题。它将焦点从它从人工智能继承的符号方法转移到从统计学和概率论中借鉴的方法和模型。<ref name="changing" />截止至2019年,许多资料都继续断言机器学习仍然是人工智能的一个子领域。然而,一些该领域的从业者(例如丹尼尔·休姆 Daniel Hulme博士,他既教授人工智能,又经营着一家在该领域运营的公司),则认为机器学习和人工智能是分开的。 |
| + | <ref name="elements">{{cite web|url= https://course.elementsofai.com/|title= The Elements of AI|publisher= [[University of Helsinki]]|date = Dec 2019|accessdate=7 April 2020}}</ref><ref>{{cite web|url= https://www.techworld.com/tech-innovation/satalia-ceo-no-one-is-doing-ai-optimisation-can-change-that-3775689/|title= Satalia CEO Daniel Hulme has a plan to overcome the limitations of machine learning|publisher= [[Techworld]]|date = October 2019|accessdate=7 April 2020}}</ref><ref name = "Alpaydin2020"/> |
| | | |
− | 2019年10月
| + | === 与数据挖掘的关系 === |
− | | |
− | |accessdate=7 April 2020}}
| |
− | | |
− | |accessdate=7 April 2020}}
| |
− | | |
− | 2020年4月7日}
| |
− | | |
− | </ref><ref name = "Alpaydin2020"/>
| |
− | | |
− | === 与数据挖掘的关系 Relation to data mining === | |
− | | |
− | Machine learning and [[data mining]] often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on ''known'' properties learned from the training data, [[data mining]] focuses on the [[discovery (observation)|discovery]] of (previously) ''unknown'' properties in the data (this is the analysis step of [[knowledge discovery]] in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, [[ECML PKDD]] being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to ''reproduce known'' knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously ''unknown'' knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data.
| |
− | | |
− | Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction, based on known properties learned from the training data, data mining focuses on the discovery of (previously) unknown properties in the data (this is the analysis step of knowledge discovery in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities (which do often have separate conferences and separate journals, ECML PKDD being a major exception) comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data.
| |
| | | |
| 机器学习和数据挖掘虽然在使用方法上有些相似并且有很大的重叠,但是机器学习的重点是预测,基于从训练数据中学到的已知属性,而数据挖掘的重点则是发现数据中(以前)未知的属性(这是'''数据库中知识发现 Knowledge Discovery in Database, KDD'''的基本分析步骤),也就是说数据挖掘虽然使用了许多机器学习方法,但二者的目标不同; 另一方面,机器学习也使用数据挖掘方法作为“无监督学习”或作为提高学习者准确性的预处理步骤。这两个研究领域之间的混淆(这两个领域通常有各自单独的会议和单独的期刊,ECML PKDD是一个例外)来自他们工作的基本假设: 在机器学习中,算法性能通常是根据再现已知知识的能力来评估,而在知识发现和数据挖掘中,其关键任务是发现以前未知的知识,因此在对已知知识进行评价时,其他监督方法很容易超过未知(无监督)方法,而在典型的知识发现任务中,由于缺乏训练数据,无法使用有监督的学习算法。 | | 机器学习和数据挖掘虽然在使用方法上有些相似并且有很大的重叠,但是机器学习的重点是预测,基于从训练数据中学到的已知属性,而数据挖掘的重点则是发现数据中(以前)未知的属性(这是'''数据库中知识发现 Knowledge Discovery in Database, KDD'''的基本分析步骤),也就是说数据挖掘虽然使用了许多机器学习方法,但二者的目标不同; 另一方面,机器学习也使用数据挖掘方法作为“无监督学习”或作为提高学习者准确性的预处理步骤。这两个研究领域之间的混淆(这两个领域通常有各自单独的会议和单独的期刊,ECML PKDD是一个例外)来自他们工作的基本假设: 在机器学习中,算法性能通常是根据再现已知知识的能力来评估,而在知识发现和数据挖掘中,其关键任务是发现以前未知的知识,因此在对已知知识进行评价时,其他监督方法很容易超过未知(无监督)方法,而在典型的知识发现任务中,由于缺乏训练数据,无法使用有监督的学习算法。 |
| | | |
− | === 与优化的关系 Relation to optimization ===
| |
− |
| |
− | Machine learning also has intimate ties to [[Mathematical optimization|optimization]]: many learning problems are formulated as minimization of some [[loss function]] on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the pre-assigned labels of a set of examples). The difference between the two fields arises from the goal of generalization: while optimization algorithms can minimize the loss on a training set, machine learning is concerned with minimizing the loss on unseen samples.<ref>{{cite encyclopedia |last1=Le Roux |first1=Nicolas |first2=Yoshua |last2=Bengio |first3=Andrew |last3=Fitzgibbon |title=Improving First and Second-Order Methods by Modeling Uncertainty |encyclopedia=Optimization for Machine Learning |year=2012 |page=404 |editor1-last=Sra |editor1-first=Suvrit |editor2-first=Sebastian |editor2-last=Nowozin |editor3-first=Stephen J. |editor3-last=Wright |publisher=MIT Press|url=https://books.google.com/?id=JPQx7s2L1A8C&pg=PA403&dq="Improving+First+and+Second-Order+Methods+by+Modeling+Uncertainty|isbn=9780262016469 }}</ref>
| |
| | | |
− | Machine learning also has intimate ties to optimization: many learning problems are formulated as minimization of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the pre-assigned labels of a set of examples). The difference between the two fields arises from the goal of generalization: while optimization algorithms can minimize the loss on a training set, machine learning is concerned with minimizing the loss on unseen samples.
| + | === 与优化的关系 === |
| + | 机器学习与优化也有着密切的联系: 许多学习问题被表述为最小化训练样本集上的某些'''损失函数 Loss Function'''。损失函数表示正在训练的模型预测结果与实际数据之间的差异(例如,在分类问题中,人们的目标是给一个未知的实例分配其对应标签,而模型经过训练学习到的是如何正确地为一组实例标记事先已知的标签)。这两个领域之间的差异源于泛化的目标: 优化算法可以最小化训练集上的损失,而机器学习关注于最小化未知样本上的损失。<ref>{{cite encyclopedia |last1=Le Roux |first1=Nicolas |first2=Yoshua |last2=Bengio |first3=Andrew |last3=Fitzgibbon |title=Improving First and Second-Order Methods by Modeling Uncertainty |encyclopedia=Optimization for Machine Learning |year=2012 |page=404 |editor1-last=Sra |editor1-first=Suvrit |editor2-first=Sebastian |editor2-last=Nowozin |editor3-first=Stephen J. |editor3-last=Wright |publisher=MIT Press|url=https://books.google.com/?id=JPQx7s2L1A8C&pg=PA403&dq="Improving+First+and+Second-Order+Methods+by+Modeling+Uncertainty|isbn=9780262016469 }}</ref> |
| | | |
− | 机器学习与优化也有着密切的联系: 许多学习问题被表述为最小化训练样本集上的某些'''损失函数 Loss Function'''。损失函数表示正在训练的模型预测结果与实际数据之间的差异(例如,在分类问题中,人们的目标是给一个未知的实例分配其对应标签,而模型经过训练学习到的是如何正确地为一组实例标记事先已知的标签)。这两个领域之间的差异源于泛化的目标: 优化算法可以最小化训练集上的损失,而机器学习关注于最小化未知样本上的损失。
| + | <br> |
| | | |
| ==理论== | | ==理论== |