更改

大小无更改 、 2020年9月26日 (六) 10:21
无编辑摘要
第3行: 第3行:  
|description=数据科学,数据挖掘,形式科学
 
|description=数据科学,数据挖掘,形式科学
 
}}
 
}}
数据挖掘是一种在大型数据集中发现模式的过程,用到了机器学习、统计学和数据库系统的交叉方法。<ref name="acm">{{cite web |url=http://www.kdd.org/curriculum/index.html |title=Data Mining Curriculum |publisher=[[Association for Computing Machinery|ACM]] [[SIGKDD]] |date=2006-04-30 |accessdate=2014-01-27 }}</ref><ref name="brittanica">{{cite web |last=Clifton |first=Christopher |title=Encyclopædia Britannica: Definition of Data Mining |year=2010 |url=http://www.britannica.com/EBchecked/topic/1056150/data-mining |accessdate=2010-12-09 }}</ref><ref name="elements">{{cite web|last1=Hastie|first1=Trevor|authorlink1=Trevor Hastie|last2=Tibshirani|first2=Robert|authorlink2=Robert Tibshirani|last3=Friedman|first3=Jerome|authorlink3=Jerome H. Friedman|title=The Elements of Statistical Learning: Data Mining, Inference, and Prediction|year=2009|url=http://www-stat.stanford.edu/~tibs/ElemStatLearn/|accessdate=2012-08-07|archive-url=https://web.archive.org/web/20091110212529/http://www-stat.stanford.edu/~tibs/ElemStatLearn/|archive-date=2009-11-10|url-status=dead}}</ref><ref>{{cite book|last1=Han, Kamber, Pei|first1=Jaiwei, Micheline, Jian|title=Data Mining: Concepts and Techniques|date=June 9, 2011|publisher=Morgan Kaufmann|isbn=978-0-12-381479-1|edition=3rd}}</ref>数据挖掘是指“知识发现(knowledge discovery in databases,KDD)”过程中的分析步骤。除了传统的分析步骤,它还涉及数据库和数据管理方面,包括“数据预处理、'''建模'''和推理'''考量'''、兴趣度量、'''复杂性考虑、发现结构的后处理'''、可视化和在线更新等内容。”
+
数据挖掘是一种在大型数据集中发现模式的过程,用到了机器学习、统计学和数据库系统的交叉方法。<ref name="acm">{{cite web |url=http://www.kdd.org/curriculum/index.html |title=Data Mining Curriculum |publisher=[[Association for Computing Machinery|ACM]] [[SIGKDD]] |date=2006-04-30 |accessdate=2014-01-27 }}</ref><ref name="brittanica">{{cite web |last=Clifton |first=Christopher |title=Encyclopædia Britannica: Definition of Data Mining |year=2010 |url=http://www.britannica.com/EBchecked/topic/1056150/data-mining |accessdate=2010-12-09 }}</ref><ref name="elements">{{cite web|last1=Hastie|first1=Trevor|authorlink1=Trevor Hastie|last2=Tibshirani|first2=Robert|authorlink2=Robert Tibshirani|last3=Friedman|first3=Jerome|authorlink3=Jerome H. Friedman|title=The Elements of Statistical Learning: Data Mining, Inference, and Prediction|year=2009|url=http://www-stat.stanford.edu/~tibs/ElemStatLearn/|accessdate=2012-08-07|archive-url=https://web.archive.org/web/20091110212529/http://www-stat.stanford.edu/~tibs/ElemStatLearn/|archive-date=2009-11-10|url-status=dead}}</ref><ref>{{cite book|last1=Han, Kamber, Pei|first1=Jaiwei, Micheline, Jian|title=Data Mining: Concepts and Techniques|date=June 9, 2011|publisher=Morgan Kaufmann|isbn=978-0-12-381479-1|edition=3rd}}</ref>数据挖掘是指“知识发现 knowledge discovery in databases(KDD)”过程中的分析步骤。除了传统的分析步骤,它还涉及数据库和数据管理方面,包括“数据预处理、'''建模'''和推理'''考量'''、兴趣度量、'''复杂性考虑、发现结构的后处理'''、可视化和在线更新等内容。”
    
“数据挖掘”这种形容其实并不'''太'''恰当,因为我们的目标是从大量数据中提取模式和知识,而不是数据本身的提取(挖掘)。<ref name="han-kamber">{{cite book|title=Data mining: concepts and techniques|last1=Han|first1=Jiawei|last2=Kamber|first2=Micheline|date=2001|publisher=[[Morgan Kaufmann]]|isbn=978-1-55860-489-6|page=5|quote=Thus, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long|authorlink1=Jiawei Han}}</ref>“它是一个经常被用于各种大规模数据或信息处理(收集、提取、存储、分析和统计),以及包括人工智能(例如机器学习)和商业智能的'''<font color="#ff8000"> 计算机决策系统 Decision Support System,DSS</font>'''等场合的流行语”<ref>[http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf OKAIRP 2005 Fall Conference, Arizona State University] {{Webarchive|url=https://web.archive.org/web/20140201170452/http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf|date=2014-02-01}}</ref>。 《数据挖掘:使用Java的实用机器学习工具和技术》<ref name="witten">{{cite book|title=Data Mining: Practical Machine Learning Tools and Techniques|last1=Witten|first1=Ian H.|last2=Frank|first2=Eibe|last3=Hall|first3=Mark A.|date=30 January 2011|publisher=Elsevier|isbn=978-0-12-374856-0|edition=3|authorlink1=Ian H. Witten}}</ref> (主要提供了一些机器学习的资料)一书最初被命名为《实用机器学习》,而数据挖掘一词只是为了销量更好而增加的。<ref>{{Cite journal|author1=Bouckaert, Remco R.|author2=Frank, Eibe|author3=Hall, Mark A.|author4=Holmes, Geoffrey|author5=Pfahringer, Bernhard|author6=Reutemann, Peter|author7=Witten, Ian H.|authorlink7=Ian H. Witten|year=2010|title=WEKA Experiences with a Java open-source project|journal=Journal of Machine Learning Research|volume=11|pages=2533–2541|quote=the original title, "Practical machine learning", was changed&nbsp;... The term "data mining" was [added] primarily for marketing reasons.|postscript={{inconsistent citations}}}}</ref>经常来说,更一般的术语如(大规模)数据分析,或实际的方法如人工智能和机器学习,是更合适的表达方式。
 
“数据挖掘”这种形容其实并不'''太'''恰当,因为我们的目标是从大量数据中提取模式和知识,而不是数据本身的提取(挖掘)。<ref name="han-kamber">{{cite book|title=Data mining: concepts and techniques|last1=Han|first1=Jiawei|last2=Kamber|first2=Micheline|date=2001|publisher=[[Morgan Kaufmann]]|isbn=978-1-55860-489-6|page=5|quote=Thus, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long|authorlink1=Jiawei Han}}</ref>“它是一个经常被用于各种大规模数据或信息处理(收集、提取、存储、分析和统计),以及包括人工智能(例如机器学习)和商业智能的'''<font color="#ff8000"> 计算机决策系统 Decision Support System,DSS</font>'''等场合的流行语”<ref>[http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf OKAIRP 2005 Fall Conference, Arizona State University] {{Webarchive|url=https://web.archive.org/web/20140201170452/http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf|date=2014-02-01}}</ref>。 《数据挖掘:使用Java的实用机器学习工具和技术》<ref name="witten">{{cite book|title=Data Mining: Practical Machine Learning Tools and Techniques|last1=Witten|first1=Ian H.|last2=Frank|first2=Eibe|last3=Hall|first3=Mark A.|date=30 January 2011|publisher=Elsevier|isbn=978-0-12-374856-0|edition=3|authorlink1=Ian H. Witten}}</ref> (主要提供了一些机器学习的资料)一书最初被命名为《实用机器学习》,而数据挖掘一词只是为了销量更好而增加的。<ref>{{Cite journal|author1=Bouckaert, Remco R.|author2=Frank, Eibe|author3=Hall, Mark A.|author4=Holmes, Geoffrey|author5=Pfahringer, Bernhard|author6=Reutemann, Peter|author7=Witten, Ian H.|authorlink7=Ian H. Witten|year=2010|title=WEKA Experiences with a Java open-source project|journal=Journal of Machine Learning Research|volume=11|pages=2533–2541|quote=the original title, "Practical machine learning", was changed&nbsp;... The term "data mining" was [added] primarily for marketing reasons.|postscript={{inconsistent citations}}}}</ref>经常来说,更一般的术语如(大规模)数据分析,或实际的方法如人工智能和机器学习,是更合适的表达方式。
7,129

个编辑