更改

添加17字节、 2024年6月3日 (星期一)

无编辑摘要

第3行：第3行：

|description=数据科学，数据挖掘，形式科学

}}

−

~~数据挖掘是一种在大型数据集中发现模式的过程，用到了机器学习、统计学和数据库系统的交叉方法。~~<ref name="acm">{{cite web |url=http://www.kdd.org/curriculum/index.html |title=Data Mining Curriculum |publisher=Association for Computing Machinery| SIGKDD |date=2006-04-30 |accessdate=2014-01-27 }}</ref><ref name="brittanica">{{cite web |last=Clifton |first=Christopher |title=Encyclopædia Britannica: Definition of Data Mining |year=2010 |url=http://www.britannica.com/EBchecked/topic/1056150/data-mining |accessdate=2010-12-09 }}</ref><ref name="elements">{{cite web|last1=Hastie|first1=Trevor|last2=Tibshirani|first2=Robert|last3=Friedman|first3=Jerome|title=The Elements of Statistical Learning: Data Mining, Inference, and Prediction|year=2009|url=http://www-stat.stanford.edu/~tibs/ElemStatLearn/|accessdate=2012-08-07|archive-url=https://web.archive.org/web/20091110212529/http://www-stat.stanford.edu/~tibs/ElemStatLearn/|archive-date=2009-11-10|url-status=dead}}</ref><ref>{{cite book|last1=Han, Kamber, Pei|first1=Jaiwei, Micheline, Jian|title=Data Mining: Concepts and Techniques|date=June 9, 2011|publisher=Morgan Kaufmann|isbn=978-0-12-381479-1|edition=3rd}}</ref>数据挖掘是指“'''知识发现 knowledge discovery in databases(KDD)'''”过程中的分析步骤。除了传统的分析步骤，它还涉及数据库和数据管理方面，包括“数据预处理、建模和推理考量、兴趣度量、复杂性考虑、发现结构的后处理、可视化和在线更新等内容。”

+

数据挖掘（Data mining）是一种在大型数据集中发现模式的过程，用到了机器学习、统计学和数据库系统的交叉方法。<ref name="acm">{{cite web |url=http://www.kdd.org/curriculum/index.html |title=Data Mining Curriculum |publisher=Association for Computing Machinery| SIGKDD |date=2006-04-30 |accessdate=2014-01-27 }}</ref><ref name="brittanica">{{cite web |last=Clifton |first=Christopher |title=Encyclopædia Britannica: Definition of Data Mining |year=2010 |url=http://www.britannica.com/EBchecked/topic/1056150/data-mining |accessdate=2010-12-09 }}</ref><ref name="elements">{{cite web|last1=Hastie|first1=Trevor|last2=Tibshirani|first2=Robert|last3=Friedman|first3=Jerome|title=The Elements of Statistical Learning: Data Mining, Inference, and Prediction|year=2009|url=http://www-stat.stanford.edu/~tibs/ElemStatLearn/|accessdate=2012-08-07|archive-url=https://web.archive.org/web/20091110212529/http://www-stat.stanford.edu/~tibs/ElemStatLearn/|archive-date=2009-11-10|url-status=dead}}</ref><ref>{{cite book|last1=Han, Kamber, Pei|first1=Jaiwei, Micheline, Jian|title=Data Mining: Concepts and Techniques|date=June 9, 2011|publisher=Morgan Kaufmann|isbn=978-0-12-381479-1|edition=3rd}}</ref>数据挖掘是指“'''知识发现 knowledge discovery in databases(KDD)'''”过程中的分析步骤。除了传统的分析步骤，它还涉及数据库和数据管理方面，包括“数据预处理、建模和推理考量、兴趣度量、复杂性考虑、发现结构的后处理、可视化和在线更新等内容。”

“数据挖掘”这种形容其实并不太恰当，因为我们的目标是从大量数据中提取模式和知识，而不是数据本身的提取(挖掘)。<ref name="han-kamber">{{cite book|title=Data mining: concepts and techniques|last1=Han|first1=Jiawei|last2=Kamber|first2=Micheline|date=2001|publisher=Morgan Kaufmann|isbn=978-1-55860-489-6|page=5|quote=Thus, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long}}</ref>“它是一个经常被用于各种大规模数据或信息处理（收集、提取、存储、分析和统计），以及包括人工智能（例如机器学习）和商业智能的'''<font color="#ff8000"> 计算机决策系统 Decision Support System，DSS</font>'''等场合的流行语”<ref>[http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf OKAIRP 2005 Fall Conference, Arizona State University] {{Webarchive|url=https://web.archive.org/web/20140201170452/http://www.okairp.org/documents/2005%20Fall/F05_ROMEDataQualityETC.pdf|date=2014-02-01}}</ref>。《数据挖掘：使用Java的实用机器学习工具和技术》<ref name="witten">{{cite book|title=Data Mining: Practical Machine Learning Tools and Techniques|last1=Witten|first1=Ian H.|last2=Frank|first2=Eibe|last3=Hall|first3=Mark A.|date=30 January 2011|publisher=Elsevier|isbn=978-0-12-374856-0|edition=3}}</ref> （主要提供了一些机器学习的资料）一书最初被命名为《实用机器学习》，而数据挖掘一词只是为了销量更好而增加的。<ref>{{Cite journal|author1=Bouckaert, Remco R.|author2=Frank, Eibe|author3=Hall, Mark A.|author4=Holmes, Geoffrey|author5=Pfahringer, Bernhard|author6=Reutemann, Peter|author7=Witten, Ian H.|year=2010|title=WEKA Experiences with a Java open-source project|journal=Journal of Machine Learning Research|volume=11|pages=2533–2541|quote=the original title, "Practical machine learning", was changed ... The term "data mining" was [added] primarily for marketing reasons.|postscript={{inconsistent citations}}}}</ref>经常来说，更一般的术语如（大规模）数据分析，或实际的方法如人工智能和机器学习，是更合适的表达方式。

Gyt666

123

个编辑

更改

数据挖掘 (查看源代码)

2024年6月3日 (一) 14:46的版本