更改

跳到导航 跳到搜索
添加21字节 、 2020年10月24日 (六) 14:27
第81行: 第81行:  
{} / ref
 
{} / ref
   −
分析是指将一个整体分解成独立的部分来进行个别检查。数据分析是获取原始数据并将其转化为用户决策有用信息的过程。数据被收集和分析,从而回答问题、检验假设或推翻理论。
+
分析是指将一个整体分解成独立的部分来进行个别检查。数据分析是获取原始数据并将其转化为用户决策有用信息的过程。通过收集和分析数据来回答问题、检验假设或推翻理论。
    
Statistician [[John Tukey]] defined data analysis in 1961 as: "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data."<ref>[http://projecteuclid.org/download/pdf_1/euclid.aoms/1177704711 John Tukey-The Future of Data Analysis-July 1961]</ref>
 
Statistician [[John Tukey]] defined data analysis in 1961 as: "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data."<ref>[http://projecteuclid.org/download/pdf_1/euclid.aoms/1177704711 John Tukey-The Future of Data Analysis-July 1961]</ref>
第87行: 第87行:  
Statistician John Tukey defined data analysis in 1961 as: "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data."
 
Statistician John Tukey defined data analysis in 1961 as: "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data."
   −
统计学家 John Tukey 在1961年将数据分析定义为“分析数据的一些过程,解释这些过程所产生结果的技术,规划数据收集以使数据分析过程更容易、更精确或更准确的方法,以及所有适用于数据分析的(数学)统计的机制和结果。”
+
统计学家 John Tukey 在1961年将数据分析定义为“分析数据并解释这些过程所产生结果的技术,同时规划数据收集以使数据分析过程更容易、更精确或更准确的方法,以及所有适用于数据分析的(数学)统计的机制和结果。”
      第124行: 第124行:  
| isbn = 978-1-449-35865-5}}</ref> The CRISP framework used in data mining has similar steps.
 
| isbn = 978-1-449-35865-5}}</ref> The CRISP framework used in data mining has similar steps.
   −
有几个阶段可以区分,如下所述。这些阶段是<font color = '#ff8000'>迭代的iterative</font>,因为后期阶段的反馈可能会导致前期阶段额外的工作。
+
数据分析的发展可以分为以下几个阶段,如下所述。这些阶段是<font color = '#ff8000'>迭代的iterative</font>,因为后期阶段的反馈可能会导致重复额外的与前期阶段相同的工作。
 
用于数据挖掘的 CRISP 框架有类似的步骤。
 
用于数据挖掘的 CRISP 框架有类似的步骤。
   第135行: 第135行:  
The data are necessary as inputs to the analysis, which is specified based upon the requirements of those directing the analysis or customers (who will use the finished product of the analysis). The general type of entity upon which the data will be collected is referred to as an experimental unit (e.g., a person or population of people). Specific variables regarding a population (e.g., age and income) may be specified and obtained.  Data may be numerical or categorical (i.e., a text label for numbers).
 
The data are necessary as inputs to the analysis, which is specified based upon the requirements of those directing the analysis or customers (who will use the finished product of the analysis). The general type of entity upon which the data will be collected is referred to as an experimental unit (e.g., a person or population of people). Specific variables regarding a population (e.g., age and income) may be specified and obtained.  Data may be numerical or categorical (i.e., a text label for numbers).
   −
这些数据,作为分析的输入,是很必要的,因为分析是基于指导分析的人或客户的需求(这些人将使用分析的最终产品)而规定的。收集数据的一般实体类型称为试验单位(例如,一个人或一群人)。关于<font color = '#ff8000'>总体population</font>的具体变量(例如,年龄和收入)可以被指定和获得。数据可以是数字型的或分类的(也就是数字的文本标签)。
+
作为分析的输入数据,是很必要的,因为分析是基于指导分析的人或客户的需求(这些人将使用分析的最终产品)而规定的。收集数据的一般实体类型称为试验单位(例如,一个人或一群人)。关于<font color = '#ff8000'>总体population</font>的具体变量(例如,年龄和收入)可以被指定和获得。数据可以是数字型的或分类的(也就是数字的文本标签)。
     
526

个编辑

导航菜单