更改

数据分析 (查看源代码)

2020年8月30日 (日) 15:56的版本

添加1,290字节、 2020年8月30日 (日) 15:56

→‎Techniques for analyzing quantitative data

第314行：第314行： −

==Techniques for analyzing quantitative data==

+

==Techniques for analyzing quantitative data 分析定性数据的技术==

第322行：第322行：

Author Jonathan Koomey has recommended a series of best practices for understanding quantitative data. These include:

−

~~作者乔纳森 · 库米推荐了一系列理解定量数据的最佳实践。其中包括~~:

+

作者Jonathan Koomey推荐了一系列理解定量数据的最佳方法。其中包括:

*Check raw data for anomalies prior to performing an analysis;

+

* 在实施数据分析之前检查原始数据中的异常值；

+

*Re-perform important calculations, such as verifying columns of data that are formula driven;

+

* 重新执行重要的计算，例如验证'''公式驱动formula driven'''的数据列；

*Confirm main totals are the sum of subtotals;

+

* 确认总计是小计的和；

*Check relationships between numbers that should be related in a predictable way, such as ratios over time;

+

* 检查那些可以通过一些方法预测的数字之间的关系，例如虽时间变化的比例；

*Normalize numbers to make comparisons easier, such as analyzing amounts per person or relative to GDP or as an index value relative to a base year;

+

* 使数字正态化以便于比较，例如分析每个人的数量，或相对于GDP 的数量，或相对于基准年的数量指数；

*Break problems into component parts by analyzing factors that led to the results, such as [[DuPont analysis]] of return on equity.<ref name="Koomey1"/>

−

+

* 通过分析导致结果的因素将问题整体分解为几个部分，如'''净资产收益率return on equity'''的'''杜邦分析DuPont analysis'''<ref name="Koomey1"/>。

第342行：第353行：

For the variables under examination, analysts typically obtain descriptive statistics for them, such as the mean (average), median, and standard deviation. They may also analyze the distribution of the key variables to see how the individual values cluster around the mean.

−

对于被调查的变量，分析师通常会得到它们的描述统计学，比如平均值、中位数和标准差。他们还可以分析关键变量的分布情况，看看各个值是如何围绕平均值聚集的。

+

对于被调查的变量，分析师通常会得到它们的描述统计学变量，比如平均数、中位数和标准差。他们还可以分析关键变量的分布情况，来看各个值是如何围绕平均数聚集的。

[[File:US_Employment_Statistics_-_March_2015.png|thumb|250px|right|An illustration of the [[MECE principle]] used for data analysis.]] The consultants at [[McKinsey and Company]] named a technique for breaking a quantitative problem down into its component parts called the [[MECE principle]]. Each layer can be broken down into its components; each of the sub-components must be [[Mutually exclusive events|mutually exclusive]] of each other and [[Collectively exhaustive events|collectively]] add up to the layer above them. The relationship is referred to as "Mutually Exclusive and Collectively Exhaustive" or MECE. For example, profit by definition can be broken down into total revenue and total cost. In turn, total revenue can be analyzed by its components, such as revenue of divisions A, B, and C (which are mutually exclusive of each other) and should add to the total revenue (collectively exhaustive).

第348行：第359行：

An illustration of the [[MECE principle used for data analysis.]] The consultants at McKinsey and Company named a technique for breaking a quantitative problem down into its component parts called the MECE principle. Each layer can be broken down into its components; each of the sub-components must be mutually exclusive of each other and collectively add up to the layer above them. The relationship is referred to as "Mutually Exclusive and Collectively Exhaustive" or MECE. For example, profit by definition can be broken down into total revenue and total cost. In turn, total revenue can be analyzed by its components, such as revenue of divisions A, B, and C (which are mutually exclusive of each other) and should add to the total revenue (collectively exhaustive).

−

~~对[[用于数据分析的~~ MECE ~~原理]的说明麦肯锡咨询公司的顾问们提出了一种将定量问题分解为其组成部分的技术，称为~~ MECE ~~原理。每一层都可以分解成它的组件; 每一个子组件必须相互排斥，共同构成它们上面的层。这种关系被称为“相互排斥和集体详尽”或~~ MECE。例如，根据定义，利润可以分为总收入和总成本。反过来，总收入可以通过其组成部分进行分析，如部门 ~~a、 b~~ 和 ~~c 的收入(它们相互排斥) ，并且应该增加总收入(总体上详尽无遗)。~~

+

一个用于数据分析的'''MECE 原理MECE principle'''的说明。

+

麦肯锡咨询公司的顾问们提出了一种将定量问题分解为其组成部分的技术，称为 MECE 原理。每一层都可以分解成它的组成部分；每一个子组成部分必须相互排斥，共同构成它们上一级的层次。这种关系被称为“'''相互排斥且集体穷尽Mutually Exclusive and Collectively Exhaustive'''”或 MECE。例如，根据定义，利润可以分为总收入和总成本。反过来，总收入可以通过其组成部分进行分析，如部门 A、 B 和 C 的收入（它们相互排斥），并且它们的总和应该是总收入（总体上穷尽）。

第356行：第368行：

Analysts may use robust statistical measurements to solve certain analytical problems. Hypothesis testing is used when a particular hypothesis about the true state of affairs is made by the analyst and data is gathered to determine whether that state of affairs is true or false. For example, the hypothesis might be that "Unemployment has no effect on inflation", which relates to an economics concept called the Phillips Curve. Hypothesis testing involves considering the likelihood of Type I and type II errors, which relate to whether the data supports accepting or rejecting the hypothesis.

−

分析师可能会使用强有力的统计测量来解决某些分析问题。当分析师对事件的真实状态做出特定假设并收集数据以确定事件的真实状态时，就使用假设检验。例如，假设可能是“失业对通货膨胀没有影响” ，这与一个叫做菲利普斯曲线的经济学概念有关。假设检验包括考虑第一类和第二类错误的可能性，这些错误与数据是否支持接受或拒绝假设有关。

+

分析师可能会使用稳健的统计测量来解决特定的分析问题。当分析师对事件的真实状态做出特定假设，并收集数据以确定事件的真实状态时，就使用'''假设检验Hypothesis testing'''。例如，一个假设可能是“失业对通货膨胀没有影响” ，这个假设与一个叫做'''菲利普斯曲线Phillips Curve'''的经济学概念有关。假设检验包括考虑第一类和第二类错误的可能性，这些错误与数据是否支持接受或拒绝假设有关。

第364行：第376行：

Regression analysis may be used when the analyst is trying to determine the extent to which independent variable X affects dependent variable Y (e.g., "To what extent do changes in the unemployment rate (X) affect the inflation rate (Y)?"). This is an attempt to model or fit an equation line or curve to the data, such that Y is a function of X.

−

当分析师试图确定自变量 x 对因变量 y 的影响程度时，可以使用'''~~回归分析~~'''~~分析法~~(~~例如，“失业率(x~~)的变化对通货膨胀率(~~y)的影响程度”~~) ~~.这是一个试图模型或拟合一个方程线或曲线的数据，这样 y 是一个函数的 x。~~

+

当分析师试图确定自变量 X 对因变量 Y 的影响程度时，可以使用'''回归分析Regression analysis'''D的方法（例如，“失业率(X)的变化对通货膨胀率(Y)的影响程度如何？”）。这是一种建模或拟合一个方程直线（或曲线）数据的尝试，使得 Y 是 X 的一个函数。

第372行：第384行：

[https://www.erim.eur.nl/centres/necessary-condition-analysis/ Necessary condition analysis] (NCA) may be used when the analyst is trying to determine the extent to which independent variable X allows variable Y (e.g., "To what extent is a certain unemployment rate (X) necessary for a certain inflation rate (Y)?"). Whereas (multiple) regression analysis uses additive logic where each X-variable can produce the outcome and the X's can compensate for each other (they are sufficient but not necessary), necessary condition analysis (NCA) uses necessity logic, where one or more X-variables allow the outcome to exist, but may not produce it (they are necessary but not sufficient). Each single necessary condition must be present and compensation is not possible.

−

当分析师试图确定自变量 x 在多大程度上允许变量 ~~y 时，可以使用~~ https://www.erim.eur.nl/centres/Necessary-condition-analysis/ ~~必要条件分析~~(NCA)(~~例如，“在多大程度上某个失业率(x~~)~~对某个通货膨胀率~~(y)是必要的? ”) .然而(多重)回归分析分析使用附加逻辑，其中每个 x 变量可以产生结果，x 可以相互补偿(他们是充分的，但不是必要的) ，必要条件分析(NCA)使用必要逻辑，其中一个或多个 x 变量允许结果存在，但可能不产生它(他们是必要的，但不是充分的)。每一个单一的必要条件必须存在，补偿是不可能的。

+

当分析师试图确定自变量 X 在多大程度上允许变量 Y 的出现时，可以使用 https://www.erim.eur.nl/centres/Necessary-condition-analysis/ '''必要条件分析Necessary condition analysis(NCA)'''（例如，“某个失业率(X)在多大程度上对某个通货膨胀率(Y)是必要的? ”）。（多重）回归分析分析使用'''加法逻辑additive logic'''，其中每个 X 变量可以产生结果，X 之间可以相互补偿（这些X都是充分的，但不是必要的），然而必要条件分析使用'''必要逻辑necessity logic，其中一个或多个 X 变量允许结果的存在，但也可能不产生这个结果（它们是必要的，但不是充分的）。每个单一的必要条件都必须存在，变量之间不允许补偿。

−

==Analytical activities of data users==

嘉树

259

个编辑

更改

数据分析 (查看源代码)

2020年8月30日 (日) 15:56的版本

导航菜单

搜索