“鲁宾因果模型”的版本间的差异
(Moved page from wikipedia:en:Rubin causal model (history)) |
(没有差异)
|
2021年5月25日 (二) 15:18的版本
此词条暂由彩云小译翻译,翻译字数共430,未经人工整理和审校,带来阅读不便,请见谅。
The Rubin causal model (RCM), also known as the Neyman–Rubin causal model,[1] is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes, named after Donald Rubin. The name "Rubin causal model" was first coined by Paul W. Holland.[2] The potential outcomes framework was first proposed by Jerzy Neyman in his 1923 Master's thesis,[3] though he discussed it only in the context of completely randomized experiments.[4] Rubin extended it into a general framework for thinking about causation in both observational and experimental studies.[1]
The Rubin causal model (RCM), also known as the Neyman–Rubin causal model, is an approach to the statistical analysis of cause and effect based on the framework of potential outcomes, named after Donald Rubin. The name "Rubin causal model" was first coined by Paul W. Holland. The potential outcomes framework was first proposed by Jerzy Neyman in his 1923 Master's thesis, though he discussed it only in the context of completely randomized experiments. Rubin extended it into a general framework for thinking about causation in both observational and experimental studies. A randomized experiment assigns people randomly to treatments: college or no college. Because of this random assignment, the groups are (on average) equivalent, and the difference in income at age 40 can be attributed to the college assignment since that was the only difference between the groups. An estimate of the average causal effect (also referred to as the average treatment effect) can then be obtained by computing the difference in means between the treated (college-attending) and control (not-college-attending) samples.
虚拟事实模型分析法,也称为 Neyman-虚拟事实模型分析法,是一种基于潜在结果框架的因果统计分析方法,以 Donald Rubin 的名字命名。虚拟事实模型的名字是由 Paul w. Holland 首创的。潜在结果框架最早是由 Jerzy Neyman 在他1923年的硕士论文中提出的,尽管他只是在完全随机化实验的背景下讨论它。鲁宾把它扩展到一个普遍的框架,用来思考观察和实验研究中的因果关系。一个随机实验随机分配人参加治疗: 上大学或不上大学。由于这种随机分配,这些群体(平均)是相等的,40岁时的收入差异可以归因于大学分配,因为这是这些群体之间唯一的差异。平均因果效应(也称为平均治疗效应)的估计可以通过计算治疗(就读大学)和对照(非就读大学)样本之间的平均值差异来获得。
Introduction
In many circumstances, however, randomized experiments are not possible due to ethical or practical concerns. In such scenarios there is a non-random assignment mechanism. This is the case for the example of college attendance: people are not randomly assigned to attend college. Rather, people may choose to attend college based on their financial situation, parents' education, and so on. Many statistical methods have been developed for causal inference, such as propensity score matching. These methods attempt to correct for the assignment mechanism by finding control units similar to treatment units.
然而,在许多情况下,由于伦理或实际的考虑,随机试验是不可能的。在这种情况下,有一个非随机分配机制。这就是大学出勤率的例子: 人们并不是随机分配到大学的。相反,人们可能会根据自己的经济状况、父母的教育程度等因素选择上大学。许多因果推断的统计方法已经被开发出来,比如倾向评分匹配。这些方法试图通过寻找类似于处理单元的控制单元来纠正分配机制。
The Rubin causal model is based on the idea of potential outcomes. For example, a person would have a particular income at age 40 if he had attended college, whereas he would have a different income at age 40 if he had not attended college. To measure the causal effect of going to college for this person, we need to compare the outcome for the same individual in both alternative futures. Since it is impossible to see both potential outcomes at once, one of the potential outcomes is always missing. This dilemma is the "fundamental problem of causal inference".
Because of the fundamental problem of causal inference, unit-level causal effects cannot be directly observed. However, randomized experiments allow for the estimation of population-level causal effects.[5] A randomized experiment assigns people randomly to treatments: college or no college. Because of this random assignment, the groups are (on average) equivalent, and the difference in income at age 40 can be attributed to the college assignment since that was the only difference between the groups. An estimate of the average causal effect (also referred to as the average treatment effect) can then be obtained by computing the difference in means between the treated (college-attending) and control (not-college-attending) samples.
Rubin defines a causal effect:
鲁宾定义了一种因果效应:
In many circumstances, however, randomized experiments are not possible due to ethical or practical concerns. In such scenarios there is a non-random assignment mechanism. This is the case for the example of college attendance: people are not randomly assigned to attend college. Rather, people may choose to attend college based on their financial situation, parents' education, and so on. Many statistical methods have been developed for causal inference, such as propensity score matching. These methods attempt to correct for the assignment mechanism by finding control units similar to treatment units.
< 封锁报价 >
Intuitively, the causal effect of one treatment, E, over another, C, for a particular unit and an interval of time from [math]\displaystyle{ t_1 }[/math] to [math]\displaystyle{ t_2 }[/math] is the difference between what would have happened at time [math]\displaystyle{ t_2 }[/math] if the unit had been exposed to E initiated at [math]\displaystyle{ t_1 }[/math] and what would have happened at [math]\displaystyle{ t_2 }[/math] if the unit had been exposed to C initiated at [math]\displaystyle{ t_1 }[/math]: 'If an hour ago I had taken two aspirins instead of just a glass of water, my headache would now be gone,' or 'because an hour ago I took two aspirins instead of just a glass of water, my headache is now gone.' Our definition of the causal effect of the E versus C treatment will reflect this intuitive meaning." and other techniques for causal inference. For more on the connections between the Rubin causal model, structural equation modeling, and other statistical methods for causal inference, see Morgan and Winship (2007).
直观上,一种治疗方法 e 对另一种治疗方法 c 的因果关系,对于一个特定的单位和一段时间间隔,如果这个单位在 < math > t _ 1 </math > 到 < math > t _ 2 </math > 之间暴露于 e,那么在 < math > t _ 1 </math > 之前会发生什么,如果这个单位在 < math > t _ 1 </math > 之前暴露于 c,那么在 < math > t _ 2 </math > 之前会发生什么,如果一个小时之前我吃了两片阿司匹林而不是一杯水,我的头痛现在就会消失,或者因为一小时前我吃了两片阿司匹林而不是一杯水,现在我的头痛好了我们对 e 与 c 治疗的因果关系的定义将反映这一直观意义。”以及其他因果推理技术。要了解更多关于虚拟事实模型、结构方程模型和其他因果推断统计方法之间的联系,请参见 Morgan 和 Winship (2007)。
An extended example
Rubin defines a causal effect:
Intuitively, the causal effect of one treatment, E, over another, C, for a particular unit and an interval of time from [math]\displaystyle{ t_1 }[/math] to [math]\displaystyle{ t_2 }[/math] is the difference between what would have happened at time [math]\displaystyle{ t_2 }[/math] if the unit had been exposed to E initiated at [math]\displaystyle{ t_1 }[/math] and what would have happened at [math]\displaystyle{ t_2 }[/math] if the unit had been exposed to C initiated at [math]\displaystyle{ t_1 }[/math]: 'If an hour ago I had taken two aspirins instead of just a glass of water, my headache would now be gone,' or 'because an hour ago I took two aspirins instead of just a glass of water, my headache is now gone.' Our definition of the causal effect of the E versus C treatment will reflect this intuitive meaning."[5]
According to the RCM, the causal effect of your taking or not taking aspirin one hour ago is the difference between how your head would have felt in case 1 (taking the aspirin) and case 2 (not taking the aspirin). If your headache would remain without aspirin but disappear if you took aspirin, then the causal effect of taking aspirin is headache relief. In most circumstances, we are interested in comparing two futures, one generally termed "treatment" and the other "control". These labels are somewhat arbitrary.
Potential outcomes
Suppose that Joe is participating in an FDA test for a new hypertension drug. If we were omniscient, we would know the outcomes for Joe under both treatment (the new drug) and control (either no treatment or the current standard treatment). The causal effect, or treatment effect, is the difference between these two potential outcomes.
subject [math]\displaystyle{ Y_t(u) }[/math] [math]\displaystyle{ Y_c(u) }[/math] [math]\displaystyle{ Y_t(u) - Y_c(u) }[/math] Joe Category:Causal inference
类别: 因果推理
130 135 −5 Category:Statistical models
类别: 统计模型
Category:Econometric models
类别: 计量经济学模型
Category:Observational study
类别: 观察性研究
[math]\displaystyle{ Y_t(u) }[/math] is Joe's blood pressure if he takes the new pill. In general, this notation expresses the potential outcome which results from a treatment, t, on a unit, u. Similarly, [math]\displaystyle{ Y_c(u) }[/math] is the effect of a different treatment, c or control, on a unit, u. In this case, [math]\displaystyle{ Y_c(u) }[/math] is Joe's blood pressure if he doesn't take the pill. [math]\displaystyle{ Y_t(u) - Y_c(u) }[/math] is the causal effect of taking the new drug.
Category:Experiments
分类: 实验
This page was moved from wikipedia:en:Rubin causal model. Its edit history can be viewed at 鲁宾因果框架/edithistory
- ↑ 1.0 1.1 Sekhon, Jasjeet (2007). "The Neyman–Rubin Model of Causal Inference and Estimation via Matching Methods". The Oxford Handbook of Political Methodology. http://sekhon.berkeley.edu/papers/SekhonOxfordHandbook.pdf.
- ↑ Holland, Paul W. (1986). "Statistics and Causal Inference". J. Amer. Statist. Assoc. 81 (396): 945–960. doi:10.1080/01621459.1986.10478354. JSTOR 2289064.
- ↑ Neyman, Jerzy. Sur les applications de la theorie des probabilites aux experiences agricoles: Essai des principes. Master's Thesis (1923). Excerpts reprinted in English, Statistical Science, Vol. 5, pp. 463–472. (D. M. Dabrowska, and T. P. Speed, Translators.)
- ↑ Rubin, Donald (2005). "Causal Inference Using Potential Outcomes". J. Amer. Statist. Assoc. 100 (469): 322–331. doi:10.1198/016214504000001880.
- ↑ 5.0 5.1 Rubin, Donald (1974). "Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies". J. Educ. Psychol. 66 (5): 688–701 [p. 689]. doi:10.1037/h0037350.