辛普森悖论
Simpson's paradox is a paradox from statistics. It is named after Edward H. Simpson, a British statistician who first described it in 1951. The statistician Karl Pearson described a very similar effect in 1899.- Udny Yule's description dates from 1903. Sometimes, it is called the Yule–Simpson effect. When looking at the statistical scores of groups, these scores may change, depending on whether the groups are looked at one by one, or if they are combined into a larger group. This case often occurs in social sciences and medical statistics. It may confuse people, if frequency data is used to explain a causal relationship. Other names for the paradox include reversal paradox and amalgamation paradox.
辛普森悖论是一个统计学悖论。它是以 E.H.辛普森的名字命名的,他是一位英国统计学家,在1951年第一次描述了它[1]。统计学家卡尔 · 皮尔森在1899年描述了一个非常相似的效应[2]。- Udny Yule 的描述可以追溯到1903年[3]。有时,这种现象被称为“尤尔-辛普森效应”。当观察小组的统计分数时,这些分数可能会发生变化,这取决于小组是逐一观察,还是将它们合并成一个更大的小组。这种情况经常发生在社会科学和医学统计中[4]。如果用频率数据来解释因果关系[5],人们可能会感到困惑。悖论的其他名称还包括反转悖论和合并悖论[6].
样例:肾结石治疗
This is a real-life example from a medical study comparing the success rates of two treatments for kidney stones.
The table shows the success rates and numbers of treatments for treatments involving both small and large kidney stones, where Treatment A includes all open procedures and Treatment B is percutaneous nephrolithotomy:
这是一个真实的例子,来自一项医学研究[7],比较两种治疗肾结石的成功率[8]。
下表显示了治疗小肾结石和大肾结石的成功率和治疗次数,其中治疗A包括所有开放手术,治疗B是经皮肾镜取石术:
Treatment A | Treatment B | |||
---|---|---|---|---|
success | failure | success | failure | |
Small Stones | Group 1 | Group 2 | ||
number of patients | 81 | 6 | 234 | 36 |
93% | 7% | 87% | 13% | |
Large Stones | Group 3 | Group 4 | ||
number of patients | 192 | 71 | 55 | 25 |
73% | 27% | 69% | 31% | |
Both | Group 1+3 | Group 2+4 | ||
number of patients | 273 | 77 | 289 | 61 |
78% | 22% | 83% | 17% |
The paradoxical conclusion is that treatment A is more effective when used on small stones, and also when used on large stones, yet treatment B is more effective when considering both sizes at the same time. In this example, it was not known that the size of the kidney stone influenced the result. This is called a hidden variable (or lurking variable) in statistics.
Which treatment is considered better is determined by an inequality between two ratios (successes/total). The reversal of the inequality between the ratios, which creates Simpson's paradox, happens because two effects occur together:
- The sizes of the groups, which are combined when the lurking variable is ignored, are very different. Doctors tend to give the severe cases (large stones) the better treatment (A), and the milder cases (small stones) the inferior treatment (B). Therefore, the totals are dominated by groups three and two, and not by the two much smaller groups one and four.
- The lurking variable has a large effect on the ratios, i.e. the success rate is more strongly influenced by the severity of the case than by the choice of treatment. Therefore, the group of patients with large stones using treatment A (group three) does worse than the group with small stones, even if the latter used the inferior treatment B (group two).
一个自相矛盾的结论是,A疗法对小结石更有效,对大结石也更有效,而B疗法在同时考虑两种大小时更有效。在这个例子中,还不知道肾结石的大小会影响结果。这在统计学中称为隐藏变量(或隐藏变量)。
哪种治疗方法更好是由两个比率(成功率/总成功率)之间的不平等决定的。造成辛普森悖论的两个比率之间不平等的逆转,是因为两种效应同时发生:
1、当忽略隐藏变量时,组的大小是非常不同的。医生倾向于对严重的病例(大结石)给予较好的治疗(A) ,对较轻的病例(小结石)给予较差的治疗(B)。因此,总数由第三组和第二组支配,而不是由规模小得多的第一组和第四组支配。
2、潜伏变量对比率有很大的影响,也就是说成功率更多地受到病情严重程度的影响,而不是治疗方法的选择。因此,治疗 A组(第三组)大结石患者的情况比治疗小结石患者差,即使后者采用劣等治疗 B 组(第二组)。
参考文献
- ↑ Simpson, Edward H. (1951). "The Interpretation of Interaction in Contingency Tables". Journal of the Royal Statistical Society, Ser. B. 13: 238–241
- ↑ Pearson, Karl; Lee, A.; Bramley-Moore, L. (1899). "Genetic (reproductive) selection: Inheritance of fertility in man". Philosophical Translations of the Royal Statistical Society, Ser. A. 173: 534–539
- ↑ G. U. Yule (1903). "Notes on the Theory of Association of Attributes in Statistics". Biometrika. 2 (2): 121–134. doi:10.1093/biomet/2.2.121
- ↑ Clifford H. Wagner (February 1982). "Simpson's Paradox in Real Life". The American Statistician. 36 (1): 46–48. doi:10.2307/2684093. JSTOR 2684093.
- ↑ Judea Pearl. Causality: Models, Reasoning, and Inference, Cambridge University Press (2000, 2nd edition 2009). ISBN 0-521-77362-8.
- ↑ I. J. Good, Y. Mittal (June 1987). "The Amalgamation and Geometry of Two-by-Two Contingency Tables". The Annals of Statistics. 15 (2): 694–711. doi:10.1214/aos/1176350369. ISSN 0090-5364. JSTOR 2241334.
- ↑ C. R. Charig; D. R. Webb; S. R. Payne; O. E. Wickham (29 March 1986). "Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy". Br Med J (Clin Res Ed). 292 (6524): 879–882. doi:10.1136/bmj.292.6524.879. PMC 1339981. PMID 3083922.
- ↑ Steven A. Julious and Mark A. Mullee (1994-12-03). "Confounding and Simpson's paradox". BMJ. 309 (6967): 1480–1481. doi:10.1136/bmj.309.6967.1480. PMC 2541623. PMID 7804052