“匹配”的版本间的差异

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
跳到导航 跳到搜索
第6行: 第6行:
  
  
匹配由'''<font color="#ff8000"> 唐纳德•鲁宾 Donald Rubin </font>'''<ref name="Rosenbaum Rubin" />推动,在经济学中主要受到'''<font color="#ff8000"> 拉隆德 LaLonde</font>'''(1986)<ref>{{cite journal | last = LaLonde | first = Robert J. | title = Evaluating the Econometric Evaluations of Training Programs with Experimental Data | journal = American Economic Review | volume = 76 | issue = 4 |year = 1986 | pages = 604–620 | jstor=1806062 }}</ref>的批评。LaLonde比较了一个实验中的处理效果估计和运用匹配方法产生的可比估计,表明匹配方法是有偏的。'''<font color="#ff8000"> 德赫加和瓦巴 Dehejia and Wahba </font>'''(1999)重新评估了LaLonde的批评,并指出匹配是一个很好的解决方案。<ref>{{cite journal | title = Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs |first1 = R. H. |last1=Dehejia |first2 = S. |last2=Wahba |journal=Journal of the American Statistical Association |year=1999 |volume=94 |issue=448 |pages=1053–1062 |doi=10.1080/01621459.1999.10473858 |url = http://www.nber.org/papers/w6586.pdf }}</ref>政治学<ref>{{cite journal |last1=Arceneaux |first1=Kevin |first2=Alan S. |last2=Gerber |first3=Donald P. |last3=Green |year=2006 |title=Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization |journal=Political Analysis |volume=14 |issue=1 |pages=37–62 |doi=10.1093/pan/mpj001 }}</ref>和社会学期刊<ref>{{cite journal |last1=Arceneaux |first1=Kevin |first2=Alan S. |last2=Gerber |first3=Donald P. |last3=Green |year=2010 |title=A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark |journal=Sociological Methods & Research |volume=39 |issue=2 |pages=256–282 |doi=10.1177/0049124110378098 |s2cid=37012563 }}</ref>上也提出了类似的批评。
+
匹配由'''<font color="#ff8000"> 唐纳德•鲁宾 Donald Rubin </font>'''<ref name="Rosenbaum Rubin" />推动,在经济学中主要受到'''<font color="#ff8000"> 拉隆德 LaLonde</font>'''(1986)<ref>{{cite journal | last = LaLonde | first = Robert J. | title = Evaluating the Econometric Evaluations of Training Programs with Experimental Data | journal = American Economic Review | volume = 76 | issue = 4 |year = 1986 | pages = 604–620 | jstor=1806062 }}</ref>的批评。LaLonde比较了一个实验中的处理效果估计和运用匹配方法产生的可比估计,表明匹配方法是有偏的。'''<font color="#ff8000"> 德赫加和瓦巴 Dehejia and Wahba </font>'''(1999)重新评估了LaLonde的批评,并指出匹配是一个很好的解决方案。<ref>{{cite journal | title = Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs |first1 = R. H. |last1=Dehejia |first2 = S. |last2=Wahba |journal=Journal of the American Statistical Association |year=1999 |volume=94 |issue=448 |pages=1053–1062 |doi=10.1080/01621459.1999.10473858 |url = http://www.nber.org/papers/w6586.pdf }}</ref>政治学<ref>{{cite journal |last1=Arceneaux |first1=Kevin |first2=Alan S. |last2=Gerber |first3=Donald P. |last3=Green |year=2006 |title=Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization |journal=Political Analysis |volume=14 |issue=1 |pages=37–62 |doi=10.1093/pan/mpj001 }}</ref>和社会学期刊<ref>{{cite journal |last1=Arceneaux |first1=Kevin |first2=Alan S. |last2=Gerber |first3=Donald P. |last3=Green |year=2010 |title=A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark |journal=Sociological Methods & Research |volume=39 |issue=2 |pages=256–282 |doi=10.1177/0049124110378098}}</ref>上也提出了类似的批评。
  
  

2021年6月27日 (日) 10:03的版本

作为一种统计技术, 匹配 Matching通过在 观察研究 Observational Study 准实验研究 Quasi-experiment(即 处理 Treatment 是非随机分配的)中比较已处理和未处理的单元,以评估处理的效果。匹配的目标是,对于每个处理单元,找到一个(或多个)具有相似可观察特征的未处理单元,以评估处理效果。通过处理单元与相似未处理单元的匹配,匹配技术可以比较处理单元与未处理单元的不同结果,从而评估处理效应,减少混杂效应带来的偏差。[1][2][3] 倾向值匹配 Propensity Score Matching,一种早期的匹配技术,是作为 鲁宾因果模型 Rubin Causal Model[4]的一部分发展起来的,但已被证明会增加模型依赖性、偏差、无效性和 计算量 power ,与其他匹配方法相比不再推荐使用。[5]


匹配由 唐纳德•鲁宾 Donald Rubin [4]推动,在经济学中主要受到 拉隆德 LaLonde(1986)[6]的批评。LaLonde比较了一个实验中的处理效果估计和运用匹配方法产生的可比估计,表明匹配方法是有偏的。 德赫加和瓦巴 Dehejia and Wahba (1999)重新评估了LaLonde的批评,并指出匹配是一个很好的解决方案。[7]政治学[8]和社会学期刊[9]上也提出了类似的批评。


分析

当感兴趣的结果是二元变量时,分析匹配数据最常用的工具是条件Logistic回归模型,因为它可以处理 任意大小的层次和连续或二元处理变量(自变量) strata of arbitrary size and continuous or binary treatments (predictors) ,并且可以控制协变量。在特定情况下,可以使用 配对差异检验 paired difference test、 McNemar 检验和 Cochran-Mantel-Haenzel 检验等更简单的检验。


当感兴趣的结果是连续的,对  平均处理效 Average Treatment Effect 应进行估计。


匹配也可用于在通过其他技术分析之前(例如回归分析)“预处理”样本。[10]


过匹配

过匹配是对表面是中介变量、实际上是暴露的结果进行匹配。如果中介变量本身是分层的,则很可能引致一种暴露与疾病的令人费解的关系。[11] 过匹配因此导致统计偏误。[11]


例如,在估计体外受精(IVF)后的围产期死亡率和出生体重时,按妊娠期和/或多胎数来匹配对照组就是过度匹配,因为IVF本身会增加早产和多胎的风险。[12]


它可以被看作是一个降低研究外部效度的抽样偏误,因为相比一般人群,对照组在暴露方面变得更类似于病例。


另见


参考文献

  1. Rubin, Donald B. (1973). "Matching to Remove Bias in Observational Studies". Biometrics. 29 (1): 159–183. doi:10.2307/2529684. JSTOR 2529684.
  2. Anderson, Dallas W.; Kish, Leslie; Cornell, Richard G. (1980). "On Stratification, Grouping and Matching". Scandinavian Journal of Statistics. 7 (2): 61–66. JSTOR 4615774.
  3. Kupper, Lawrence L.; Karon, John M.; Kleinbaum, David G.; Morgenstern, Hal; Lewis, Donald K. (1981). "Matching in Epidemiologic Studies: Validity and Efficiency Considerations". Biometrics. 37 (2): 271–291. CiteSeerX 10.1.1.154.1197. doi:10.2307/2530417. JSTOR 2530417. PMID 7272415.
  4. 4.0 4.1 Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects". Biometrika. 70 (1): 41–55. doi:10.1093/biomet/70.1.41.
  5. King, Gary; Nielsen, Richard (October 2019). "Why Propensity Scores Should Not Be Used for Matching". Political Analysis (in English). 27 (4): 435–454. doi:10.1017/pan.2019.11. ISSN 1047-1987.
  6. LaLonde, Robert J. (1986). "Evaluating the Econometric Evaluations of Training Programs with Experimental Data". American Economic Review. 76 (4): 604–620. JSTOR 1806062.
  7. Dehejia, R. H.; Wahba, S. (1999). "Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs" (PDF). Journal of the American Statistical Association. 94 (448): 1053–1062. doi:10.1080/01621459.1999.10473858.
  8. Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2006). "Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization". Political Analysis. 14 (1): 37–62. doi:10.1093/pan/mpj001.
  9. Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2010). "A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark". Sociological Methods & Research. 39 (2): 256–282. doi:10.1177/0049124110378098.
  10. Ho, Daniel E.; Imai, Kosuke; King, Gary; Stuart, Elizabeth A. (2007). "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference". Political Analysis. 15 (3): 199–236. doi:10.1093/pan/mpl013.
  11. 11.0 11.1 Marsh, J. L.; Hutton, J. L.; Binks, K. (2002). "Removal of radiation dose response effects: an example of over-matching". British Medical Journal. 325 (7359): 327–330. doi:10.1136/bmj.325.7359.327. PMC 1123834. PMID 12169512.
  12. Gissler, M.; Hemminki, E. (1996). "The danger of overmatching in studies of the perinatal mortality and birthweight of infants born after assisted conception". Eur J Obstet Gynecol Reprod Biol. 69 (2): 73–75. doi:10.1016/0301-2115(95)02517-0. PMID 8902436.


进一步阅读

  • Angrist, Joshua D.; Pischke, Jörn-Steffen (2009). "Regression Meets Matching". Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. pp. 69–80. ISBN 978-0-691-12034-8. 



本中文词条由Sikongpop用户参与编译,LFZ参与审校,薄荷编辑,欢迎在讨论页面留言。


本词条内容源自wikipedia及公开资料,遵守 CC3.0协议。