更改

删除16字节 、 2022年7月3日 (日) 10:58
第1行: 第1行: −
'''Inverse probability weighting''' is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application<ref>Robins, JM; Rotnitzky, A; Zhao, LP (1994). "Estimation of regression coefficients when some regressors are not always observed". Journal of the American Statistical Association. 89 (427): 846–866. doi:10.1080/01621459.1994.10476818</ref>. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns<ref>Breslow, NE; Lumley, T; et al. (2009). "Using the Whole Cohort in the Analysis of Case-Cohort Data". Am J Epidemiol. 169 (11): 1398–1405. doi:10.1093/aje/kwp055. PMC 2768499. <nowiki>PMID 19357328</nowiki>.</ref>. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
+
'''Inverse probability weighting''' is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application<ref name=":1">Robins, JM; Rotnitzky, A; Zhao, LP (1994). "Estimation of regression coefficients when some regressors are not always observed". Journal of the American Statistical Association. 89 (427): 846–866. doi:10.1080/01621459.1994.10476818</ref>. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns<ref name=":2">Breslow, NE; Lumley, T; et al. (2009). "Using the Whole Cohort in the Analysis of Case-Cohort Data". Am J Epidemiol. 169 (11): 1398–1405. doi:10.1093/aje/kwp055. PMC 2768499. <nowiki>PMID 19357328</nowiki>.</ref>. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
   −
One very early weighted estimator is the Horvitz–Thompson estimator of the mean<ref>Horvitz, D. G.; Thompson, D. J. (1952). "A generalization of sampling without replacement from a finite universe". ''Journal of the American Statistical Association''. '''47''' (260): 663–685. doi:10.1080/01621459.1952.10483446</ref>. When the sampling probability is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are weighted likelihoods, weighted estimating equations, and weighted probability densities from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as marginal structural models, the standardized mortality ratio, and the EM algorithm for coarsened or aggregate data.
+
One very early weighted estimator is the Horvitz–Thompson estimator of the mean<ref name=":3">Horvitz, D. G.; Thompson, D. J. (1952). "A generalization of sampling without replacement from a finite universe". ''Journal of the American Statistical Association''. '''47''' (260): 663–685. doi:10.1080/01621459.1952.10483446</ref>. When the sampling probability is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are weighted likelihoods, weighted estimating equations, and weighted probability densities from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as marginal structural models, the standardized mortality ratio, and the EM algorithm for coarsened or aggregate data.
   −
Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis<ref>Hernan, MA; Robins, JM (2006). "Estimating Causal Effects From Epidemiological Data". ''J Epidemiol Community Health''. '''60''' (7): 578–596. CiteSeerX 10.1.1.157.9366. doi:10.1136/jech.2004.029496. PMC 2652882. <nowiki>PMID 16790829</nowiki></ref>. With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of missing data.
+
Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis<ref name=":4">Hernan, MA; Robins, JM (2006). "Estimating Causal Effects From Epidemiological Data". ''J Epidemiol Community Health''. '''60''' (7): 578–596. CiteSeerX 10.1.1.157.9366. doi:10.1136/jech.2004.029496. PMC 2652882. <nowiki>PMID 16790829</nowiki></ref>. With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of missing data.
   −
【翻译】逆概率加权是一种统计技术,用于计算标准化与收集数据的伪总体不同的统计量。研究设计与不同的抽样总体和总体的目标推断(目标总体)是常见的应用。可能存在阻碍研究人员直接从目标人群中采样的因素,如成本、时间或伦理问题。解决这个问题的方法是使用另一种设计策略,例如分层抽样。如果正确使用加权,可能会提高效率,减少未加权估计量的偏差。
+
【翻译】逆概率加权是一种统计技术,用于计算标准化与收集数据的伪总体不同的统计量。研究设计与不同的抽样总体和总体的目标推断(目标总体)是常见的应用<ref name=":1" />。可能存在阻碍研究人员直接从目标人群中采样的因素,如成本、时间或伦理问题<ref name=":2" />。解决这个问题的方法是使用另一种设计策略,例如分层抽样。如果正确使用加权,可能会提高效率,减少未加权估计量的偏差。
   −
一个非常早期的加权估计是霍维茨汤姆森估计量的均值估计。当抽样概率已知时,从目标总体中抽取抽样总体,然后利用这个概率的逆值对观测值进行加权。这种方法已经在各种框架下推广到统计学的许多方面。特别是,有加权的可能性,加权的估计方程,和加权的概率密度,其中大多数统计数据都是从中导出的。这些应用编纂了其他统计和估计理论,例如边际结构模型、标准死亡率和用于粗化或汇总数据的 EM 算法。
+
一个非常早期的加权估计是霍维茨汤姆森估计量的均值估计<ref name=":3" />。当抽样概率已知时,从目标总体中抽取抽样总体,然后利用这个概率的逆值对观测值进行加权。这种方法已经在各种框架下推广到统计学的许多方面。特别是,有加权的可能性,加权的估计方程,和加权的概率密度,其中大多数统计数据都是从中导出的。这些应用编纂了其他统计和估计理论,例如边际结构模型、标准死亡率和用于粗化或汇总数据的 EM 算法。
   −
当缺失数据的受试者不能被包括在主要分析中时,逆概率加权也被用来解释缺失数据。通过对抽样概率的估计,或该因子在另一测量中被测量的概率,逆概率加权可以用来为由于大量数据缺失而代表性不足的受试者增加权重。
+
当缺失数据的受试者不能被包括在主要分析中时,逆概率加权也被用来解释缺失数据<ref name=":4" />。通过对抽样概率的估计,或该因子在另一测量中被测量的概率,逆概率加权可以用来为由于大量数据缺失而代表性不足的受试者增加权重。
 
== Inverse Probability Weighted Estimator (IPWE) ==
 
== Inverse Probability Weighted Estimator (IPWE) ==
 
The inverse probability weighting estimator can be used to demonstrate causality when the researcher cannot conduct a controlled experiment but has observed data to model. Because it is assumed that the treatment is not randomly assigned, the goal is to estimate the counterfactual or potential outcome if all subjects in population were assigned either treatment.
 
The inverse probability weighting estimator can be used to demonstrate causality when the researcher cannot conduct a controlled experiment but has observed data to model. Because it is assumed that the treatment is not randomly assigned, the goal is to estimate the counterfactual or potential outcome if all subjects in population were assigned either treatment.
   −
Suppose observed data are [[文件:Wiki-IPWE-Figure1.png]]  drawn i.i.d (independent and identically distributed) from unknown distribution P, where<syntaxhighlight lang="mathematica">
+
Suppose observed data are [[文件:Wiki-IPWE-Figure1.png]]  drawn i.i.d (independent and identically distributed) from unknown distribution P, where
<math>\begin{align}
  −
\mathbb{E}\left[ Y^{*}(a) \right]
  −
        =  \mathbb{E}_{(X,Y)}\left[ Y(X,a)\right]  =  \mathbb{E}_{(X,A,Y)}\left[ \frac{  Y \mathbf{1}(A=a) }{ P(A=a|X)} \right]. \qquad \cdots \cdots (*)
  −
\end{align}</math>
  −
</syntaxhighlight>
   
* [[文件:Wiki-IPWE-Figure2.png]] covariates
 
* [[文件:Wiki-IPWE-Figure2.png]] covariates
 
* [[文件:Wiki-IPWE-Figure3.png]] are the two possible treatments.
 
* [[文件:Wiki-IPWE-Figure3.png]] are the two possible treatments.
第26行: 第21行:  
The goal is to estimate the potential outcome,[[文件:Wiki-IPWE-Figure5.png]] , that would be observed if the subject were assigned treatment . Then compare the mean outcome if all patients in the population were assigned either treatment:[[文件:Wiki-IPWE-Figure6.png]] . We want to estimate  using observed data [[文件:Wiki-IPWE-Figure1.png]].
 
The goal is to estimate the potential outcome,[[文件:Wiki-IPWE-Figure5.png]] , that would be observed if the subject were assigned treatment . Then compare the mean outcome if all patients in the population were assigned either treatment:[[文件:Wiki-IPWE-Figure6.png]] . We want to estimate  using observed data [[文件:Wiki-IPWE-Figure1.png]].
   −
【翻译】当研究人员不能进行对照实验而只能通过观测数据建立模型时,逆概率加权估计可以用来证明因果关系。因为假设治疗不是随机分配的,目标是估计反事实或潜在的结果,如果所有受试者在人口中被分配任何一种治疗。
+
【翻译】
 +
 
 +
当研究人员不能进行对照实验而只能通过观测数据建立模型时,逆概率加权估计可以用来证明因果关系。
 +
 
 +
因为假设治疗不是随机分配的,目标是估计反事实或潜在的结果,如果所有受试者在人口中被分配任何一种治疗。
   −
假设观测数据是从未知分布 P 中提取 i.i.d (独立且同分布) ,其中协变量是两种可能的处理方法。
+
假设观测数据是[[文件:Wiki-IPWE-Figure1.png]]从未知分布 P 中提取 i.i.d (独立且同分布) ,其中协变量是两种可能的处理方法。
    
* [[文件:Wiki-IPWE-Figure2.png]] 协变量
 
* [[文件:Wiki-IPWE-Figure2.png]] 协变量
第114行: 第113行:     
== Augmented Inverse Probability Weighted Estimator (AIPWE) ==
 
== Augmented Inverse Probability Weighted Estimator (AIPWE) ==
An alternative estimator is the augmented inverse probability weighted estimator (AIPWE) combines both the properties of the regression based estimator and the inverse probability weighted estimator. It is therefore a 'doubly robust' method in that it only requires either the propensity or outcome model to be correctly specified but not both. This method augments the IPWE to reduce variability and improve estimate efficiency. This model holds the same assumptions as the Inverse Probability Weighted Estimator (IPWE)<ref>Cao, Weihua; Tsiatis, Anastasios A.; Davidian, Marie (2009). "Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data". ''Biometrika''. '''96''' (3): 723–734. doi:10.1093/biomet/asp033. ISSN 0006-3444. PMC 2798744. <nowiki>PMID 20161511</nowiki></ref>.
+
An alternative estimator is the augmented inverse probability weighted estimator (AIPWE) combines both the properties of the regression based estimator and the inverse probability weighted estimator. It is therefore a 'doubly robust' method in that it only requires either the propensity or outcome model to be correctly specified but not both. This method augments the IPWE to reduce variability and improve estimate efficiency. This model holds the same assumptions as the Inverse Probability Weighted Estimator (IPWE)<ref name=":5">Cao, Weihua; Tsiatis, Anastasios A.; Davidian, Marie (2009). "Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data". ''Biometrika''. '''96''' (3): 723–734. doi:10.1093/biomet/asp033. ISSN 0006-3444. PMC 2798744. <nowiki>PMID 20161511</nowiki></ref>.
    
【翻译】
 
【翻译】
   −
增广逆概率加权估计(AIPWE)是一种结合了基于回归估计和逆概率加权估计性质的替代估计。因此,它是一种“双稳健”方法,因为它只需要正确指定倾向或结果模型,而不是两者兼而有之。该方法增强了 IPWE,降低了变异性,提高了估计效率。这个模型拥有与反概率加权估计(IPWE)相同的假设[5]
+
增广逆概率加权估计(AIPWE)是一种结合了基于回归估计和逆概率加权估计性质的替代估计。因此,它是一种“双稳健”方法,因为它只需要正确指定倾向或结果模型,而不是两者兼而有之。该方法增强了 IPWE,降低了变异性,提高了估计效率。这个模型拥有与反概率加权估计(IPWE)相同的假设<ref name=":5" />
    
=== Estimator Formula ===
 
=== Estimator Formula ===
第144行: 第143行:  
The later rearrangement of the formula helps reveal the underlying idea: our estimator is based on the average predicted outcome using the model (i.e.: [[文件:Wiki-AIPWE-6.png]]). However, if the model is biased, then the residuals of the model will not be (in the full treatment group a) around 0. We can correct this potential bias by adding the extra term of the average residuals of the model (Q) from the true value of the outcome (Y) (i.e.:[[文件:Wiki-AIPWE-7.png]] ). Because we have missing values of Y, we give weights to inflate the relative importance of each residual (these weights are based on the inverse propensity, a.k.a. probability, of seeing each subject observations) (see page 10 in<ref name=":0">Kang, Joseph DY, and Joseph L. Schafer. "Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data." Statistical science 22.4 (2007): 523-539. link for the paper</ref> ).
 
The later rearrangement of the formula helps reveal the underlying idea: our estimator is based on the average predicted outcome using the model (i.e.: [[文件:Wiki-AIPWE-6.png]]). However, if the model is biased, then the residuals of the model will not be (in the full treatment group a) around 0. We can correct this potential bias by adding the extra term of the average residuals of the model (Q) from the true value of the outcome (Y) (i.e.:[[文件:Wiki-AIPWE-7.png]] ). Because we have missing values of Y, we give weights to inflate the relative importance of each residual (these weights are based on the inverse propensity, a.k.a. probability, of seeing each subject observations) (see page 10 in<ref name=":0">Kang, Joseph DY, and Joseph L. Schafer. "Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data." Statistical science 22.4 (2007): 523-539. link for the paper</ref> ).
   −
The "doubly robust" benefit of such an estimator comes from the fact that it's sufficient for one of the two models to be correctly specified, for the estimator to be unbiased (either  [[文件:Wiki-AIPWE-3.png]]or [[文件:Wiki-AIPWE-4.png]], or both). This is because if the outcome model is well specified then its residuals will be around 0 (regardless of the weights each residual will get). While if the model is biased, but the weighting model is well specified, then the bias will be well estimated (And corrected for) by the weighted average residuals<ref name=":0" /><ref>Kim, Jae Kwang, and David Haziza. "Doubly robust inference with missing data in survey sampling." Statistica Sinica 24.1 (2014): 375-394. link to the paper</ref><ref>Seaman, Shaun R., and Stijn Vansteelandt. "Introduction to double robust methods for incomplete data." Statistical science: a review journal of the Institute of Mathematical Statistics 33.2 (2018): 184. link to the paper</ref>
+
The "doubly robust" benefit of such an estimator comes from the fact that it's sufficient for one of the two models to be correctly specified, for the estimator to be unbiased (either  [[文件:Wiki-AIPWE-3.png]]or [[文件:Wiki-AIPWE-4.png]], or both). This is because if the outcome model is well specified then its residuals will be around 0 (regardless of the weights each residual will get). While if the model is biased, but the weighting model is well specified, then the bias will be well estimated (And corrected for) by the weighted average residuals<ref name=":0" /><ref name=":6">Kim, Jae Kwang, and David Haziza. "Doubly robust inference with missing data in survey sampling." Statistica Sinica 24.1 (2014): 375-394. link to the paper</ref><ref name=":7">Seaman, Shaun R., and Stijn Vansteelandt. "Introduction to double robust methods for incomplete data." Statistical science: a review journal of the Institute of Mathematical Statistics 33.2 (2018): 184. link to the paper</ref>
   −
The bias of the doubly robust estimators is called a '''second-order bias''', and it depends on the product of the difference [[文件:Wiki-AIPWE-8.png]] and the difference [[文件:Wiki-AIPWE-9.png]]. This property allows us, when having a "large enough" sample size, to lower the overall bias of doubly robust estimators by using machine learning estimators (instead of parametric models).<ref>Hernán, Miguel A., and James M. Robins. "Causal inference." (2010): 2. link to the book - page 170</ref>
+
The bias of the doubly robust estimators is called a '''second-order bias''', and it depends on the product of the difference [[文件:Wiki-AIPWE-8.png]] and the difference [[文件:Wiki-AIPWE-9.png]]. This property allows us, when having a "large enough" sample size, to lower the overall bias of doubly robust estimators by using machine learning estimators (instead of parametric models).<ref name=":8">Hernán, Miguel A., and James M. Robins. "Causal inference." (2010): 2. link to the book - page 170</ref>
    
【翻译】
 
【翻译】
   −
后来公式的重新排列有助于揭示潜在的想法: 我们的估计是基于使用模型(即:[[文件:Wiki-AIPWE-6.png]])的平均预测结果。然而,如果模型是有偏差的,那么模型的残差将不会(在完整的治疗组 a)在0左右。我们可以通过从结果(Y)的真实值(即:[[文件:Wiki-AIPWE-7.png]])中加入模型(Q)的平均残差的额外项来纠正这种潜在的偏差。因为我们缺少 Y 的值,所以我们给出权重来夸大每个残差的相对重要性(这些权重是基于看到每个受试者观察结果的逆倾向,也就是概率)(参见[6]第10页)。
+
后来公式的重新排列有助于揭示潜在的想法: 我们的估计是基于使用模型(即:[[文件:Wiki-AIPWE-6.png]])的平均预测结果。然而,如果模型是有偏差的,那么模型的残差将不会(在完整的治疗组 a)在0左右。我们可以通过从结果(Y)的真实值(即:[[文件:Wiki-AIPWE-7.png]])中加入模型(Q)的平均残差的额外项来纠正这种潜在的偏差。因为我们缺少 Y 的值,所以我们给出权重来夸大每个残差的相对重要性(这些权重是基于看到每个受试者观察结果的逆倾向,也就是概率)(参见<ref name=":0" />第10页)。
   −
这种估计器的“双重稳健”好处来自于这样一个事实,即它足以正确指定两个模型中的一个,使估计器是无偏的([[文件:Wiki-AIPWE-3.png]]或者[[文件:Wiki-AIPWE-4.png]],或者两者兼而有之)。这是因为如果结果模型被很好地指定,那么它的残差将在0左右(不管每个残差将得到的权重如何)。如果模型是有偏差的,但加权模型是明确规定的,那么偏差将由加权平均数残差[6][7][8]得到很好的估计(和校正)
+
这种估计器的“双重稳健”好处来自于这样一个事实,即它足以正确指定两个模型中的一个,使估计器是无偏的([[文件:Wiki-AIPWE-3.png]]或者[[文件:Wiki-AIPWE-4.png]],或者两者兼而有之)。这是因为如果结果模型被很好地指定,那么它的残差将在0左右(不管每个残差将得到的权重如何)。如果模型是有偏差的,但加权模型是明确规定的,那么偏差将由加权平均数残差<ref name=":0" /><ref name=":6" /><ref name=":7" />得到很好的估计(和校正)
   −
双稳健估计量的偏差称为二阶偏差,它取决于差[[文件:Wiki-AIPWE-8.png]]和差[[文件:Wiki-AIPWE-9.png]]的乘积。这个性质允许我们,当有一个“足够大”的样本量,以降低整体偏差的双鲁棒估计使用机器学习估计(而不是参数模型)。[9]
+
双稳健估计量的偏差称为二阶偏差,它取决于差[[文件:Wiki-AIPWE-8.png]]和差[[文件:Wiki-AIPWE-9.png]]的乘积。这个性质允许我们,当有一个“足够大”的样本量,以降低整体偏差的双鲁棒估计使用机器学习估计(而不是参数模型)。<ref name=":8" />
    
== 参考文献 ==
 
== 参考文献 ==
 
<references />
 
<references />
316

个编辑