逆概率加权

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
Wei讨论 | 贡献2022年3月20日 (日) 20:21的版本
跳到导航 跳到搜索

Inverse probability weighting is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application.[1] There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns.[2] A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.

Inverse probability weighting is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.

逆概率加权是一种统计技术,用于计算不同于收集数据的伪总体的标准化统计数据。研究设计具有不同的抽样总体和总体目标推断(目标总体)是常见的应用。可能存在一些禁止研究人员直接从目标人群中取样的因素,如成本、时间或伦理问题。解决这个问题的方法是使用替代的设计策略,例如。分层抽样。正确使用加权可以提高效率,减少未加权估计量的偏差。

One very early weighted estimator is the Horvitz–Thompson estimator of the mean.[3] When the sampling probability is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are weighted likelihoods, weighted estimating equations, and weighted probability densities from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as marginal structural models, the standardized mortality ratio, and the EM algorithm for coarsened or aggregate data.

One very early weighted estimator is the Horvitz–Thompson estimator of the mean. When the sampling probability is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are weighted likelihoods, weighted estimating equations, and weighted probability densities from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as marginal structural models, the standardized mortality ratio, and the EM algorithm for coarsened or aggregate data.

一个非常早期的加权估计是均值的 Horvitz-Thompson 估计。当抽样概率已知时,从目标总体中抽取抽样总体,然后用该概率的倒数来加权观测值。这种方法已经推广到各种框架下的统计的许多方面。特别是,有加权可能性、加权估计方程和加权概率密度,从中得出大多数统计数据。这些应用编纂了其他统计理论和估计器,如边际结构模型,标准死亡率,和 EM 算法的粗化或聚合数据。

Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis.[4] With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of missing data.

Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis. With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of missing data.

当缺失数据不能包含在初步分析中时,逆概率加权也可用于考虑缺失数据。根据抽样概率的估计,或在另一测量中测量该因素的概率,可以使用逆概率加权来夸大由于大量数据缺失而代表性不足的受试者的权重。

逆概率加权估计量(Inverse Probability Weighted Estimator, IPWE)

The inverse probability weighting estimator can be used to demonstrate causality when the researcher cannot conduct a controlled experiment but has observed data to model. Because it is assumed that the treatment is not randomly assigned, the goal is to estimate the counterfactual or potential outcome if all subjects in population were assigned either treatment.

The inverse probability weighting estimator can be used to demonstrate causality when the researcher cannot conduct a controlled experiment but has observed data to model. Because it is assumed that the treatment is not randomly assigned, the goal is to estimate the counterfactual or potential outcome if all subjects in population were assigned either treatment.

当研究人员不能进行控制实验,但有观测数据进行模型时,逆概率加权估计量可用于证明因果关系。因为假设治疗不是随机分配的,目标是估计反事实或潜在的结果,如果人口中的所有受试者被分配任何一种治疗。

Suppose observed data are [math]\displaystyle{ \{\bigl(X_i,A_i,Y_i\bigr)\}^{n}_{i=1} }[/math] drawn i.i.d (independent and identically distributed) from unknown distribution P, where

  • [math]\displaystyle{ X \in \mathbb{R}^{p} }[/math] covariates
  • [math]\displaystyle{ A \in \{0, 1\} }[/math] are the two possible treatments.
  • [math]\displaystyle{ Y \in \mathbb{R} }[/math] response
  • We do not assume treatment is randomly assigned.

The goal is to estimate the potential outcome, [math]\displaystyle{ Y^{*}\bigl(a\bigr) }[/math], that would be observed if the subject were assigned treatment [math]\displaystyle{ a }[/math]. Then compare the mean outcome if all patients in the population were assigned either treatment: [math]\displaystyle{ \mu_{a} = \mathbb{E}Y^{*}(a) }[/math]. We want to estimate [math]\displaystyle{ \mu_a }[/math] using observed data [math]\displaystyle{ \{\bigl(X_i,A_i,Y_i\bigr)\}^{n}_{i=1} }[/math].

Suppose observed data are \{\bigl(X_i,A_i,Y_i\bigr)\}^{n}_{i=1} drawn i.i.d (independent and identically distributed) from unknown distribution P, where

  • X \in \mathbb{R}^{p} covariates
  • A \in \{0, 1\} are the two possible treatments.
  • Y \in \mathbb{R} response
  • We do not assume treatment is randomly assigned.

The goal is to estimate the potential outcome, Y^{*}\bigl(a\bigr), that would be observed if the subject were assigned treatment a. Then compare the mean outcome if all patients in the population were assigned either treatment: \mu_{a} = \mathbb{E}Y^{*}(a). We want to estimate \mu_a using observed data \{\bigl(X_i,A_i,Y_i\bigr)\}^{n}_{i=1}.

假设观测数据是{ bigl (xi,a _ i,y _ i bigr)} ^ { n }{ i = 1}从未知分布 p 中抽取的 i.d (独立同分布) ,其中

  • x 在{0,1}中的数学{ r } ^ { p }协变量
  • a 中是两个可能的处理。我们不假设治疗是随机分配的。目标是估计潜在的结果,y ^ {
  • } bigl (a bigr) ,如果给受试者分配治疗 a,可以观察到这个结果。然后比较平均结果,如果所有患者在人口分配任一治疗: mu _ { a } = mathbb { e } y ^ {
  • }(a)。我们想用观测数据{ bigl (xi,a _ i,y _ i bigr)} ^ { n }{ i = 1}来估计 mu _ a。

Estimator Formula

[math]\displaystyle{ \hat{\mu}^{IPWE}_{a,n} = \frac{1}{n}\sum^{n}_{i=1}Y_{i} \frac{\mathbf 1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})} }[/math]

\hat{\mu}^{IPWE}_{a,n} = \frac{1}{n}\sum^{n}_{i=1}Y_{i} \frac{\mathbf 1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}

Estimator Formula

\hat{\mu}^{IPWE}_{a,n} = \frac{1}{n}\sum^{n}_{i=1}Y_{i} \frac{\mathbf 1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}

Constructing the IPWE

  1. [math]\displaystyle{ \mu_{a} = \mathbb{E}\frac{\mathbf{1}_{A=a} Y}{p(A|X)} }[/math] where [math]\displaystyle{ p(a|x) = \frac{P(A=a,X=x)}{P(X=x)} }[/math]
  2. construct [math]\displaystyle{ \hat{p}_{n}(a|x) }[/math] or [math]\displaystyle{ p(a|x) }[/math] using any propensity model (often a logistic regression model)
  3. [math]\displaystyle{ \hat{\mu}^{IPWE}_{a,n} = \sum^{n}_{i=1}\frac{Y_{i} 1_{A_{i}=a}}{n\hat{p}_{n}(A_{i}|X_{i})} }[/math]

With the mean of each treatment group computed, a statistical t-test or ANOVA test can be used to judge difference between group means and determine statistical significance of treatment effect.

  1. \mu_{a} = \mathbb{E}\frac{\mathbf{1}_{A=a} Y}{p(A|X)} where p(a|x) = \frac{P(A=a,X=x)}{P(X=x)}
  2. construct \hat{p}_{n}(a|x) or p(a|x) using any propensity model (often a logistic regression model)
  3. \hat{\mu}^{IPWE}_{a,n} = \sum^{n}_{i=1}\frac{Y_{i} 1_{A_{i}=a}}{n\hat{p}_{n}(A_{i}|X_{i})}

With the mean of each treatment group computed, a statistical t-test or ANOVA test can be used to judge difference between group means and determine statistical significance of treatment effect.

= = = = = = = # mu { a } = mathbb { e } frac { mathbf {1}{ a = a } y }{ p (a | x)}其中 p (a | x) = frac { p (a = a,x = x)}{ p (x = x)}}{ p (x = x)}} # construct hat { p }{ n }(a | x)或 p (a | x)使用任意模型(通常是 Logit模型模型) # 帽子{ mu } ^ { IPWE } _ { a,n } = sum ^ { n } _ { i = 1} frac { y { i }1 _ { a _ { i } = a }{ n hat { p } _ { n }(a _ { i } | x { i })计算每个治疗组的平均值,方差分析和统计 t 检验可以用来判断治疗效果的差异,并确定治疗效果的统计显著性。

假设

  1. Consistency: [math]\displaystyle{ Y = Y^{*}(A) }[/math]
  2. No unmeasured confounders: [math]\displaystyle{ \{Y^{*}(0), Y^{*}(1)\} \perp A|X }[/math]
    • Treatment assignment is based solely on covariate data and independent of potential outcomes.
  3. Positivity: [math]\displaystyle{ P(A=a|X=x)\gt 0 }[/math] for all [math]\displaystyle{ a }[/math] and [math]\displaystyle{ x }[/math]
  1. Consistency: Y = Y^{*}(A)
  2. No unmeasured confounders: \{Y^{*}(0), Y^{*}(1)\} \perp A|X
    • Treatment assignment is based solely on covariate data and independent of potential outcomes.
  3. Positivity: P(A=a|X=x)>0 for all a and x

= = = = = = = = = = # 一致性: y = y ^ {

  • }(a) # 不存在未测量的混杂因素: { y ^ {
  • }(0) ,y ^ {
  • }(1)} a/p | x #
  • 治疗分配完全基于协数据,与潜在结果无关。# 正性: p (a = a | x = x) > 0表示所有 a 和 x

Limitations

The Inverse Probability Weighted Estimator (IPWE) can be unstable if estimated propensities are small. If the probability of either treatment assignment is small, then the logistic regression model can become unstable around the tails causing the IPWE to also be less stable.

The Inverse Probability Weighted Estimator (IPWE) can be unstable if estimated propensities are small. If the probability of either treatment assignment is small, then the logistic regression model can become unstable around the tails causing the IPWE to also be less stable.

= = = = 极限 = = = = = 反概率加权估计量(IPWE)在估计倾向较小时可能不稳定。如果任一处理分配的概率很小,那么 Logit模型模型可能在尾部附近变得不稳定,导致 IPWE 也变得不稳定。

增广逆概率加权估计器

An alternative estimator is the augmented inverse probability weighted estimator (AIPWE) combines both the properties of the regression based estimator and the inverse probability weighted estimator. It is therefore a 'doubly robust' method in that it only requires either the propensity or outcome model to be correctly specified but not both. This method augments the IPWE to reduce variability and improve estimate efficiency. This model holds the same assumptions as the Inverse Probability Weighted Estimator (IPWE).[5]

An alternative estimator is the augmented inverse probability weighted estimator (AIPWE) combines both the properties of the regression based estimator and the inverse probability weighted estimator. It is therefore a 'doubly robust' method in that it only requires either the propensity or outcome model to be correctly specified but not both. This method augments the IPWE to reduce variability and improve estimate efficiency. This model holds the same assumptions as the Inverse Probability Weighted Estimator (IPWE).

另一种估计是增广逆概率加权估计(Augmented Inverse Probability Weighted Estimator,AIPWE) ,它综合了基于回归的估计和逆概率加权估计的性质。因此,这是一个双重稳健的方法,因为它只需要正确指定倾向或结果模型,而不是两者都要求。这种方法增强了 IPWE,减少了变异性,提高了估计效率。该模型与逆概率加权估计(IPWE)具有相同的假设条件。

Estimator Formula

[math]\displaystyle{ \begin{align} \lt math\gt \begin{align} === Estimator Formula === \lt math\gt \begin{align} \hat{\mu}^{AIPWE}_{a,n} \hat{\mu}^{AIPWE}_{a,n} 如果你想要的话,你可以选择 &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{Y_{i}1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})} - \frac{1_{A_{i}=a}-\hat{p}_n(A_i|X_i)}{\hat{p}_n(A_i|X_i)}\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{Y_{i}1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})} - \frac{1_{A_{i}=a}-\hat{p}_n(A_i|X_i)}{\hat{p}_n(A_i|X_i)}\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{Y_{i}1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})} - \frac{1_{A_{i}=a}-\hat{p}_n(A_i|X_i)}{\hat{p}_n(A_i|X_i)}\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}Y_{i} - (1-\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})})\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}Y_{i} - (1-\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})})\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n} \sum_{i=1}^n\Biggl(\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}Y_{i} - (1-\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})})\hat{Q}_n(X_i,a)\Biggr) \\ &= \frac{1}{n}\sum_{i=1}^n\Biggl(\hat{Q}_n(X_i,a)\Biggr) + &= \frac{1}{n}\sum_{i=1}^n\Biggl(\hat{Q}_n(X_i,a)\Biggr) + &= \frac{1}{n}\sum_{i=1}^n\Biggl(\hat{Q}_n(X_i,a)\Biggr) + \frac{1}{n}\sum_{i=1}^n\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}\Biggl(Y_{i} - \hat{Q}_n(X_i,a)\Biggr) \frac{1}{n}\sum_{i=1}^n\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}\Biggl(Y_{i} - \hat{Q}_n(X_i,a)\Biggr) Frac {1}{ n } sum { i = 1} ^ n frac {1 _ { a _ { i } = a }{ hat { p }{ n }(a _ { i } | x _ { i })} Biggl (y _ { i }-hat { q } _ n (x _ i,a) Biggr) \end{align} }[/math]

\end{align} </math>

\end{align} </math>

With the following notations:

  1. [math]\displaystyle{ 1_{A_{i}=a} }[/math] is an indicator function if subject i is part of treatment group a (or not).
  2. Construct regression estimator [math]\displaystyle{ \hat{Q}_n(x,a) }[/math] to predict outcome [math]\displaystyle{ Y }[/math] based on covariates [math]\displaystyle{ X }[/math] and treatment [math]\displaystyle{ A }[/math], for some subject i. For example, using ordinary least squares regression.
  3. Construct propensity (probability) estimate [math]\displaystyle{ \hat{p}_n(A_i|X_i) }[/math]. For example, using logistic regression.
  4. Combine in AIPWE to obtain [math]\displaystyle{ \hat{\mu}^{AIPWE}_{a,n} }[/math]

With the following notations:

  1. 1_{A_{i}=a} is an indicator function if subject i is part of treatment group a (or not).
  2. Construct regression estimator \hat{Q}_n(x,a) to predict outcome Y based on covariates X and treatment A, for some subject i. For example, using ordinary least squares regression.
  3. Construct propensity (probability) estimate \hat{p}_n(A_i|X_i). For example, using logistic regression.
  4. Combine in AIPWE to obtain \hat{\mu}^{AIPWE}_{a,n}

用下面的符号: # 1{ a { i } = a }是一个指示函数,如果主体 i 是治疗组 a 的一部分(或不是)。# 基于协变量 x 和处理 a 构造回归估计量{ q } _ n (x,a)来预测结果 y。例如,使用一般最小平方法回归。# 构造倾向(概率)估计{ p } _ n (a _ i | x _ i)。例如,使用 Logit模型。# 在 AIPWE 中组合以获得 hat { mu } ^ { AIPWE } _ { a,n }

Interpretation and "double robustness"

Interpretation and "double robustness"

解释和“双重稳健性”

The later rearrangement of the formula helps reveal the underlying idea: our estimator is based on the average predicted outcome using the model (i.e.: [math]\displaystyle{ \frac{1}{n}\sum_{i=1}^n\Biggl(\hat{Q}_n(X_i,a)\Biggr) }[/math]). However, if the model is biased, then the residuals of the model will not be (in the full treatment group a) around 0. We can correct this potential bias by adding the extra term of the average residuals of the model (Q) from the true value of the outcome (Y) (i.e.: [math]\displaystyle{ \frac{1}{n}\sum_{i=1}^n\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}\Biggl(Y_{i} - \hat{Q}_n(X_i,a)\Biggr) }[/math]). Because we have missing values of Y, we give weights to inflate the relative importance of each residual (these weights are based on the inverse propensity, a.k.a. probability, of seeing each subject observations) (see page 10 in [6]).

The later rearrangement of the formula helps reveal the underlying idea: our estimator is based on the average predicted outcome using the model (i.e.: \frac{1}{n}\sum_{i=1}^n\Biggl(\hat{Q}_n(X_i,a)\Biggr)). However, if the model is biased, then the residuals of the model will not be (in the full treatment group a) around 0. We can correct this potential bias by adding the extra term of the average residuals of the model (Q) from the true value of the outcome (Y) (i.e.: \frac{1}{n}\sum_{i=1}^n\frac{1_{A_{i}=a}}{\hat{p}_{n}(A_{i}|X_{i})}\Biggl(Y_{i} - \hat{Q}_n(X_i,a)\Biggr)). Because we have missing values of Y, we give weights to inflate the relative importance of each residual (these weights are based on the inverse propensity, a.k.a. probability, of seeing each subject observations) (see page 10 in Kang, Joseph DY, and Joseph L. Schafer. "Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data." Statistical science 22.4 (2007): 523-539. link for the paper).

公式的后期重新排列有助于揭示基本思想: 我们的估计是基于使用该模型的平均预测结果(即。: frac {1}{ n } sum { i = 1} ^ n Biggl (hat { q } _ n (x _ i,a) Biggr)).然而,如果模型是偏倚的,那么模型的残差将不会(在完整的治疗组 a)大约0。我们可以通过将模型的平均残差(q)与结果的真实值(y)相加的额外项来纠正这种潜在的偏差。: frac {1}{ n } sum { i = 1} ^ n frac {1 _ { a _ { i } = a }{ hat { p }{ n }(a _ { i } | x _ { i })} Biggl (y _ { i }-hat { q _ n (x _ i,a) Biggr))).因为我们有 y 的缺失值,所以我们给出权值来膨胀每个剩余值的相对重要性(这些权值基于反向倾向,也就是 a。观察到每个主题的概率)(见 Kang,Joseph DY 和 Joseph l. Schafer 的第10页。去神秘化的双重稳健性: 从不完全数据估计人口平均值的替代策略的比较统计科学22.4(2007) : 523-539. 论文链接)。

The "doubly robust" benefit of such an estimator comes from the fact that it's sufficient for one of the two models to be correctly specified, for the estimator to be unbiased (either [math]\displaystyle{ \hat{Q}_n(X_i,a) }[/math] or [math]\displaystyle{ \hat{p}_{n}(A_{i}|X_{i}) }[/math], or both). This is because if the outcome model is well specified then its residuals will be around 0 (regardless of the weights each residual will get). While if the model is biased, but the weighting model is well specified, then the bias will be well estimated (And corrected for) by the weighted average residuals.[6][7][8]

The "doubly robust" benefit of such an estimator comes from the fact that it's sufficient for one of the two models to be correctly specified, for the estimator to be unbiased (either \hat{Q}_n(X_i,a) or \hat{p}_{n}(A_{i}|X_{i}), or both). This is because if the outcome model is well specified then its residuals will be around 0 (regardless of the weights each residual will get). While if the model is biased, but the weighting model is well specified, then the bias will be well estimated (And corrected for) by the weighted average residuals.Kim, Jae Kwang, and David Haziza. "Doubly robust inference with missing data in survey sampling." Statistica Sinica 24.1 (2014): 375-394. link to the paperSeaman, Shaun R., and Stijn Vansteelandt. "Introduction to double robust methods for incomplete data." Statistical science: a review journal of the Institute of Mathematical Statistics 33.2 (2018): 184. link to the paper

这种估计器的“双重稳健”效益来自这样一个事实,即两个模型中的一个已经被正确指定,估计器是无偏的(hat { q } _ n (xi,a)或 hat { p } _ { n }(a _ { i } | x _ { i }) ,或者两者都是)。这是因为如果结果模型被很好地指定,那么它的残差将大约为0(不管每个残差将得到多少权重)。如果模型是有偏差的,但是加权模型是很好地指定的,那么偏差将被加权平均数残差很好地估计(并修正)。和 David Haziza。调查抽样中缺失数据的双重稳健推断24.1(2014) : 375-394. link to the paperSeaman,Shaun r. ,and Stijn Vansteelandt.“不完整数据的双重稳健方法介绍”统计科学: 数理统计研究所的评论杂志33.2(2018) : 184. 链接到论文

The bias of the doubly robust estimators is called a second-order bias, and it depends on the product of the difference [math]\displaystyle{ \frac{1}{\hat{p}_{n}(A_{i}|X_{i})} - \frac{1}{{p}_{n}(A_{i}|X_{i})} }[/math] and the difference [math]\displaystyle{ \hat{Q}_n(X_i,a) - Q_n(X_i,a) }[/math]. This property allows us, when having a "large enough" sample size, to lower the overall bias of doubly robust estimators by using machine learning estimators (instead of parametric models).[9]

The bias of the doubly robust estimators is called a second-order bias, and it depends on the product of the difference \frac{1}{\hat{p}_{n}(A_{i}|X_{i})} - \frac{1}{{p}_{n}(A_{i}|X_{i})} and the difference \hat{Q}_n(X_i,a) - Q_n(X_i,a). This property allows us, when having a "large enough" sample size, to lower the overall bias of doubly robust estimators by using machine learning estimators (instead of parametric models).Hernán, Miguel A., and James M. Robins. "Causal inference." (2010): 2. link to the book - page 179

双重稳健估计的偏差称为二阶偏差,它取决于差分 frac {1}{ hat { p }{ n }(a _ { i } | x _ { i })}-frac {1}{ p }{ n }(a _ { i } | x _ { i })}和差分{ q } _ n (x _ i,a)-q _ n (x _ i,a)的乘积。这个特性使我们在样本容量足够大的情况下,通过使用机器学习估计器(而不是参数模型)来降低双重稳健估计器的总体偏差。米格尔 · a · 埃尔南和詹姆斯 · m · 罗宾斯。”因果推理”(2010) : 2. 链接到书-页179

See also

参考文献

  1. Robins, JM; Rotnitzky, A; Zhao, LP (1994). "Estimation of regression coefficients when some regressors are not always observed". Journal of the American Statistical Association. 89 (427): 846–866. doi:10.1080/01621459.1994.10476818.
  2. Breslow, NE; Lumley, T; et al. (2009). "Using the Whole Cohort in the Analysis of Case-Cohort Data". Am J Epidemiol. 169 (11): 1398–1405. doi:10.1093/aje/kwp055. PMC 2768499. PMID 19357328.
  3. Horvitz, D. G.; Thompson, D. J. (1952). "A generalization of sampling without replacement from a finite universe". Journal of the American Statistical Association. 47 (260): 663–685. doi:10.1080/01621459.1952.10483446.
  4. Hernan, MA; Robins, JM (2006). "Estimating Causal Effects From Epidemiological Data". J Epidemiol Community Health. 60 (7): 578–596. CiteSeerX 10.1.1.157.9366. doi:10.1136/jech.2004.029496. PMC 2652882. PMID 16790829.
  5. Cao, Weihua; Tsiatis, Anastasios A.; Davidian, Marie (2009). "Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data". Biometrika. 96 (3): 723–734. doi:10.1093/biomet/asp033. ISSN 0006-3444. PMC 2798744. PMID 20161511.
  6. 6.0 6.1 Kang, Joseph DY, and Joseph L. Schafer. "Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data." Statistical science 22.4 (2007): 523-539. link for the paper
  7. Kim, Jae Kwang, and David Haziza. "Doubly robust inference with missing data in survey sampling." Statistica Sinica 24.1 (2014): 375-394. link to the paper
  8. Seaman, Shaun R., and Stijn Vansteelandt. "Introduction to double robust methods for incomplete data." Statistical science: a review journal of the Institute of Mathematical Statistics 33.2 (2018): 184. link to the paper
  9. Hernán, Miguel A., and James M. Robins. "Causal inference." (2010): 2. link to the book - page 179