第1行: |
第1行: |
− | '''Inverse probability weighting''' is a statistical technique for calculating statistics standardized to a [[pseudo-population]] different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application.<ref name="refname2" /> There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns.<ref name="refname3" /> A solution to this problem is to use an alternate design strategy, e.g. [[stratified sampling]]. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
| |
| | | |
− | Inverse probability weighting is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target inference (target population) are common in application. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
| |
| | | |
− | 逆概率加权是一种统计技术,用于计算不同于收集数据的伪总体的标准化统计数据。研究设计具有不同的抽样总体和总体目标推断(目标总体)是常见的应用。可能存在一些禁止研究人员直接从目标人群中取样的因素,如成本、时间或伦理问题。解决这个问题的方法是使用替代的设计策略,例如。分层抽样。正确使用加权可以提高效率,减少未加权估计量的偏差。
| + | '''逆概率加权'''是一种统计技术,用于计算与收集数据的人群不同的伪总体([[pseudo-population]])的标准化统计数据。在应用中,抽样人群和目标推断人群(目标人群)不一致的研究设计是很常见的<ref name="refname2" />。可能有一些禁止性因素,如成本、时间或道德方面的考虑,使研究人员无法直接从目标人群中抽样<ref name="refname3" />。解决这个问题的方法是使用另一种设计策略,如分层抽样([[stratified sampling]])。如果应用得当,加权可以潜在地提高效率,减少非加权估计的偏差。 |
| | | |
− | One very early weighted estimator is the [[Horvitz–Thompson estimator]] of the mean.<ref>{{cite journal | first1 = D. G. |last1 = Horvitz | first2 = D. J. |last2 = Thompson | title = A generalization of sampling without replacement from a finite universe | journal = [[Journal of the American Statistical Association]] | volume = 47 | pages = 663–685 | year = 1952 |issue = 260 | doi=10.1080/01621459.1952.10483446}}</ref> When the [[sampling probability]] is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are [[likelihood function|weighted likelihoods]], [[generalized estimating equations|weighted estimating equations]], and [[probability density function|weighted probability densities]] from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as [[marginal structural models]], the [[standardized mortality ratio]], and the [[EM algorithm]] for coarsened or aggregate data.
| |
| | | |
− | One very early weighted estimator is the Horvitz–Thompson estimator of the mean. When the sampling probability is known, from which the sampling population is drawn from the target population, then the inverse of this probability is used to weight the observations. This approach has been generalized to many aspects of statistics under various frameworks. In particular, there are weighted likelihoods, weighted estimating equations, and weighted probability densities from which a majority of statistics are derived. These applications codified the theory of other statistics and estimators such as marginal structural models, the standardized mortality ratio, and the EM algorithm for coarsened or aggregate data.
| + | 一个非常早期的加权估计器是均值的Horvitz-Thompson估计器([[Horvitz–Thompson estimator]])<ref>{{cite journal | first1 = D. G. |last1 = Horvitz | first2 = D. J. |last2 = Thompson | title = A generalization of sampling without replacement from a finite universe | journal = [[Journal of the American Statistical Association]] | volume = 47 | pages = 663–685 | year = 1952 |issue = 260 | doi=10.1080/01621459.1952.10483446}}</ref>。当抽样概率是已知的,抽样人群是从目标人群中抽取的,那么这个概率的倒数被用来加权观测。这种方法已经在不同的框架下被推广到统计学的许多方面。特别是,有加权似然([[likelihood function|weighted likelihoods]])、加权估计方程([[generalized estimating equations|weighted estimating equations]])和加权概率密度([[probability density function|weighted probability densities]]),大多数统计学都是由此而来的。这些应用编纂了其他统计学和估计器的理论,如边际结构模型([[marginal structural models]])、标准化死亡率([[standardized mortality ratio]]),以及用于粗粒度或聚合数据的EM算法([[EM algorithm]])。 |
| | | |
− | 一个非常早期的加权估计是均值的 Horvitz-Thompson 估计。当抽样概率已知时,从目标总体中抽取抽样总体,然后用该概率的倒数来加权观测值。这种方法已经推广到各种框架下的统计的许多方面。特别是,有加权可能性、加权估计方程和加权概率密度,从中得出大多数统计数据。这些应用编纂了其他统计理论和估计器,如边际结构模型,标准死亡率,和 EM 算法的粗化或聚合数据。
| |
| | | |
− | Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis.<ref name="refname1"/>
| + | 当数据缺失的受试者不能被纳入主要分析时,逆概率加权也被用来解释缺失的数据<ref name="refname1" />。有了对抽样概率的估计,或该因素在另一次测量中被测量的概率,逆概率加权可以用来提高那些由于数据缺失程度大而代表性不足的受试者的权重。 |
− | With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of [[missing data]].
| |
− | | |
− | Inverse probability weighting is also used to account for missing data when subjects with missing data cannot be included in the primary analysis.
| |
− | With an estimate of the sampling probability, or the probability that the factor would be measured in another measurement, inverse probability weighting can be used to inflate the weight for subjects who are under-represented due to a large degree of missing data.
| |
− | | |
− | 当缺失数据不能包含在初步分析中时,逆概率加权也可用于考虑缺失数据。根据抽样概率的估计,或在另一测量中测量该因素的概率,可以使用逆概率加权来夸大由于大量数据缺失而代表性不足的受试者的权重。
| |
| | | |
| == 逆概率加权估计量(Inverse Probability Weighted Estimator, IPWE) == | | == 逆概率加权估计量(Inverse Probability Weighted Estimator, IPWE) == |