更改

删除2字节 、 2021年6月4日 (五) 20:20
无编辑摘要
第6行: 第6行:  
In [[statistics]], '''ignorability''' is a feature of an [[experiment design]] whereby the method of data collection (and the nature of missing data) do not depend on the missing data.  A missing data mechanism such as a treatment assignment or survey sampling strategy is "ignorable" if the missing data matrix, which indicates which variables are observed or missing, is independent of the missing data conditional on the observed data.
 
In [[statistics]], '''ignorability''' is a feature of an [[experiment design]] whereby the method of data collection (and the nature of missing data) do not depend on the missing data.  A missing data mechanism such as a treatment assignment or survey sampling strategy is "ignorable" if the missing data matrix, which indicates which variables are observed or missing, is independent of the missing data conditional on the observed data.
   −
在[[统计学]]中,'''可忽略性'''是一种[[实验设计]]特征,即数据收集方法(以及缺失数据的性质)不依赖于缺失数据。若显示哪些变量已观测或缺失的缺失数据矩阵与已观测数据为条件的缺失数据相互独立,则称该数据缺失机制(例如处理分配或抽样调查策略)是“可忽略的”。
+
在[[统计学]]中,'''可忽略性'''[[实验设计]]的一种特征,即数据收集方式(以及缺失数据的性质)不依赖于缺失数据。若在给定已观测数据的条件下,表示哪些变量被观测到或缺失的缺失数据指示矩阵与缺失数据独立,则称该数据缺失机制(例如处理分配或抽样调查策略)是“可忽略的”。
       
This idea is part of the [[Rubin Causal Model|Rubin Causal Inference Model]], developed by [[Donald Rubin]] in collaboration with [[Paul R. Rosenbaum|Paul Rosenbaum]] in the early 1970s. The exact definition differs between their articles in that period. In one of Rubins articles from 1978 Rubin discuss ''ignorable assignment mechanisms'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> which can be understood as the way individuals are assigned to treatment groups is irrelevant for the data analysis, given everything that is recorded about that individual. Later, in 1983 <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref> Rubin and Rosenbaum rather define ''strongly ignorable treatment assignment'' which is a stronger condition, mathematically formulated as <math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>, where <math>r_t</math> is a potential outcome given treatment <math>t</math>, <math>v</math> is some covariates and <math>z</math> is the actual treatment.
 
This idea is part of the [[Rubin Causal Model|Rubin Causal Inference Model]], developed by [[Donald Rubin]] in collaboration with [[Paul R. Rosenbaum|Paul Rosenbaum]] in the early 1970s. The exact definition differs between their articles in that period. In one of Rubins articles from 1978 Rubin discuss ''ignorable assignment mechanisms'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> which can be understood as the way individuals are assigned to treatment groups is irrelevant for the data analysis, given everything that is recorded about that individual. Later, in 1983 <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref> Rubin and Rosenbaum rather define ''strongly ignorable treatment assignment'' which is a stronger condition, mathematically formulated as <math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>, where <math>r_t</math> is a potential outcome given treatment <math>t</math>, <math>v</math> is some covariates and <math>z</math> is the actual treatment.
   −
这个想法是20世纪70年代早期[[Donald Rubin]]和[[Paul R. Rosenbaum|Paul Rosenbaum]] 合作提出的[[鲁宾因果推理模型]]的一部分。但那时,他们文章中可忽略性的确切定义不同。1978年鲁宾在一篇文章中讨论了''可忽略的分配机制''<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> ,其可理解为将个体分配到处理组的方式与数据分析无关,因为已经记录了有关该个体的所有信息。后来,在 1983 年,Rubin 和 Rosenbaum 更确切地定义了“处理分配的强可忽略性”<ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref>,这是一个更强的假设条件,数学公式为<math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>,其中<math>r_t</math>是给定处理状态 <math>t</math>下的潜在结果,<math>v</math> 是协变量,<math>z</math> 是实际的处理状态。
+
这个想法是20世纪70年代早期[[Donald Rubin]]和[[Paul R. Rosenbaum|Paul Rosenbaum]] 合作提出的[[鲁宾因果推理模型]]的一部分。但那时,他们文章中可忽略性的确切定义不同。1978年鲁宾在一篇文章中讨论了''可忽略的分配机制''<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> ,其可理解为将个体分配到处理组的方式与数据分析无关,因为已经记录了有关该个体的所有信息。后来,在 1983 年,Rubin 和 Rosenbaum 更确切地定义了“处理分配的强可忽略性”<ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref>,这是一个更强的假设条件,数学上表示为<math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>,其中<math>r_t</math>是给定处理状态 <math>t</math>下的潜在结果,<math>v</math> 是协变量,<math>z</math> 是实际的处理状态。
       
Pearl [2000] devised a simple graphical criterion, called ''back-door'', that entails ignorability and identifies sets of covariates that achieve this condition.
 
Pearl [2000] devised a simple graphical criterion, called ''back-door'', that entails ignorability and identifies sets of covariates that achieve this condition.
   −
Pearl [2000]设计了一个简单的图形准则,称为“后门”(back-door) ,它需要可忽略性并确定达到这种条件的协变量集。
+
Pearl [2000]设计了一个简单的图形准则,称为“后门”(back-door) ,它需要可忽略性并能识别满足后门准则条件的协变量集。
       
Ignorability (better called exogeneity) simply means we can ignore how one ended up in one vs. the other group (‘treated’ Tx = 1, or ‘control’ Tx = 0) when it comes to the potential outcome (say Y). It was also called unconfoundedness, selection on the observables, or no omitted variable bias.<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>
 
Ignorability (better called exogeneity) simply means we can ignore how one ended up in one vs. the other group (‘treated’ Tx = 1, or ‘control’ Tx = 0) when it comes to the potential outcome (say Y). It was also called unconfoundedness, selection on the observables, or no omitted variable bias.<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>
   −
可忽略性(称为外生性更好)其简明含义是,当涉及到潜在结果(Y)时,一个人是怎样最终处于一个群体中而非另一个群体中(“处理组”Tx = 1,或“控制组”Tx = 0)我们是可忽略的。它也被称为非混淆性、基于可观测变量的选择或无遗漏变量偏差<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>。
+
可忽略性(称为外生性更好)其简明含义是,当涉及潜在结果(Y)时,我们可以忽略一个人是怎样最终处于一个群体中而非另一个群体中(“处理组”Tx = 1,或“控制组”Tx = 0)。它也被称为无混淆杂性、基于可观测变量的选择或无遗漏变量偏差<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>。
      第30行: 第30行:  
So: Y<sub>1</sub><sup>1</sup>/*Y<sub>0</sub><sup>1</sup> are potential Y outcomes had the person been treated (superscript <sup>1</sup>), when in reality they have actually been (Y<sub>1</sub><sup>1</sup>, subscript <sub>1</sub>), or not (*Y<sub>0</sub><sup>1</sup>: the * signals this quantity can never be realized or observed, or is ''fully'' contrary-to-fact or counterfactual, CF).
 
So: Y<sub>1</sub><sup>1</sup>/*Y<sub>0</sub><sup>1</sup> are potential Y outcomes had the person been treated (superscript <sup>1</sup>), when in reality they have actually been (Y<sub>1</sub><sup>1</sup>, subscript <sub>1</sub>), or not (*Y<sub>0</sub><sup>1</sup>: the * signals this quantity can never be realized or observed, or is ''fully'' contrary-to-fact or counterfactual, CF).
   −
所以,如果个体被处理(上角标为 <sup>1</sup>),其对应的潜在结果Y为Y<sub>1</sub><sup>1</sup>/*Y<sub>0</sub><sup>1</sup>,实际上它们可观测的结果是(Y<sub>1</sub><sup>1</sup>, 下角标也为 <sub>1</sub>) ,而不是*Y<sub>0</sub><sup>1</sup>。注意:* 表示这个值是无法获取或不可观测的,即''完全与事实相反''或称为反事实(counterfactual, CF)。
+
所以,如果个体接受处理(上角标为 <sup>1</sup>),其对应的潜在结果Y为Y<sub>1</sub><sup>1</sup>/*Y<sub>0</sub><sup>1</sup>,实际上它们可观测的结果是(Y<sub>1</sub><sup>1</sup>, 下角标也为 <sub>1</sub>) ,而不是*Y<sub>0</sub><sup>1</sup>。注意:* 表示这个值是无法获取或不可观测的,即''完全与事实相反''或称为反事实(counterfactual, CF)。
       
Similarly, *Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup> are potential Y outcomes had the person not been treated (superscript <sup>0</sup>), when in reality they have been (*Y<sub>1</sub><sup>0</sup>, subscript <sub>1</sub>), or not actually (Y<sub>0</sub><sup>0</sup>).
 
Similarly, *Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup> are potential Y outcomes had the person not been treated (superscript <sup>0</sup>), when in reality they have been (*Y<sub>1</sub><sup>0</sup>, subscript <sub>1</sub>), or not actually (Y<sub>0</sub><sup>0</sup>).
   −
同样,当现实中它们是(*Y<sub>1</sub><sup>0</sup>, 下角标为 <sub>1</sub>),或实际上不是 (Y<sub>0</sub><sup>0</sup>)时,表个体未被处理 (上角标为 <sup>0</sup>),对应的潜在结果Y为*Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup>。
+
同样,如果个体未接受处理(上角标为 <sup>0</sup>), 其对应的潜在结果Y为*Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup>。在现实中它们是(Y<sub>0</sub><sup>0</sup>),而不是(*Y<sub>1</sub><sup>0</sup>。
 
      
Only one of each potential outcome (PO) can be realized, the other cannot, for the same assignment to condition, so when we try to estimate treatment effects, we need something to replace the fully contrary-to-fact ones with observables (or estimate them). When ignorability/exogeneity holds, like when people are randomized to be treated or not, we can ‘replace’ *''Y''<sub>0</sub><sup>1</sup> with its observable counterpart Y<sub>1</sub><sup>1</sup>, and *Y<sub>1</sub><sup>0</sup> with its observable counterpart ''Y''<sub>0</sub><sup>0</sup>, not at the individual level Y<sub>i</sub>’s, but when it comes to averages like E[''Y''<sub>''i''</sub><sup>1</sup> – ''Y''<sub>''i''</sub><sup>0</sup>], which is exactly the causal treatment effect (TE) one tries to recover.
 
Only one of each potential outcome (PO) can be realized, the other cannot, for the same assignment to condition, so when we try to estimate treatment effects, we need something to replace the fully contrary-to-fact ones with observables (or estimate them). When ignorability/exogeneity holds, like when people are randomized to be treated or not, we can ‘replace’ *''Y''<sub>0</sub><sup>1</sup> with its observable counterpart Y<sub>1</sub><sup>1</sup>, and *Y<sub>1</sub><sup>0</sup> with its observable counterpart ''Y''<sub>0</sub><sup>0</sup>, not at the individual level Y<sub>i</sub>’s, but when it comes to averages like E[''Y''<sub>''i''</sub><sup>1</sup> – ''Y''<sub>''i''</sub><sup>0</sup>], which is exactly the causal treatment effect (TE) one tries to recover.
第45行: 第44行:  
Because of the ‘consistency rule’, the potential outcomes are the values actually realized, so we can write Y<sub>i</sub><sup>0</sup> = Y<sub>i0</sub><sup>0</sup> and Y<sub>i</sub><sup>1</sup> = Y<sub>i1</sub><sup>1</sup> (“the consistency rule states that an individual’s potential outcome under a hypothetical condition that happened to materialize is precisely the outcome experienced by that individual”,<ref>{{cite journal|last1=Pearl|first1=Judea|title=On the consistency rule in causal inference: axiom, definition, assumption, or theorem?|journal=Epidemiology|date=2010|volume=21|issue=6|pages=872–875|doi=10.1097/EDE.0b013e3181f5d3fd|pmid=20864888}}</ref> p.&nbsp;872). Hence TE = E[Y<sub>i</sub><sup>1</sup> – Y<sub>i</sub><sup>0</sup>] = E[Y<sub>i1</sub><sup>1</sup> – Y<sub>i0</sub><sup>0</sup>].
 
Because of the ‘consistency rule’, the potential outcomes are the values actually realized, so we can write Y<sub>i</sub><sup>0</sup> = Y<sub>i0</sub><sup>0</sup> and Y<sub>i</sub><sup>1</sup> = Y<sub>i1</sub><sup>1</sup> (“the consistency rule states that an individual’s potential outcome under a hypothetical condition that happened to materialize is precisely the outcome experienced by that individual”,<ref>{{cite journal|last1=Pearl|first1=Judea|title=On the consistency rule in causal inference: axiom, definition, assumption, or theorem?|journal=Epidemiology|date=2010|volume=21|issue=6|pages=872–875|doi=10.1097/EDE.0b013e3181f5d3fd|pmid=20864888}}</ref> p.&nbsp;872). Hence TE = E[Y<sub>i</sub><sup>1</sup> – Y<sub>i</sub><sup>0</sup>] = E[Y<sub>i1</sub><sup>1</sup> – Y<sub>i0</sub><sup>0</sup>].
   −
由于“一致性准则”,潜在结果可利用实际观测值表示:Y<sub>i</sub><sup>0</sup> = Y<sub>i0</sub><sup>0</sup> ; Y<sub>i</sub><sup>1</sup> = Y<sub>i1</sub><sup>1</sup>(“一致性准则指出,在假设条件成立时下,个体的潜在结果正是该个体的实际产生结果<ref>{{cite journal|last1=Pearl|first1=Judea|title=On the consistency rule in causal inference: axiom, definition, assumption, or theorem?|journal=Epidemiology|date=2010|volume=21|issue=6|pages=872–875|doi=10.1097/EDE.0b013e3181f5d3fd|pmid=20864888}}</ref> p.&nbsp;872)。 所以,TE = E[Y<sub>i</sub><sup>1</sup> – Y<sub>i</sub><sup>0</sup>] = E[Y<sub>i1</sub><sup>1</sup> – Y<sub>i0</sub><sup>0</sup>]。
+
由于“一致性准则”,潜在结果可利用实际观测值表示:Y<sub>i</sub><sup>0</sup> = Y<sub>i0</sub><sup>0</sup> ; Y<sub>i</sub><sup>1</sup> = Y<sub>i1</sub><sup>1</sup>(“一致性准则指出,个体的潜在结果正是该个体的实际产生结果<ref>{{cite journal|last1=Pearl|first1=Judea|title=On the consistency rule in causal inference: axiom, definition, assumption, or theorem?|journal=Epidemiology|date=2010|volume=21|issue=6|pages=872–875|doi=10.1097/EDE.0b013e3181f5d3fd|pmid=20864888}}</ref> p.&nbsp;872)。 所以,TE = E[Y<sub>i</sub><sup>1</sup> – Y<sub>i</sub><sup>0</sup>] = E[Y<sub>i1</sub><sup>1</sup> – Y<sub>i0</sub><sup>0</sup>]。
      第61行: 第60行:  
Ignorability, either plain or conditional on some other variables, implies that such selection bias can be ignored, so one can recover (or estimate) the causal effect.
 
Ignorability, either plain or conditional on some other variables, implies that such selection bias can be ignored, so one can recover (or estimate) the causal effect.
   −
无论是普通的还是条件性的可忽略性,都意味着这种选择偏差可以被忽略或消除,因此人们可以得到(或估计)因果效应。
+
无论是普通的还是在给定一些变量条件下的可忽略性,都意味着这种选择偏差可以被忽略或消除,因此人们可以得到(或估计)因果效应。
      第82行: 第81行:     
----
 
----
本中文词条由[[用户:shlay|shlay]]用户参与编译,[[用户: | ]]参与审校,[[用户:思无涯咿呀咿呀|思无涯咿呀咿呀]]编辑,欢迎在讨论页面留言。
+
本中文词条由[[用户:shlay|shlay]]用户参与编译,[[用户:PengWu|PengWu]]参与审校,[[用户:思无涯咿呀咿呀|思无涯咿呀咿呀]]编辑,欢迎在讨论页面留言。
       
'''本词条内容源自wikipedia及公开资料,遵守 CC3.0协议。'''
 
'''本词条内容源自wikipedia及公开资料,遵守 CC3.0协议。'''
3

个编辑