第3行: |
第3行: |
| In [[statistics]], '''ignorability''' is a feature of an [[experiment design]] whereby the method of data collection (and the nature of missing data) do not depend on the missing data. A missing data mechanism such as a treatment assignment or survey sampling strategy is "ignorable" if the missing data matrix, which indicates which variables are observed or missing, is independent of the missing data conditional on the observed data. | | In [[statistics]], '''ignorability''' is a feature of an [[experiment design]] whereby the method of data collection (and the nature of missing data) do not depend on the missing data. A missing data mechanism such as a treatment assignment or survey sampling strategy is "ignorable" if the missing data matrix, which indicates which variables are observed or missing, is independent of the missing data conditional on the observed data. |
| | | |
− | 在[[统计学]]中,'''可忽略性'''是[[实验设计]]的一个特征,即数据收集的方法(以及缺失数据的性质)不依赖于缺失的数据。缺失数据机制,例如处理分配或调查抽样策略是”可忽略的”,如果缺失数据矩阵表明哪些变量是观察到的或缺失的,它独立于缺失数据条件的观察到的数据。 | + | 在[[统计学]]中,'''可忽略性'''是一种[[实验设计]]特征,即数据收集方法(以及缺失数据的性质)不依赖于缺失数据。若显示哪些变量已观测或缺失的缺失数据矩阵与已观测数据为条件的缺失数据相互独立,则称该数据缺失机制(例如处理分配或抽样调查策略)是“可忽略的”。 |
− | | |
| | | |
| | | |
| This idea is part of the [[Rubin Causal Model|Rubin Causal Inference Model]], developed by [[Donald Rubin]] in collaboration with [[Paul R. Rosenbaum|Paul Rosenbaum]] in the early 1970s. The exact definition differs between their articles in that period. In one of Rubins articles from 1978 Rubin discuss ''ignorable assignment mechanisms'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> which can be understood as the way individuals are assigned to treatment groups is irrelevant for the data analysis, given everything that is recorded about that individual. Later, in 1983 <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref> Rubin and Rosenbaum rather define ''strongly ignorable treatment assignment'' which is a stronger condition, mathematically formulated as <math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>, where <math>r_t</math> is a potential outcome given treatment <math>t</math>, <math>v</math> is some covariates and <math>z</math> is the actual treatment. | | This idea is part of the [[Rubin Causal Model|Rubin Causal Inference Model]], developed by [[Donald Rubin]] in collaboration with [[Paul R. Rosenbaum|Paul Rosenbaum]] in the early 1970s. The exact definition differs between their articles in that period. In one of Rubins articles from 1978 Rubin discuss ''ignorable assignment mechanisms'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> which can be understood as the way individuals are assigned to treatment groups is irrelevant for the data analysis, given everything that is recorded about that individual. Later, in 1983 <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref> Rubin and Rosenbaum rather define ''strongly ignorable treatment assignment'' which is a stronger condition, mathematically formulated as <math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>, where <math>r_t</math> is a potential outcome given treatment <math>t</math>, <math>v</math> is some covariates and <math>z</math> is the actual treatment. |
| | | |
− | | + | 这个想法是20世纪70年代早期[[Donald Rubin]]和[[Paul R. Rosenbaum|Paul Rosenbaum]] 合作提出的[[鲁宾因果推理模型]]的一部分。但那时,他们文章中可忽略性的确切定义不同。1978年鲁宾在一篇文章中讨论了''可忽略的分配机制'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> 其可理解为将个体分配到处理组的方式与数据分析无关,因为已经记录了有关该个体的所有信息。后来,在 1983 年,Rubin 和 Rosenbaum 更确切地定义了“处理分配的强可忽略性”, <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref>,这是一个更强的假设条件,数学公式为<math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>,其中<math>r_t</math>是给定处理状态 <math>t</math>下的潜在结果,<math>v</math> 是协变量,<math>z</math> 是实际的处理结果。 |
− | 这个想法是[[鲁宾因果推理模型]]的一部分,由[[Donald Rubin]]和[[Paul R. Rosenbaum|Paul Rosenbaum]] 在20世纪70年代早期合作开发。在那个时期,他们的文章的确切定义是不同的。鲁宾在1978年的一篇文章中讨论了''可忽略的分配机制'',<ref name="rubin78">{{cite journal |last1=Rubin |first1=Donald |title=Bayesian Inference for Causal Effects: The Role of Randomization |journal=The Annals of Statistics |date=1978 |volume=6 |issue=1 |pages=34–58|doi=10.1214/aos/1176344064 |doi-access=free }}</ref> 这种机制可以理解为个体被分配到处理组的方式,这与数据分析无关,因为关于个体的所有记录都是记录在案的。后来,在1983年鲁宾和罗森鲍姆更愿意定义''强忽略性的处理分配'' <ref>{{cite journal |last1=Rubin |first1=Donald B. |last2=Rosenbaum |first2=Paul R. |title=The Central Role of the Propensity Score in Observational Studies for Causal Effects |journal=Biometrika |date=1983 |volume=70 |issue=1 |pages=41–55 |doi=10.2307/2335942 |jstor=2335942 |doi-access=free }}</ref>,这是一个更强的条件,数学公式为<math>(r_1,r_0) \perp \!\!\!\perp z \mid v ,\quad 0<\operatorname{pr}(z=1)<1 \quad \forall v</math>,其中<math>r_t</math>是给定处理状态 <math>t</math>下的潜在结果,<math>v</math> 是一些协变量,<math>z</math> 是实际的处理结果。
| |
− | | |
| | | |
| | | |
| Pearl [2000] devised a simple graphical criterion, called ''back-door'', that entails ignorability and identifies sets of covariates that achieve this condition. | | Pearl [2000] devised a simple graphical criterion, called ''back-door'', that entails ignorability and identifies sets of covariates that achieve this condition. |
| | | |
− | | + | Pearl [2000]设计了一个简单的图形标准,称为“后门”(back-door) ,它需要可忽略性并确定达到这种条件的协变量集。 |
− | Pearl [2000]设计了一个简单的图形标准,称为“后门”(back-door) ,它包含可忽略性并识别实现此条件的协变量集。 | |
− | | |
| | | |
| | | |
| Ignorability (better called exogeneity) simply means we can ignore how one ended up in one vs. the other group (‘treated’ Tx = 1, or ‘control’ Tx = 0) when it comes to the potential outcome (say Y). It was also called unconfoundedness, selection on the observables, or no omitted variable bias.<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref> | | Ignorability (better called exogeneity) simply means we can ignore how one ended up in one vs. the other group (‘treated’ Tx = 1, or ‘control’ Tx = 0) when it comes to the potential outcome (say Y). It was also called unconfoundedness, selection on the observables, or no omitted variable bias.<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref> |
| | | |
− | 可忽略性(称为外生性更好)其简明含义是,当涉及到潜在结果时,我们可以忽略一个人如何最终处于一个群体中而非另一个群体中(“处理组”Tx = 1,或“控制组”Tx = 0)。它也被称为不混淆,选择的可观察的,或没有遗漏的变量偏差<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>。 | + | 可忽略性(称为外生性更好)其简明含义是,当涉及到潜在结果(Y)时,一个人是怎样最终处于一个群体中而非另一个群体中(“处理组”Tx = 1,或“控制组”Tx = 0)我们是可忽略的。它也被称为非混淆性,基于可观测变量的选择选择的可观察的,或无遗漏变量偏差<ref>{{cite journal|last1=Yamamoto|first1=Teppei|title=Understanding the Past: Statistical Analysis of Causal Attribution|journal=Journal of Political Science|date=2012|volume=56|issue=1|pages=237–256|doi=10.1111/j.1540-5907.2011.00539.x|hdl=1721.1/85887}}</ref>。 |
− | | |
| | | |
| | | |
| Formally it has been written as [Y<sub>i</sub>1, Y<sub>i</sub>0] ⊥ Tx<sub>i</sub>, or in words the potential Y outcome of person ''i'' had they been treated or not does not depend on whether they have really been (observable) treated or not. We can ignore in other words how people ended up in one vs. the other condition, and treat their potential outcomes as exchangeable. While this seems thick, it becomes clear if we add subscripts for the ‘realized’ and superscripts for the ‘ideal’ (potential) worlds (notation suggested by [https://www.cambridge.org/core/books/statistical-models-and-causal-inference/7CE8D4957FF6E9615AAAC4128FA8246E David Freedman]; a visual can help here: [https://drive.google.com/open?id=1nLHHH0il225LIy33nRiH3ZfgoX1_-_V9 potential outcomes simplified]). | | Formally it has been written as [Y<sub>i</sub>1, Y<sub>i</sub>0] ⊥ Tx<sub>i</sub>, or in words the potential Y outcome of person ''i'' had they been treated or not does not depend on whether they have really been (observable) treated or not. We can ignore in other words how people ended up in one vs. the other condition, and treat their potential outcomes as exchangeable. While this seems thick, it becomes clear if we add subscripts for the ‘realized’ and superscripts for the ‘ideal’ (potential) worlds (notation suggested by [https://www.cambridge.org/core/books/statistical-models-and-causal-inference/7CE8D4957FF6E9615AAAC4128FA8246E David Freedman]; a visual can help here: [https://drive.google.com/open?id=1nLHHH0il225LIy33nRiH3ZfgoX1_-_V9 potential outcomes simplified]). |
− |
| |
− |
| |
| | | |
| 数学形式上,它被写成[Y<sub>i</sub>1, Y<sub>i</sub>0] ⊥ Tx<sub>i</sub> ,或者用文字来说,人们的潜在结果Y我已经治疗或不治疗不取决于他们是否真的被(可观察的)治疗。换句话说,我们可以忽略人们是如何在一种情况下和另一种情况下结束生命的,而把他们的潜在结果看作是可以交换的。虽然这看起来很厚,但是如果我们为“理想”(潜在)世界添加“已实现”的下标和上标就变得很清楚了(由 [https://www.cambridge.org/core/books/statistical-models-and-causal-inference/7CE8D4957FF6E9615AAAC4128FA8246E David Freedman]提出的符号; 一个视觉可以在这里帮助:[https://drive.google.com/open?id=1nLHHH0il225LIy33nRiH3ZfgoX1_-_V9 potential outcomes simplified]). | | 数学形式上,它被写成[Y<sub>i</sub>1, Y<sub>i</sub>0] ⊥ Tx<sub>i</sub> ,或者用文字来说,人们的潜在结果Y我已经治疗或不治疗不取决于他们是否真的被(可观察的)治疗。换句话说,我们可以忽略人们是如何在一种情况下和另一种情况下结束生命的,而把他们的潜在结果看作是可以交换的。虽然这看起来很厚,但是如果我们为“理想”(潜在)世界添加“已实现”的下标和上标就变得很清楚了(由 [https://www.cambridge.org/core/books/statistical-models-and-causal-inference/7CE8D4957FF6E9615AAAC4128FA8246E David Freedman]提出的符号; 一个视觉可以在这里帮助:[https://drive.google.com/open?id=1nLHHH0il225LIy33nRiH3ZfgoX1_-_V9 potential outcomes simplified]). |
第42行: |
第34行: |
| | | |
| | | |
− | 同样,*Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup>是个体未被处理 (superscript <sup>0</sup>)的潜在结果Y,当现实中它们是(*Y<sub>1</sub><sup>0</sup>, subscript <sub>1</sub>),或实际上不是 (Y<sub>0</sub><sup>0</sup>). | + | 同样,*Y<sub>1</sub><sup>0</sup>/Y<sub>0</sub><sup>0</sup>是个体未被处理 (上角标0)的潜在结果Y,当现实中它们是(*Y<sub>1</sub><sup>0</sup>, subscript <sub>1</sub>),或实际上不是 (Y<sub>0</sub><sup>0</sup>). |
| | | |
| | | |