更改

跳到导航 跳到搜索
添加26字节 、 2021年6月4日 (五) 00:09
无编辑摘要
第26行: 第26行:  
In order to define formally the ATE, we define two potential outcomes : <math>y_{0}(i)</math> is the value of the outcome variable for individual <math>i</math> if they are not treated, <math>y_{1}(i)</math> is the value of the outcome variable for individual <math>i</math> if they are treated. For example, <math>y_{0}(i)</math>  is the health status of the individual if they are not administered the drug under study and <math>y_{1}(i)</math> is the health status if they are administered the drug.
 
In order to define formally the ATE, we define two potential outcomes : <math>y_{0}(i)</math> is the value of the outcome variable for individual <math>i</math> if they are not treated, <math>y_{1}(i)</math> is the value of the outcome variable for individual <math>i</math> if they are treated. For example, <math>y_{0}(i)</math>  is the health status of the individual if they are not administered the drug under study and <math>y_{1}(i)</math> is the health status if they are administered the drug.
   −
为了正式定义平均处理效应,我们定义了两个潜在的结果: <math>y_{0}(i)</math > 是个体 <math> i </math> 没有被处理的结果变量的取值,<math> y _ {1}(i) </math> 是个体 <math> i </math> 被处理的结果变量的取值。例如,<math>y_{0}(i)</math > 是个体 <math> i </math> 没有被注射研究药物的健康状态,<math>y_{1}(i)</math > 是个体 <math> i </math> 被注射药物的健康状态。
+
为了形式化定义平均处理效应,我们定义了两个潜在的结果: <math>y_{0}(i)</math > 是个体 <math> i </math> 没有被处理时的结果变量的取值,<math> y _ {1}(i) </math> 是个体 <math> i </math> 被处理时的结果变量的取值。例如,<math>y_{0}(i)</math > 是个体 <math> i </math> 没有被注射研究药物时的健康状态,<math>y_{1}(i)</math > 是个体 <math> i </math> 被注射药物时的健康状态。
    
The treatment effect for individual <math>i</math> is given by <math>y_{1}(i)-y_{0}(i)=\beta(i)</math>. In the general case, there is no reason to expect this effect to be constant across individuals. The average treatment effect is given by  
 
The treatment effect for individual <math>i</math> is given by <math>y_{1}(i)-y_{0}(i)=\beta(i)</math>. In the general case, there is no reason to expect this effect to be constant across individuals. The average treatment effect is given by  
第34行: 第34行:  
:<math>\text{ATE} = \frac{1}{N}\sum_i (y_{1}(i)-y_{0}(i))</math>
 
:<math>\text{ATE} = \frac{1}{N}\sum_i (y_{1}(i)-y_{0}(i))</math>
   −
这里对总体中所有N数量个体进行了求和。
+
这里对总体中所有N数量个体的处理效应进行了聚合平均。
 
         
If we could observe, for each individual, <math>y_{1}(i)</math> and <math>y_{0}(i)</math> among a large representative sample of the population, we could estimate the ATE simply by taking the average value of <math>y_{1}(i)-y_{0}(i)</math> across the sample. However, we can not observe both <math>y_{1}(i)</math> and <math>y_{0}(i)</math> for each individual since an individual cannot be both treated and not treated. For example, in the drug example, we can only observe  <math>y_{1}(i)</math> for individuals who have received the drug and <math>y_{0}(i)</math> for those who did not receive it. This is the main problem faced by scientists in the evaluation of treatment effects and has triggered a large body of estimation techniques.
 
If we could observe, for each individual, <math>y_{1}(i)</math> and <math>y_{0}(i)</math> among a large representative sample of the population, we could estimate the ATE simply by taking the average value of <math>y_{1}(i)-y_{0}(i)</math> across the sample. However, we can not observe both <math>y_{1}(i)</math> and <math>y_{0}(i)</math> for each individual since an individual cannot be both treated and not treated. For example, in the drug example, we can only observe  <math>y_{1}(i)</math> for individuals who have received the drug and <math>y_{0}(i)</math> for those who did not receive it. This is the main problem faced by scientists in the evaluation of treatment effects and has triggered a large body of estimation techniques.
   −
如果我们能观察到一个大型代表性样本中每个个体的<math> y _ {1}(i) </math> 和 <math> y _ {0}(i) </math> ,我们可以简单地通过取样本中 <math> y _ {1}(i)-y _ {0}(i) </math> 的平均值来估计平均治疗效果。然而,我们不能同时观察每个个体的<math> y _ {1}(i)、y _ {0}(i) </math>,因为每个个体不能同时被处理和不被处理。例如,在药物例子中,我们只能观察到个体接受过药物治疗的<math> y _ {1}(i) </math> 和个体未接受药物的 <math> y _ {0}(i) </math> 。这是研究学者在评估治疗效果时面临的主要问题,并因此引发了大量估计技术的研究。
+
如果我们能观察到一个大型代表性样本中每个个体的<math> y _ {1}(i) </math> 和 <math> y _ {0}(i) </math> ,我们可以简单地通过取样本中 <math> y _ {1}(i)-y _ {0}(i) </math> 的平均值来估计平均处理效应。然而,我们不能同时观察每个个体的<math> y _ {1}(i)、y _ {0}(i) </math>,因为每个个体不能同时被处理和不被处理。例如,在药物例子中,我们只能观察到个体接受过药物治疗的<math> y _ {1}(i) </math> 和个体未接受药物的 <math> y _ {0}(i) </math> 。这是研究者们在评估治疗效果时面临的主要问题,并因此引发了大量的与估计方法相关的研究。
    
== 估计 Estimation ==
 
== 估计 Estimation ==
第46行: 第45行:  
Depending on the data and its underlying circumstances, many methods can be used to estimate the ATE. The most common ones are:
 
Depending on the data and its underlying circumstances, many methods can be used to estimate the ATE. The most common ones are:
   −
根据数据及其潜在环境,可以使用许多方法来估计平均处理效应<math> \text{ATE} </math>。最常见方法是:
+
根据数据及其潜在环境的不同,我们可以使用许多方法来估计平均处理效应<math> \text{ATE} </math>。最常见方法包括:
    
* 自然实验 Natural Experiment  
 
* 自然实验 Natural Experiment  
第62行: 第61行:  
Consider an example where all units are unemployed individuals, and some experience a policy intervention (the treatment group), while others do not (the control group). The causal effect of interest is the impact a job search monitoring policy (the treatment) has on the length of an unemployment spell: On average, how much shorter would one's unemployment be if they experienced the intervention? The ATE, in this case, is the difference in expected values (means) of the treatment and control groups' length of unemployment.
 
Consider an example where all units are unemployed individuals, and some experience a policy intervention (the treatment group), while others do not (the control group). The causal effect of interest is the impact a job search monitoring policy (the treatment) has on the length of an unemployment spell: On average, how much shorter would one's unemployment be if they experienced the intervention? The ATE, in this case, is the difference in expected values (means) of the treatment and control groups' length of unemployment.
   −
考虑一个失业群体,对一些个体给与政策干预(处理组),其余的不做任何处理(控制组) 。现需要计算求职监控政策(干预)对失业期长短的影响: 平均来说,如果对个体进行求职监控(给与干预),失业期会缩短多少?在这种情况下,平均处理效应是处理组和对照组的失业时间长度的期望值(平均值)差异。
+
考虑一个失业群体,对其中一些个体给与政策干预(处理组),其余的不做任何处理(控制组)。现需要计算求职监控政策(干预)对失业期长短的影响: 平均来说,如果对个体进行求职监控(给与干预),失业期会缩短多少?在选择一种干预这种情况下,平均处理效应是处理组和对照组的失业时间长度的期望值(平均值)差异。
       
A positive ATE, in this example, would suggest that the job policy increased the length of unemployment. A negative ATE would suggest that the job policy decreased the length of unemployment. An ATE estimate equal to zero would suggest that there was no advantage or disadvantage to providing the treatment in terms of the length of unemployment. Determining whether an ATE estimate is distinguishable from zero (either positively or negatively) requires statistical inference.
 
A positive ATE, in this example, would suggest that the job policy increased the length of unemployment. A negative ATE would suggest that the job policy decreased the length of unemployment. An ATE estimate equal to zero would suggest that there was no advantage or disadvantage to providing the treatment in terms of the length of unemployment. Determining whether an ATE estimate is distinguishable from zero (either positively or negatively) requires statistical inference.
   −
在这个例子中,正值平均处理效应意味着就业政策延长了失业期,负值平均处理效应表明就业政策缩短了失业期。零值平均处理效应表明提供就业政策对失业期长短并没有任何利处或不利,判断一个平均处理效应估计值是否可以区分为零需要进行统计推断。
+
在这个例子中,平均处理效应为正值意味着就业政策延长了失业期,平均处理效应为负值表明就业政策缩短了失业期。平均处理效应等于零表明提供就业政策对失业期长短并没有任何利处或不利。判断一个平均处理效应估计值是否为可以区分的零值需要进行统计推断。
      第74行: 第73行:       −
因为平均处理效应是对处理的平均效果估计,正值或者负值平均处理效应并不表明处理对任意特定个体是有益的或者有害的。因此,平均处理效应忽略了治疗效果分布。即使平均效应是正值,群体的部分个体也可能因为这种处理或者干预而使得情况变得更糟。
+
因为平均处理效应是对处理的平均效果估计,正值或者负值平均处理效应并不表明处理对任意特定个体是有益的或者有害的。因此,平均处理效应忽略了处理效应的分布。即使平均处理效应是正值,群体的部分个体也可能因为这种处理或者干预而使得情况变得更糟。
         −
== Heterogenous treatment effects ==
+
== (异质处理效应 Heterogenous treatment effects) ==
    
Some researchers call a treatment effect "heterogenous" if it affects different individuals differently (heterogeneously).  For example, perhaps the above treatment of a job search monitoring policy affected men and women differently, or people who live in different states differently.
 
Some researchers call a treatment effect "heterogenous" if it affects different individuals differently (heterogeneously).  For example, perhaps the above treatment of a job search monitoring policy affected men and women differently, or people who live in different states differently.
      −
一些研究人员将处理效果依赖于个体的情况称之为“异质性”。例如,上面提到的求职监控政策依赖于性别(男、女)或者是区域。
+
一些研究人员将处理效果依赖于个体的情况称之为“异质性”。例如,上面提到的求职监控政策依赖于性别(男、女)或者是区域的不同。
      第90行: 第89行:       −
一种异质处理效应的研究方法是将研究数据进行分组(例如,按照男、女性别,或者按区域) ,比较平均治疗效果在子组内的效应差异。每个子组的平均处理效应被称为“条件平均治疗效应”(Cnditional Average Treatment Effect,CATE) ,也就是说,每个子组的 平均处理效应被称为条件平均治疗效应,以子组内的成员为条件。
+
一种异质处理效应的研究方法是将研究数据进行分组(例如,按照男、女性别,或者按区域) ,比较平均处理效果在子组内的效应差异。每个子组的平均处理效应被称为'''<font color="#ff8000">“条件平均治疗效应”(Conditional Average Treatment Effect,CATE)</font>''' ,也就是说,每个子组的平均处理效应被称为条件平均治疗效应,以子组内的分类方式为条件。
      第97行: 第96行:       −
这种研究方法存在的一个问题是,子组的数据可能比未分组的数据要少得多,所以如果这项研究在没有进行分组分析的情况下就能检测出主要的影响,可能没有足够的数据来正确判断在子组上的影响 (感觉逻辑不对,个人建议删除这句话)。
+
这种研究方法存在的一个问题是,子组的数据可能比未分组的数据要少得多,没有足够数据进行分析。
      第105行: 第104行:  
There is some work on detecting heterogenous treatment effects using random forests.
 
There is some work on detecting heterogenous treatment effects using random forests.
   −
有一些利用随机森林检测异质处理效果相关工作。
+
也有一些利用随机森林[3][4]检测异质处理效果相关工作。
     
252

个编辑

导航菜单