随机对照试验

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
薄荷讨论 | 贡献2021年7月15日 (四) 13:09的版本
跳到导航 跳到搜索

此词条暂由彩云小译翻译,翻译字数共2766,未经人工整理和审校,带来阅读不便,请见谅。

据修改的2010年CONSORT (综合报告试验标准)要求,流程图包括:两组平行随机试验分为登记、分配、干预、随访和数据分析四个阶段,在对照试验中,需要其中一项干预作为对照处理措施。 [1]


随机对照试验(A randomized controlled trial,[2] RCT)是一种科学实验(例如:临床试验)或干预研究(区别于观察性研究) ,其目的是在测试新治疗的有效性时减少某些偏倚来源。通过受试者随机分配到两个或两个以上的组,经过不同的处理,产生的效应再与一个有可控的处理效应相比较。即一组或多组(实验组)接受正在评估的干预措施,而另一组(通常称为对照组)接受替代治疗,如安慰剂或无干预措施。在试验设计的条件下对这些组进行监测,以确定实验干预的有效性,并与对照组进行疗效比较评估。[3]当然这也包括一个以上的治疗组或一个以上的对照组。


试验可能采用了盲法 blinded experiment,这意味着影响参与者的信息在试验完成后才会公布。试验的任何参与者,包括受试者、研究人员、技术人员、数据分析人员和评估人员,都可能被强加盲。有效的盲法可以减少或消除某些试验偏差的来源。


在分配治疗方案时,受试者随机地被分配到不同组。这个随机化过程减少了选择偏差和分配偏差,平衡了已知和未知的预后因素。[4]盲法减少了其他形式的实验者和主体偏见。


一个良好盲法的 RCT 通常被认为是临床试验的黄金标准。盲法随机对照试验通常用于检测医疗干预措施的效果,并且还可能提供关于药物反应等不良反应的信息。随机对照试验可以提供令人信服的证据,证明研究治疗对人类健康产生了影响。[5]


“ RCT”和“随机试验”这两个术语有时被用作同义词,但后一个术语没有提到对照,因此可以描述在没有对照组的情况下相互比较多个治疗组的研究。.[6]科学文献中常有“随机临床试验”或“随机比较试验”这类引发歧义的术语。[7][8]并非所有的随机临床试验都是随机对照试验(其中一些试验永远不可能成为随机对照试验,因为实施控制是不切实际或不道德的)。随机对照临床试验这个术语是临床研究中使用的另一个术语;[9]然而,随机对照临床试验也被用于其他研究领域,包括许多社会科学。


历史

The first reported clinical trial was conducted by James Lind in 1747 to identify treatment for scurvy.[10] The first blind experiment was conducted by the French Royal Commission on Animal Magnetism in 1784 to investigate the claims of mesmerism. An early essay advocating the blinding of researchers came from Claude Bernard in the latter half of the 19th century. Bernard recommended that the observer of an experiment should not have knowledge of the hypothesis being tested. This suggestion contrasted starkly with the prevalent Enlightenment-era attitude that scientific observation can only be objectively valid when undertaken by a well-educated, informed scientist.[11] The first study recorded to have a blinded researcher was conducted in 1907 by W. H. R. Rivers and H. N. Webber to investigate the effects of caffeine.[12]

据报道,1747年James Lind进行了第一个临床试验,目的是确定治疗坏血病的方法。1784年,French Royal Commission on Animal Magnetism进行了第一次盲法实验,以调查催眠术的说法。19世纪下半叶,一篇提倡研究人员失明的早期文章来自Claude Bernard。Bernard建议实验的观察者不要知道正在被测试的假设。这一建议与启蒙时代流行的态度形成鲜明对比,即科学观察只有由受过良好教育、消息灵通的科学家进行才能客观有效。1907年,W. H. R. Rivers和H. N. Webber进行了第一项有记录的盲法研究,研究咖啡因的作用。


在19世纪80年代,Charles Sanders Peirce和Joseph Jastrow在心理学[10]和教育学[11][12][13]领域引入随机实验。


在20世纪早期,Jerzy Neyman[14]和Ronald A. Fisher将随机实验引入农业研究。Fisher的实验研究和他的著作普及了随机实验。[15]


医学上首次发表的随机对照试验出现在1948年题为“Streptomycin treatment of pulmonary tuberculosis”的论文中,这篇论文描述了医学研究理事会的一项调查。[16][17][18]这篇论文的作者之一是Austin Bradford Hill,被认为是构想出了现代 RCT理论。[19]


20世纪80年代进行的大规模 ISIS 心脏病治疗试验进一步影响了试验设计。[20]

By the late 20th century, RCTs were recognized as the standard method for "rational therapeutics" in medicine.[24] As of 2004, more than 150,000 RCTs were in the Cochrane Library.[22] To improve the reporting of RCTs in the medical literature, an international group of scientists and editors published Consolidated Standards of Reporting Trials (CONSORT) Statements in 1996, 2001 and 2010, and these have become widely accepted.[1][4] Randomization is the process of assigning trial subjects to treatment or control groups using an element of chance to determine the assignments in order to reduce the bias.

到20世纪后期,随机对照试验被公认为医学“合理疗法”的标准方法。截至2004年,美国 Cochrane图书馆有超过15万本随机对照试验的参考资料。为了改进医学文献中对随机对照试验的报道,一个由科学家和编辑组成的国际小组在1996年、2001年和2010年发布了Consolidated Standards of Reporting Trials (CONSORT)声明,这些声明已被广泛接受。随机化是将试验受试者分配到治疗组或对照组的过程,使用机会因素来确定分配,以减少偏差。

Although the principle of clinical equipoise ("genuine uncertainty within the expert medical community... about the preferred treatment") common to clinical trials[25] has been applied to RCTs, the ethics of RCTs have special considerations. For one, it has been argued that equipoise itself is insufficient to justify RCTs.[26] For another, "collective equipoise" can conflict with a lack of personal equipoise (e.g., a personal belief that an intervention is effective).[27] Finally, Zelen's design, which has been used for some RCTs, randomizes subjects before they provide informed consent, which may be ethical for RCTs of screening and selected therapies, but is likely unethical "for most therapeutic trials."[28][29]

Although the principle of clinical equipoise ("genuine uncertainty within the expert medical community... about the preferred treatment") common to clinical trials has been applied to RCTs, the ethics of RCTs have special considerations. For one, it has been argued that equipoise itself is insufficient to justify RCTs. For another, "collective equipoise" can conflict with a lack of personal equipoise (e.g., a personal belief that an intervention is effective). Finally, Zelen's design, which has been used for some RCTs, randomizes subjects before they provide informed consent, which may be ethical for RCTs of screening and selected therapies, but is likely unethical "for most therapeutic trials."

尽管临床平衡原则已经被广泛应用于RCT,但随机对照试验的伦理问题具有特殊性。首先,有人认为平衡本身不足以证明随机对照试验的合理性。另一方面,“集体均势”可能与缺乏个人均势相冲突(例如,个人认为干预是有效的)。最后,Zelen 的设计已经被用于一些随机试验,在受试者提供知情同意之前随机化,这对于筛选和选择性治疗的随机试验来说可能是合乎道德的,但是对于“大多数治疗试验”来说可能是不道德的。

Although subjects almost always provide informed consent for their participation in an RCT, studies since 1982 have documented that RCT subjects may believe that they are certain to receive treatment that is best for them personally; that is, they do not understand the difference between research and treatment.[21][22] Further research is necessary to determine the prevalence of and ways to address this "therapeutic misconception".[22] Although subjects almost always provide informed consent for their participation in an RCT, studies since 1982 have documented that RCT subjects may believe that they are certain to receive treatment that is best for them personally; that is, they do not understand the difference between research and treatment. Further research is necessary to determine the prevalence of and ways to address this "therapeutic misconception". 一般来说,受试者要为参加随机对照试验提交了知情同意书,但1982年以来的研究记录表明,随机对照试验的受试者可能认为他们肯定会接受对他们个人最好的治疗; 也就是说,他们不理解研究和治疗之间的区别。需要进一步研究,以确定这种”治疗性误解”的流行程度和解决方法。 The RCT method variations may also create cultural effects that have not been well understood.[32] For example, patients with terminal illness may join trials in the hope of being cured, even when treatments are unlikely to be successful. RCT方法的变种也可能产生尚未被很好理解的文化效应。例如,患有晚期疾病的病人可能会加入临床试验以希望治愈,即使治疗不太可能成功的情况下也是如此。

Trial registration

In 2004, the International Committee of Medical Journal Editors (ICMJE) announced that all trials starting enrolment after July 1, 2005 must be registered prior to consideration for publication in one of the 12 member journals of the committee.[23] However, trial registration may still occur late or not at all.[24][25]Medical journals have been slow in adapting policies requiring mandatory clinical trial registration as a prerequisite for publication.[26]

In 2004, the International Committee of Medical Journal Editors (ICMJE) announced that all trials starting enrolment after July 1, 2005 must be registered prior to consideration for publication in one of the 12 member journals of the committee. However, trial registration may still occur late or not at all. Medical journals have been slow in adapting policies requiring mandatory clinical trial registration as a prerequisite for publication.

2004年, 医学杂志编辑国际委员会(ICMJE)宣布,所有在2005年7月1日之后考虑在该委员会12种杂志上发表之前,必须对试验进行注册。尽管如此,试验登记可能仍然延迟或根本不会发生。医学期刊将强制性临床试验登记作为发表的先决条件进展缓慢。

Classifications

By study design

One way to classify RCTs is by study design. From most to least common in the healthcare literature, the major categories of RCT study designs are:[27]

通过研究设计对 RCT 进行分类。从最常见到最不常见,RCT 研究设计的主要类别是

  • Parallel-group – each participant is randomly assigned to a group, and all the participants in the group receive (or do not receive) an intervention.[28][29]
  • 平行试验:每个参与者被随机分配到一个组,组中的所有参与者都接受(或不接受)干预。
  • Crossover – over time, each participant receives (or does not receive) an intervention in a random sequence.[30][31]
  • 交叉试验:随着时间的推移,每个参与者都会接受(或不接受)随机序列的干预。
  • Cluster – pre-existing groups of participants (e.g., villages, schools) are randomly selected to receive (or not receive) an intervention.[32][33]
  • 聚类试验:预先存在的参与者组(例如,村庄、学校)被随机选择以接受(或不接受)干预。
  • Factorial – each participant is randomly assigned to a group that receives a particular combination of interventions or non-interventions (e.g., group 1 receives vitamin X and vitamin Y, group 2 receives vitamin X and placebo Y, group 3 receives placebo X and vitamin Y, and group 4 receives placebo X and placebo Y).
  • 因子试验:每个参与者被随机分配到一个接受干预或非干预特定组合的组(例如,第1组接受维生素X和维生素Y,第2组接受维生素X和安慰剂Y,第3组接受安慰剂X和维生素Y,第4组接受安慰剂X和安慰剂Y)。

An analysis of the 616 RCTs indexed in PubMed during December 2006 found that 78% were parallel-group trials, 16% were crossover, 2% were split-body, 2% were cluster, and 2% were factorial.[27]

An analysis of the 616 RCTs indexed in PubMed during December 2006 found that 78% were parallel-group trials, 16% were crossover, 2% were split-body, 2% were cluster, and 2% were factorial.

对2006年12月在 PubMed 收录的616例随机对照试验的分析发现,78% 为平行组试验,16% 为交叉组试验,2% 为分体组试验,2% 为聚类组试验,2% 为因子组试验。

By outcome of interest (efficacy vs. effectiveness)

RCTs can be classified as "explanatory" or "pragmatic."[34] Explanatory RCTs test efficacy in a research setting with highly selected participants and under highly controlled conditions.[34] In contrast, pragmatic RCTs (pRCTs) test effectiveness in everyday practice with relatively unselected participants and under flexible conditions; in this way, pragmatic RCTs can "inform decisions about practice."[34]

RCTs can be classified as "explanatory" or "pragmatic." Explanatory RCTs test efficacy in a research setting with highly selected participants and under highly controlled conditions. In contrast, pragmatic RCTs (pRCTs) test effectiveness in everyday practice with relatively unselected participants and under flexible conditions; in this way, pragmatic RCTs can "inform decisions about practice."

随机对照试验可分为“解释性”或“实用性”。解释性随机对照试验在高度选定的参与者和高度受控的条件下测试有效性。相比之下,实用性随机对照测验(pragmatic RCTs, pRCT)在相对未经选择的参与者和灵活的条件下,在日常实践中检验有效性,这样,实用随机对照测验可以“为实践决策提供信息”。

By hypothesis (superiority vs. noninferiority vs. equivalence)

Another classification of RCTs categorizes them as "superiority trials", "noninferiority trials", and "equivalence trials", which differ in methodology and reporting.[35] Most RCTs are superiority trials, in which one intervention is hypothesized to be superior to another in a statistically significant way.[35] Some RCTs are noninferiority trials "to determine whether a new treatment is no worse than a reference treatment."[35] Other RCTs are equivalence trials in which the hypothesis is that two interventions are indistinguishable from each other.[35]

Another classification of RCTs categorizes them as "superiority trials", "noninferiority trials", and "equivalence trials", which differ in methodology and reporting. Most RCTs are superiority trials, in which one intervention is hypothesized to be superior to another in a statistically significant way. Some RCTs are noninferiority trials "to determine whether a new treatment is no worse than a reference treatment." Other RCTs are equivalence trials in which the hypothesis is that two interventions are indistinguishable from each other.

随机对照试验的另一种分类是“优越性试验”、“非劣性试验”和“等效性试验”,它们在方法和报告上有所不同。大多数随机对照试验都是优势试验,其中一种干预措施被假设在统计学意义上优于另一种干预措施。一些随机对照试验是非劣效性试验,“以确定一种新的治疗方法是否比一种参考治疗方法更差。”其他随机对照试验是等效试验,其中的假设是两种干预措施彼此不可区分。

Randomization

The advantages of proper randomization in RCTs include:[36]

  • "It facilitates blinding (masking) of the identity of treatments from investigators, participants, and assessors."
  • "It permits the use of probability theory to express the likelihood that any difference in outcome between treatment groups merely indicates chance."

There are two processes involved in randomizing patients to different interventions. First is choosing a randomization procedure to generate an unpredictable sequence of allocations; this may be a simple random assignment of patients to any of the groups at equal probabilities, may be "restricted", or may be "adaptive." A second and more practical issue is allocation concealment, which refers to the stringent precautions taken to ensure that the group assignment of patients are not revealed prior to definitively allocating them to their respective groups. Non-random "systematic" methods of group assignment, such as alternating subjects between one group and the other, can cause "limitless contamination possibilities" and can cause a breach of allocation concealment.


随机对照试验中适当随机化的优点包括:

  • 它消除了治疗分配中的偏差,特别是选择偏差和混淆。
  • 它便于调查人员、参与者和评估人员对治疗人群的身份进行遮蔽。
  • 它允许使用概率理论来表达这样一种可能性,即治疗组之间结果的任何差异仅仅表明是偶然性的。

有两个过程涉及到随机化的病人接受到不同的干预。首先是选择一个随机化程序来生成一个不可预测的分配序列;这可能是以相等的概率将患者随机分配到任何一组,可能是“受限的”,也可能是“适应性的”。第二个也是更实际的问题是隐藏分配,这是指在将患者明确分配到各自的组之前,采取严格的预防措施,以确保患者的组分配不被披露。非随机的“系统”组分配方法,如在一个组和另一个组之间交替患者,可能会造成“无限的污染可能性”,并可能导致分配隐藏的破坏。

However empirical evidence that adequate randomization changes outcomes relative to inadequate randomization has been difficult to detect.[37]

然而,相对于不充分随机化,充分随机化改变结果的经验证据很难被发现。

Procedures

The treatment allocation is the desired proportion of patients in each treatment arm.

治疗分配是每个治疗组中患者的期望比例。

An ideal randomization procedure would achieve the following goals:[38]

An ideal randomization procedure would achieve the following goals:

一个理想的随机化程序将实现以下目标:

  • Maximize statistical power, especially in subgroup analyses. Generally, equal group sizes maximize statistical power, however, unequal groups sizes may be more powerful for some analyses (e.g., multiple comparisons of placebo versus several doses using Dunnett's procedure[39] ), and are sometimes desired for non-analytic reasons (e.g., patients may be more motivated to enroll if there is a higher chance of getting the test treatment, or regulatory agencies may require a minimum number of patients exposed to treatment).[40]
  • Minimize selection bias. This may occur if investigators can consciously or unconsciously preferentially enroll patients between treatment arms. A good randomization procedure will be unpredictable so that investigators cannot guess the next subject's group assignment based on prior treatment assignments. The risk of selection bias is highest when previous treatment assignments are known (as in unblinded studies) or can be guessed (perhaps if a drug has distinctive side effects).
  • Minimize allocation bias (or confounding). This may occur when covariates that affect the outcome are not equally distributed between treatment groups, and the treatment effect is confounded with the effect of the covariates (i.e., an "accidental bias"[36][41]). If the randomization procedure causes an imbalance in covariates related to the outcome across groups, estimates of effect may be biased if not adjusted for the covariates (which may be unmeasured and therefore impossible to adjust for).
  • 最大化统计能力,尤其是在亚组分析中。一般来说,相等的组规模将最大化统计能力,然而,不相等的组规模对于某些分析来说可能更强大(例如,使用Dunnett程序对安慰剂与几个剂量进行多次比较),并且有时由于非分析性原因而被采用(例如,如果有更高的机会获得试验治疗,患者可能更有动力登记,或者监管机构可能要求最少数量的患者接受治疗)。
  • 最小化选择偏差。如果调查人员可以有意识或无意识地在治疗之间优先招募患者,就可能发生这种情况。一个好的随机化过程是不可预测的,因此研究人员不能根据先前的治疗分配来猜测下一个受试者的分组。当已知以前的治疗方案时(如在非盲法研究中)或可以猜到(如果一种药物有明显的副作用),选择偏倚的风险最高。
  • 最小化分配偏差(或混淆)。当影响结果的协变量在治疗组之间分布不均,并且治疗效果与协变量的效果混淆时(即“偶然偏差”),可能会出现这种情况。如果随机化程序导致与各组结果相关的协变量失衡,如果不对协变量进行调整,效果估计可能会有偏差(这可能无法测量,因此无法调整)。

However, no single randomization procedure meets those goals in every circumstance, so researchers must select a procedure for a given study based on its advantages and disadvantages.

However, no single randomization procedure meets those goals in every circumstance, so researchers must select a procedure for a given study based on its advantages and disadvantages.

然而,没有一个单一的随机化程序在每种情况下都能满足这些目标,因此研究人员必须根据其优点和缺点来选择一个给定的研究程序。

Simple

This is a commonly used and intuitive procedure, similar to "repeated fair coin-tossing."[36] Also known as "complete" or "unrestricted" randomization, it is robust against both selection and accidental biases. However, its main drawback is the possibility of imbalanced group sizes in small RCTs. It is therefore recommended only for RCTs with over 200 subjects.[42]

This is a commonly used and intuitive procedure, similar to "repeated fair coin-tossing." Also known as "complete" or "unrestricted" randomization, it is robust against both selection and accidental biases. However, its main drawback is the possibility of imbalanced group sizes in small RCTs. It is therefore recommended only for RCTs with over 200 subjects.这是一个常用且直观的程序,类似于“反复公平抛硬币”,也被称为“完全”或“无限制”随机化,它对选择和意外偏差都是稳健的。然而,它的主要缺点是在小的随机对照试验中群体规模不平衡的可能性。因此,建议仅用于受试者超过200人时进行随机对照试验。

Restricted

To balance group sizes in smaller RCTs, some form of "restricted" randomization is recommended.[42] The major types of restricted randomization used in RCTs are:

  • Permuted-block randomization or blocked randomization: a "block size" and "allocation ratio" (number of subjects in one group versus the other group) are specified, and subjects are allocated randomly within each block.[43] For example, a block size of 6 and an allocation ratio of 2:1 would lead to random assignment of 4 subjects to one group and 2 to the other. This type of randomization can be combined with "stratified randomization", for example by center in a multicenter trial, to "ensure good balance of participant characteristics in each group."[4] A special case of permuted-block randomization is random allocation, in which the entire sample is treated as one block.[43] The major disadvantage of permuted-block randomization is that even if the block sizes are large and randomly varied, the procedure can lead to selection bias.[38] Another disadvantage is that "proper" analysis of data from permuted-block-randomized RCTs requires stratification by blocks.[42]
  • Adaptive biased-coin randomization methods (of which urn randomization is the most widely known type): In these relatively uncommon methods, the probability of being assigned to a group decreases if the group is overrepresented and increases if the group is underrepresented.[43] The methods are thought to be less affected by selection bias than permuted-block randomization.[42]

为了平衡较小随机对照试验中的组规模,建议采用某种形式的“限制性”随机化。随机对照试验中主要使用的限制随机化类型有:

  • 置换区组随机化区组随机化:规定了“区组大小”和“分配比例”(一组受试者相对于另一组受试者的数量),受试者在每个区组内随机分配。例如,块大小为6,分配比例为2:1,将导致4个受试者随机分配到一个组,2个分配到另一个组。这种类型的随机化可以与“分层随机化”相结合,例如通过多中心试验中的中心,以“确保每个组中参与者特征的良好平衡。”置换块随机化的一个特殊情况是随机分配,其中整个样本被视为一个块。置换块随机化的主要缺点是,即使块大小很大且随机变化,该过程也会导致选择偏差。另一个缺点是,对置换区组随机对照试验数据的“适当”分析需要按区组分层。
  • 自适应有偏-硬币随机化方法(其中瓮随机化是最广为人知的类型):在这些相对不常见的方法中,如果一个组的代表人数过多,被分配到该组的概率会降低,如果该组的代表人数不足,被分配到该组的概率会增加。这些方法被认为比置换块随机化受选择偏差的影响更小。

Adaptive

At least two types of "adaptive" randomization procedures have been used in RCTs, but much less frequently than simple or restricted randomization:

  • Covariate-adaptive randomization, of which one type is minimization: The probability of being assigned to a group varies in order to minimize "covariate imbalance."[42] Minimization is reported to have "supporters and detractors"[43] because only the first subject's group assignment is truly chosen at random, the method does not necessarily eliminate bias on unknown factors.[4]
  • Response-adaptive randomization, also known as outcome-adaptive randomization: The probability of being assigned to a group increases if the responses of the prior patients in the group were favorable.[42] Although arguments have been made that this approach is more ethical than other types of randomization when the probability that a treatment is effective or ineffective increases during the course of an RCT, ethicists have not yet studied the approach in detail.[44]

在随机对照试验中,至少使用了两种类型的“适应性”随机化程序,但频率远低于简单或限制性随机化:

  • 协变量自适应随机化,其中一种类型是最小化:被分配到一个组的概率是变化的,以便最小化“协变量不平衡”。据报道,最小化有“支持者和诋毁者”,因为只有第一个受试者的群组分配是真正随机选择的,该方法不一定能消除对未知因素的偏见。
  • 应答自适应随机化,也称为结果自适应随机化:如果组中先前患者的应答是有利的,则被分配到一个组的概率增加。虽然有人认为,当治疗有效或无效的概率在RCT过程中增加时,这种方法比其他类型的随机化更符合伦理,但伦理学家尚未详细研究这种方法。

Allocation concealment

"Allocation concealment" (defined as "the procedure for protecting the randomization process so that the treatment to be allocated is not known before the patient is entered into the study") is important in RCTs.[45] In practice, clinical investigators in RCTs often find it difficult to maintain impartiality. Stories abound of investigators holding up sealed envelopes to lights or ransacking offices to determine group assignments in order to dictate the assignment of their next patient.[43] Such practices introduce selection bias and confounders (both of which should be minimized by randomization), possibly distorting the results of the study.[43] Adequate allocation concealment should defeat patients and investigators from discovering treatment allocation once a study is underway and after the study has concluded. Treatment related side-effects or adverse events may be specific enough to reveal allocation to investigators or patients thereby introducing bias or influencing any subjective parameters collected by investigators or requested from subjects.

“分配隐藏”(定义为“保护随机化过程的程序,以便在病人进入研究之前不知道要分配的治疗”)在随机对照试验中很重要。在实践中,临床研究人员在随机对照试验中常常发现难以保持公正性。关于调查人员将密封的信封举到灯光下或者搜查办公室来决定群组分配,以便指定下一个病人的分配的故事比比皆是。这种做法引入了选择偏差和混杂因素(这两者都应该通过随机化来减少) ,可能会扭曲研究结果。一旦研究开始并在研究结束后,充分的分配隐藏应该会阻止患者和研究者发现治疗分配。与治疗相关的副作用或不良事件可能足够具体,足以向研究者或患者揭示分配情况,从而引入偏差或影响研究者收集的或受试者要求的任何主观参数。

Some standard methods of ensuring allocation concealment include sequentially numbered, opaque, sealed envelopes (SNOSE); sequentially numbered containers; pharmacy controlled randomization; and central randomization. It is recommended that allocation concealment methods be included in an RCT's protocol, and that the allocation concealment methods should be reported in detail in a publication of an RCT's results; however, a 2005 study determined that most RCTs have unclear allocation concealment in their protocols, in their publications, or both. On the other hand, a 2008 study of 146 meta-analyses concluded that the results of RCTs with inadequate or unclear allocation concealment tended to be biased toward beneficial effects only if the RCTs' outcomes were subjective as opposed to objective.

一些确保分配隐藏的标准方法包括顺序编号、不透明、密封信封(SNOSE);顺序编号的容器;药学控制的随机化;和中心随机化。分配隐藏方法被建议纳入RCT议定书,并在RCT结果的出版物中详细报告分配隐藏方法;然而,2005年的一项研究发现,大多数随机对照试验在其方案、出版物或两者中都有不清楚的分配隐藏。另一方面,2008年的一项对146项元分析的研究得出结论,分配隐瞒不充分或不明确的随机对照试验的结果往往只有在随机对照试验的结果是主观的而不是客观的情况下才会偏向于有益的结果。

Sample size

The number of treatment units (subjects or groups of subjects) assigned to control and treatment groups, affects an RCT's reliability. If the effect of the treatment is small, the number of treatment units in either group may be insufficient for rejecting the null hypothesis in the respective statistical test. The failure to reject the null hypothesis would imply that the treatment shows no statistically significant effect on the treated in a given test. But as the sample size increases, the same RCT may be able to demonstrate a significant effect of the treatment, even if this effect is small.

分配给控制组和治疗组的治疗单位(受试者或受试者组)的样本量影响 RCT 的可靠性。如果治疗的效果很小,任何一组的治疗单位的样本量都可能不足以在各自的统计检验中拒绝零假设。拒绝无效假设的失败意味着在给定的试验中,治疗对被治疗者没有统计学上的显著影响。但是随着样本量的增加,同样的RCT可能能够证明治疗的显著效果,即使这种效果很小。

Blinding

An RCT may be blinded, (also called "masked") by "procedures that prevent study participants, caregivers, or outcome assessors from knowing which intervention was received."[46] Unlike allocation concealment, blinding is sometimes inappropriate or impossible to perform in an RCT; for example, if an RCT involves a treatment in which active participation of the patient is necessary (e.g., physical therapy), participants cannot be blinded to the intervention.

一项RCT的盲法是指阻止研究参与者、照顾者或结果评估者知道哪些干预措施,这一程序也称“蒙面”。与分配隐藏不同,在RCT,致盲有时是不合适的或不可能的;例如,如果RCT涉及需要患者积极参与的治疗(例如物理治疗),参与者不能对干预视而不见。

Traditionally, blinded RCTs have been classified as "single-blind", "double-blind", or "triple-blind"; however, in 2001 and 2006 two studies showed that these terms have different meanings for different people.[47][48] The 2010 CONSORT Statement specifies that authors and editors should not use the terms "single-blind", "double-blind", and "triple-blind"; instead, reports of blinded RCT should discuss "If done, who was blinded after assignment to interventions (for example, participants, care providers, those assessing outcomes) and how."[4]

传统上,盲法随机对照试验分为“单盲”、“双盲”或“三盲”;然而,2001年和2006年的两项研究表明,这些术语对不同的人有不同的含义。2010年CONSORT 声明明确指出,作者和编辑不应使用”单盲”、”双盲”和”三盲”等术语;相反,关于盲法 RCT 的报告应讨论”如果完成,干预分配后谁被“蒙面”了(例如,参与者、护理提供者、评估结果的人员)以及其原因”。

RCTs without blinding are referred to as "unblinded",[49] "open",[50] or (if the intervention is a medication) "open-label".[51] In 2008 a study concluded that the results of unblinded RCTs tended to be biased toward beneficial effects only if the RCTs' outcomes were subjective as opposed to objective;[46] for example, in an RCT of treatments for multiple sclerosis, unblinded neurologists (but not the blinded neurologists) felt that the treatments were beneficial.[52] In pragmatic RCTs, although the participants and providers are often unblinded, it is "still desirable and often possible to blind the assessor or obtain an objective source of data for evaluation of outcomes."[34]

没有盲法的随机对照试验被称为“未盲法”,也称“开放”,或者(如果干预是一种药物)“开放标签”。2008年的一项研究得出结论,只有当随机对照试验的结果是主观的而不是客观的时候,非盲法随机对照试验的结果往往偏向于有益的结果;例如,在RCT多发性硬化症的治疗中,未盲的神经学家认为治疗是有益的。在实用的随机对照试验中,尽管参与者和提供者往往是非盲的,但是”仍然需要并且往往可能使评估者“蒙面”,以获得评估结果的客观数据来源”。

Analysis of data

The types of statistical methods used in RCTs depend on the characteristics of the data and include:

随机对照试验中使用的统计方法类型取决于数据的特征,包括:

  • 对于二元结果数据,可以使用逻辑回归(例如,预测接受聚乙二醇干扰素α-2a治疗丙型肝炎后的持续病毒学应答)和其他方法。
  • 对于连续的结果数据,协方差分析(例如,急性冠状动脉综合征后接受阿托伐他汀后血脂水平的变化)可以用于检测预测变量效果。
  • 对于可能删失的时间到事件结果数据,生存分析(如绝经后接受激素替代治疗后冠心病发生时间的卡普兰-迈耶估计值和考克斯比例风险模型)是合适的。

Regardless of the statistical methods used, important considerations in the analysis of RCT data include:

  • Whether an RCT should be stopped early due to interim results. For example, RCTs may be stopped early if an intervention produces "larger than expected benefit or harm", or if "investigators find evidence of no important difference between experimental and control interventions."[4]
  • The extent to which the groups can be analyzed exactly as they existed upon randomization (i.e., whether a so-called "intention-to-treat analysis" is used). A "pure" intention-to-treat analysis is "possible only when complete outcome data are available" for all randomized subjects;[56] when some outcome data are missing, options include analyzing only cases with known outcomes and using imputed data.[4] Nevertheless, the more that analyses can include all participants in the groups to which they were randomized, the less bias that an RCT will be subject to.[4]

无论使用何种统计方法,RCT数据分析中的重要考虑因素包括:

  • 由于中期结果,是否应该提前停止RCT。例如,如果干预产生“大于预期的益处或危害”,或者如果“研究者发现实验干预和对照干预之间没有重要区别的证据”,则可能会提前停止随机对照试验。
  • 这些组在多大程度上可以完全按照随机化时的状态进行分析(即,是否使用了所谓的“意向治疗分析”)。一项“纯”意向治疗分析“只有在获得所有随机受试者的完整结果数据时才有可能”;当一些结果数据缺失时,选项包括仅分析具有已知结果的病例和使用估算数据。然而,分析越能包括他们被随机分组的所有参与者,RCT受到的偏见就越少。
  • 是否应进行亚组分析。这些“通常是不鼓励的”,因为多次比较可能会产生假阳性结果,而其他研究无法证实。

Reporting of results

  • The CONSORT 2010 Statement is "an evidence-based, minimum set of recommendations for reporting RCTs." The CONSORT 2010 checklist contains 25 items (many with sub-items) focusing on "individually randomised, two group, parallel trials" which are the most common type of RCT. For other RCT study designs, "CONSORT extensions" have been published, some examples are:
    • Consort 2010 Statement: Extension to Cluster Randomised Trials
    • Consort 2010 Statement: Non-Pharmacologic Treatment Interventions
    CONSORT 2010声明是“一套基于证据的报告随机对照试验的最低建议。”CONSORT 2010核对表包含25个项目(许多带有子项目),重点关注“个体随机、两组、平行试验”,这是RCT最常见的类型。 对于其他RCT研究设计,“CONSORT扩展版”已经发布,一些例子是:
    • CONSORT 2010 声明: 扩展至聚类随机试验
    • CONSORT 2010 声明: 非药物治疗干预

    Relative importance and observational studies

    Two studies published in The New England Journal of Medicine in 2000 found that observational studies and RCTs overall produced similar results. The authors of the 2000 findings questioned the belief that "observational studies should not be used for defining evidence-based medical care" and that RCTs' results are "evidence of the highest grade." However, a 2001 study published in Journal of the American Medical Association concluded that "discrepancies beyond chance do occur and differences in estimated magnitude of treatment effect are very common" between observational studies and RCTs.

    Two other lines of reasoning question RCTs' contribution to scientific knowledge beyond other types of studies:

    • If study designs are ranked by their potential for new discoveries, then anecdotal evidence would be at the top of the list, followed by observational studies, followed by RCTs.
    • RCTs may be unnecessary for treatments that have dramatic and rapid effects relative to the expected stable or progressively worse natural course of the condition treated. One example is combination chemotherapy including cisplatin for metastatic testicular cancer, which increased the cure rate from 5% to 60% in a 1977 non-randomized study.

    2000年发表在《新英格兰医学杂志》上的两项研究发现,观察性研究和随机对照试验总体上产生了相似的结果。2000年研究结果的作者质疑“观察性研究不应用于定义循证医疗”以及随机对照试验的结果是“最高等级的证据”的观点。然而,2001年发表在《美国医学协会杂志》上的一项研究得出结论,观察性研究和随机对照试验之间“确实会出现超越偶然的差异,估计治疗效果的差异非常普遍”。

    另外两条推理路线质疑随机对照试验对科学知识的贡献超过了其他类型的研究:

    • 如果按照新发现的潜力对研究设计进行排序,那么轶事证据将排在首位,其次是观察性研究,然后是随机对照试验。
    • 相对于所治疗疾病的预期稳定或逐渐恶化的自然病程而言,RCT对于具有显著和快速效果的治疗可能是不必要的。一个例子是联合化疗,包括顺铂治疗转移性睾丸癌,在1977年的一项非随机研究中将治愈率从5%提高到60%。

    Interpretation of statistical results

    与所有统计方法一样,随机对照试验同时存在ⅰ型(“假阳性”)和ⅱ型(“假阴性”)统计误差。关于第一类错误,典型的RCT将使用0.05(即20分之一)作为RCT错误地发现两种同等有效的治疗方法显著不同的概率。关于第二类错误,尽管1978年发表的一篇论文指出,许多“阴性”随机对照试验的样本量太小,无法对阴性结果做出明确的结论,但到2005-2006年,相当大比例的随机对照试验仍然有不准确或不完全报告的样本量计算。

    Peer review

    结果的同行评审是科学方法的重要组成部分。审查者检查研究结果是否存在可能导致不可靠结果的潜在设计问题(例如,通过产生系统偏差),在相关研究和其他证据的背景下评估研究,并评估是否可以合理地认为研究已经证明了其结论。为了强调同行评审的必要性和过度概括结论的危险,两位波士顿地区的医学研究人员进行了一项随机对照试验,他们随机给23名从双翼飞机或直升机上跳下的志愿者分配了一个降落伞或一个空背包。这项研究能够准确地报告,与空背包相比,降落伞不能减少伤害。限制这一结论普遍适用性的关键背景是,飞机停在地面上,参与者只跳了大约两英尺。

    Advantages

    RCTs are considered to be the most reliable form of scientific evidence in the hierarchy of evidence that influences healthcare policy and practice because RCTs reduce spurious causality and bias. Results of RCTs may be combined in systematic reviews which are increasingly being used in the conduct of evidence-based practice. Some examples of scientific organizations' considering RCTs or systematic reviews of RCTs to be the highest-quality evidence available are:

    RCT被认为是影响医疗保健政策和实践的证据层次中最可靠的科学证据形式,因为RCT减少了虚假的因果关系和偏见。随机对照试验的结果可以在系统综述中结合使用,越来越多地用于循证实践。一些科学组织认为随机对照试验或随机对照试验的系统审查是现有的最高质量证据的例子有:

    • 截至1998年,澳大利亚国家卫生和医学研究委员会将“一级”证据指定为“从所有相关随机对照试验的系统审查中获得的”,将“二级”证据指定为“从至少一项适当设计的随机对照试验中获得的”
    • 至少自2001年以来,美国预防服务工作组在提出临床实践指南建议时,将研究的设计及其内部有效性作为其质量的指标。它承认“从至少一个适当的随机对照试验中获得的证据”具有良好的内部有效性(即“良好”评级),是它所能获得的最高质量的证据。
    • GRADE工作组在2008年得出结论,“没有重要限制的随机试验构成了高质量的证据。”
    • 对于涉及“治疗/预防、病因学/危害”的问题,截至2011年,牛津循证医学中心将“1a级”证据定义为相互一致的随机对照试验的系统审查,“1b级”证据定义为“个体RCT(置信区间较窄)。”


    Notable RCTs with unexpected results that contributed to changes in clinical practice include:

    • After Food and Drug Administration approval, the antiarrhythmic agents flecainide and encainide came to market in 1986 and 1987 respectively.[61] The non-randomized studies concerning the drugs were characterized as "glowing",[62] and their sales increased to a combined total of approximately 165,000 prescriptions per month in early 1989.[61] In that year, however, a preliminary report of an RCT concluded that the two drugs increased mortality.[63] Sales of the drugs then decreased.[61]
    • Prior to 2002, based on observational studies, it was routine for physicians to prescribe hormone replacement therapy for post-menopausal women to prevent myocardial infarction.[62] In 2002 and 2004, however, published RCTs from the Women's Health Initiative claimed that women taking hormone replacement therapy with estrogen plus progestin had a higher rate of myocardial infarctions than women on a placebo, and that estrogen-only hormone replacement therapy caused no reduction in the incidence of coronary heart disease.[55][64] Possible explanations for the discrepancy between the observational studies and the RCTs involved differences in methodology, in the hormone regimens used, and in the populations studied.[65][66] The use of hormone replacement therapy decreased after publication of the RCTs.[67]

    导致临床实践改变的具有意想不到结果的显著随机对照试验包括:

    • 美国食品药品监督管理局批准后,抗心律失常药氟卡尼和恩卡尼分别于1986年和1987年上市。关于这些药物的非随机研究被描述为“glowing”,1989年初,它们的销售额增加到每月总计约165,000张处方。然而,在那一年,一份RCT的初步报告得出结论,这两种药物会增加死亡率。这些药物的销量随后下降。
    • 在2002年之前,基于观察性研究,医生为绝经后妇女开激素替代疗法以预防心肌梗死是常规。然而,在2002年和2004年,妇女健康倡议发表的随机对照试验声称,服用雌激素加孕激素的激素替代疗法的妇女比服用安慰剂的妇女心肌梗死的发生率更高,并且仅服用雌激素的激素替代疗法不会降低冠心病的发病率。观察性研究和随机对照试验之间差异的可能解释涉及方法学、所用激素方案和研究人群的差异。在随机对照试验发表后,激素替代疗法的使用减少了。

    Disadvantages

    Many papers discuss the disadvantages of RCTs.[68][69][70] Among the most frequently cited drawbacks are:

    许多论文讨论了随机对照试验的缺点。最常被提及的缺点包括:

    Time and costs

    RCTs can be expensive; one study found 28 Phase III RCTs funded by the National Institute of Neurological Disorders and Stroke prior to 2000 with a total cost of US$335 million, for a mean cost of US$12 million per RCT. Nevertheless, the return on investment of RCTs may be high, in that the same study projected that the 28 RCTs produced a "net benefit to society at 10-years" of 46 times the cost of the trials program, based on evaluating a quality-adjusted life year as equal to the prevailing mean per capita gross domestic product.

    The conduct of an RCT takes several years until being published; thus, data is restricted from the medical community for long years and may be of less relevance at time of publication.

    It is costly to maintain RCTs for the years or decades that would be ideal for evaluating some interventions.

    Interventions to prevent events that occur only infrequently (e.g., sudden infant death syndrome) and uncommon adverse outcomes (e.g., a rare side effect of a drug) would require RCTs with extremely large sample sizes and may, therefore, best be assessed by observational studies.

    Due to the costs of running RCTs, these usually only inspect one variable or very few variables, rarely reflecting the full picture of a complicated medical situation; whereas the case report, for example, can detail many aspects of the patient's medical situation (e.g. patient history, physical examination, diagnosis, psychosocial aspects, follow up).

    RCT可能很贵;一项研究发现,在2000年之前,由国家神经障碍和中风研究所资助的28个三期随机对照试验总费用为3.35亿美元,平均每个RCT花费1200万美元。尽管如此,随机对照试验的投资回报可能很高,因为同一项研究预测,根据对质量调整生命年的评估,28个随机对照试验产生的“10年社会净收益”是试验项目成本的46倍,等于当时的人均国内生产总值平均值。

    一部RCT的行为需要几年才能出版;因此,数据在很长一段时间内受到医学界的限制,在发表时可能不太相关。

    维持几年或几十年的随机对照试验成本很高,而这些试验对于评估一些干预措施是理想的。

    预防不常发生的事件(如婴儿猝死综合征)和不常见的不良后果(如药物的罕见副作用)的干预措施需要样本量极大的随机对照试验,因此最好通过观察性研究进行评估。

    由于运行随机对照试验的成本,这些通常只检查一个变量或很少的变量,很少反映复杂医疗情况的全貌;而例如病例报告可以详细描述患者医疗状况的许多方面(例如,患者病史、体检、诊断、心理社会方面、随访)。

    Conflict of interest dangers

    A 2011 study done to disclose possible conflicts of interests in underlying research studies used for medical meta-analyses reviewed 29 meta-analyses and found that conflicts of interests in the studies underlying the meta-analyses were rarely disclosed. The 29 meta-analyses included 11 from general medicine journals; 15 from specialty medicine journals, and 3 from the Cochrane Database of Systematic Reviews. The 29 meta-analyses reviewed an aggregate of 509 randomized controlled trials (RCTs). Of these, 318 RCTs reported funding sources with 219 (69%) industry funded. 132 of the 509 RCTs reported author conflict of interest disclosures, with 91 studies (69%) disclosing industry financial ties with one or more authors. The information was, however, seldom reflected in the meta-analyses. Only two (7%) reported RCT funding sources and none reported RCT author-industry ties. The authors concluded "without acknowledgment of COI due to industry funding or author industry financial ties from RCTs included in meta-analyses, readers' understanding and appraisal of the evidence from the meta-analysis may be compromised."

    Some RCTs are fully or partly funded by the health care industry (e.g., the pharmaceutical industry) as opposed to government, nonprofit, or other sources. A systematic review published in 2003 found four 1986–2002 articles comparing industry-sponsored and nonindustry-sponsored RCTs, and in all the articles there was a correlation of industry sponsorship and positive study outcome. A 2004 study of 1999–2001 RCTs published in leading medical and surgical journals determined that industry-funded RCTs "are more likely to be associated with statistically significant pro-industry findings." These results have been mirrored in trials in surgery, where although industry funding did not affect the rate of trial discontinuation it was however associated with a lower odds of publication for completed trials. One possible reason for the pro-industry results in industry-funded published RCTs is publication bias. Other authors have cited the differing goals of academic and industry sponsored research as contributing to the difference. Commercial sponsors may be more focused on performing trials of drugs that have already shown promise in early stage trials, and on replicating previous positive results to fulfill regulatory requirements for drug approval.

    2011年的一项研究披露了用于医学荟萃分析的基础研究中可能存在的利益冲突,该研究回顾了29项荟萃分析,发现在荟萃分析的基础研究中很少披露利益冲突。29项荟萃分析包括11项来自普通医学期刊;15篇来自专业医学期刊,3篇来自Cochrane系统综述数据库。29项荟萃分析共审查了509项随机对照试验。其中,318个随机对照试验报告了资金来源,219个(69%)得到了行业资助。509个随机对照试验中有132个报告了作者利益冲突披露,91项研究(69%)披露了与一名或多名作者的行业财务联系。然而,这些信息很少反映在荟萃分析中。只有两个(7%)报告了RCT的资金来源,没有一个报告了RCT作者与行业的联系。作者总结道,“如果由于行业资助或作者行业财务联系而不承认荟萃分析中随机对照试验的COI,读者对荟萃分析证据的理解和评估可能会受到影响。"一些随机对照试验完全或部分由医疗保健行业(如制药行业)资助,而不是由政府、非营利或其他来源资助。2003年发表的一项系统综述发现了四篇1986-2002年的文章,比较了行业赞助和非行业赞助的随机对照试验,在所有文章中,行业赞助和积极的研究结果之间存在相关性。2004年发表在主要医学和外科杂志上的一项关于1999-2001年随机对照试验的研究确定,行业资助的随机对照试验“更有可能与有统计学意义的亲行业发现相关。”这些结果在外科试验中得到了反映,尽管行业资助不影响试验中止率,但与完成试验的发表几率较低有关。行业资助的已发表随机对照试验中出现亲行业结果的一个可能原因是发表偏倚。其他作者认为学术和行业赞助研究的不同目标是造成这种差异的原因。商业赞助商可能会更专注于对已经在早期试验中显示出希望的药物进行试验,并复制以前的积极结果,以满足药物批准的监管要求。

    Ethics

    If a disruptive innovation in medical technology is developed, it may be difficult to test this ethically in an RCT if it becomes "obvious" that the control subjects have poorer outcomes—either due to other foregoing testing, or within the initial phase of the RCT itself. Ethically it may be necessary to abort the RCT prematurely, and getting ethics approval (and patient agreement) to withhold the innovation from the control group in future RCT's may not be feasible.

    Historical control trials (HCT) exploit the data of previous RCTs to reduce the sample size; however, these approaches are controversial in the scientific community and must be handled with care.

    如果医疗技术出现了颠覆性创新,如果“明显”对照受试者的结局较差,可能很难在RCT进行伦理测试——这可能是由于其他前述测试,也可能是在RCT的初始阶段。从伦理上讲,可能有必要过早地中止RCT,而获得伦理批准(和患者同意)以在未来的RCT试验中阻止对照组的创新可能是不可行的。

    历史对照试验(HCT)利用以前随机对照试验的数据来减少样本量;然而,这些方法在科学界有争议,必须小心处理。

    In social science

    Due to the recent emergence of RCTs in social science, the use of RCTs in social sciences is a contested issue. Some writers from a medical or health background have argued that existing research in a range of social science disciplines lacks rigour, and should be improved by greater use of randomized control trials.

    由于最近在社会科学中出现了随机对照试验,随机对照试验在社会科学中的使用是一个有争议的问题。一些具有医学或健康背景的作者认为,一系列社会科学学科的现有研究缺乏严谨性,应该通过更多地使用随机对照试验来改进。

    Transport science

    Researchers in transport science argue that public spending on programmes such as school travel plans could not be justified unless their efficacy is demonstrated by randomized controlled trials. Graham-Rowe and colleagues reviewed 77 evaluations of transport interventions found in the literature, categorising them into 5 "quality levels". They concluded that most of the studies were of low quality and advocated the use of randomized controlled trials wherever possible in future transport research.

    Dr. Steve Melia took issue with these conclusions, arguing that claims about the advantages of RCTs, in establishing causality and avoiding bias, have been exaggerated. He proposed the following eight criteria for the use of RCTs in contexts where interventions must change human behaviour to be effective:

    The intervention:

    1. Has not been applied to all members of a unique group of people (e.g. the population of a whole country, all employees of a unique organisation etc.)
    2. Is applied in a context or setting similar to that which applies to the control group
    3. Can be isolated from other activities—and the purpose of the study is to assess this isolated effect
    4. Has a short timescale between its implementation and maturity of its effects

    And the causal mechanisms:

    1. Are either known to the researchers, or else all possible alternatives can be tested
    2. Do not involve significant feedback mechanisms between the intervention group and external environments
    3. Have a stable and predictable relationship to exogenous factors
    4. Would act in the same way if the control group and intervention group were reversed

    交通科学的研究人员认为,除非随机对照试验证明其有效性,否则在学校旅行计划等项目上的公共支出是不合理的。格雷厄姆-罗和他的同事们回顾了文献中发现的77项交通干预评估,将它们分为5个“质量等级”。他们得出结论,大多数研究质量较低,并主张在未来的运输研究中尽可能使用随机对照试验。

    Steve Melia博士不同意这些结论,他认为关于随机对照试验在建立因果关系和避免偏见方面优势的说法被夸大了。在干预措施必须改变人类行为才能有效的情况下,他提出了以下八项使用随机对照试验的标准:

    干预措施:

    1. 没有适用于一个独特群体的所有成员(例如,整个国家的人口、一个独特组织的所有雇员等)
    2. 应用于类似于应用于控制组的上下文或设置中
    3. 可以从其他活动中分离出来,本研究的目的是评估这种分离的效果
    4. 从实施到效果成熟的时间很短

    以及因果机制:

    1. 要么是研究人员已知的,要么是所有可能的替代品都可以测试
    2. 不要涉及干预组和外部环境之间的重要反馈机制
    3. 与外部因素有稳定且可预测的关系
    4. 如果对照组和干预组颠倒过来,会以同样的方式起作用

    Criminology

    A 2005 review found 83 randomized experiments in criminology published in 1982–2004, compared with only 35 published in 1957–1981. The authors classified the studies they found into five categories: "policing", "prevention", "corrections", "court", and "community". Focusing only on offending behavior programs, Hollin (2008) argued that RCTs may be difficult to implement (e.g., if an RCT required "passing sentences that would randomly assign offenders to programmes") and therefore that experiments with quasi-experimental design are still necessary.

    2005年的一项审查发现,1982-2004年发表了83项犯罪学随机实验,而1957-1981年只发表了35项。作者将他们发现的研究分为五类:“警务”、“预防”、“惩戒”、“法院”和“社区”。Hollin (2008)只关注犯罪行为项目,他认为随机对照试验可能很难实施(例如,如果RCT要求“判刑时随机将罪犯分配到项目中”),因此准实验设计的实验仍然是必要的。

    Education

    RCTs have been used in evaluating a number of educational interventions. Between 1980 and 2016, over 1,000 reports of RCTs have been published. For example, a 2009 study randomized 260 elementary school teachers' classrooms to receive or not receive a program of behavioral screening, classroom intervention, and parent training, and then measured the behavioral and academic performance of their students. Another 2009 study randomized classrooms for 678 first-grade children to receive a classroom-centered intervention, a parent-centered intervention, or no intervention, and then followed their academic outcomes through age 19.

    RCT已被用于评估一些教育干预措施。从1980年到2016年,已经发表了1000多份随机对照试验报告。例如,2009年的一项研究随机选择了260名小学教师的教室,让他们接受或不接受行为筛查、课堂干预和家长培训,然后测量他们学生的行为和学业表现。另一项2009年的研究对678名一年级儿童进行了随机课堂,让他们接受以课堂为中心的干预、以家长为中心的干预或不干预,然后跟踪他们19岁的学习成绩。

    Criticism

    A 2018 review of the 10 most cited randomised controlled trials noted poor distribution of background traits, difficulties with blinding, and discussed other assumptions and biases inherent in randomised controlled trials. These include the "unique time period assessment bias", the "background traits remain constant assumption", the "average treatment effects limitation", the "simple treatment at the individual level limitation", the "all preconditions are fully met assumption", the "quantitative variable limitation" and the "placebo only or conventional treatment only limitation".

    2018年对10个引用最多的随机对照试验的回顾指出了背景特征分布不佳、致盲困难,并讨论了随机对照试验中固有的其他假设和偏见。其中包括“独特的时间段评估偏差”、“背景特征保持不变假设”、“平均治疗效果限制”、“个体水平的简单治疗限制”、“所有前提条件均完全满足假设”、“定量变量限制”和“仅安慰剂或仅常规治疗限制”。Category:Clinical research

    类别: 临床研究

    Category:Epidemiological study projects

    类别: 流行病学研究项目

    Category:Evidence-based practices

    类别: 循证做法

    Category:Design of experiments

    1. Schulz KF, Altman DG, ((Moher D; for the CONSORT Group)) (2010). "CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials". Br Med J. 340: c332. doi:10.1136/bmj.c332. PMC 2844940. PMID 20332509.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    2. Chalmers TC, Smith H Jr, Blackburn B, Silverman B, Schroeder B, Reitman D, Ambroz A (1981). "A method for assessing the quality of a randomized control trial". Controlled Clinical Trials. 2 (1): 31–49. doi:10.1016/0197-2456(81)90056-8. PMID 7261638.
    3. "Randomised controlled trial". National Institute for Health and Care Excellence, London, UK. 2019. Retrieved 3 June 2019.
    4. 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux PJ, Elbourne D, Egger M, Altman DG (2010). "CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials". Br Med J. 340: c869. doi:10.1136/bmj.c869. PMC 2844943. PMID 20332511.
    5. Hannan EL (June 2008). "Randomized clinical trials and observational studies: guidelines for assessing respective strengths and limitations". JACC. Cardiovascular Interventions. 1 (3): 211–7. doi:10.1016/j.jcin.2008.01.008. PMID 19463302.
    6. Ranjith G (2005). "Interferon-α-induced depression: when a randomized trial is not a randomized controlled trial". Psychother Psychosom. 74 (6): 387, author reply 387–8. doi:10.1159/000087787. PMID 16244516. S2CID 143644933.
    7. Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N, McPherson K, Peto J, Smith PG (1976). "Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. Introduction and design". Br J Cancer. 34 (6): 585–612. doi:10.1038/bjc.1976.220. PMC 2025229. PMID 795448.
    8. Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N, McPherson K, Peto J, Smith PG (1977). "Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples". Br J Cancer. 35 (1): 1–39. doi:10.1038/bjc.1977.1. PMC 2025310. PMID 831755.
    9. Wollert KC, Meyer GP, Lotz J, Ringes-Lichtenberg S, Lippolt P, Breidenbach C, Fichtner S, Korte T, Hornig B, Messinger D, Arseniev L, Hertenstein B, Ganser A, Drexler H (2004). "Intracoronary autologous bone-marrow cell transfer after myocardial infarction: the BOOST randomised controlled clinical trial". Lancet. 364 (9429): 141–8. doi:10.1016/S0140-6736(04)16626-9. PMID 15246726. S2CID 24361586.
    10. Charles Sanders Peirce and Joseph Jastrow (1885). "On Small Differences in Sensation". Memoirs of the National Academy of Sciences. 3: 73–83. http://psychclassics.yorku.ca/Peirce/small-diffs.htm
    11. Hacking, Ian (September 1988). "Telepathy: Origins of Randomization in Experimental Design". Isis. A Special Issue on Artifact and Experiment. 79 (3): 427–451. doi:10.1086/354775. JSTOR 234674. MR 1013489. S2CID 52201011.
    12. Stephen M. Stigler (November 1992). "A Historical View of Statistical Concepts in Psychology and Educational Research". American Journal of Education. 101 (1): 60–70. doi:10.1086/444032. S2CID 143685203.
    13. Trudy Dehue (December 1997). "Deception, Efficiency, and Random Groups: Psychology and the Gradual Origination of the Random Group Design" (PDF). Isis. 88 (4): 653–673. doi:10.1086/383850. PMID 9519574. S2CID 23526321.
    14. Neyman, Jerzy. 1923 [1990]. "On the Application of Probability Theory to AgriculturalExperiments. Essay on Principles. Section 9." Statistical Science 5 (4): 465–472. Trans. Dorota M. Dabrowska and Terence P. Speed.
    15. 引用错误:无效<ref>标签;未给name属性为Conniffe的引用提供文字
    16. Streptomycin in Tuberculosis Trials Committee (1948). "Streptomycin treatment of pulmonary tuberculosis. A Medical Research Council investigation". Br Med J. 2 (4582): 769–82. doi:10.1136/bmj.2.4582.769. PMC 2091872. PMID 18890300.
    17. Brown D (1998-11-02). "Landmark study made research resistant to bias". Washington Post.
    18. Shikata S, Nakayama T, Noguchi Y, Taji Y, Yamagishi H (2006). "Comparison of effects in randomized controlled trials with observational studies in digestive surgery". Ann Surg. 244 (5): 668–76. doi:10.1097/01.sla.0000225356.04304.bc. PMC 1856609. PMID 17060757.
    19. Stolberg HO, Norman G, Trop I (2004). "Randomized controlled trials". Am J Roentgenol. 183 (6): 1539–44. doi:10.2214/ajr.183.6.01831539. PMID 15547188.
    20. Georgina Ferry (2 November 2020). "Peter Sleight Obituary". The Guardian. Retrieved 3 November 2020.
    21. Appelbaum PS, Roth LH, Lidz C (1982). "The therapeutic misconception: informed consent in psychiatric research". Int J Law Psychiatry. 5 (3–4): 319–29. doi:10.1016/0160-2527(82)90026-7. PMID 6135666.
    22. 22.0 22.1 Henderson GE, Churchill LR, Davis AM, Easter MM, Grady C, Joffe S, Kass N, King NM, Lidz CW, Miller FG, Nelson DK, Peppercorn J, Rothschild BB, Sankar P, Wilfond BS, Zimmer CR (2007). "Clinical trials and medical care: defining the therapeutic misconception". PLoS Med. 4 (11): e324. doi:10.1371/journal.pmed.0040324. PMC 2082641. PMID 18044980.
    23. De Angelis C, Drazen JM, Frizelle FA, et al. (September 2004). "Clinical trial registration: a statement from the International Committee of Medical Journal Editors". The New England Journal of Medicine. 351 (12): 1250–1. doi:10.1056/NEJMe048225. PMID 15356289.
    24. Law MR, Kawasumi Y, Morgan SG (2011). "Despite law, fewer than one in eight completed studies of drugs and biologics are reported on time on ClinicalTrials.gov". Health Aff (Millwood). 30 (12): 2338–45. doi:10.1377/hlthaff.2011.0172. PMID 22147862.
    25. Mathieu S, Boutron I, Moher D, Altman DG, Ravaud P (2009). "Comparison of registered and published primary outcomes in randomized controlled trials". JAMA. 302 (9): 977–84. doi:10.1001/jama.2009.1242. PMID 19724045.
    26. Bhaumik, S (Mar 2013). "Editorial policies of MEDLINE indexed Indian journals on clinical trial registration". Indian Pediatr. 50 (3): 339–40. doi:10.1007/s13312-013-0092-2. PMID 23680610. S2CID 40317464.
    27. 27.0 27.1 Hopewell S, Dutton S, Yu LM, Chan AW, Altman DG (2010). "The quality of reports of randomised trials in 2000 and 2006: comparative study of articles indexed in PubMed". BMJ. 340: c723. doi:10.1136/bmj.c723. PMC 2844941. PMID 20332510.
    28. Kaiser, Joerg; Niesen, Willem; Probst, Pascal; Bruckner, Thomas; Doerr-Harim, Colette; Strobel, Oliver; Knebel, Phillip; Diener, Markus K.; Mihaljevic, André L.; Büchler, Markus W.; Hackert, Thilo (7 June 2019). "Abdominal drainage versus no drainage after distal pancreatectomy: study protocol for a randomized controlled trial". Trials. 20 (1): 332. doi:10.1186/s13063-019-3442-0. PMC 6555976. PMID 31174583.
    29. Farag, Sara M.; Mohammed, Manal O.; EL-Sobky, Tamer A.; ElKadery, Nadia A.; ElZohiery, Abeer K. (March 2020). "Botulinum Toxin A Injection in Treatment of Upper Limb Spasticity in Children with Cerebral Palsy: A Systematic Review of Randomized Controlled Trials". JBJS Reviews. 8 (3): e0119. doi:10.2106/JBJS.RVW.19.00119. PMC 7161716. PMID 32224633.
    30. Jones, Byron; Kenward, Michael G. (2003). Design and Analysis of Cross-Over Trials (Second ed.). London: Chapman and Hall. 
    31. Vonesh, Edward F.; Chinchilli, Vernon G. (1997). "Crossover Experiments". Linear and Nonlinear Models for the Analysis of Repeated Measurements. London: Chapman and Hall. pp. 111–202. 
    32. Gall, Stefanie; Adams, Larissa; Joubert, Nandi; Ludyga, Sebastian; Müller, Ivan; Nqweniso, Siphesihle; Pühse, Uwe; du Randt, Rosa; Seelig, Harald; Smith, Danielle; Steinmann, Peter; Utzinger, Jürg; Walter, Cheryl; Gerber, Markus; van Wouwe, Jacobus P. (8 November 2018). "Effect of a 20-week physical activity intervention on selective attention and academic performance in children living in disadvantaged neighborhoods: A cluster randomized control trial". PLOS ONE. 13 (11): e0206908. Bibcode:2018PLoSO..1306908G. doi:10.1371/journal.pone.0206908. PMC 6224098. PMID 30408073.
    33. Gladstone, Melissa J.; Chandna, Jaya; Kandawasvika, Gwendoline; Ntozini, Robert; Majo, Florence D.; Tavengwa, Naume V.; Mbuya, Mduduzi N. N.; Mangwadu, Goldberg T.; Chigumira, Ancikaria; Chasokela, Cynthia M.; Moulton, Lawrence H.; Stoltzfus, Rebecca J.; Humphrey, Jean H.; Prendergast, Andrew J.; Tumwine, James K. (21 March 2019). "Independent and combined effects of improved water, sanitation, and hygiene (WASH) and improved complementary feeding on early neurodevelopment among children born to HIV-negative mothers in rural Zimbabwe: Substudy of a cluster-randomized trial". PLOS Medicine. 16 (3): e1002766. doi:10.1371/journal.pmed.1002766. PMC 6428259. PMID 30897095.
    34. 34.0 34.1 34.2 34.3 Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B, Oxman AD, Moher D; CONSORT group; Pragmatic Trials in Healthcare (Practihc) group (2008). "Improving the reporting of pragmatic trials: an extension of the CONSORT statement". BMJ. 337: a2390. doi:10.1136/bmj.a2390. PMC 3266844. PMID 19001484.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    35. 35.0 35.1 35.2 35.3 Piaggio G, Elbourne DR, Altman DG, Pocock SJ, Evans SJ; CONSORT Group (2006). "Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement" (PDF). JAMA. 295 (10): 1152–60. doi:10.1001/jama.295.10.1152. PMID 16522836.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    36. 36.0 36.1 36.2 Schulz KF, Grimes DA (2002). "Generation of allocation sequences in randomised trials: chance, not choice" (PDF). Lancet. 359 (9305): 515–9. doi:10.1016/S0140-6736(02)07683-3. PMID 11853818. S2CID 291300.
    37. Howick J, Mebius A (2014). "In search of justification for the unpredictability paradox". Trials. 15: 480. doi:10.1186/1745-6215-15-480. PMC 4295227. PMID 25490908.
    38. 38.0 38.1 Lachin JM (1988). "Statistical properties of randomization in clinical trials". Controlled Clinical Trials. 9 (4): 289–311. doi:10.1016/0197-2456(88)90045-1. PMID 3060315.
    39. Rosenberger, James. "STAT 503 - Design of Experiments". Pennsylvania State University. Retrieved 24 September 2012.
    40. Avins, A L (1998). ""Can unequal be more fair? Ethics, subject allocation, and randomized clinical trials"". J Med Ethics. 24 (6): 401–408. doi:10.1136/jme.24.6.401. PMC 479141. PMID 9873981.
    41. Buyse ME (1989). "Analysis of clinical trial outcomes: some comments on subgroup analyses". Controlled Clinical Trials. 10 (4 Suppl): 187S–194S. doi:10.1016/0197-2456(89)90057-3. PMID 2605967.
    42. 42.0 42.1 42.2 42.3 42.4 42.5 Lachin JM, Matts JP, Wei LJ (1988). "Randomization in clinical trials: conclusions and recommendations" (PDF). Controlled Clinical Trials. 9 (4): 365–74. doi:10.1016/0197-2456(88)90049-9. hdl:2027.42/27041. PMID 3203526.
    43. 43.0 43.1 43.2 43.3 43.4 43.5 Schulz KF, Grimes DA (2002). "Allocation concealment in randomised trials: defending against deciphering" (PDF). Lancet. 359 (9306): 614–8. doi:10.1016/S0140-6736(02)07750-4. PMID 11867132. S2CID 12902486.
    44. Rosenberger WF, Lachin JM (1993). "The use of response-adaptive designs in clinical trials". Controlled Clinical Trials. 14 (6): 471–84. doi:10.1016/0197-2456(93)90028-C. PMID 8119063.
    45. Forder PM, Gebski VJ, Keech AC (2005). "Allocation concealment and blinding: when ignorance is bliss". Med J Aust. 182 (2): 87–9. doi:10.5694/j.1326-5377.2005.tb06584.x. PMID 15651970. S2CID 202149.
    46. 46.0 46.1 Wood L, Egger M, Gluud LL, Schulz KF, Jüni P, Altman DG, Gluud C, Martin RM, Wood AJ, Sterne JA (2008). "Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study". BMJ. 336 (7644): 601–5. doi:10.1136/bmj.39465.451748.AD. PMC 2267990. PMID 18316340.
    47. Devereaux PJ, Manns BJ, Ghali WA, Quan H, Lacchetti C, Montori VM, Bhandari M, Guyatt GH (2001). "Physician interpretations and textbook definitions of blinding terminology in randomized controlled trials". J Am Med Assoc. 285 (15): 2000–3. doi:10.1001/jama.285.15.2000. PMID 11308438.
    48. Haahr MT, Hróbjartsson A (2006). "Who is blinded in randomized clinical trials? A study of 200 trials and a survey of authors". Clin Trials. 3 (4): 360–5. doi:10.1177/1740774506069153. PMID 17060210. S2CID 23818514.
    49. Marson AG, Al-Kharusi AM, Alwaidh M, Appleton R, Baker GA, Chadwick DW, et al. (2007). "The SANAD study of effectiveness of valproate, lamotrigine, or topiramate for generalised and unclassifiable epilepsy: an unblinded randomised controlled trial". Lancet. 369 (9566): 1016–26. doi:10.1016/S0140-6736(07)60461-9. PMC 2039891. PMID 17382828.
    50. Chan R, Hemeryck L, O'Regan M, Clancy L, Feely J (1995). "Oral versus intravenous antibiotics for community acquired lower respiratory tract infection in a general hospital: open, randomised controlled trial". BMJ. 310 (6991): 1360–2. doi:10.1136/bmj.310.6991.1360. PMC 2549744. PMID 7787537.
    51. Fukase K, Kato M, Kikuchi S, Inoue K, Uemura N, Okamoto S, Terao S, Amagai K, Hayashi S, Asaka M; Japan Gast Study Group (2008). "Effect of eradication of Helicobacter pylori on incidence of metachronous gastric carcinoma after endoscopic resection of early gastric cancer: an open-label, randomised controlled trial" (PDF). Lancet. 372 (9636): 392–7. doi:10.1016/S0140-6736(08)61159-9. hdl:2115/34681. PMID 18675689. S2CID 13741892.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    52. Noseworthy JH, Ebers GC, Vandervoort MK, Farquhar RE, Yetisir E, Roberts R (1994). "The impact of blinding on the results of a randomized, placebo-controlled multiple sclerosis clinical trial". Neurology. 44 (1): 16–20. doi:10.1212/wnl.44.1.16. PMID 8290055. S2CID 2663997.
    53. Manns MP, McHutchison JG, Gordon SC, Rustgi VK, Shiffman M, Reindollar R, Goodman ZD, Koury K, Ling M, Albrecht JK (2001). "Peginterferon alfa-2b plus ribavirin compared with interferon alfa-2b plus ribavirin for initial treatment of chronic hepatitis C: a randomised trial". Lancet. 358 (9286): 958–65. doi:10.1016/S0140-6736(01)06102-5. PMID 11583749. S2CID 14583372.
    54. Schwartz GG, Olsson AG, Ezekowitz MD, Ganz P, Oliver MF, Waters D, Zeiher A, Chaitman BR, Leslie S, Stern T; Myocardial Ischemia Reduction with Aggressive Cholesterol Lowering (MIRACL) Study Investigators (2001). "Effects of atorvastatin on early recurrent ischemic events in acute coronary syndromes: the MIRACL study: a randomized controlled trial". J Am Med Assoc. 285 (13): 1711–8. doi:10.1001/jama.285.13.1711. PMID 11277825.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    55. 55.0 55.1 Rossouw JE, Anderson GL, Prentice RL, LaCroix AZ, Kooperberg C, Stefanick ML, Jackson RD, Beresford SA, Howard BV, Johnson KC, Kotchen JM, Ockene J; Writing Group for the Women's Health Initiative Investigators (2002). "Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the Women's Health Initiative randomized controlled trial" (PDF). J Am Med Assoc. 288 (3): 321–33. doi:10.1001/jama.288.3.321. PMID 12117397. S2CID 20149703.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    56. Hollis S, Campbell F (1999). "What is meant by intention to treat analysis? Survey of published randomised controlled trials". Br Med J. 319 (7211): 670–4. doi:10.1136/bmj.319.7211.670. PMC 28218. PMID 10480822.
    57. National Health and Medical Research Council (1998-11-16). A guide to the development, implementation and evaluation of clinical practice guidelines. Canberra: Commonwealth of Australia. p. 56. ISBN 978-1-86496-048-8. http://www.nhmrc.gov.au/_files_nhmrc/file/publications/synopses/cp30.pdf. 
    58. 58.0 58.1 Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, Atkins D; Methods Work Group, Third US Preventive Services Task Force (2001). "Current methods of the US Preventive Services Task Force: a review of the process" (PDF). Am J Prev Med. 20 (3 Suppl): 21–35. doi:10.1016/S0749-3797(01)00261-6. PMID 11306229.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    59. Guyatt GH, Oxman AD, Kunz R, Vist GE, Falck-Ytter Y, Schünemann HJ; GRADE Working Group (2008). "What is "quality of evidence" and why is it important to clinicians?". BMJ. 336 (7651): 995–8. doi:10.1136/bmj.39490.551019.BE. PMC 2364804. PMID 18456631.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    60. Oxford Centre for Evidence-based Medicine (2011-09-16). "Levels of evidence". Retrieved 2012-02-15.
    61. 61.0 61.1 61.2 Anderson JL, Pratt CM, Waldo AL, Karagounis LA (1997). "Impact of the Food and Drug Administration approval of flecainide and encainide on coronary artery disease mortality: putting "Deadly Medicine" to the test". Am J Cardiol. 79 (1): 43–7. doi:10.1016/S0002-9149(96)00673-X. PMID 9024734.
    62. 62.0 62.1 Rubin R (2006-10-16). "In medicine, evidence can be confusing - deluged with studies, doctors try to sort out what works, what doesn't". USA Today. Retrieved 2010-03-22.
    63. Cardiac Arrhythmia Suppression Trial (CAST) Investigators (1989). "Preliminary report: effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. The Cardiac Arrhythmia Suppression Trial (CAST) Investigators". N Engl J Med. 321 (6): 406–12. doi:10.1056/NEJM198908103210629. PMID 2473403.
    64. Anderson GL, Limacher M, Assaf AR, Bassford T, Beresford SA, Black H, et al. (2004). "Effects of conjugated equine estrogen in postmenopausal women with hysterectomy: the Women's Health Initiative randomized controlled trial". JAMA. 291 (14): 1701–12. doi:10.1001/jama.291.14.1701. PMID 15082697.
    65. Grodstein F, Clarkson TB, Manson JE (2003). "Understanding the divergent data on postmenopausal hormone therapy". N Engl J Med. 348 (7): 645–50. doi:10.1056/NEJMsb022365. PMID 12584376.
    66. Vandenbroucke JP (2009). "The HRT controversy: observational studies and RCTs fall in line". Lancet. 373 (9671): 1233–5. doi:10.1016/S0140-6736(09)60708-X. PMID 19362661. S2CID 44991220.
    67. Hsu A, Card A, Lin SX, Mota S, Carrasquillo O, Moran A (2009). "Changes in postmenopausal hormone replacement therapy use among women with high cardiovascular risk". Am J Public Health. 99 (12): 2184–7. doi:10.2105/AJPH.2009.159889. PMC 2775780. PMID 19833984.
    68. Black N (1996). "Why we need observational studies to evaluate the effectiveness of health care". BMJ. 312 (7040): 1215–8. doi:10.1136/bmj.312.7040.1215. PMC 2350940. PMID 8634569.
    69. Bell, S.H., & Peck, L.R. (2012). "Obstacles to and limitations of social experiments: 15 false alarms". Abt Thought Leadership Paper Series.{{cite journal}}: CS1 maint: multiple names: authors list (link)
    70. Sanson-Fisher RW, Bonevski B, Green LW, D'Este C (2007). "Limitations of the randomized controlled trial in evaluating population-based health interventions". Am J Prev Med. 33 (2): 155–61. doi:10.1016/j.amepre.2007.04.007. PMID 17673104.