更改

添加1,097字节 、 2020年11月29日 (日) 00:25
无编辑摘要
第23行: 第23行:  
{{Probability distribution
 
{{Probability distribution
   −
<font color="#ff8000">概率分布Probability distribution </font>  
+
<font color="#ff8000">概率分布 Probability distribution </font>  
    
   | name      = Binomial distribution
 
   | name      = Binomial distribution
第58行: 第58行:  
   | parameters = n \in \{0, 1, 2, \ldots\} &ndash; number of trials<br />p \in [0,1] &ndash; success probability for each trial<br />q = 1 - p
 
   | parameters = n \in \{0, 1, 2, \ldots\} &ndash; number of trials<br />p \in [0,1] &ndash; success probability for each trial<br />q = 1 - p
   −
| 参数 = <br /><math>n \in \{0, 1, 2, \ldots\}</math> &ndash; --- 试验次数; <br /><math>p \in [0,1]</math> &ndash -- 每个试验的成功概率; <br /><math>q = 1 - p</math>
+
| 参数 = <br /><math>n \in \{0, 1, 2, \ldots\}</math> &ndash; --- 试验次数; <br /><math>p \in [0,1]</math> &ndash; -- 每个试验的成功概率; <br /><math>q = 1 - p</math>
    
   | support    = <math>k \in \{0, 1, \ldots, n\}</math> &ndash; number of successes
 
   | support    = <math>k \in \{0, 1, \ldots, n\}</math> &ndash; number of successes
第64行: 第64行:  
   | support    = k \in \{0, 1, \ldots, n\} &ndash; number of successes
 
   | support    = k \in \{0, 1, \ldots, n\} &ndash; number of successes
   −
| 支持 = <math>k \in \{0, 1, \ldots, n\}</math> &ndash  --- 成功的数量
+
| 支持 = <br /><math>k \in \{0, 1, \ldots, n\}</math> &ndash; --- 成功的数量
    
   | pdf        = <math>\binom{n}{k} p^k q^{n-k}</math>
 
   | pdf        = <math>\binom{n}{k} p^k q^{n-k}</math>
第118行: 第118行:  
   | entropy    = \frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right)<br /> in shannons. For nats, use the natural log in the log.
 
   | entropy    = \frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right)<br /> in shannons. For nats, use the natural log in the log.
   −
|  <font color="#ff8000">熵 entropy</font> = <math>\frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right)</math> < br /> 用<font color="#ff8000">香农熵 Shannon entropy</font>测量。对于<font color="#ff8000">分布式消息队列系统 NATS </font>,使用日志中的自然日志。
+
|  <font color="#ff8000">熵 entropy</font> = <math>\frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right)</math>用<font color="#ff8000">香农熵 Shannon entropy</font>测量。对于<font color="#ff8000">分布式消息队列系统 NATS </font>,使用日志中的自然日志。
    
   | mgf        = <math>(q + pe^t)^n</math>
 
   | mgf        = <math>(q + pe^t)^n</math>
第154行: 第154行:  
Binomial distribution for p=0.5<br />with n and k as in [[Pascal's triangle<br /><br />The probability that a ball in a Galton box with 8 layers (n&nbsp;=&nbsp;8) ends up in the central bin (k&nbsp;=&nbsp;4) is 70/256.]]
 
Binomial distribution for p=0.5<br />with n and k as in [[Pascal's triangle<br /><br />The probability that a ball in a Galton box with 8 layers (n&nbsp;=&nbsp;8) ends up in the central bin (k&nbsp;=&nbsp;4) is 70/256.]]
   −
[[文章[[File:Pascal's triangle; binomial distribution.svg]]< br/> < br/> < br/> 是<math>p=0.5</math><br />与n和k相关的二项分布。< br/> < br/> < br/> 一个8层(''n''&nbsp;=&nbsp;8)的高尔顿盒子中的一个球最终进入中央箱子(''k''&nbsp;=&nbsp;4)的概率是<math>70/256</math>。]]
+
文章File:Pascal's triangle; binomial distribution.svg是<math>p=0.5</math><br />与n和k相关的二项分布。一个8层(''n''&nbsp;=&nbsp;8)的高尔顿盒子中的一个球最终进入中央箱子(''k''&nbsp;=&nbsp;4)的概率是<math>70/256</math>。
    
In [[probability theory]] and [[statistics]], the '''binomial distribution''' with parameters ''n'' and ''p'' is the [[discrete probability distribution]] of the number of successes in a sequence of ''n'' [[statistical independence|independent]] [[experiment (probability theory)|experiment]]s, each asking a [[yes–no question]], and each with its own [[boolean-valued function|boolean]]-valued [[outcome (probability)|outcome]]: [[wikt:success|success]]/[[yes and no|yes]]/[[truth value|true]]/[[one]] (with [[probability]] ''p'') or [[failure]]/[[yes and no|no]]/[[false (logic)|false]]/[[zero]] (with [[probability]] ''q''&nbsp;=&nbsp;1&nbsp;−&nbsp;''p'').  
 
In [[probability theory]] and [[statistics]], the '''binomial distribution''' with parameters ''n'' and ''p'' is the [[discrete probability distribution]] of the number of successes in a sequence of ''n'' [[statistical independence|independent]] [[experiment (probability theory)|experiment]]s, each asking a [[yes–no question]], and each with its own [[boolean-valued function|boolean]]-valued [[outcome (probability)|outcome]]: [[wikt:success|success]]/[[yes and no|yes]]/[[truth value|true]]/[[one]] (with [[probability]] ''p'') or [[failure]]/[[yes and no|no]]/[[false (logic)|false]]/[[zero]] (with [[probability]] ''q''&nbsp;=&nbsp;1&nbsp;−&nbsp;''p'').  
第275行: 第275行:  
f(k,&nbsp;n,&nbsp;p) is monotone increasing for k&nbsp;<&nbsp;M and monotone decreasing for k&nbsp;>&nbsp;M, with the exception of the case where (n&nbsp;+&nbsp;1)p is an integer. In this case, there are two values for which f is maximal: (n&nbsp;+&nbsp;1)p and (n&nbsp;+&nbsp;1)p&nbsp;−&nbsp;1. M is the most probable outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the mode.
 
f(k,&nbsp;n,&nbsp;p) is monotone increasing for k&nbsp;<&nbsp;M and monotone decreasing for k&nbsp;>&nbsp;M, with the exception of the case where (n&nbsp;+&nbsp;1)p is an integer. In this case, there are two values for which f is maximal: (n&nbsp;+&nbsp;1)p and (n&nbsp;+&nbsp;1)p&nbsp;−&nbsp;1. M is the most probable outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the mode.
   −
''f''(''k'',&nbsp;''n'',&nbsp;''p'')对''k''&nbsp;<&nbsp是单调递增的,对''k''&nbsp;>&nbsp是单调递减的,但(''n''&nbsp;+&nbsp;1)''p''是整数的情况除外。在这种情况下,有(''n''&nbsp;+&nbsp;1)''p'' 和 (''n''&nbsp;+&nbsp;1)''p''&nbsp;&nbsp两个值使''f''达到最大。''M'' 是伯努利试验最有可能的结果(也就是说,发生的可能性最大,尽管仍然存在不发生的情况) ,被称为模。
+
''f''(''k'',&nbsp;''n'',&nbsp;''p'')对''k''&nbsp;< ''M''&nbsp是单调递增的,对''k''&nbsp;> ''M''&nbsp是单调递减的,但(''n''&nbsp;+&nbsp;1)''p''是整数的情况除外。在这种情况下,有(''n''&nbsp;+&nbsp;1)''p'' 和 (''n''&nbsp;+&nbsp;1)''p''&nbsp;−1&nbsp两个值使''f''达到最大。''M'' 是伯努利试验最有可能的结果(也就是说,发生的可能性最大,尽管仍然存在不发生的情况) ,被称为模。
      第575行: 第575行:     
* A median ''m'' cannot lie too far away from the mean: {{nowrap|&#124;''m'' − ''np''&#124; ≤ min{ ln 2, max{''p'', 1 − ''p''} }}}.<ref name="Hamza">{{Cite journal
 
* A median ''m'' cannot lie too far away from the mean: {{nowrap|&#124;''m'' − ''np''&#124; ≤ min{ ln 2, max{''p'', 1 − ''p''} }}}.<ref name="Hamza">{{Cite journal
 +
| last1 = Hamza | first1 = K.
 +
 +
| doi = 10.1016/0167-7152(94)00090-U
 +
 +
| title = The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions
 +
 +
F(k;n,p) \leq \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right) 
 +
 +
| journal = Statistics & Probability Letters
 +
 +
| volume = 23
 +
 +
where D(a || p) is the relative entropy between an a-coin and a p-coin (i.e. between the Bernoulli(a) and Bernoulli(p) distribution):
 +
 +
| pages = 21–25
 +
 +
| year = 1995
 +
 +
D(a\parallel p)=(a)\log\frac{a}{p}+(1-a)\log\frac{1-a}{1-p}. \!
 +
 +
| pmid = 
 +
 +
| pmc =
 +
 +
Asymptotically, this bound is reasonably tight; see
 +
 +
}}</ref>
 +
 +
    
*中位数''m''不能离均值太远。{{nowrap|&#124;''m'' − ''np''&#124; ≤ min{ ln 2, max{''p'', 1 − ''p''} }}}<ref name="Hamza">{{Cite journal
 
*中位数''m''不能离均值太远。{{nowrap|&#124;''m'' − ''np''&#124; ≤ min{ ln 2, max{''p'', 1 − ''p''} }}}<ref name="Hamza">{{Cite journal
第614行: 第643行:  
* The median is unique and equal to ''m''&nbsp;=&nbsp;[[Rounding|round]](''np'') when |''m''&nbsp;−&nbsp;''np''|&nbsp;≤&nbsp;min{''p'',&nbsp;1&nbsp;−&nbsp;''p''} (except for the case when ''p''&nbsp;=&nbsp;{{sfrac|1|2}} and ''n'' is odd).<ref name="KaasBuhrman"/>
 
* The median is unique and equal to ''m''&nbsp;=&nbsp;[[Rounding|round]](''np'') when |''m''&nbsp;−&nbsp;''np''|&nbsp;≤&nbsp;min{''p'',&nbsp;1&nbsp;−&nbsp;''p''} (except for the case when ''p''&nbsp;=&nbsp;{{sfrac|1|2}} and ''n'' is odd).<ref name="KaasBuhrman"/>
   −
*中位数是唯一的并且等于''m''&nbsp;=&nbsp;[[Rounding|round]](''np''),此时|''m''&nbsp;−&nbsp;''np''|&nbsp;≤&nbsp;min{''p'',&nbsp;1&nbsp;−&nbsp;''p''}(''p''&nbsp;=&nbsp;{{sfrac|1|2}} 和 ''n'' 是奇数的情况除外)
+
*中位数是唯一的并且等于''m''&nbsp;=&nbsp;[[Rounding|round]](''np''),此时|''m''&nbsp;−&nbsp;''np''|&nbsp;≤&nbsp;min{''p'',&nbsp;1&nbsp;−&nbsp;''p''}(''p''&nbsp;= \\sfrac{1}{2}&nbsp; 和 ''n'' 是奇数的情况除外)
    
which implies the simpler but looser bound
 
which implies the simpler but looser bound
第628行: 第657行:  
For p = 1/2 and k ≥ 3n/8 for even n, it is possible to make the denominator constant:
 
For p = 1/2 and k ≥ 3n/8 for even n, it is possible to make the denominator constant:
   −
对于''p''&nbsp;=&nbsp;1/2且''n''是奇数,任意''m''满足{{sfrac|1|2}}(''n''&nbsp;−&nbsp;1)&nbsp;≤&nbsp;''m''&nbsp;≤&nbsp;{{sfrac|1|2}}(''n''&nbsp;+&nbsp;1)是一个二项分布的中位数。如果''p''&nbsp;=&nbsp;1/2且''n'' 是偶数,那么''m''&nbsp;=&nbsp;''n''/2是唯一的中位数:
+
对于''p''&nbsp;=&nbsp;1/2且''n''是奇数,任意''m''满足\\sfrac{1}{2}(''n''&nbsp;−&nbsp;1)&nbsp;≤&nbsp;''m''&nbsp;≤&nbsp; \\sfrac{1}{2} (''n''&nbsp;+&nbsp;1)是一个二项分布的中位数。如果''p''&nbsp;=&nbsp;1/2且''n'' 是偶数,那么''m''&nbsp;=&nbsp;''n''/2是唯一的中位数:
F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right)
+
 
 +
<math>F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right);</math>
    
当''p''&nbsp;=&nbsp;1/2并且''n''为偶数,k ≥ 3n/8时, 可以使分母为常数。
 
当''p''&nbsp;=&nbsp;1/2并且''n''为偶数,k ≥ 3n/8时, 可以使分母为常数。
第643行: 第673行:  
  F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!
 
  F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!
   −
F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!
+
<math>F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!</math>
      第662行: 第692行:  
When n is known, the parameter p can be estimated using the proportion of successes:  \widehat{p} = \frac{x}{n}. This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.
 
When n is known, the parameter p can be estimated using the proportion of successes:  \widehat{p} = \frac{x}{n}. This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.
   −
当 n 已知时,参数 p 可以使用成功的比例来估计: \widehat{p} = \frac{x}{n}。可以利用<font color="#ff8000">极大似然估计 maximum likelihood estimator </font>和<font color="#ff8000"> 矩方法 method of moments</font>来求出该估计量。<font color="#ff8000">Lehmann-scheffé 定理</font>证明了该估计量是无偏的一致的且方差最小的,因为该估计量是基于一个极小<font color="#ff8000">充分完备统计量 sufficient and complete statistic</font>(即: x).它在概率和<font color="#ff8000">均方误差 MSE</font>方面也是一致的。
+
当 n 已知时,参数 p 可以使用成功的比例来估计:<math> \widehat{p} = \frac{x}{n}</math>。可以利用<font color="#ff8000">极大似然估计 maximum likelihood estimator </font>和<font color="#ff8000"> 矩方法 method of moments</font>来求出该估计量。<font color="#ff8000">Lehmann-scheffé 定理</font>证明了该估计量是无偏的一致的且方差最小的,因为该估计量是基于一个极小<font color="#ff8000">充分完备统计量 sufficient and complete statistic</font>(即: x).它在概率和<font color="#ff8000">均方误差 MSE</font>方面也是一致的。
      第672行: 第702行:  
A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general \operatorname{Beta}(\alpha, \beta) as a prior, the posterior mean estimator is:  \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}. The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors),  admissible and consistent in probability.
 
A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general \operatorname{Beta}(\alpha, \beta) as a prior, the posterior mean estimator is:  \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}. The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors),  admissible and consistent in probability.
   −
利用 Beta分布作为<font color="#ff8000">共轭先验分布 conjugate prior distribution </font>时,也存在p的封闭形式的<font color="#ff8000">贝叶斯估计 Bayes estimator </font>。当使用一个通用\operatorname{Beta}(\alpha, \beta)作为先验时,后验均值估计量为: \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。
+
利用 Beta分布作为<font color="#ff8000">共轭先验分布 conjugate prior distribution </font>时,也存在p的封闭形式的<font color="#ff8000">贝叶斯估计 Bayes estimator </font>。当使用一个通用<math>\operatorname{Beta}(\alpha, \beta)</math>作为先验时,后验均值估计量为: <math>\widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。
      第680行: 第710行:  
For the special case of using the standard uniform distribution as a non-informative prior (\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)), the posterior mean estimator becomes  \widehat{p_b} = \frac{x+1}{n+2} (a posterior mode should just lead to the standard estimator). This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.
 
For the special case of using the standard uniform distribution as a non-informative prior (\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)), the posterior mean estimator becomes  \widehat{p_b} = \frac{x+1}{n+2} (a posterior mode should just lead to the standard estimator). This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.
   −
对于使用标准均匀分布作为非信息性的先验概率的特殊情况(\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)),后验均值估计变为\widehat{p_b} = \frac{x+1}{n+2} (后验模式应只能得出标准估计量)。这种方法被称为<font color="#ff8000">继承法则 the rule of succession </font>,它是18世纪皮埃尔-西蒙·拉普拉斯 Pierre-Simon Laplace提出的。
+
对于使用标准均匀分布作为非信息性的先验概率的特殊情况(<math>\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)</math>),后验均值估计变为<math>\widehat{p_b} = \frac{x+1}{n+2}</math> (后验模式应只能得出标准估计量)。这种方法被称为<font color="#ff8000">继承法则 the rule of succession </font>,它是18世纪皮埃尔-西蒙·拉普拉斯 Pierre-Simon Laplace提出的。
      第690行: 第720行:  
When estimating p with very rare events and a small n (e.g.: if x=0), then using the standard estimator leads to  \widehat{p} = 0, which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators. One way is to use the Bayes estimator, leading to:  \widehat{p_b} = \frac{1}{n+2}). Another method is to use the upper bound of the confidence interval obtained using the rule of three:  \widehat{p_{\text{rule of 3}}} = \frac{3}{n})
 
When estimating p with very rare events and a small n (e.g.: if x=0), then using the standard estimator leads to  \widehat{p} = 0, which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators. One way is to use the Bayes estimator, leading to:  \widehat{p_b} = \frac{1}{n+2}). Another method is to use the upper bound of the confidence interval obtained using the rule of three:  \widehat{p_{\text{rule of 3}}} = \frac{3}{n})
   −
当估计用非常罕见的事件和一个小的n (例如,如果x = 0) ,那么使用标准估计会得到\widehat{p} = 0,这有时是不现实的和我们不希望看到的。在这种情况下,有各种可供选择的估计值。一种方法是使用贝叶斯估计,得到: \widehat{p_b} = \frac{1}{n+2})。另一种方法是利用从3个规则获得的置信区间的上界: \widehat{p_{\text{rule of 3}}} = \frac{3}{n})
+
当估计用非常罕见的事件和一个小的n (例如,如果x = 0) ,那么使用标准估计会得到<math>\widehat{p} = 0</math>,这有时是不现实的和我们不希望看到的。在这种情况下,有各种可供选择的估计值。一种方法是使用贝叶斯估计,得到:<math> \widehat{p_b} = \frac{1}{n+2}</math>)。另一种方法是利用从3个规则获得的置信区间的上界: <math>\widehat{p_{\text{rule of 3}}} = \frac{3}{n})</math>
      第734行: 第764行:  
   \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .
 
   \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .
   −
\widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } }
+
<math>\widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } }</math>
      第763行: 第793行:  
   \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } .
 
   \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } .
   −
\tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } }.
+
<math>\tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } }.</math>
    
A closed form [[Bayes estimator]] for ''p'' also exists when using the [[Beta distribution]] as a [[Conjugate prior|conjugate]] [[prior distribution]]. When using a general <math>\operatorname{Beta}(\alpha, \beta)</math> as a prior, the [[Bayes estimator#Posterior mean|posterior mean]] estimator is: <math> \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>. The Bayes estimator is [[Asymptotic efficiency (Bayes)|asymptotically efficient]] and as the sample size approaches infinity (''n'' → ∞), it approaches the [[Maximum likelihood estimation|MLE]] solution. The Bayes estimator is [[Bias of an estimator|biased]] (how much depends on the priors),  [[Bayes estimator#Admissibility|admissible]] and [[Consistent estimator|consistent]] in probability.
 
A closed form [[Bayes estimator]] for ''p'' also exists when using the [[Beta distribution]] as a [[Conjugate prior|conjugate]] [[prior distribution]]. When using a general <math>\operatorname{Beta}(\alpha, \beta)</math> as a prior, the [[Bayes estimator#Posterior mean|posterior mean]] estimator is: <math> \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>. The Bayes estimator is [[Asymptotic efficiency (Bayes)|asymptotically efficient]] and as the sample size approaches infinity (''n'' → ∞), it approaches the [[Maximum likelihood estimation|MLE]] solution. The Bayes estimator is [[Bias of an estimator|biased]] (how much depends on the priors),  [[Bayes estimator#Admissibility|admissible]] and [[Consistent estimator|consistent]] in probability.
   −
利用 Beta分布作为共轭先验分布时,也存在p的封闭形式的贝叶斯估计。当使用一个通用\operatorname{Beta}(\alpha, \beta)作为先验时,后验均值估计量为: \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计(MLE)解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。
+
利用 Beta分布作为共轭先验分布时,也存在p的封闭形式的贝叶斯估计。当使用一个通用<math>\operatorname{Beta}(\alpha, \beta) </math>作为先验时,后验均值估计量为: <math>\widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计(MLE)解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。
      第778行: 第808行:  
For the special case of using the [[Standard uniform distribution|standard uniform distribution]] as a [[non-informative prior]] (<math>\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)</math>), the posterior mean estimator becomes <math> \widehat{p_b} = \frac{x+1}{n+2}</math> (a [[Bayes estimator#Posterior mode|posterior mode]] should just lead to the standard estimator). This method is called the [[rule of succession]], which was introduced in the 18th century by [[Pierre-Simon Laplace]].
 
For the special case of using the [[Standard uniform distribution|standard uniform distribution]] as a [[non-informative prior]] (<math>\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)</math>), the posterior mean estimator becomes <math> \widehat{p_b} = \frac{x+1}{n+2}</math> (a [[Bayes estimator#Posterior mode|posterior mode]] should just lead to the standard estimator). This method is called the [[rule of succession]], which was introduced in the 18th century by [[Pierre-Simon Laplace]].
   −
对于使用标准均匀分布作为非信息性的先验概率的特殊情况(\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)),后验均值估计变为\widehat{p_b} = \frac{x+1}{n+2} (后验模式应只能得出标准估计量)。这种方法被称为继承法则,它是18世纪 Pierre-Simon Laplace提出的。
+
对于使用标准均匀分布作为非信息性的先验概率的特殊情况(<math>\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)</math>),后验均值估计变为<math>\widehat{p_b} = \frac{x+1}{n+2}</math> (后验模式应只能得出标准估计量)。这种方法被称为继承法则,它是18世纪 Pierre-Simon Laplace提出的。
       
   \tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 }  
 
   \tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 }  
   −
\tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 }
+
<math>\tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 }</math>
    
When estimating ''p'' with very rare events and a small ''n'' (e.g.: if x=0), then using the standard estimator leads to <math> \widehat{p} = 0,</math> which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators.<ref>{{cite journal |last=Razzaghi |first=Mehdi |title=On the estimation of binomial success probability with zero occurrence in sample |journal=Journal of Modern Applied Statistical Methods |volume=1 |issue=2 |year=2002 |pages=326–332 |doi=10.22237/jmasm/1036110000 |doi-access=free }}</ref> One way is to use the Bayes estimator, leading to: <math> \widehat{p_b} = \frac{1}{n+2}</math>). Another method is to use the upper bound of the [[confidence interval]] obtained using the [[Rule of three (statistics)|rule of three]]: <math> \widehat{p_{\text{rule of 3}}} = \frac{3}{n}</math>)
 
When estimating ''p'' with very rare events and a small ''n'' (e.g.: if x=0), then using the standard estimator leads to <math> \widehat{p} = 0,</math> which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators.<ref>{{cite journal |last=Razzaghi |first=Mehdi |title=On the estimation of binomial success probability with zero occurrence in sample |journal=Journal of Modern Applied Statistical Methods |volume=1 |issue=2 |year=2002 |pages=326–332 |doi=10.22237/jmasm/1036110000 |doi-access=free }}</ref> One way is to use the Bayes estimator, leading to: <math> \widehat{p_b} = \frac{1}{n+2}</math>). Another method is to use the upper bound of the [[confidence interval]] obtained using the [[Rule of three (statistics)|rule of three]]: <math> \widehat{p_{\text{rule of 3}}} = \frac{3}{n}</math>)
第800行: 第830行:  
  \sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).
 
  \sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).
   −
:\sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).
+
:<math>\sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).</math>
    
Even for quite large values of ''n'', the actual distribution of the mean is significantly nonnormal.<ref name=Brown2001>{{Citation |first1=Lawrence D. |last1=Brown |first2=T. Tony |last2=Cai |first3=Anirban |last3=DasGupta |year=2001 |title = Interval Estimation for a Binomial Proportion |url=http://www-stat.wharton.upenn.edu/~tcai/paper/html/Binomial-StatSci.html |journal=Statistical Science |volume=16 |issue=2 |pages=101–133 |access-date = 2015-01-05 |doi=10.1214/ss/1009213286|citeseerx=10.1.1.323.7752 }}</ref> Because of this problem several methods to estimate confidence intervals have been proposed.
 
Even for quite large values of ''n'', the actual distribution of the mean is significantly nonnormal.<ref name=Brown2001>{{Citation |first1=Lawrence D. |last1=Brown |first2=T. Tony |last2=Cai |first3=Anirban |last3=DasGupta |year=2001 |title = Interval Estimation for a Binomial Proportion |url=http://www-stat.wharton.upenn.edu/~tcai/paper/html/Binomial-StatSci.html |journal=Statistical Science |volume=16 |issue=2 |pages=101–133 |access-date = 2015-01-05 |doi=10.1214/ss/1009213286|citeseerx=10.1.1.323.7752 }}</ref> Because of this problem several methods to estimate confidence intervals have been proposed.
第834行: 第864行:  
<font color="#ff8000">Wald 法</font>
 
<font color="#ff8000">Wald 法</font>
   −
:: <math> \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .</math>
+
: <math> \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .</math>
    
  <math>\frac{
 
  <math>\frac{
      −
     \widehat{p\,} + \frac{z^2}{2n} + z
+
     \widehat{p\,} + \frac{z^2}{2n} + z}</math>
    
: A [[continuity correction]] of 0.5/''n'' may be added.{{clarify|date=July 2012}}
 
: A [[continuity correction]] of 0.5/''n'' may be added.{{clarify|date=July 2012}}
   
     \sqrt{
 
     \sqrt{
   第861行: 第890行:  
:: <math> \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } .</math>
 
:: <math> \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } .</math>
   −
    1 + \frac{z^2}{n}
+
<math>1 + \frac{z^2}{n}</math>
 
  −
 
  −
 
  −
}</math>
      
: Here the estimate of ''p'' is modified to
 
: Here the estimate of ''p'' is modified to
第913行: 第938行:  
For example, imagine throwing n balls to a basket UX and taking the balls that hit and throwing them to another basket UY. If p is the probability to hit UX then X&nbsp;~&nbsp;B(n,&nbsp;p) is the number of balls that hit UX. If q is the probability to hit UY then the number of balls that hit UY is Y&nbsp;~&nbsp;B(X,&nbsp;q) and therefore Y&nbsp;~&nbsp;B(n,&nbsp;pq).
 
For example, imagine throwing n balls to a basket UX and taking the balls that hit and throwing them to another basket UY. If p is the probability to hit UX then X&nbsp;~&nbsp;B(n,&nbsp;p) is the number of balls that hit UX. If q is the probability to hit UY then the number of balls that hit UY is Y&nbsp;~&nbsp;B(X,&nbsp;q) and therefore Y&nbsp;~&nbsp;B(n,&nbsp;pq).
   −
例如,想象一下把 n 个球扔到一个篮子UX里,然后把击中的球扔到另一个篮子UY里。如果 p 是击中 UX 的概率,那么X&nbsp;~&nbsp;B(n,&nbsp;p)是击中 UX 的球数。如果 q 是击中 UY 的概率,那么击中 UY的球数是Y&nbsp;~&nbsp;B(X,&nbsp;q),那么Y&nbsp;~&nbsp;B(n,&nbsp;pq。
+
例如,想象一下把 n 个球扔到一个篮子UX里,然后把击中的球扔到另一个篮子UY里。如果 p 是击中 UX 的概率,那么X&nbsp;~&nbsp;B(n,&nbsp;p)是击中 UX 的球数。如果 q 是击中 UY 的概率,那么击中 UY的球数是Y&nbsp;~&nbsp;B(X,&nbsp;q),那么Y&nbsp;~&nbsp;B(n,&nbsp;pq)。
    
The notation in the formula below differs from the previous formulas in two respects:<ref name="Wilson1927">{{Citation |last = Wilson |first=Edwin B. |date = June 1927 |title = Probable inference, the law of succession, and statistical inference |url = http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |journal = Journal of the American Statistical Association |volume=22 |issue=158 |pages=209–212 |access-date= 2015-01-05 |doi = 10.2307/2276774 |url-status=dead |archive-url = https://web.archive.org/web/20150113082307/http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |archive-date = 2015-01-13 |jstor = 2276774 }}</ref>
 
The notation in the formula below differs from the previous formulas in two respects:<ref name="Wilson1927">{{Citation |last = Wilson |first=Edwin B. |date = June 1927 |title = Probable inference, the law of succession, and statistical inference |url = http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |journal = Journal of the American Statistical Association |volume=22 |issue=158 |pages=209–212 |access-date= 2015-01-05 |doi = 10.2307/2276774 |url-status=dead |archive-url = https://web.archive.org/web/20150113082307/http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |archive-date = 2015-01-13 |jstor = 2276774 }}</ref>
第931行: 第956行:  
Since  X \sim B(n, p)  and  Y \sim B(X, q) , by the law of total probability,
 
Since  X \sim B(n, p)  and  Y \sim B(X, q) , by the law of total probability,
   −
由于X \sim B(n, p)和Y \sim B(X, q),由<font color="#ff8000">全概率公式 the law of total probability </font>,
+
由于X <math>\sim B(n, p)</math>和Y <math>\sim B(X, q)</math>,由<font color="#ff8000">全概率公式 the law of total probability </font>,
      第937行: 第962行:  
<math>\begin{align}
 
<math>\begin{align}
   −
:: <math>\frac{
+
:: <math>\frac{}
    
   \Pr[Y = m] &= \sum_{k = m}^{n} \Pr[Y = m \mid X = k] \Pr[X = k] \\[2pt]
 
   \Pr[Y = m] &= \sum_{k = m}^{n} \Pr[Y = m \mid X = k] \Pr[X = k] \\[2pt]
第949行: 第974行:  
  \end{align}</math>
 
  \end{align}</math>
   −
        \frac{\widehat{p\,}(1 - \widehat{p\,})}{n} +
+
<math>\frac{\widehat{p\,}(1 - \widehat{p\,})}{n} </math>
    
Since \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, the equation above can be expressed as
 
Since \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, the equation above can be expressed as
   −
由于\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},上述方程可表示为
+
由于<math>\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}</math>,上述方程可表示为
   −
        \frac{z^2}{4 n^2}
+
<math>        \frac{z^2}{4 n^2}
   −
  \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m}  
+
  \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} </math>
 
  −
    }
      
Factoring  p^k = p^m p^{k-m}  and pulling all the terms that don't depend on  k  out of the sum now yields
 
Factoring  p^k = p^m p^{k-m}  and pulling all the terms that don't depend on  k  out of the sum now yields
   −
对 p ^ k = p ^ m p ^ { k-m }进行分解,从总和中取出所有不依赖于 k 的项,现在就得到了结果
+
<math>p ^ k = p ^ m p ^ { k-m }</math>进行分解,从总和中取出所有不依赖于 k 的项,现在就得到了结果
    
}{
 
}{
    
<math>\begin{align}
 
<math>\begin{align}
  −
1.1.1.2.2.2.2.2.2.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.4.3
      
     1 + \frac{z^2}{n}
 
     1 + \frac{z^2}{n}
第975行: 第996行:  
   \Pr[Y = m] &= \binom{n}{m} p^m q^m \left( \sum_{k=m}^n \binom{n-m}{k-m} p^{k-m} (1-p)^{n-k} (1-q)^{k-m} \right) \\[2pt]
 
   \Pr[Y = m] &= \binom{n}{m} p^m q^m \left( \sum_{k=m}^n \binom{n-m}{k-m} p^{k-m} (1-p)^{n-k} (1-q)^{k-m} \right) \\[2pt]
   −
}</math><ref>{{cite book
+
}</math><ref>{{cite book}}
    
   &= \binom{n}{m} (pq)^m \left( \sum_{k=m}^n \binom{n-m}{k-m} \left(p(1-q)\right)^{k-m} (1-p)^{n-k}  \right)
 
   &= \binom{n}{m} (pq)^m \left( \sum_{k=m}^n \binom{n-m}{k-m} \left(p(1-q)\right)^{k-m} (1-p)^{n-k}  \right)
第1,082行: 第1,103行:  
  \mathcal{N}(np,\,np(1-p)),
 
  \mathcal{N}(np,\,np(1-p)),
   −
\mathcal{N}(np,\,np(1-p))
+
<math>\mathcal{N}(np,\,np(1-p))</math>
      第1,111行: 第1,132行:  
For example, suppose one randomly samples n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If groups of n people were sampled repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation \sigma = \sqrt{\frac{p(1-p)}{n}}
 
For example, suppose one randomly samples n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If groups of n people were sampled repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation \sigma = \sqrt{\frac{p(1-p)}{n}}
   −
例如,假设从大群体中随机抽取了 n 个人,然后询问他们是否同意某种说法。同意的人的比例取决于样本。如果 n 组人群被重复随机地取样,其比例将遵循一个近似正态分布,均值等于总体中一致性的真实比例 p,标准差\sigma = \sqrt{\frac{p(1-p)}{n}}
+
例如,假设从大群体中随机抽取了 n 个人,然后询问他们是否同意某种说法。同意的人的比例取决于样本。如果 n 组人群被重复随机地取样,其比例将遵循一个近似正态分布,均值等于总体中一致性的真实比例 p,标准差<math>\sigma = \sqrt{\frac{p(1-p)}{n}}</math>
    
Then log(''T'') is approximately normally distributed with mean log(''p''<sub>1</sub>/''p''<sub>2</sub>) and variance ((1/''p''<sub>1</sub>)&nbsp;−&nbsp;1)/''n''&nbsp;+&nbsp;((1/''p''<sub>2</sub>)&nbsp;−&nbsp;1)/''m''.
 
Then log(''T'') is approximately normally distributed with mean log(''p''<sub>1</sub>/''p''<sub>2</sub>) and variance ((1/''p''<sub>1</sub>)&nbsp;−&nbsp;1)/''n''&nbsp;+&nbsp;((1/''p''<sub>2</sub>)&nbsp;−&nbsp;1)/''m''.
第1,131行: 第1,152行:  
The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed or at least p tends to zero. Therefore, the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small.  According to two rules of thumb, this approximation is good if n&nbsp;≥&nbsp;20 and p&nbsp;≤&nbsp;0.05, or if n&nbsp;≥&nbsp;100 and np&nbsp;≤&nbsp;10.
 
The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed or at least p tends to zero. Therefore, the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small.  According to two rules of thumb, this approximation is good if n&nbsp;≥&nbsp;20 and p&nbsp;≤&nbsp;0.05, or if n&nbsp;≥&nbsp;100 and np&nbsp;≤&nbsp;10.
   −
当试验数量趋于无穷大,而np 保持不变或者至少 p 趋于零时,二项分布收敛到泊松分佈。因此,如果 n 是足够大,p 足够小的话,参数为λ = np的泊松分布可以作为二项分布B(n, p)的近似。根据两个经验法则,如果n&nbsp;≥&nbsp;20和p&nbsp;≤&nbsp;0.05,或者如果n&nbsp;≥&nbsp;100 and np&nbsp;≤&nbsp;10,则这个近似是好的。
+
当试验数量趋于无穷大,而np 保持不变或者至少 p 趋于零时,二项分布收敛到泊松分布。因此,如果 n 是足够大,p 足够小的话,参数为λ = np的泊松分布可以作为二项分布B(n, p)的近似。根据两个经验法则,如果n&nbsp;≥&nbsp;20和p&nbsp;≤&nbsp;0.05,或者如果n&nbsp;≥&nbsp;100 and np&nbsp;≤&nbsp;10,则这个近似是好的。
      第1,170行: 第1,191行:  
Since <math>\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math> the equation above can be expressed as
 
Since <math>\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math> the equation above can be expressed as
   −
由于<math>/tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math>上式可表示为
+
由于<math>\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math>上式可表示为
    
:<math> \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} </math>
 
:<math> \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} </math>
第1,229行: 第1,250行:  
The [[Bernoulli distribution]] is a special case of the binomial distribution, where ''n''&nbsp;=&nbsp;1. Symbolically, ''X''&nbsp;~&nbsp;B(1,&nbsp;''p'') has the same meaning as ''X''&nbsp;~&nbsp;Bernoulli(''p''). Conversely, any binomial distribution, B(''n'',&nbsp;''p''), is the distribution of the sum of ''n'' [[Bernoulli trials]], Bernoulli(''p''), each with the same probability ''p''.<ref>{{cite web|last1=Taboga|first1=Marco|title=Lectures on Probability Theory and Mathematical Statistics|url=https://www.statlect.com/probability-distributions/binomial-distribution#hid3|website=statlect.com|accessdate=18 December 2017}}</ref>
 
The [[Bernoulli distribution]] is a special case of the binomial distribution, where ''n''&nbsp;=&nbsp;1. Symbolically, ''X''&nbsp;~&nbsp;B(1,&nbsp;''p'') has the same meaning as ''X''&nbsp;~&nbsp;Bernoulli(''p''). Conversely, any binomial distribution, B(''n'',&nbsp;''p''), is the distribution of the sum of ''n'' [[Bernoulli trials]], Bernoulli(''p''), each with the same probability ''p''.<ref>{{cite web|last1=Taboga|first1=Marco|title=Lectures on Probability Theory and Mathematical Statistics|url=https://www.statlect.com/probability-distributions/binomial-distribution#hid3|website=statlect.com|accessdate=18 December 2017}}</ref>
   −
伯努利分布是二项分布的特例,其中''n''&nbsp;=&nbsp;1.从符号上看,''X''&nbsp;~&nbsp;B(1,&nbsp;''p'')与''X''&nbsp;~&nbsp;Bernoulli(''p'')具有相同的意义。相反,任何二项分布,B(''n'',&nbsp;''p'')是''n''个伯努利试验的和的分布,每个概率''p''相同。
+
伯努利分布是二项分布的特例,其中''n''&nbsp;=&nbsp;1.从符号上看,''X''&nbsp;~&nbsp;B(1,&nbsp;''p'')与''X''&nbsp;~&nbsp;Bernoulli(''p'')具有相同的意义。相反,任何二项分布,B(''n'',&nbsp;''p'')是''n''个伯努利试验的和的分布,每个概率''p''相同。<ref>{{cite web|last1=Taboga|first1=Marco|title=Lectures on Probability Theory and Mathematical Statistics|url=https://www.statlect.com/probability-distributions/binomial-distribution#hid3|website=statlect.com|accessdate=18 December 2017}}</ref>
     
16

个编辑