更改

二项分布 (查看源代码)

2020年10月4日 (日) 11:35的版本

添加6,734字节、 2020年10月4日 (日) 11:35

无编辑摘要

第1行：第1行： −

~~此词条由南风翻译~~

+

此词条暂由南风翻译

第59行：第59行：

| parameters = n \in \{0, 1, 2, \ldots\} – number of trials p \in [0,1] – success probability for each trial q = 1 - p

−

| ~~parameters~~ = n in {0,1,2，ldots } -- 试验次数 p in [0,1] -- 每个试验的成功概率 q = 1-p

+

| 参数 = n in {0,1,2，ldots } -- 试验次数 p in [0,1] -- 每个试验的成功概率 q = 1-p

| support = <math>k \in \{0, 1, \ldots, n\}</math> – number of successes

第95行：第95行：

| mode = \lfloor (n + 1)p \rfloor or \lceil (n + 1)p \rceil - 1

−

| ~~mode~~ = lfloor (n + 1) p rfloor 或 lceil (n + 1) p rceil-1

+

| 模 = lfloor (n + 1) p rfloor 或 lceil (n + 1) p rceil-1

| variance = <math>npq</math>

第183行：第183行：

===Probability mass function===

+

概率质量函数

第190行：第192行：

In general, if the random variable X follows the binomial distribution with parameters n ∈ ℕ and p ∈ [0,1], we write X ~ B(n, p). The probability of getting exactly k successes in n independent Bernoulli trials is given by the probability mass function:

−

一般来说，如果随机变量 x 服从参数 n ∈ N 和 p ∈[0,1]的二项分布，我们写作 x ~ b (n，p)。在 n 个独立的伯努利试验中获得 k ~~成功的概率是由概率质量函数给出的~~:

+

一般来说，如果随机变量 x 服从参数 n ∈ N 和 p ∈[0,1]的二项分布，我们写作 x ~ b (n，p)。在 n 个独立的伯努利试验中获得 k 成功的概率是由'''概率质量函数'''给出的:

第299行：第301行：

===Cumulative distribution function===

+

累积分布函数

第322行：第326行：

where \lfloor k\rfloor is the "floor" under k, i.e. the greatest integer less than or equal to k.

−

~~楼层是~~ k 下面的”楼层” ，也就是。小于或等于 k 的最大整数。

+

是 k 下面的”楼层” ，也就是。小于或等于 k 的最大整数。

第370行：第374行：

which is equivalent to the cumulative distribution function of the -distribution:

−

~~这相当于-分布的累积分布函数~~:

+

这相当于分布的'''累积分布函数''':

第386行：第390行：

Some closed-form bounds for the cumulative distribution function are given below.

−

~~下面给出了累积分布函数的一些闭式界。~~

+

下面给出了'''累积分布函数'''的一些闭式界。

第393行：第397行：

===Expected value and variance===

+

期望值和方差

第449行：第455行：

===Higher moments===

+

高阶矩

The first 6 central moments are given by

第454行：第462行：

The first 6 central moments are given by

−

~~前6个中心矩阵由~~

+

前6个中心矩由

:<math>\begin{align}

第507行：第515行：

===Mode===

+

模

第514行：第524行：

Usually the mode of a binomial B(n, p) distribution is equal to \lfloor (n+1)p\rfloor, where \lfloor\cdot\rfloor is the floor function. However, when (n + 1)p is an integer and p is neither 0 nor 1, then the distribution has two modes: (n + 1)p and (n + 1)p − 1. When p is equal to 0 or 1, the mode will be 0 and n correspondingly. These cases can be summarized as follows:

−

通常二项式 b (n，p)~~分布的模式等于~~ lfloor (n + 1) p 楼层，其中 lfloor cdot ~~楼层是阶梯函数。然而，当~~(n + 1) p 是整数且 p ~~既不是0也不是1时，分布有两种模式~~: (n + 1) p 和(n + 1) p-1。当 p ~~等于0或1时，模式将相应地为0和~~ n。这些情况可概述如下:

+

通常二项式 b (n，p)分布的模等于 lfloor (n + 1) p 楼层，其中 lfloor cdot 楼层是下限函数。然而，当(n + 1) p 是整数且 p 既不是0也不是1时，分布有两种模: (n + 1) p 和(n + 1) p-1。当 p 等于0或1时，模将相应地为0和 n。这些情况可概述如下:

−

~~==[[用户:南风|南风]]（[[用户讨论:南风|讨论]]）[翻译]存疑“the floor function”~~

: <math>\text{mode} =

第560行：第569行：

Proof: Let

−

证据: 让

+

证明: 让

第645行：第654行：

===Median===

+

中位数

In general, there is no single formula to find the [[median]] for a binomial distribution, and it may even be non-unique. However several special results have been established:

第653行：第664行：

* If ''np'' is an integer, then the mean, median, and mode coincide and equal ''np''.<ref>{{cite journal|last=Neumann|first=P.|year=1966|title=Über den Median der Binomial- and Poissonverteilung|journal=Wissenschaftliche Zeitschrift der Technischen Universität Dresden|volume=19|pages=29–33|language=German}}</ref><ref>Lord, Nick. (July 2010). "Binomial averages when the mean is an integer", [[The Mathematical Gazette]] 94, 331-332.</ref>

+

如果“np”是一个整数，那么它的均值，中位数和模相同且等于“np”。

* Any median ''m'' must lie within the interval ⌊''np''⌋ ≤ ''m'' ≤ ⌈''np''⌉.<ref name="KaasBuhrman">{{cite journal|first1=R.|last1=Kaas|first2=J.M.|last2=Buhrman|title=Mean, Median and Mode in Binomial Distributions|journal=Statistica Neerlandica|year=1980|volume=34|issue=1|pages=13–18|doi=10.1111/j.1467-9574.1980.tb00681.x}}</ref>

+

任何中位数“ m”都必须在⌊''np''⌋≤≤m''≤⌈''np''⌉范围内。

* A median ''m'' cannot lie too far away from the mean: {{nowrap||''m'' − ''np''| ≤ min{ ln 2, max{''p'', 1 − ''p''} }}}.<ref name="Hamza">{{Cite journal

+

中位数“m”不能离均值太远。

| last1 = Hamza | first1 = K.

第674行：第691行：

where D(a || p) is the relative entropy between an a-coin and a p-coin (i.e. between the Bernoulli(a) and Bernoulli(p) distribution):

−

其中 d (a | | p)是a-coin，p-~~coin 相对熵之间的距离。在伯努利~~(a)和伯努利(p)分布之间:

+

其中 D(a | | p)是a-coin，p-coin相对熵之间的距离。在伯努利(a)和伯努利(p)分布之间:

==~~~[翻译]存疑“a-coin，p-coin”

第691行：第708行：

Asymptotically, this bound is reasonably tight; see

−

~~从渐近的角度来看，这个约束是相当严格~~; 参见

+

从渐近的角度来看，这个界限相当严格; 参见

}}</ref>

第700行：第717行：

* The median is unique and equal to ''m'' = [[Rounding|round]](''np'') when |''m'' − ''np''| ≤ min{''p'', 1 − ''p''} (except for the case when ''p'' = {{sfrac|1|2}} and ''n'' is odd).<ref name="KaasBuhrman"/>

+

中位数是唯一的并且等于“m”

which implies the simpler but looser bound

−

~~这意味着更简单但更松散的约束~~

+

这意味着更简单但更松散的界限

* When ''p'' = 1/2 and ''n'' is odd, any number ''m'' in the interval {{sfrac|1|2}}(''n'' − 1) ≤ ''m'' ≤ {{sfrac|1|2}}(''n'' + 1) is a median of the binomial distribution. If ''p'' = 1/2 and ''n'' is even, then ''m'' = ''n''/2 is the unique median.

第720行：第739行：

===Tail bounds===

+

尾部边界

For ''k'' ≤ ''np'', upper bounds can be derived for the lower tail of the cumulative distribution function <math>F(k;n,p) = \Pr(X \le k)</math>, the probability that there are at most ''k'' successes. Since <math>\Pr(X \ge k) = F(n-k;n,1-p) </math>, these bounds can also be seen as bounds for the upper tail of the cumulative distribution function for ''k'' ≥ ''np''.

+

对于''k''≤''np''，可以得出累积分布函数下尾的上界<math>F(k;n,p)=Pr(X \le k)</math>，即最多存在''k''成功的概率。由于<math>/Pr(X \ge k) = F(n-k;n,1-p) </math>，这些界限也可以看作是''k''≥''np''的累积分布函数上尾的界限。

F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!

第731行：第754行：

[[Hoeffding's inequality]] yields the simple bound

+

[[Hoeffding's不等式]]得到简单的边界

第738行：第762行：

which is however not very tight. In particular, for ''p'' = 1, we have that ''F''(''k'';''n'',''p'') = 0 (for fixed ''k'', ''n'' with ''k'' < ''n''), but Hoeffding's bound evaluates to a positive constant.

+

然而，这并不是很严格。特别是，对于''p''=1，我们有''F''(''k'';''n''，''p'')=0(对于固定的''k''，''n''与''k'' < ''n'')，但Hoeffding''的约束评价为一个正常数。

When n is known, the parameter p can be estimated using the proportion of successes: \widehat{p} = \frac{x}{n}. This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.

−

当 n 已知时，参数 p 可以使用成功的比例来估计: widehat { p } = frac { x }{ n }~~。利用'''极大似然估计'''和'''矩方法'''求出了该估计量。利用~~ '''Lehmann-scheffé 定理'''~~证明了该估计量的'''无偏一致最小方差'''，因为该估计量是基于一个极小充分完全统计量~~(即:。: x).它在概率和均方误差方面也是一致的。

+

当 n 已知时，参数 p 可以使用成功的比例来估计: widehat { p } = frac { x }{ n }。利用极大似然估计和矩方法求出了该估计量。利用 '''Lehmann-scheffé 定理'''证明了该估计量的无偏一致最小方差，因为该估计量是基于一个极小充分完全统计量(即:。: x).它在概率和均方误差方面也是一致的。

A sharper bound can be obtained from the [[Chernoff bound]]:<ref name="ag">{{cite journal |first1=R. |last1=Arratia |first2=L. |last2=Gordon |title=Tutorial on large deviations for the binomial distribution |journal=Bulletin of Mathematical Biology |volume=51 |issue=1 |year=1989 |pages=125–131 |doi=10.1007/BF02458840 |pmid=2706397 |s2cid=189884382 }}</ref>

+

可以从[[Chernoff界]]中得到一个更清晰的边界

A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general \operatorname{Beta}(\alpha, \beta) as a prior, the posterior mean estimator is: \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}. The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors), admissible and consistent in probability.

第757行：第785行：

For the special case of using the standard uniform distribution as a non-informative prior (\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)), the posterior mean estimator becomes \widehat{p_b} = \frac{x+1}{n+2} (a posterior mode should just lead to the standard estimator). This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.

−

对于使用标准均匀分布作为先验概率的特殊情况(操作者名{ Beta }(alpha = 1，Beta = 1) = u (0,1)) ，后验均值估计变为广义{ p _ b } = frac { x + 1}{ n + 2}(后验模式应该只导致标准估计)~~。这种方法被称为继承法则，它是在18世纪由皮埃尔~~-西蒙·拉普拉斯引进的。

+

对于使用标准均匀分布作为先验概率的特殊情况(操作者名{ Beta }(alpha = 1，Beta = 1) = u (0,1)) ，后验均值估计变为广义{ p _ b } = frac { x + 1}{ n + 2}(后验模式应该只导致标准估计)。这种方法被称为'''继承法则'''，它是在18世纪由皮埃尔-西蒙·拉普拉斯引进的。

第774行：第802行：

Asymptotically, this bound is reasonably tight; see <ref name="ag"/> for details.

+

渐进地讲，这个边界是相当严格的；详见<ref name="ag"/>。

第782行：第812行：

One can also obtain ''lower'' bounds on the tail <math>F(k;n,p) </math>, known as anti-concentration bounds. By approximating the binomial coefficient with Stirling's formula it can be shown that<ref>{{cite book |author1=Robert B. Ash |title=Information Theory |url=https://archive.org/details/informationtheor00ashr |url-access=limited |date=1990 |publisher=Dover Publications |page=[https://archive.org/details/informationtheor00ashr/page/n81 115]}}</ref>

+

我们还可以得到尾部<math>F(k;n,p) </math>的''下'界，即'''反集中界'''。通过用斯特林公式对二项式系数进行近似，可以看出：<math>F(k;n,p)</math>是一个反集中的界限。

:<math> F(k;n,p) \geq \frac{1}{\sqrt{8n\tfrac{k}{n}(1-\tfrac{k}{n})}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right),</math>

第790行：第822行：

which implies the simpler but looser bound

+

这意味着更简单但更松散的约束。

:<math> F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right).</math>

第808行：第842行：

== Statistical Inference ==

+

统计推断

A continuity correction of 0.5/n may be added.

第814行：第850行：

=== Estimation of parameters ===

+

参数估计

+

beta分布贝叶斯推断

When ''n'' is known, the parameter ''p'' can be estimated using the proportion of successes: <math> \widehat{p} = \frac{x}{n}.</math> This estimator is found using [[maximum likelihood estimator]] and also the [[method of moments (statistics)|method of moments]]. This estimator is [[Bias of an estimator|unbiased]] and uniformly with [[Minimum-variance unbiased estimator|minimum variance]], proven using [[Lehmann–Scheffé theorem]], since it is based on a [[Minimal sufficient|minimal sufficient]] and [[Completeness (statistics)|complete]] statistic (i.e.: ''x''). It is also [[Consistent estimator|consistent]] both in probability and in [[Mean squared error|MSE]].

+

当''n''已知时，参数''p''可以用成功的比例来估计。 <math> \widehat{p} = \frac{x}{n}.</math> 这个估计是用[[最大似然估计法]]和[[矩量法（统计学）|矩量法]]来计算的。这个估计是[[估计的偏倚|无偏估计]]，并与[[最小方差无偏估计|最小方差]]一致，用[[Lehmann-Scheffé定理]]证明，因为它是基于[[最小充分性|最小充分性]]和[[完全性（统计学）|完全性]]统计（即：''x''）。它在概率和[[平均平方误差|MSE]]方面也是[[一致估计|一致]]。

第828行：第870行：

A closed form [[Bayes estimator]] for ''p'' also exists when using the [[Beta distribution]] as a [[Conjugate prior|conjugate]] [[prior distribution]]. When using a general <math>\operatorname{Beta}(\alpha, \beta)</math> as a prior, the [[Bayes estimator#Posterior mean|posterior mean]] estimator is: <math> \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>. The Bayes estimator is [[Asymptotic efficiency (Bayes)|asymptotically efficient]] and as the sample size approaches infinity (''n'' → ∞), it approaches the [[Maximum likelihood estimation|MLE]] solution. The Bayes estimator is [[Bias of an estimator|biased]] (how much depends on the priors), [[Bayes estimator#Admissibility|admissible]] and [[Consistent estimator|consistent]] in probability.

+

当使用[[Beta分布]]作为[[共轭先验|共轭]]时，''p''的闭合形式[[贝叶斯估计]]也存在。[[先验分布]]。当使用一般的<math><operatorname{Beta}(\alpha，\beta)</math>作为先验时，[[贝叶斯估计#后验均值|后验均值]]估计是。<math> \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}</math>。Bayes估计是[[渐进效率(Bayes)|asymptotically efficient]]，当样本量接近无穷大(''n'' → ∞)时，它接近[[最大似然估计|MLE]]解。贝叶斯估计是[[估计的偏倚|偏倚]]。(多少取决于前值)，[[贝叶斯估计#可接受性|可接受性]]和[[一致性估计|一致性]]的概率。

第837行：第881行：

For the special case of using the [[Standard uniform distribution|standard uniform distribution]] as a [[non-informative prior]] (<math>\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)</math>), the posterior mean estimator becomes <math> \widehat{p_b} = \frac{x+1}{n+2}</math> (a [[Bayes estimator#Posterior mode|posterior mode]] should just lead to the standard estimator). This method is called the [[rule of succession]], which was introduced in the 18th century by [[Pierre-Simon Laplace]].

+

对于使用[[标准均匀分布|标准均匀分布]]作为[[非信息先验]]的特殊情况（<math>operatorname{Beta}(alpha=1, β=1)=U(0, 1)</math>)，后均值估计变成<math> \widehat{p_b}=\frac{x+1}{n+2}</math>(一个[[贝叶斯估计#后模|后模]]应该只是导致标准估计)。这种方法被称为[[继承规则]]，由[[Pierre-Simon Laplace]]在18世纪引入。

第845行：第890行：

When estimating ''p'' with very rare events and a small ''n'' (e.g.: if x=0), then using the standard estimator leads to <math> \widehat{p} = 0,</math> which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators.<ref>{{cite journal |last=Razzaghi |first=Mehdi |title=On the estimation of binomial success probability with zero occurrence in sample |journal=Journal of Modern Applied Statistical Methods |volume=1 |issue=2 |year=2002 |pages=326–332 |doi=10.22237/jmasm/1036110000 |doi-access=free }}</ref> One way is to use the Bayes estimator, leading to: <math> \widehat{p_b} = \frac{1}{n+2}</math>). Another method is to use the upper bound of the [[confidence interval]] obtained using the [[Rule of three (statistics)|rule of three]]: <math> \widehat{p_{\text{rule of 3}}} = \frac{3}{n}</math>)

+

当估计''p''时，如果事件非常罕见，而且''n''很小（例如：如果x=0），那么使用标准估计器会导致<math> \widehat{p} = 0,</math>，这有时是不现实的，也是不可取的。在这种情况下，有各种不同的估计方法。

=== Confidence intervals ===

+

置信区间

第858行：第906行：

Even for quite large values of ''n'', the actual distribution of the mean is significantly nonnormal.<ref name=Brown2001>{{Citation |first1=Lawrence D. |last1=Brown |first2=T. Tony |last2=Cai |first3=Anirban |last3=DasGupta |year=2001 |title = Interval Estimation for a Binomial Proportion |url=http://www-stat.wharton.upenn.edu/~tcai/paper/html/Binomial-StatSci.html |journal=Statistical Science |volume=16 |issue=2 |pages=101–133 |access-date = 2015-01-05 |doi=10.1214/ss/1009213286|citeseerx=10.1.1.323.7752 }}</ref> Because of this problem several methods to estimate confidence intervals have been proposed.

+

即使对于相当大的''n''值，平均数的实际分布也是显著非正态的，由于这个问题，人们提出了几种估计置信区间的方法。

In the equations for confidence intervals below, the variables have the following meaning:

+

在下面的置信区间公式中，变量具有以下含义

* ''n''1 is the number of successes out of ''n'', the total number of trials

+

* ''n''1是''n''中的成功次数，即试验的总次数。

* <math> \widehat{p\,} = \frac{n_1}{n}</math> is the proportion of successes

+

* <math> <math> widehat{p,} = \frac{n_1}{n}</math> 是成功的比例。

The notation in the formula below differs from the previous formulas in two respects:

第876行：第932行：

==== Wald method ====

+

Wald法

:: <math> \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .</math>

第944行：第1,002行：

==== Arcsine method ====

+

弧线法

Let X ~ B(n,p1) and Y ~ B(m,p2) be independent. Let T = (X/n)/(Y/m).

第955行：第1,015行：

Then log(T) is approximately normally distributed with mean log(p1/p2) and variance ((1/p1) − 1)/n + ((1/p2) − 1)/m.

−

~~然后对数~~(t)~~近似呈正态分布，其中平均对数~~(p1/p2)和方差((1/p1)-1)/n + ((1/p2)-1)/m。

+

然后log(t)近似呈正态分布，其中平均log(p1/p2)和方差((1/p1)-1)/n + ((1/p2)-1)/m。

: <math>\sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).</math>

第962行：第1,022行：

==== Wilson (score) method ====

+

威尔逊法

If X ~ B(n, p) and Y | X ~ B(X, q) (the conditional distribution of Y, given X), then Y is a simple binomial random variable with distribution Y ~ B(n, pq).

第976行：第1,038行：

The notation in the formula below differs from the previous formulas in two respects:<ref name="Wilson1927">{{Citation |last = Wilson |first=Edwin B. |date = June 1927 |title = Probable inference, the law of succession, and statistical inference |url = http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |journal = Journal of the American Statistical Association |volume=22 |issue=158 |pages=209–212 |access-date= 2015-01-05 |doi = 10.2307/2276774 |url-status=dead |archive-url = https://web.archive.org/web/20150113082307/http://psych.stanford.edu/~jlm/pdfs/Wison27SingleProportion.pdf |archive-date = 2015-01-13 |jstor = 2276774 }}</ref>

+

下面的公式中的符号与前面的公式有两个不同之处

* Firstly, ''z''''x'' has a slightly different interpretation in the formula below: it has its ordinary meaning of 'the ''x''th quantile of the standard normal distribution', rather than being a shorthand for 'the (1 − ''x'')-th quantile'.

+

首先，''z''''x''在下式中的解释略有不同：它的普通含义是 "标准正态分布的''x''th分位数"，而不是"(1 - ''x'')th分位数 "的简写。

+

* Secondly, this formula does not use a plus-minus to define the two bounds. Instead, one may use <math>z = z_{\alpha / 2}</math> to get the lower bound, or use <math>z = z_{1 - \alpha/2}</math> to get the upper bound. For example: for a 95% confidence level the error <math>\alpha</math> = 0.05, so one gets the lower bound by using <math>z = z_{\alpha/2} = z_{0.025} = - 1.96</math>, and one gets the upper bound by using <math>z = z_{1 - \alpha/2} = z_{0.975} = 1.96</math>.

+

其次，这个公式没有使用加减法来定义两个界限。相反，我们可以使用<math>z = z_{/alpha / 2}</math>得到下限，或者使用<math>z = z_{1 - \alpha/2}</math>得到上限。例如：对于95%的置信度，误差<math>/alpha</math> = 0.05，所以用<math>z = z_{/alpha/2} = z_{0.025} = - 1.96</math>得到下限，用<math>z = z_{1 - \alpha/2} = z_{0.975} = 1.96</math>得到上限。

+

Since X \sim B(n, p) and Y \sim B(X, q) , by the law of total probability,

第1,104行：第1,174行：

The Wald method, although commonly recommended in textbooks, is the most biased.{{clarify|reason=what sense of bias is this|date=July 2012}}

+

Wald法虽然是教科书上普遍推荐的方法，但却是最偏颇的方法。

第1,112行：第1,184行：

==Related distributions==

+

相关分布

===Sums of binomials===

+

二项式之和

The binomial distribution is a special case of the Poisson binomial distribution, or general binomial distribution, which is the distribution of a sum of n independent non-identical Bernoulli trials B(pi).

第1,122行：第1,198行：

If ''X'' ~ B(''n'', ''p'') and ''Y'' ~ B(''m'', ''p'') are independent binomial variables with the same probability ''p'', then ''X'' + ''Y'' is again a binomial variable; its distribution is ''Z=X+Y'' ~ B(''n+m'', ''p''):

+

如果''X'' ~ B(''n'', ''p'')和''Y'' ~ B(''m'', ''p'')是独立的二项式变量，概率''p''相同，那么''X''  + ''Y''又是一个二项式变量，其分布是''Z=X+Y'' ~ B(''n+m'', ''p'')。

+

第1,144行：第1,223行：

However, if ''X'' and ''Y'' do not have the same probability ''p'', then the variance of the sum will be [[Binomial sum variance inequality|smaller than the variance of a binomial variable]] distributed as <math>B(n+m, \bar{p}).\,</math>

+

但是，如果''X''和''Y''的概率''p''不一样，那么和的方差将是[[二项式和方差不等式|小于二项式变量的方差]]分布为<math>B(n+m, \bar{p}).\,</math>。

\mathcal{N}(np,\,np(1-p)),

第1,152行：第1,233行：

===Ratio of two binomial distributions===

+

两个二项分布的比值

and this basic approximation can be improved in a simple way by using a suitable continuity correction.

第1,164行：第1,247行：

This result was first derived by Katz and coauthors in 1978.<ref name=Katz1978>{{cite journal |last1=Katz |first1=D. |displayauthors=1 |first2=J. |last2=Baptista |first3=S. P. |last3=Azen |first4=M. C. |last4=Pike |year=1978 |title=Obtaining confidence intervals for the risk ratio in cohort studies |journal=Biometrics |volume=34 |issue=3 |pages=469–474 |doi=10.2307/2530610 |jstor=2530610 }}</ref>

+

这个结果最早是由Katz和合作者在1978年得出的。

第1,176行：第1,261行：

Then log(''T'') is approximately normally distributed with mean log(''p''1/''p''2) and variance ((1/''p''1) − 1)/''n'' + ((1/''p''2) − 1)/''m''.

+

则log(''T'')近似正态分布，均值log(''p''1/''p''2)，方差((1/''p''1) - 1)/''n'' + ((1/''p''2) - 1)/''m''。

+

===Conditional binomials===

+

条件二项式

If ''X'' ~ B(''n'', ''p'') and ''Y'' | ''X'' ~ B(''X'', ''q'') (the conditional distribution of ''Y'', given ''X''), then ''Y'' is a simple binomial random variable with distribution ''Y'' ~ B(''n'', ''pq'').

+

如果''X'' ~ B(''n'', ''p'')和''Y'' | 'X'' ~ B(''X'', ''q'')('Y''的条件分布，给定&nbsp。 ''X''），则''Y''是一个简单的二项式随机变量，其分布为''Y'' ~ B(''n'', ''pq'')。

+

The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed or at least p tends to zero. Therefore, the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and np ≤ 10.

第1,190行：第1,283行：

For example, imagine throwing ''n'' balls to a basket ''UX'' and taking the balls that hit and throwing them to another basket ''UY''. If ''p'' is the probability to hit ''UX'' then ''X'' ~ B(''n'', ''p'') is the number of balls that hit ''UX''. If ''q'' is the probability to hit ''UY'' then the number of balls that hit ''UY'' is ''Y'' ~ B(''X'', ''q'') and therefore ''Y'' ~ B(''n'', ''pq'').

+

例如，想象将''n''个球扔到一个篮子里''UX''，然后把击中的球扔到另一个篮子里''UY''。如果''p''是击中''UX''的概率，那么''X'' ~ B(''n'', ''p'')就是击中''UX''的球数。如果''q''是击中''UY''的概率，那么击中''UY''的球数是''Y'' ~ B(''X'', ''q'')，因此''Y'' ~ B(''n'', ''pq'')。

+

Concerning the accuracy of Poisson approximation, see Novak, ch. 4, and references therein.

第1,213行：第1,309行：

Given a uniform prior, the posterior distribution for the probability of success given independent events with observed successes is a beta distribution.

−

~~给定一个统一的先验，给定观察到成功结果的独立事件成功概率的后验分布是一个~~ beta分布。

+

给定一个一致性先验，给定观察到成功结果的独立事件成功概率的后验分布是一个 beta分布。

\end{align}</math>

Since <math>\tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math> the equation above can be expressed as

+

由于<math>/tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m},</math>上式可表示为

:<math> \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} </math>

Factoring <math> p^k = p^m p^{k-m} </math> and pulling all the terms that don't depend on <math> k </math> out of the sum now yields

+

将 <math> p^k = p^m p^{k-m} </math> 进行分解，并将所有不依赖于 <math> k </math> 的项从总和中抽出，即可得到

Methods for random number generation where the marginal distribution is a binomial distribution are well-established.

第1,240行：第1,340行：

After substituting <math> i = k - m </math> in the expression above, we get

+

将 <math> i = k - m </math> 代入上述表达式后，我们得到了

:<math> \Pr[Y = m] = \binom{n}{m} (pq)^m \left( \sum_{i=0}^{n-m} \binom{n-m}{i} (p - pq)^i (1-p)^{n-m - i} \right) </math>

第1,248行：第1,350行：

Notice that the sum (in the parentheses) above equals <math> (p - pq + 1 - p)^{n-m} </math> by the [[binomial theorem]]. Substituting this in finally yields

+

请注意，上述的和（括号内）等于<math> (p - pq + 1 - p)^{n-m}。</math>由[[二项式定理]]得出。将此代入最终得到

+

:<math>\begin{align}

第1,268行：第1,373行：

The [[Bernoulli distribution]] is a special case of the binomial distribution, where ''n'' = 1. Symbolically, ''X'' ~ B(1, ''p'') has the same meaning as ''X'' ~ Bernoulli(''p''). Conversely, any binomial distribution, B(''n'', ''p''), is the distribution of the sum of ''n'' [[Bernoulli trials]], Bernoulli(''p''), each with the same probability ''p''.<ref>{{cite web|last1=Taboga|first1=Marco|title=Lectures on Probability Theory and Mathematical Statistics|url=https://www.statlect.com/probability-distributions/binomial-distribution#hid3|website=statlect.com|accessdate=18 December 2017}}</ref>

+

[[伯努利分布]]是二项分布的特例，其中''n'' = 1.从符号上看，''X'' ~ B(1, ''p'')与''X'' ~ 伯努利(''p'')具有相同的意义。相反，任何二项分布，B(''n'', ''p'')是''n''[[伯努利试验]]，伯努利(''p'')之和的分布，每个概率''p''相同。

第1,276行：第1,383行：

The binomial distribution is a special case of the [[Poisson binomial distribution]], or [[general binomial distribution]], which is the distribution of a sum of ''n'' independent non-identical [[Bernoulli trials]] B(''pi'').<ref>

+

二项分布是[[泊松二项分布]]或[[广义二项分布]]的特例，它是''n''独立的非相同[[伯努利试验]]之和的分布。B(''pi'')

+

{{Cite journal

第1,325行：第1,435行：

Category:Conjugate prior distributions

−

范畴: 共轭先验分布

+

类别: 共轭先验分布

[[File:Binomial Distribution.svg|right|250px|thumb|Binomial [[probability mass function]] and normal [[probability density function]] approximation for ''n'' = 6 and ''p'' = 0.5]]

南风

3

个编辑