二项分布

来自集智百科
跳到导航 跳到搜索

此词条暂由南风翻译。已由Smile审校

模板:Redirect


-->

-->

{{Probability distribution

{{Probability distribution

概率分布 Probability distribution

 | name       = Binomial distribution
 | name       = Binomial distribution

名称 = 二项分布 Binomial distribution

 | type       = mass
 | type       = mass

类型 = 质量,这里指离散型 discrete

 | pdf_image  = Probability mass function for the binomial distribution
 | pdf_image  = Probability mass function for the binomial distribution

| 概率质量函数图像 = 二项分布的概率质量函数 Probability mass function for the binomial distribution

 | cdf_image  = Cumulative distribution function for the binomial distribution
 | cdf_image  = Cumulative distribution function for the binomial distribution

| 累积分布函数图像 = 二项分布的累积分布函数 Cumulative distribution function for the binomial distribution

 | notation   = [math]\displaystyle{ B(n,p) }[/math]
 | notation   = B(n,p)

| 符号 = [math]\displaystyle{ B(n,p) }[/math]

 | parameters = [math]\displaystyle{ n \in \{0, 1, 2, \ldots\} }[/math] – number of trials
[math]\displaystyle{ p \in [0,1] }[/math] – success probability for each trial
[math]\displaystyle{ q = 1 - p }[/math]
 | parameters = n \in \{0, 1, 2, \ldots\} – number of trials
p \in [0,1] – success probability for each trial
q = 1 - p

| 参数 =
[math]\displaystyle{ n \in \{0, 1, 2, \ldots\} }[/math] – --- 试验次数;
[math]\displaystyle{ p \in [0,1] }[/math] – -- 每个试验的成功概率;
[math]\displaystyle{ q = 1 - p }[/math]

 | support    = [math]\displaystyle{ k \in \{0, 1, \ldots, n\} }[/math] – number of successes
 | support    = k \in \{0, 1, \ldots, n\} – number of successes

| 支持 =
[math]\displaystyle{ k \in \{0, 1, \ldots, n\} }[/math] – --- 成功的数量

 | pdf        = [math]\displaystyle{ \binom{n}{k} p^k q^{n-k} }[/math]
 | pdf        = \binom{n}{k} p^k q^{n-k}

|概率质量函数 Probability mass function = [math]\displaystyle{ \binom{n}{k} p^k q^{n-k} }[/math]

 | cdf        = [math]\displaystyle{ I_{q}(n - k, 1 + k) }[/math]
 | cdf        = I_{q}(n - k, 1 + k)

| 累积分布函数 Cumulative distribution function = [math]\displaystyle{ I_{q}(n - k, 1 + k) }[/math]

 | mean       = [math]\displaystyle{ np }[/math]
 | mean       = np

平均值 mean = [math]\displaystyle{ np }[/math]

 | median     = [math]\displaystyle{ \lfloor np \rfloor }[/math] or [math]\displaystyle{ \lceil np \rceil }[/math]
 | median     = \lfloor np \rfloor or \lceil np \rceil

中位数 median = [math]\displaystyle{ \lfloor np \rfloor }[/math][math]\displaystyle{ \lceil np \rceil }[/math]

 | mode       = [math]\displaystyle{ \lfloor (n + 1)p \rfloor }[/math] or [math]\displaystyle{ \lceil (n + 1)p \rceil - 1 }[/math]
 | mode       = \lfloor (n + 1)p \rfloor or \lceil (n + 1)p \rceil - 1

| 模 mode = [math]\displaystyle{ \lfloor (n + 1)p \rfloor }[/math][math]\displaystyle{ \lceil (n + 1)p \rceil - 1 }[/math]

 | variance   = [math]\displaystyle{ npq }[/math]
 | variance   = npq

| 方差 variance = [math]\displaystyle{ npq }[/math]

 | skewness   = [math]\displaystyle{ \frac{q-p}{\sqrt{npq}} }[/math]
 | skewness   = \frac{q-p}{\sqrt{npq}}

| 偏度 skewness = [math]\displaystyle{ \frac{q-p}{\sqrt{npq}} }[/math]

 | kurtosis   = [math]\displaystyle{ \frac{1-6pq}{npq} }[/math]
 | kurtosis   = \frac{1-6pq}{npq}

| 峰度 kurtosis = [math]\displaystyle{ \frac{1-6pq}{npq} }[/math]

 | entropy    = [math]\displaystyle{ \frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right) }[/math]
in shannons. For nats, use the natural log in the log.
 | entropy    = \frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right)
in shannons. For nats, use the natural log in the log.

| 熵 entropy = [math]\displaystyle{ \frac{1}{2} \log_2 (2\pi enpq) + O \left( \frac{1}{n} \right) }[/math]香农熵 Shannon entropy测量。对于分布式消息队列系统 NATS ,使用日志中的自然日志。

 | mgf        = [math]\displaystyle{ (q + pe^t)^n }[/math]
 | mgf        = (q + pe^t)^n

| 矩量母函数 Moment Generating Function = [math]\displaystyle{ (q + pe^t)^n }[/math]

 | char       = [math]\displaystyle{ (q + pe^{it})^n }[/math]
 | char       = (q + pe^{it})^n

| 特征函数 characteristic function = [math]\displaystyle{ (q + pe^{it})^n }[/math]

 | pgf        = [math]\displaystyle{ G(z) = [q + pz]^n }[/math]
 | pgf        = G(z) = [q + pz]^n

| 概率母函数 probability generating function = [math]\displaystyle{ G(z) = [q + pz]^n }[/math]

 | fisher     = [math]\displaystyle{  g_n(p) = \frac{n}{pq}  }[/math]
(for fixed [math]\displaystyle{ n }[/math])
 | fisher     =  g_n(p) = \frac{n}{pq} 
(for fixed n)

| 费雪信息量 fisher information = [math]\displaystyle{ g_n(p) = \frac{n}{pq} }[/math]
(对于固定的 [math]\displaystyle{ n }[/math])

}}

}}

}}

文件:Pascal's triangle; binomial distribution.svg
Binomial distribution for [math]\displaystyle{ p=0.5 }[/math]
with n and k as in Pascal's triangle

The probability that a ball in a Galton box with 8 layers (n = 8) ends up in the central bin (k = 4) is [math]\displaystyle{ 70/256 }[/math].

Binomial distribution for p=0.5
with n and k as in [[Pascal's triangle

The probability that a ball in a Galton box with 8 layers (n = 8) ends up in the central bin (k = 4) is 70/256.]]

文章File:Pascal's triangle; binomial distribution.svg是[math]\displaystyle{ p=0.5 }[/math]
与n和k相关的二项分布。一个8层(n = 8)的高尔顿盒子中的一个球最终进入中央箱子(k = 4)的概率是[math]\displaystyle{ 70/256 }[/math]

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own boolean-valued outcome: success/yes/true/one (with probability p) or failure/no/false/zero (with probability q = 1 − p).

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own boolean-valued outcome: success/yes/true/one (with probability p) or failure/no/false/zero (with probability q = 1 − p).

在概率论和统计学中,参数为n和p的二项分布是n个独立实验序列中成功次数的离散概率分布 discrete probability distribution ,每个实验结果是一个 是/否问题,每个实验都有布尔值结果: 成功/是/正确/1 (概率为 p)或失败/否/错误/0 (概率为 q = 1 − p)。


A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

一个单一的结果为成功或失败的实验也被称为伯努利试验 Bernoulli trial伯努利实验 Bernoulli experiment ,一系列伯努利实验结果被称为伯努利过程 Bernoulli process ; 对于一个单一的实验,即n = 1,这个二项分布是一个伯努利分布 Bernoulli distribution。二项分布是统计显著性 statistical significance 二项检验 binomial test 的基础。


The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution remains a good approximation, and is widely used.

The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution remains a good approximation, and is widely used.

二项分布经常被用来模拟大小为n的样本中的成功数量,这些样本是从大小为N的种群中有放回地抽取的。如果抽样没有把抽取的个体放回总体中,抽样就不是独立的,所以得到的分布是一个超几何分布 hypergeometric distribution ,而不是二项分布。然而,对于N远大于n的情况,二项分布仍然是一个很好的近似,并且被广泛使用。


Definitions

Probability mass function

概率质量函数


In general, if the random variable X follows the binomial distribution with parameters n and p ∈ [0,1], we write X ~ B(np). The probability of getting exactly k successes in n independent Bernoulli trials is given by the probability mass function:

In general, if the random variable X follows the binomial distribution with parameters n ∈ ℕ and p ∈ [0,1], we write X ~ B(n, p). The probability of getting exactly k successes in n independent Bernoulli trials is given by the probability mass function:

一般来说,如果随机变量 random variable X服从参数n p ∈ [0,1]的二项分布,记作X ~ B(np)。在n个独立的伯努利试验中获得k次成功的概率由概率质量函数给出:


[math]\displaystyle{ f(k,n,p) = \Pr(k;n,p) = \Pr(X = k) = \binom{n}{k}p^k(1-p)^{n-k} }[/math]

f(k,n,p) = \Pr(k;n,p) = \Pr(X = k) = \binom{n}{k}p^k(1-p)^{n-k}

[math]\displaystyle{ f(k,n,p) = \Pr(k;n,p) = \Pr(X = k) = \binom{n}{k}p^k(1-p)^{n-k} }[/math]


for k = 0, 1, 2, ..., n, where

for k = 0, 1, 2, ..., n, where

对于k = 0, 1, 2, ..., n,其中


[math]\displaystyle{ \binom{n}{k} =\frac{n!}{k!(n-k)!} }[/math]

\binom{n}{k} =\frac{n!}{k!(n-k)!}

[math]\displaystyle{ \binom{n}{k} =\frac{n!}{k!(n-k)!} }[/math]


is the binomial coefficient, hence the name of the distribution. The formula can be understood as follows. k successes occur with probability pk and n − k failures occur with probability (1 − p)n − k. However, the k successes can occur anywhere among the n trials, and there are [math]\displaystyle{ \binom{n}{k} }[/math] different ways of distributing k successes in a sequence of n trials.

is the binomial coefficient, hence the name of the distribution. The formula can be understood as follows. k successes occur with probability pk and n − k failures occur with probability (1 − p)n − k. However, the k successes can occur anywhere among the n trials, and there are \binom{n}{k} different ways of distributing k successes in a sequence of n trials.

二项式系数 binomial coefficient,因此有了分布的名字。这个公式可以理解为,K次成功发生在概率为pk的情况下,n − k次失败发生在概率为(1 − p)n − k的情况下。然而,k次成功可以发生在n个试验中的任何一个,并且在n个试验序列中有[math]\displaystyle{ \binom{n}{k} }[/math]k次试验成功的不同分配方法。


In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

在创建二项分布概率的参考表时,通常表中最多填充到n/2的值。这是因为对于k > n/2,概率可以通过它的补来计算。


[math]\displaystyle{ f(k,n,p)=f(n-k,n,1-p). }[/math]

f(k,n,p)=f(n-k,n,1-p).

[math]\displaystyle{ f(k,n,p)=f(n-k,n,1-p). }[/math].


Looking at the expression f(knp) as a function of k, there is a k value that maximizes it. This k value can be found by calculating

Looking at the expression f(k, n, p) as a function of k, there is a k value that maximizes it. This k value can be found by calculating

把表达式f(knp)看作k的函数,存在一个k值使它达到最大。这个k 值可以通过计算得到。

[math]\displaystyle{ \frac{f(k+1,n,p)}{f(k,n,p)}=\frac{(n-k)p}{(k+1)(1-p)} }[/math]
\frac{f(k+1,n,p)}{f(k,n,p)}=\frac{(n-k)p}{(k+1)(1-p)} 

[math]\displaystyle{ \frac{f(k+1,n,p)}{f(k,n,p)}=\frac{(n-k)p}{(k+1)(1-p)} }[/math]

and comparing it to 1. There is always an integer M that satisfies[1]

and comparing it to 1. There is always an integer M that satisfies

并且与1相比较。总有一个整数M满足[2]


[math]\displaystyle{ (n+1)p-1 \leq M \lt (n+1)p. }[/math]

(n+1)p-1 \leq M < (n+1)p.

[math]\displaystyle{ (n+1)p-1 \leq M \lt (n+1)p. }[/math].


f(knp) is monotone increasing for k < M and monotone decreasing for k > M, with the exception of the case where (n + 1)p is an integer. In this case, there are two values for which f is maximal: (n + 1)p and (n + 1)p − 1. M is the most probable outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the mode.

f(k, n, p) is monotone increasing for k < M and monotone decreasing for k > M, with the exception of the case where (n + 1)p is an integer. In this case, there are two values for which f is maximal: (n + 1)p and (n + 1)p − 1. M is the most probable outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the mode.

f(knp)对k < M 是单调递增的,对k > M 是单调递减的,但(n + 1)p是整数的情况除外。在这种情况下,有(n + 1)p 和 (n + 1)p −1 两个值使f达到最大。M 是伯努利试验最有可能的结果(也就是说,发生的可能性最大,尽管仍然存在不发生的情况) ,被称为模。


Example

Suppose a biased coin comes up heads with probability 0.3 when tossed. The probability of seeing exactly 4 heads in 6 tosses is

Suppose a biased coin comes up heads with probability 0.3 when tossed. The probability of seeing exactly 4 heads in 6 tosses is

假设抛出一枚有偏硬币 biased coin 时,正面朝上的概率为0.3。在6次抛掷中恰好看到4个正面的概率是


[math]\displaystyle{ f(4,6,0.3) = \binom{6}{4}0.3^4 (1-0.3)^{6-4}= 0.059535. }[/math]

f(4,6,0.3) = \binom{6}{4}0.3^4 (1-0.3)^{6-4}= 0.059535.

[math]\displaystyle{ f(4,6,0.3) = \binom{6}{4}0.3^4 (1-0.3)^{6-4}= 0.059535. }[/math].


Cumulative distribution function

累积分布函数


The cumulative distribution function can be expressed as:

The cumulative distribution function can be expressed as:

累积分布函数可以表达为:


[math]\displaystyle{ F(k;n,p) = \Pr(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n\choose i}p^i(1-p)^{n-i}, }[/math]

F(k;n,p) = \Pr(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n\choose i}p^i(1-p)^{n-i},

[math]\displaystyle{ F(k;n,p) = \Pr(X \le k) = \sum_{i=0}^{\lfloor k \rfloor} {n\choose i}p^i(1-p)^{n-i}, }[/math] ,


where [math]\displaystyle{ \lfloor k\rfloor }[/math] is the "floor" under k, i.e. the greatest integer less than or equal to k.

where \lfloor k\rfloor is the "floor" under k, i.e. the greatest integer less than or equal to k.

[math]\displaystyle{ \lfloor k\rfloor }[/math]是k的向下取整 round down,即小于或等于k的最大整数。


It can also be represented in terms of the regularized incomplete beta function, as follows:[3]

It can also be represented in terms of the regularized incomplete beta function, as follows:

正则化不完全的beta函数 regularized incomplete beta function 下,它也可以表示如下: [4]


[math]\displaystyle{ \begin{align} F(k;n,p) & = \Pr(X \le k) \\ &= I_{1-p}(n-k, k+1) \\ & = (n-k) {n \choose k} \int_0^{1-p} t^{n-k-1} (1-t)^k \, dt. \end{align} }[/math]


which is equivalent to the cumulative distribution function of the F-distribution:[5]

which is equivalent to the cumulative distribution function of the -distribution:

这相当于F分布 F-distribution的累积分布函数: [6]


[math]\displaystyle{ F(k;n,p) = F_{F\text{-distribution}}\left(x=\frac{1-p}{p}\frac{k+1}{n-k};d_1=2(n-k),d_2=2(k+1)\right). }[/math]

F(k;n,p) = F_{F\text{-distribution}}\left(x=\frac{1-p}{p}\frac{k+1}{n-k};d_1=2(n-k),d_2=2(k+1)\right).

[math]\displaystyle{ F(k;n,p) = F_{F\text{-distribution}}\left(x=\frac{1-p}{p}\frac{k+1}{n-k};d_1=2(n-k),d_2=2(k+1)\right). }[/math]


Some closed-form bounds for the cumulative distribution function are given below.

Some closed-form bounds for the cumulative distribution function are given below.

下面给出了累积分布函数的一些闭式界 closed-form bounds


Properties

Expected value and variance

期望值 Expected value 方差 variance


If X ~ B(n, p), that is, X is a binomially distributed random variable, n being the total number of experiments and p the probability of each experiment yielding a successful result, then the expected value of X is:[7]

If X ~ B(n, p), that is, X is a binomially distributed random variable, n being the total number of experiments and p the probability of each experiment yielding a successful result, then the expected value of X is:

如果X ~ B(n, p),即X是一个服从二项分布的随机变量,n 是实验的总数,p 是每个实验得到成功结果的概率,那么X的期望值是:


[math]\displaystyle{ \operatorname{E}[X] = np. }[/math]
\operatorname{E}[X] = np.

[math]\displaystyle{ \operatorname{E}[X] = np. }[/math]


This follows from the linearity of the expected value along with fact that X is the sum of n identical Bernoulli random variables, each with expected value p. In other words, if [math]\displaystyle{ X_1, \ldots, X_n }[/math] are identical (and independent) Bernoulli random variables with parameter p, then [math]\displaystyle{ X = X_1 + \cdots + X_n }[/math] and

This follows from the linearity of the expected value along with fact that is the sum of identical Bernoulli random variables, each with expected value . In other words, if X_1, \ldots, X_n are identical (and independent) Bernoulli random variables with parameter , then X = X_1 + \cdots + X_n and

这是由于期望值的线性性 linearity,以及Xn个相同的伯努利随机变量的线性组合,每个变量都有期望值p。换句话说,如果[math]\displaystyle{ X_1, \ldots, X_n }[/math]是参数p的相同的(且独立的)伯努利随机变量,那么[math]\displaystyle{ X = X_1 + \cdots + X_n }[/math]

[math]\displaystyle{ \operatorname{E}[X] = \operatorname{E}[X_1 + \cdots + X_n] = \operatorname{E}[X_1] + \cdots + \operatorname{E}[X_n] = p + \cdots + p = np. }[/math]

\operatorname{E}[X] = \operatorname{E}[X_1 + \cdots + X_n] = \operatorname{E}[X_1] + \cdots + \operatorname{E}[X_n] = p + \cdots + p = np.

[math]\displaystyle{ \operatorname{E}[X] = \operatorname{E}[X_1 + \cdots + X_n] = \operatorname{E}[X_1] + \cdots + \operatorname{E}[X_n] = p + \cdots + p = np. }[/math]


The variance is:

The variance is:

方差是:

[math]\displaystyle{ \operatorname{Var}(X) = np(1 - p). }[/math]
\operatorname{Var}(X) = np(1 - p).

[math]\displaystyle{ \operatorname{Var}(X) = np(1 - p). }[/math]


This similarly follows from the fact that the variance of a sum of independent random variables is the sum of the variances.

This similarly follows from the fact that the variance of a sum of independent random variables is the sum of the variances.

这也是因为独立随机变量和的方差是方差之和。


Higher moments

高阶矩 Higher moments

The first 6 central moments are given by

The first 6 central moments are given by

前6个中心矩由

[math]\displaystyle{ \begin{align} \mu_1 &= 0, \\ \mu_2 &= np(1-p),\\ \mu_3 &= np(1-p)(1-2p),\\ \mu_4 &= np(1-p)(1+(3n-6)p(1-p)),\\ \mu_5 &= np(1-p)(1-2p)(1+(10n-12)p(1-p)),\\ \mu_6 &= np(1-p)(1-30p(1-p)(1-4p(1-p))+5np(1-p)(5-26p(1-p))+15n^2 p^2 (1-p)^2). \end{align} }[/math]


Mode


Usually the mode of a binomial B(n, p) distribution is equal to [math]\displaystyle{ \lfloor (n+1)p\rfloor }[/math], where [math]\displaystyle{ \lfloor\cdot\rfloor }[/math] is the floor function. However, when (n + 1)p is an integer and p is neither 0 nor 1, then the distribution has two modes: (n + 1)p and (n + 1)p − 1. When p is equal to 0 or 1, the mode will be 0 and n correspondingly. These cases can be summarized as follows:

Usually the mode of a binomial B(n, p) distribution is equal to \lfloor (n+1)p\rfloor, where \lfloor\cdot\rfloor is the floor function. However, when (n + 1)p is an integer and p is neither 0 nor 1, then the distribution has two modes: (n + 1)p and (n + 1)p − 1. When p is equal to 0 or 1, the mode will be 0 and n correspondingly. These cases can be summarized as follows:

通常二项式B(n, p)分布的模等于[math]\displaystyle{ \lfloor (n+1)p\rfloor }[/math],其中[math]\displaystyle{ \lfloor\cdot\rfloor }[/math]向下取整函数 floor function 。然而,当(n + 1)p是整数且p不为0或1时,二项分布有两种模: (n + 1)p和(n + 1)p − 1。当p等于0或1时,对应的模为0或n。这些情况可总结如下:


[math]\displaystyle{ \text{mode} = \begin{cases} \lfloor (n+1)\,p\rfloor & \text{if }(n+1)p\text{ is 0 or a noninteger}, \\ (n+1)\,p\ \text{ and }\ (n+1)\,p - 1 &\text{if }(n+1)p\in\{1,\dots,n\}, \\ n & \text{if }(n+1)p = n + 1. \end{cases} }[/math]


Proof: Let

Proof: Let

证明: 让


[math]\displaystyle{ f(k)=\binom nk p^k q^{n-k}. }[/math]

f(k)=\binom nk p^k q^{n-k}.

[math]\displaystyle{ f(k)=\binom nk p^k q^{n-k}. }[/math]


For [math]\displaystyle{ p=0 }[/math] only [math]\displaystyle{ f(0) }[/math] has a nonzero value with [math]\displaystyle{ f(0)=1 }[/math]. For [math]\displaystyle{ p=1 }[/math] we find [math]\displaystyle{ f(n)=1 }[/math] and [math]\displaystyle{ f(k)=0 }[/math] for [math]\displaystyle{ k\neq n }[/math]. This proves that the mode is 0 for [math]\displaystyle{ p=0 }[/math] and [math]\displaystyle{ n }[/math] for [math]\displaystyle{ p=1 }[/math].

For p=0 only f(0) has a nonzero value with f(0)=1. For p=1 we find f(n)=1 and f(k)=0 for k\neq n. This proves that the mode is 0 for p=0 and n for p=1.

[math]\displaystyle{ p=0 }[/math],只有[math]\displaystyle{ f(0) }[/math]有一个非零值,[math]\displaystyle{ f(0)=1 }[/math]。当[math]\displaystyle{ p=1 }[/math],我们发现当[math]\displaystyle{ k\neq n }[/math][math]\displaystyle{ f(n)=1 }[/math][math]\displaystyle{ f(k)=0 }[/math]。这证明了[math]\displaystyle{ p=0 }[/math]时模为0,[math]\displaystyle{ p=1 }[/math]时模为[math]\displaystyle{ n }[/math]


Let [math]\displaystyle{ 0 \lt p \lt 1 }[/math]. We find

Let 0 < p < 1. We find

[math]\displaystyle{ 0 \lt p \lt 1 }[/math]。我们发现


[math]\displaystyle{ \frac{f(k+1)}{f(k)} = \frac{(n-k)p}{(k+1)(1-p)} }[/math].

\frac{f(k+1)}{f(k)} = \frac{(n-k)p}{(k+1)(1-p)}.

[math]\displaystyle{ \frac{f(k+1)}{f(k)} = \frac{(n-k)p}{(k+1)(1-p)} }[/math].


From this follows

From this follows

由此可见


[math]\displaystyle{ \begin{align} k \gt (n+1)p-1 \Rightarrow f(k+1) \lt f(k) \\ k = (n+1)p-1 \Rightarrow f(k+1) = f(k) \\ k \lt (n+1)p-1 \Rightarrow f(k+1) \gt f(k) \end{align} }[/math]


So when [math]\displaystyle{ (n+1)p-1 }[/math] is an integer, then [math]\displaystyle{ (n+1)p-1 }[/math] and [math]\displaystyle{ (n+1)p }[/math] is a mode. In the case that [math]\displaystyle{ (n+1)p-1\notin \Z }[/math], then only [math]\displaystyle{ \lfloor (n+1)p-1\rfloor+1=\lfloor (n+1)p\rfloor }[/math] is a mode.[8]

So when (n+1)p-1 is an integer, then (n+1)p-1 and (n+1)p is a mode. In the case that (n+1)p-1\notin \Z, then only \lfloor (n+1)p-1\rfloor+1=\lfloor (n+1)p\rfloor is a mode.

所以当[math]\displaystyle{ (n+1)p-1 }[/math]是一个整数时,[math]\displaystyle{ (n+1)p-1 }[/math][math]\displaystyle{ (n+1)p }[/math]是一个模。在[math]\displaystyle{ (n+1)p-1\notin Z }[/math]的情况下,只有[math]\displaystyle{ \lfloor (n+1)p-1\rfloor+1=\lfloor (n+1)p\rfloor }[/math]是模。[9]


Median

中位数

In general, there is no single formula to find the median for a binomial distribution, and it may even be non-unique. However several special results have been established:

In general, there is no single formula to find the median for a binomial distribution, and it may even be non-unique. However several special results have been established:

一般来说,没有单一的公式可以找到一个二项分布的中位数,甚至可能不是唯一的。然而,几个特殊的结果是已经确定的:

  • If np is an integer, then the mean, median, and mode coincide and equal np.[10][11]
  • 如果np是一个整数,那么它的均值,中位数和模相同且等于np[12][13]
  • Any median m must lie within the interval ⌊np⌋ ≤ m ≤ ⌈np⌉.[14]
  • 任何中位数m都必须满足⌊np⌋ ≤ m ≤ ⌈np⌉。[14]
  • A median m cannot lie too far away from the mean: |mnp| ≤ min{ ln 2, max{p, 1 − p} }.[15]


  • 中位数m不能离均值太远。|mnp| ≤ min{ ln 2, max{p, 1 − p} }[15]
F(k;n,p) \geq \frac{1}{\sqrt{8n\tfrac{k}{n}(1-\tfrac{k}{n})}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right),
  • The median is unique and equal to m = round(np) when |m − np| ≤ min{p, 1 − p} (except for the case when p = 模板:Sfrac and n is odd).[14]
  • 中位数是唯一的并且等于m = round(np),此时|m − np| ≤ min{p, 1 − p}([math]\displaystyle{ ''p'' = {{sfrac|1|2}} }[/math]n 是奇数的情况除外)

which implies the simpler but looser bound

这意味着更简单但更宽松的界限

  • When p = 1/2 and n is odd, any number m in the interval 模板:Sfrac(n − 1) ≤ m ≤ 模板:Sfrac(n + 1) is a median of the binomial distribution. If p = 1/2 and n is even, then m = n/2 is the unique median.
F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right).


For p = 1/2 and k ≥ 3n/8 for even n, it is possible to make the denominator constant:

对于p = 1/2且n是奇数,任意m满足模板:Sfrac (n − 1) ≤ m ≤  模板:Sfrac (n + 1)是一个二项分布的中位数。如果p = 1/2且n 是偶数,那么m = n/2是唯一的中位数:

[math]\displaystyle{ F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right); }[/math]

p = 1/2并且n为偶数,k ≥ 3n/8时, 可以使分母为常数。

Tail bounds

尾部边界 Tail bounds

For knp, upper bounds can be derived for the lower tail of the cumulative distribution function [math]\displaystyle{ F(k;n,p) = \Pr(X \le k) }[/math], the probability that there are at most k successes. Since [math]\displaystyle{ \Pr(X \ge k) = F(n-k;n,1-p) }[/math], these bounds can also be seen as bounds for the upper tail of the cumulative distribution function for knp.

对于knp,可以得出累积分布函数左尾的上界[math]\displaystyle{ F(k;n,p)=Pr(X \le k) }[/math],即最多存在k次成功的概率。由于[math]\displaystyle{ Pr(X \ge k) = F(n-k;n,1-p) }[/math],这些界限也可以看作是knp的累积分布函数右尾的边界。

F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \!

[math]\displaystyle{ F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \! }[/math]


Hoeffding's inequality yields the simple bound

霍夫丁不等式 Hoeffding's inequality 得到简单的边界


[math]\displaystyle{ F(k;n,p) \leq \exp\left(-2 n\left(p-\frac{k}{n}\right)^2\right), \! }[/math]


which is however not very tight. In particular, for p = 1, we have that F(k;n,p) = 0 (for fixed k, n with k < n), but Hoeffding's bound evaluates to a positive constant.

然而,这并不是很严格。特别是,当p=1时,有F(k;np) = 0(对于固定的knk < n),但是Hoeffding的约束评价为一个正的常数。

When n is known, the parameter p can be estimated using the proportion of successes: \widehat{p} = \frac{x}{n}. This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.

当 n 已知时,参数 p 可以使用成功的比例来估计:[math]\displaystyle{ \widehat{p} = \frac{x}{n} }[/math]。可以利用极大似然估计 maximum likelihood estimator 矩方法 method of moments来求出该估计量。Lehmann-scheffé 定理证明了该估计量是无偏的一致的且方差最小的,因为该估计量是基于一个极小充分完备统计量 sufficient and complete statistic(即: x).它在概率和均方误差 MSE方面也是一致的。


A sharper bound can be obtained from the Chernoff bound:[16]

可以从切尔诺夫界 Chernoff bound中得到一个更清晰的边界。[16]

A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general \operatorname{Beta}(\alpha, \beta) as a prior, the posterior mean estimator is: \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta}. The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors), admissible and consistent in probability.

利用 Beta分布作为共轭先验分布 conjugate prior distribution 时,也存在p的封闭形式的贝叶斯估计 Bayes estimator 。当使用一个通用[math]\displaystyle{ \operatorname{Beta}(\alpha, \beta) }[/math]作为先验时,后验均值估计量为: [math]\displaystyle{ \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta} }[/math]。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。


[math]\displaystyle{ F(k;n,p) \leq \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right) }[/math]

For the special case of using the standard uniform distribution as a non-informative prior (\operatorname{Beta}(\alpha=1, \beta=1) = U(0,1)), the posterior mean estimator becomes \widehat{p_b} = \frac{x+1}{n+2} (a posterior mode should just lead to the standard estimator). This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.

对于使用标准均匀分布作为非信息性的先验概率的特殊情况([math]\displaystyle{ \operatorname{Beta}(\alpha=1, \beta=1) = U(0,1) }[/math]),后验均值估计变为[math]\displaystyle{ \widehat{p_b} = \frac{x+1}{n+2} }[/math] (后验模式应只能得出标准估计量)。这种方法被称为继承法则 the rule of succession ,它是18世纪皮埃尔-西蒙·拉普拉斯 Pierre-Simon Laplace提出的。


where D(a || p) is the relative entropy between an a-coin and a p-coin (i.e. between the Bernoulli(a) and Bernoulli(p) distribution):

其中D(a || p)是参数为a和p的相对熵,即Bernoulli(a)和Bernoulli(p)概率分布的差值:

When estimating p with very rare events and a small n (e.g.: if x=0), then using the standard estimator leads to \widehat{p} = 0, which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators. One way is to use the Bayes estimator, leading to: \widehat{p_b} = \frac{1}{n+2}). Another method is to use the upper bound of the confidence interval obtained using the rule of three: \widehat{p_{\text{rule of 3}}} = \frac{3}{n})

当估计用非常罕见的事件和一个小的n (例如,如果x = 0) ,那么使用标准估计会得到[math]\displaystyle{ \widehat{p} = 0 }[/math],这有时是不现实的和我们不希望看到的。在这种情况下,有各种可供选择的估计值。一种方法是使用贝叶斯估计,得到:[math]\displaystyle{ \widehat{p_b} = \frac{1}{n+2} }[/math])。另一种方法是利用从3个规则获得的置信区间的上界: [math]\displaystyle{ \widehat{p_{\text{rule of 3}}} = \frac{3}{n}) }[/math]


[math]\displaystyle{ D(a\parallel p)=(a)\log\frac{a}{p}+(1-a)\log\frac{1-a}{1-p}. \! }[/math]


Asymptotically, this bound is reasonably tight; see [16] for details.

渐近地,这个边界是相当严格的;详见[16]


Even for quite large values of n, the actual distribution of the mean is significantly nonnormal. Because of this problem several methods to estimate confidence intervals have been proposed.

即使对于非常大的 n 值,均值的实际分布是非正态的。针对这一问题,提出了几种估计置信区间的方法。

One can also obtain lower bounds on the tail [math]\displaystyle{ F(k;n,p) }[/math], known as anti-concentration bounds. By approximating the binomial coefficient with Stirling's formula it can be shown that[17]

我们还可以得到尾部[math]\displaystyle{ F(k;n,p) }[/math]的下界,即反集中界anti-concentration bounds 。通过用斯特林公式 Stirling's formula对二项式系数进行近似,可以看出:[18]

[math]\displaystyle{ F(k;n,p) \geq \frac{1}{\sqrt{8n\tfrac{k}{n}(1-\tfrac{k}{n})}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right), }[/math]

In the equations for confidence intervals below, the variables have the following meaning:

在下面的置信区间等式中,这些变量具有以下含义:

which implies the simpler but looser bound

这意味着更简单但更松散的约束。

[math]\displaystyle{ F(k;n,p) \geq \frac1{\sqrt{2n}} \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right). }[/math]


For p = 1/2 and k ≥ 3n/8 for even n, it is possible to make the denominator constant:[19]

p = 1/2并且n为偶数,k ≥ 3n/8时, 可以使分母为常数

[math]\displaystyle{ F(k;n,\tfrac{1}{2}) \geq \frac{1}{15} \exp\left(- 16n \left(\frac{1}{2} -\frac{k}{n}\right)^2\right). \! }[/math]
 \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } .

[math]\displaystyle{ \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } }[/math]


Statistical Inference

统计推断 Statistical Inference

A continuity correction of 0.5/n may be added.

可以加上0.5/n 的连续校正。

Estimation of parameters

参数估计 Estimation of parameters

模板:Seealso

Beta分布 贝叶斯推断


When n is known, the parameter p can be estimated using the proportion of successes: [math]\displaystyle{ \widehat{p} = \frac{x}{n}. }[/math]。This estimator is found using maximum likelihood estimator and also the method of moments. This estimator is unbiased and uniformly with minimum variance, proven using Lehmann–Scheffé theorem, since it is based on a minimal sufficient and complete statistic (i.e.: x). It is also consistent both in probability and in MSE.

n已知时,参数p可以用成功的比例来估计:[math]\displaystyle{ \widehat{p} = \frac{x}{n}. }[/math]。这个估计是用极大似然估计法和矩估计方法来计算的。这个估计是无偏的、一致的且有最小的方差,由Lehmann-Scheffé定理证明,因为它是基于最小充分完备统计量(即:x)。它的概率和均方误差(MSE)也是一致估计。


 \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } .

[math]\displaystyle{ \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } }. }[/math]

A closed form Bayes estimator for p also exists when using the Beta distribution as a conjugate prior distribution. When using a general [math]\displaystyle{ \operatorname{Beta}(\alpha, \beta) }[/math] as a prior, the posterior mean estimator is: [math]\displaystyle{ \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta} }[/math]. The Bayes estimator is asymptotically efficient and as the sample size approaches infinity (n → ∞), it approaches the MLE solution. The Bayes estimator is biased (how much depends on the priors), admissible and consistent in probability.

利用 Beta分布作为共轭先验分布时,也存在p的封闭形式的贝叶斯估计。当使用一个通用[math]\displaystyle{ \operatorname{Beta}(\alpha, \beta) }[/math]作为先验时,后验均值估计量为: [math]\displaystyle{ \widehat{p_b} = \frac{x+\alpha}{n+\alpha+\beta} }[/math]。贝叶斯估计是渐近有效的,当样本容量趋近无穷大(n →∞)时,它趋近极大似然估计(MLE)解。贝叶斯估计是有偏的(偏多少取决于先验) ,可接受的且一致的概率。



Here the estimate of p is modified to

这里p的估计被修改为

For the special case of using the standard uniform distribution as a non-informative prior ([math]\displaystyle{ \operatorname{Beta}(\alpha=1, \beta=1) = U(0,1) }[/math]), the posterior mean estimator becomes [math]\displaystyle{ \widehat{p_b} = \frac{x+1}{n+2} }[/math] (a posterior mode should just lead to the standard estimator). This method is called the rule of succession, which was introduced in the 18th century by Pierre-Simon Laplace.

对于使用标准均匀分布作为非信息性的先验概率的特殊情况([math]\displaystyle{ \operatorname{Beta}(\alpha=1, \beta=1) = U(0,1) }[/math]),后验均值估计变为[math]\displaystyle{ \widehat{p_b} = \frac{x+1}{n+2} }[/math] (后验模式应只能得出标准估计量)。这种方法被称为继承法则,它是18世纪 Pierre-Simon Laplace提出的。


 \tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 } 

[math]\displaystyle{ \tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 } }[/math]

When estimating p with very rare events and a small n (e.g.: if x=0), then using the standard estimator leads to [math]\displaystyle{ \widehat{p} = 0, }[/math] which sometimes is unrealistic and undesirable. In such cases there are various alternative estimators.[20] One way is to use the Bayes estimator, leading to: [math]\displaystyle{ \widehat{p_b} = \frac{1}{n+2} }[/math]). Another method is to use the upper bound of the confidence interval obtained using the rule of three: [math]\displaystyle{ \widehat{p_{\text{rule of 3}}} = \frac{3}{n} }[/math])

当估计值p时非常罕见,而且很小(例如:如果x=0),那么使用标准估计器会得到[math]\displaystyle{ \widehat{p} = 0, }[/math],这有时是不现实的,也是不可取的。在这种情况下,有几种不同的可替代的估计方法。[21]一种方法是使用贝叶斯估计,得到: [math]\displaystyle{ \widehat{p_b} = \frac{1}{n+2} }[/math])。另一种方法是利用从3个规则获得的置信区间的上界: [math]\displaystyle{ \widehat{p_{\text{rule of 3}}} = \frac{3}{n} }[/math])


Confidence intervals

置信区间 Confidence intervals


\sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right).
[math]\displaystyle{ \sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right). }[/math]

Even for quite large values of n, the actual distribution of the mean is significantly nonnormal.[22] Because of this problem several methods to estimate confidence intervals have been proposed.

即使对于相当大的n值,平均数的实际分布是显著非正态的,[22]由于这个问题,人们提出了几种估计置信区间的方法。


In the equations for confidence intervals below, the variables have the following meaning:

在下面的置信区间公式中,变量具有以下含义

  • n1 is the number of successes out of n, the total number of trials
  • n1n中的成功次数,即试验的总次数。
  • [math]\displaystyle{ \widehat{p\,} = \frac{n_1}{n} }[/math] is the proportion of successes
  • [math]\displaystyle{ \widehat{p\,} = \frac{n_1}{n} }[/math]是成功的比例。

The notation in the formula below differs from the previous formulas in two respects:

下列公式中的符号在两个地方不同于以前的公式:

  • [math]\displaystyle{ z }[/math] is the [math]\displaystyle{ 1 - \tfrac{1}{2}\alpha }[/math] quantile of a standard normal distribution (i.e., probit) corresponding to the target error rate [math]\displaystyle{ \alpha }[/math]. For example, for a 95% confidence level the error [math]\displaystyle{ \alpha }[/math] = 0.05, so [math]\displaystyle{ 1 - \tfrac{1}{2}\alpha }[/math] = 0.975 and [math]\displaystyle{ z }[/math] = 1.96.


  • [math]\displaystyle{ z }[/math]标准正态分布 standard normal distribution [math]\displaystyle{ 1 - \tfrac{1}{2}\alpha }[/math]分位数(即概率)对应的目标错误率 [math]\displaystyle{ \alpha }[/math]。例如,95%的置信度 confidence level 的错误率为[math]\displaystyle{ \alpha }[/math] = 0.05,因此 [math]\displaystyle{ 1 - \tfrac{1}{2}\alpha }[/math] = 0.975 并且[math]\displaystyle{ z }[/math] = 1.96.


Wald method

Wald 法

[math]\displaystyle{ \widehat{p\,} \pm z \sqrt{ \frac{ \widehat{p\,} ( 1 -\widehat{p\,} )}{ n } } . }[/math]
[math]\displaystyle{ \frac{p}{z^2}{2n}\widehat{p\,} + \frac{z^2}{2n} + z }[/math]
A continuity correction of 0.5/n may be added. 模板:Clarify;

可以添加一个0.5/n连续调整。(2012年7月更新)

[math]\displaystyle{ \sqrt{\frac{p}{n}\widehat{p\,}(1 - \widehat{p\,}){n} }[/math]

Agresti–Coull method

阿格里斯蒂-库尔方法 Agresti–Coull method

[math]\displaystyle{ \frac{z^2}{4 n^2} }[/math]

[23] {

[math]\displaystyle{ \tilde{p} \pm z \sqrt{ \frac{ \tilde{p} ( 1 - \tilde{p} )}{ n + z^2 } } . }[/math]

[math]\displaystyle{ 1 + \frac{z^2}{n} }[/math]

Here the estimate of p is modified to

这里p的估计量被修改为


[math]\displaystyle{ \tilde{p}= \frac{ n_1 + \frac{1}{2} z^2}{ n + z^2 } }[/math]

The exact (Clopper–Pearson) method is the most conservative.

确切的(克洛佩尔-皮尔森)方法是最保守的。


Arcsine method

弧线法 Arcsine method

Let X ~ B(n,p1) and Y ~ B(m,p2) be independent. Let T = (X/n)/(Y/m).

设X ~ B(n,p1)和Y ~ B(m,p2)是独立的。设T = (X/n)/(Y/m)。

[24]


Then log(T) is approximately normally distributed with mean log(p1/p2) and variance ((1/p1) − 1)/n + ((1/p2) − 1)/m.

然后log(T)近似服从正态分布,均值为log(p1/p2)和方差为[math]\displaystyle{ ((1/p1) − 1)/n + ((1/p2) − 1)/m }[/math]

[math]\displaystyle{ \sin^2 \left(\arcsin \left(\sqrt{\widehat{p\,}}\right) \pm \frac{z}{2\sqrt{n}} \right). }[/math]


Wilson (score) method

威尔逊法 Wilson (score) method

If X ~ B(n, p) and Y | X ~ B(X, q) (the conditional distribution of Y, given X), then Y is a simple binomial random variable with distribution Y ~ B(n, pq).

如果X ~ B(n, p)和Y | X ~ B(X, q) (给定Y的条件分布 X) ,则Y是服从Y ~ B(n, pq)的简单二项随机变量。


For example, imagine throwing n balls to a basket UX and taking the balls that hit and throwing them to another basket UY. If p is the probability to hit UX then X ~ B(n, p) is the number of balls that hit UX. If q is the probability to hit UY then the number of balls that hit UY is Y ~ B(X, q) and therefore Y ~ B(n, pq).

例如,想象一下把 n 个球扔到一个篮子UX里,然后把击中的球扔到另一个篮子UY里。如果 p 是击中 UX 的概率,那么X ~ B(n, p)是击中 UX 的球数。如果 q 是击中 UY 的概率,那么击中 UY的球数是Y ~ B(X, q),那么Y ~ B(n, pq)。

The notation in the formula below differs from the previous formulas in two respects:[25]

下面的公式中的符号与前面的公式有两个不同之处[25]

  • Firstly, zx has a slightly different interpretation in the formula below: it has its ordinary meaning of 'the xth quantile of the standard normal distribution', rather than being a shorthand for 'the (1 − x)-th quantile'.

首先,zx在下式中的解释略有不同:它的普通含义是标准正态分布x-th的分位数,而不是(1 − x)-th分位数的简写。


  • Secondly, this formula does not use a plus-minus to define the two bounds. Instead, one may use [math]\displaystyle{ z = z_{\alpha / 2} }[/math] to get the lower bound, or use [math]\displaystyle{ z = z_{1 - \alpha/2} }[/math] to get the upper bound. For example: for a 95% confidence level the error [math]\displaystyle{ \alpha }[/math] = 0.05, so one gets the lower bound by using [math]\displaystyle{ z = z_{\alpha/2} = z_{0.025} = - 1.96 }[/math], and one gets the upper bound by using [math]\displaystyle{ z = z_{1 - \alpha/2} = z_{0.975} = 1.96 }[/math].
  • 其次,这个公式没有使用加减法来定义两个界限。相反,我们可以使用[math]\displaystyle{ z = z_{/alpha / 2} }[/math]得到下限,或者使用[math]\displaystyle{ z = z_{1 - \alpha/2} }[/math]得到上限。例如:对于95%的置信度,误差为[math]\displaystyle{ alpha }[/math] = 0.05,所以用[math]\displaystyle{ z = z_{/alpha/2} = z_{0.025} = - 1.96 }[/math]得到下限,用[math]\displaystyle{ z = z_{1 - \alpha/2} = z_{0.975} = 1.96 }[/math]得到上限。


Since X \sim B(n, p) and Y \sim B(X, q) , by the law of total probability,

由于X [math]\displaystyle{ \sim B(n, p) }[/math]和Y [math]\displaystyle{ \sim B(X, q) }[/math],由全概率公式 the law of total probability ,


[math]\displaystyle{ \begin{align} }[/math]

[math]\displaystyle{ \frac{} \lt math\gt \Pr[Y = m] &= \sum_{k = m}^{n} \Pr[Y = m \mid X = k] \Pr[X = k] \\[2pt] }[/math]

[math]\displaystyle{ \widehat{p\,} + \frac{z^2}{2n} + z }[/math]

[math]\displaystyle{ &= \sum_{k=m}^n \binom{n}{k} \binom{k}{m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} }[/math]

[math]\displaystyle{ \frac{\widehat{p\,}(1 - \widehat{p\,})}{n} }[/math]

Since \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, the equation above can be expressed as

由于[math]\displaystyle{ \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m} }[/math],上述方程可表示为

[math]\displaystyle{ \frac{z^2}{4 n^2} }[/math]

[math]\displaystyle{ \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} }[/math]

Factoring p^k = p^m p^{k-m} and pulling all the terms that don't depend on k out of the sum now yields

[math]\displaystyle{ p ^ k = p ^ m p ^ { k-m } }[/math]进行分解,从总和中取出所有不依赖于 k 的项,现在就得到了结果

}{

[math]\displaystyle{ 1 + \frac{z^2}{n} }[/math]

[math]\displaystyle{ \Pr[Y = m] &= \binom{n}{m} p^m q^m \left( \sum_{k=m}^n \binom{n-m}{k-m} p^{k-m} (1-p)^{n-k} (1-q)^{k-m} \right) \\[2pt]} }[/math][26]

[math]\displaystyle{ &= \binom{n}{m} (pq)^m (1-pq)^{n-m} }[/math]

Comparison

and thus Y \sim B(n, pq) as desired.

因此[math]\displaystyle{ Y \sim B(n, pq) }[/math]为所需值。

The exact (Clopper–Pearson) method is the most conservative.[22]

最精确的二项式比例置信区间#Clopper–Pearson区间方法是最保守的。[22]

The Wald method, although commonly recommended in textbooks, is the most biased.模板:Clarify

Wald法虽然是教科书上普遍推荐的方法,但却是最偏颇的方法。


The Bernoulli distribution is a special case of the binomial distribution, where n = 1. Symbolically, X ~ B(1, p) has the same meaning as X ~ Bernoulli(p). Conversely, any binomial distribution, B(n, p), is the distribution of the sum of n Bernoulli trials, Bernoulli(p), each with the same probability p.

伯努利分布是二项分布的一个特例,其中n = 1。在符号上,X ~ B(1, p)与X ~ Bernoulli(p)具有相同的意义。反之,任何二项分布B(n, p)是 n 个伯努利试验和的分布,每个试验的概率 p 相同。

Related distributions

相关分布


Sums of binomials

二项式之和

The binomial distribution is a special case of the Poisson binomial distribution, or general binomial distribution, which is the distribution of a sum of n independent non-identical Bernoulli trials B(pi).

二项分布是泊松二项分布的一个特例,也叫一般二项分布,它是 n 个独立的不同的伯努利试验B(pi)和的分布。

If X ~ B(np) and Y ~ B(mp) are independent binomial variables with the same probability p, then X + Y is again a binomial variable; its distribution is Z=X+Y ~ B(n+mp):

如果X ~ B(np)和Y ~ B(mp)是独立的二项式变量,概率相同且为p,那么X  + Y又是一个二项式变量,其分布是Z=X+Y ~ B(n+mp)。


[math]\displaystyle{ \operatorname P(Z=k) &= \sum_{i=0}^k\left[\binom{n}i p^i (1-p)^{n-i}\right]\left[\binom{m}{k-i} p^{k-i} (1-p)^{m-k+i}\right]\\ }[/math]

Binomial probability mass function and normal probability density function approximation for n = 6 and p = 0.5

二项式n = 6 and p = 0.5的概率质量函数和正态概率密度函数近似

[math]\displaystyle{ &= \binom{n+m}k p^k (1-p)^{n+m-k} }[/math]


If n is large enough, then the skew of the distribution is not too great. In this case a reasonable approximation to B(n, p) is given by the normal distribution

如果 n 足够大,那么分布的偏斜就不会太大。在这种情况下,通过正态分布给出B(n, p)的合理近似


However, if X and Y do not have the same probability p, then the variance of the sum will be smaller than the variance of a binomial variable distributed as [math]\displaystyle{ B(n+m, \bar{p}).\, }[/math]

但是,如果XY的概率p不一样,那么和的方差将是小于二项式变量的方差的分布为[math]\displaystyle{ B(n+m, \bar{p}).\, }[/math]

\mathcal{N}(np,\,np(1-p)),

[math]\displaystyle{ \mathcal{N}(np,\,np(1-p)) }[/math]


Ratio of two binomial distributions

两个二项分布的比值

and this basic approximation can be improved in a simple way by using a suitable continuity correction.

通过适当的连续性修正,可以简单地改进这种基本近似。


The basic approximation generally improves as n increases (at least 20) and is better when p is not near to 0 or 1. Various rules of thumb may be used to decide whether n is large enough, and p is far enough from the extremes of zero or one:

基本近似通常随着 n 的增加而改进(至少20) ,当 p 不接近0或1时更好。经验法则可以用来判断 n 是否足够大,p的极值是否远离0或1:

This result was first derived by Katz and coauthors in 1978.[27]

这个结果最早是由卡兹 Katz和合著者在1978年得出的。[27]


Let X ~ B(n,p1) and Y ~ B(m,p2) be independent. Let T = (X/n)/(Y/m).

X ~ B(n,p1)和Y ~ B(m,p2)独立,T = (X/n)/(Y/m)。

For example, suppose one randomly samples n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If groups of n people were sampled repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation \sigma = \sqrt{\frac{p(1-p)}{n}}

例如,假设从大群体中随机抽取了 n 个人,然后询问他们是否同意某种说法。同意的人的比例取决于样本。如果 n 组人群被重复随机地取样,其比例将遵循一个近似正态分布,均值等于总体中一致性的真实比例 p,标准差[math]\displaystyle{ \sigma = \sqrt{\frac{p(1-p)}{n}} }[/math]

Then log(T) is approximately normally distributed with mean log(p1/p2) and variance ((1/p1) − 1)/n + ((1/p2) − 1)/m.

则log(T)近似正态分布,均值为log(p1/p2),方差为((1/p1) - 1)/n + ((1/p2) - 1)/m



Conditional binomials

条件二项式 Conditional binomials

If X ~ B(np) and Y | X ~ B(Xq) (the conditional distribution of Y, given X), then Y is a simple binomial random variable with distribution Y ~ B(npq).

如果X ~ B(np)和Y | 'X ~ B(Xq)('Y的条件分布,给定&nbsp。 X),则Y是一个简单的二项式随机变量,其分布为Y ~ B(npq)。


The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed or at least p tends to zero. Therefore, the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and np ≤ 10.

当试验数量趋于无穷大,而np 保持不变或者至少 p 趋于零时,二项分布收敛到泊松分布。因此,如果 n 是足够大,p 足够小的话,参数为λ = np的泊松分布可以作为二项分布B(n, p)的近似。根据两个经验法则,如果n ≥ 20和p ≤ 0.05,或者如果n ≥ 100 and np ≤ 10,则这个近似是好的。


For example, imagine throwing n balls to a basket UX and taking the balls that hit and throwing them to another basket UY. If p is the probability to hit UX then X ~ B(np) is the number of balls that hit UX. If q is the probability to hit UY then the number of balls that hit UY is Y ~ B(Xq) and therefore Y ~ B(npq).

例如,想象将n个球扔到一个篮子里UX,然后把击中的球扔到另一个篮子里UY。如果p是击中UX的概率,那么X ~ B(np)就是击中UX的球数。如果q是击中UY的概率,那么击中UY的球数是Y ~ B(Xq),因此Y ~ B(npq)。


Concerning the accuracy of Poisson approximation, see Novak, ch. 4, and references therein.

关于泊松近似的准确性,参见 Novak,ch.4,及其中的参考资料。


模板:Hidden begin

Since [math]\displaystyle{ X \sim B(n, p) }[/math] and [math]\displaystyle{ Y \sim B(X, q) }[/math], by the law of total probability,

由于[math]\displaystyle{ X \sim B(n, p) }[/math][math]\displaystyle{ Y \sim B(X, q) }[/math],由全概率公式,

[math]\displaystyle{ \Pr[Y = m] &= \sum_{k = m}^{n} \Pr[Y = m \mid X = k] \Pr[X = k] \\[2pt] }[/math]

[math]\displaystyle{ P(p;\alpha,\beta) =\frac{p^{\alpha-1}(1-p)^{\beta-1}}{\mathrm{B}(\alpha,\beta)}. }[/math]

[math]\displaystyle{ P (p; alpha,beta) = frac { p ^ { alpha-1}(1-p) ^ { beta-1}{ mathrm { b }(alpha,beta)}}. }[/math]

[math]\displaystyle{ &= \sum_{k=m}^n \binom{n}{k} \binom{k}{m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} }[/math]</math>

Given a uniform prior, the posterior distribution for the probability of success given independent events with observed successes is a beta distribution.

给定一个一致性先验,给定观察到成功结果的独立事件成功概率的后验分布是一个beta分布。


Since [math]\displaystyle{ \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, }[/math] the equation above can be expressed as

由于[math]\displaystyle{ \tbinom{n}{k} \tbinom{k}{m} = \tbinom{n}{m} \tbinom{n-m}{k-m}, }[/math]上式可表示为

[math]\displaystyle{ \Pr[Y = m] = \sum_{k=m}^{n} \binom{n}{m} \binom{n-m}{k-m} p^k q^m (1-p)^{n-k} (1-q)^{k-m} }[/math]

Factoring [math]\displaystyle{ p^k = p^m p^{k-m} }[/math] and pulling all the terms that don't depend on [math]\displaystyle{ k }[/math] out of the sum now yields

[math]\displaystyle{ p^k = p^m p^{k-m} }[/math] 进行分解,并将所有不依赖于 [math]\displaystyle{ k }[/math] 的项从总和中抽出,即可得到

Methods for random number generation where the marginal distribution is a binomial distribution are well-established.

边缘分布 marginal distribution 是二项分布较完善的随机数产生方法。

[math]\displaystyle{ \Pr[Y = m] &= \binom{n}{m} p^m q^m \left( \sum_{k=m}^n \binom{n-m}{k-m} p^{k-m} (1-p)^{n-k} (1-q)^{k-m} \right) \\[2pt] }[/math]

One way to generate random samples from a binomial distribution is to use an inversion algorithm. To do so, one must calculate the probability that for all values from through . (These probabilities should sum to a value close to one, in order to encompass the entire sample space.) Then by using a pseudorandom number generator to generate samples uniformly between 0 and 1, one can transform the calculated samples into discrete numbers by using the probabilities calculated in the first step.

一种从二项分布中产生随机样本的方法是使用反演算法 inversion algorithm 。要做到这一点,我们必须计算从到的所有值的概率。(为了包含整个样本空间,这些概率的和应该接近于1。)然后,通过使用伪随机数生成器来生成介于0和1之间的样本,可以使用在第一步计算出的概率将计算出的样本转换成离散数。

[math]\displaystyle{ &= \binom{n}{m} (pq)^m \left( \sum_{k=m}^n \binom{n-m}{k-m} \left(p(1-q)\right)^{k-m} (1-p)^{n-k} \right) }[/math]

After substituting [math]\displaystyle{ i = k - m }[/math] in the expression above, we get

[math]\displaystyle{ i = k - m }[/math] 代入上述表达式后,我们得到了

[math]\displaystyle{ \Pr[Y = m] = \binom{n}{m} (pq)^m \left( \sum_{i=0}^{n-m} \binom{n-m}{i} (p - pq)^i (1-p)^{n-m - i} \right) }[/math]

This distribution was derived by Jacob Bernoulli. He considered the case where p = r/(r + s) where p is the probability of success and r and s are positive integers. Blaise Pascal had earlier considered the case where p = 1/2.

这个分布是由雅各布伯努利 Jacob Bernoulli推导出来的。他考虑了p = r/(r + s)的情形,其中 p 是成功的概率,r 和 s 是正整数。早些时候,布莱斯 · 帕斯卡 Blaise Pascal考虑过p = 1/2的情况。

Notice that the sum (in the parentheses) above equals [math]\displaystyle{ (p - pq + 1 - p)^{n-m} }[/math] by the binomial theorem. Substituting this in finally yields

请注意,上述的和(括号内)等于[math]\displaystyle{ (p - pq + 1 - p)^{n-m} }[/math]二项式定理 binomial theorem得出。将此代入最终得到


[math]\displaystyle{ \begin{align} \Pr[Y=m] &= \binom{n}{m} (pq)^m (p - pq + 1 - p)^{n-m}\\[4pt] &= \binom{n}{m} (pq)^m (1-pq)^{n-m} \end{align} }[/math]

and thus [math]\displaystyle{ Y \sim B(n, pq) }[/math] as desired.

模板:Hidden end


Bernoulli distribution

伯努利分布

The Bernoulli distribution is a special case of the binomial distribution, where n = 1. Symbolically, X ~ B(1, p) has the same meaning as X ~ Bernoulli(p). Conversely, any binomial distribution, B(np), is the distribution of the sum of n Bernoulli trials, Bernoulli(p), each with the same probability p.[28]

伯努利分布是二项分布的特例,其中n = 1.从符号上看,X ~ B(1, p)与X ~ Bernoulli(p)具有相同的意义。相反,任何二项分布,B(np)是n个伯努利试验的和的分布,每个概率p相同。[29]


Poisson binomial distribution

泊松二项分布 Poisson binomial distribution

The binomial distribution is a special case of the Poisson binomial distribution, or general binomial distribution, which is the distribution of a sum of n independent non-identical Bernoulli trials B(pi).[30]

二项分布是泊松二项分布或广义二项分布的特例,它是n个独立的不相同的伯努利试验之和的分布。B(pi) [31]


Category:Discrete distributions

类别: 离散分布

Normal approximation

正态逼近 Normal approximation

Category:Factorial and binomial topics

类别: 阶乘和二项式主题


Category:Conjugate prior distributions

类别: 共轭先验分布

文件:Binomial Distribution.svg
Binomial probability mass function and normal probability density function approximation for n = 6 and p = 0.5

Category: Exponential family distributions

类别: 指数族分布 Exponential family distributions


This page was moved from wikipedia:en:Binomial distribution. Its edit history can be viewed at 二项分布/edithistory

  1. Feller, W. (1968). An Introduction to Probability Theory and Its Applications (Third ed.). New York: Wiley. p. 151 (theorem in section VI.3). https://archive.org/details/introductiontopr01wfel. 
  2. Feller, W. (1968). An Introduction to Probability Theory and Its Applications (Third ed.). New York: Wiley. p. 151 (theorem in section VI.3). https://archive.org/details/introductiontopr01wfel. 
  3. Wadsworth, G. P. (1960). Introduction to Probability and Random Variables. New York: McGraw-Hill. p. 52. https://archive.org/details/introductiontopr0000wads. 
  4. Wadsworth, G. P. (1960). Introduction to Probability and Random Variables. New York: McGraw-Hill. p. 52. https://archive.org/details/introductiontopr0000wads. 
  5. Jowett, G. H. (1963). "The Relationship Between the Binomial and F Distributions". Journal of the Royal Statistical Society D. 13 (1): 55–57. doi:10.2307/2986663. JSTOR 2986663.
  6. Jowett, G. H. (1963). "The Relationship Between the Binomial and F Distributions". Journal of the Royal Statistical Society D. 13 (1): 55–57. doi:10.2307/2986663. JSTOR 2986663.
  7. See Proof Wiki
  8. See also Nicolas, André (January 7, 2019). "Finding mode in Binomial distribution". Stack Exchange.
  9. See also Nicolas, André (January 7, 2019). "Finding mode in Binomial distribution". Stack Exchange.
  10. Neumann, P. (1966). "Über den Median der Binomial- and Poissonverteilung". Wissenschaftliche Zeitschrift der Technischen Universität Dresden (in German). 19: 29–33.CS1 maint: unrecognized language (link)
  11. Lord, Nick. (July 2010). "Binomial averages when the mean is an integer", The Mathematical Gazette 94, 331-332.
  12. Neumann, P. (1966). "Über den Median der Binomial- and Poissonverteilung". Wissenschaftliche Zeitschrift der Technischen Universität Dresden (in German). 19: 29–33.CS1 maint: unrecognized language (link)
  13. Lord, Nick. (July 2010). "Binomial averages when the mean is an integer", The Mathematical Gazette 94, 331-332.
  14. 14.0 14.1 14.2 Kaas, R.; Buhrman, J.M. (1980). "Mean, Median and Mode in Binomial Distributions". Statistica Neerlandica. 34 (1): 13–18. doi:10.1111/j.1467-9574.1980.tb00681.x.
  15. 15.0 15.1 Hamza, K. (1995 D(a\parallel p)=(a)\log\frac{a}{p}+(1-a)\log\frac{1-a}{1-p}. \!). this bound is reasonably tight; see "The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions F(k;n,p) \leq \exp\left(-nD\left(\frac{k}{n}\parallel p\right)\right)" Check |PMC= value (help). Statistics & Probability Letters. 23 where D(a: 21–25. doi:10.1016/0167-7152(94)00090-U. PMC Asymptotically, this bound is reasonably tight; see Check |pmc= value (help). Text " p) is the relative entropy between an a-coin and a p-coin (i.e. between the Bernoulli(a) and Bernoulli(p) distribution): " ignored (help); line feed character in |title= at position 123 (help); line feed character in |volume= at position 3 (help); line feed character in |year= at position 5 (help); Check date values in: |year= (help) 引用错误:无效<ref>标签;name属性“Hamza”使用不同内容定义了多次
  16. 16.0 16.1 16.2 16.3 Arratia, R.; Gordon, L. (1989). "Tutorial on large deviations for the binomial distribution". Bulletin of Mathematical Biology. 51 (1): 125–131. doi:10.1007/BF02458840. PMID 2706397. Unknown parameter |s2cid= ignored (help)
  17. Robert B. Ash (1990). Information Theory. Dover Publications. p. 115. https://archive.org/details/informationtheor00ashr. 
  18. Robert B. Ash (1990). Information Theory. Dover Publications. p. 115. https://archive.org/details/informationtheor00ashr. 
  19. Matoušek, J.; Vondrak, J. "The Probabilistic Method" (PDF). lecture notes.
  20. Razzaghi, Mehdi (2002). "On the estimation of binomial success probability with zero occurrence in sample". Journal of Modern Applied Statistical Methods. 1 (2): 326–332. doi:10.22237/jmasm/1036110000.
  21. Razzaghi, Mehdi (2002). "On the estimation of binomial success probability with zero occurrence in sample". Journal of Modern Applied Statistical Methods. 1 (2): 326–332. doi:10.22237/jmasm/1036110000.
  22. 22.0 22.1 22.2 22.3 Brown, Lawrence D.; Cai, T. Tony; DasGupta, Anirban (2001), "Interval Estimation for a Binomial Proportion", Statistical Science, 16 (2): 101–133, CiteSeerX 10.1.1.323.7752, doi:10.1214/ss/1009213286, retrieved 2015-01-05
  23. Agresti, Alan; Coull, Brent A. (May 1998), "Approximate is better than 'exact' for interval estimation of binomial proportions" (PDF), The American Statistician, 52 (2): 119–126, doi:10.2307/2685469, JSTOR 2685469, retrieved 2015-01-05
  24. Pires, M. A. (2002). "Confidence intervals for a binomial proportion: comparison of methods and software evaluation". In Klinke, S.; Ahrend, P.; Richter, L.. Proceedings of the Conference CompStat 2002. Short Communications and Posters. https://www.math.tecnico.ulisboa.pt/~apires/PDFs/AP_COMPSTAT02.pdf. 
  25. 25.0 25.1 Wilson, Edwin B. (June 1927), "Probable inference, the law of succession, and statistical inference" (PDF), Journal of the American Statistical Association, 22 (158): 209–212, doi:10.2307/2276774, JSTOR 2276774, archived from the original (PDF) on 2015-01-13, retrieved 2015-01-05
  26. {{cite book [math]\displaystyle{ &= \binom{n}{m} (pq)^m \left( \sum_{k=m}^n \binom{n-m}{k-m} \left(p(1-q)\right)^{k-m} (1-p)^{n-k} \right) }[/math] | chapter = Confidence intervals | chapter-url = http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm After substituting i = k - m in the expression above, we get | title = Engineering Statistics Handbook \Pr[Y = m] = \binom{n}{m} (pq)^m \left( \sum_{i=0}^{n-m} \binom{n-m}{i} (p - pq)^i (1-p)^{n-m - i} \right) | publisher = NIST/Sematech Notice that the sum (in the parentheses) above equals (p - pq + 1 - p)^{n-m} by the binomial theorem. Substituting this in finally yields | year = 2012 1.1.1.2.2.2.2.2.2.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.4.3 | access-date = 2017-07-23 [math]\displaystyle{ \Pr[Y=m] &= \binom{n}{m} (pq)^m (p - pq + 1 - p)^{n-m}\\[4pt] }[/math] }}
  27. 27.0 27.1 Katz, D.; et al. (1978). "Obtaining confidence intervals for the risk ratio in cohort studies". Biometrics. 34 (3): 469–474. doi:10.2307/2530610. JSTOR 2530610.
  28. Taboga, Marco. "Lectures on Probability Theory and Mathematical Statistics". statlect.com. Retrieved 18 December 2017.
  29. Taboga, Marco. "Lectures on Probability Theory and Mathematical Statistics". statlect.com. Retrieved 18 December 2017.
  30. Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312. Archived from the original (PDF) on 2016-03-03.
  31. Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312. Archived from the original (PDF) on 2016-03-03.