更改

跳到导航 跳到搜索
删除31,833字节 、 2020年9月9日 (三) 13:21
清空页面
第1行: 第1行: −
此词条暂由彩云小译翻译,未经人工整理和审校,带来阅读不便,请见谅。
     −
{{Short description|Probability that random variable X is less than or equal to x.}}
  −
  −
{{refimprove|date=March 2010}}
  −
  −
  −
  −
[[File:Exponential distribution cdf.png|thumb|300px|Cumulative distribution function for the [[exponential distribution]]]]
  −
  −
Cumulative distribution function for the [[exponential distribution]]
  −
  −
累积分布函数的指数分布
  −
  −
[[File:Normal Distribution CDF.svg|thumb|300px|Cumulative distribution function for the [[normal distribution]]]]
  −
  −
Cumulative distribution function for the [[normal distribution]]
  −
  −
[正态分布]的累积分布函数
  −
  −
  −
  −
In [[probability theory]] and [[statistics]], the '''cumulative distribution function''' ('''CDF''') of a real-valued [[random variable]] <math>X</math>, or just '''distribution function''' of <math>X</math>, evaluated at <math>x</math>, is the [[probability]] that <math>X</math> will take a value less than or equal to <math>x</math>.<ref>{{Cite book|url=https://github.com/mml-book/mml-book.github.io|title=Mathematics for Machine Learning|last1=Deisenroth|first1=Marc Peter|last2=Faisal|first2=A. Aldo|last3=Ong|first3=Cheng Soon|publisher=Cambridge University Press|year=2020|isbn=9781108455145|location=|pages=181}}</ref>
  −
  −
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
  −
  −
在概率论和统计学中,实值随机变量 x 的累积分布函数,或 x 的分布函数,在 x 处求值,是 x 取值小于或等于 x 的概率。
  −
  −
  −
  −
In the case of a scalar [[continuous distribution]], it gives the area under the [[probability density function]] from minus infinity to <math>x</math>. Cumulative distribution functions are also used to specify the distribution of [[multivariate random variable]]s.
  −
  −
In the case of a scalar continuous distribution, it gives the area under the probability density function from minus infinity to x. Cumulative distribution functions are also used to specify the distribution of multivariate random variables.
  −
  −
在标量连续分布的情况下,它给出了概率密度函数下的面积从负无穷到 x。累积分布函数也用来指定多元随机变量的分布。
  −
  −
  −
  −
==Definition==
  −
  −
The cumulative distribution function of a real-valued [[random variable]] <math>X</math> is the function given by<ref name=KunIlPark>{{cite book | author=Park, Kun Il| title=Fundamentals of Probability and Stochastic Processes with Applications to Communications| publisher=Springer | year=2018 | isbn=978-3-319-68074-3}}</ref>{{rp|p. 77}}
  −
  −
The cumulative distribution function of a real-valued random variable X is the function given by
  −
  −
实值随机变量 x 的累积分布函数是 x 给出的函数
  −
  −
  −
  −
{{Equation box 1
  −
  −
{{Equation box 1
  −
  −
{方程式方框1
  −
  −
|indent =
  −
  −
|indent =
  −
  −
2012年10月22日
  −
  −
|title=
  −
  −
|title=
  −
  −
2012年10月11日
  −
  −
|equation = {{NumBlk||<math>F_X(x) = \operatorname{P}(X\leq x)</math>|{{EquationRef|Eq.1}}}}
  −
  −
|equation = }}
  −
  −
| equation = }
  −
  −
|cellpadding= 6
  −
  −
|cellpadding= 6
  −
  −
6
  −
  −
|border
  −
  −
|border
  −
  −
边界
  −
  −
|border colour = #0073CF
  −
  −
|border colour = #0073CF
  −
  −
0073CF
  −
  −
|background colour=#F5FFFA}}
  −
  −
|background colour=#F5FFFA}}
  −
  −
5/fffa }}
  −
  −
  −
  −
where the right-hand side represents the [[probability]] that the random variable <math>X</math> takes on a value less than or
  −
  −
where the right-hand side represents the probability that the random variable X takes on a value less than or
  −
  −
其中右边表示随机变量 x 取小于或的概率
  −
  −
equal to <math>x</math>. The probability that <math>X</math> lies in the semi-closed [[interval (mathematics)|interval]] <math>(a,b]</math>, where <math>a < b</math>, is therefore<ref name=KunIlPark/>{{rp|p. 84}}
  −
  −
equal to x. The probability that X lies in the semi-closed interval (a,b], where a < b, is therefore using the Fundamental Theorem of Calculus; i.e. given F(x),
  −
  −
等于 x。因此,x 位于半闭区间(a,b ]中的概率,其中 a < b,就是使用微积分基本定理;。给定 f (x) ,
  −
  −
  −
  −
{{Equation box 1
  −
  −
f(x) = {dF(x) \over dx}
  −
  −
f (x) = { dF (x)/dx }
  −
  −
|indent =
  −
  −
|title=
  −
  −
as long as the derivative exists.
  −
  −
只要衍生物存在。
  −
  −
|equation = {{NumBlk||<math>\operatorname{P}(a < X \le b)= F_X(b)-F_X(a)</math>|{{EquationRef|Eq.2}}}}
  −
  −
|cellpadding= 6
  −
  −
The CDF of a continuous random variable X can be expressed as the integral of its probability density function f_X as follows:, is the value of cumulative distribution function of the normal distribution. It is very useful to use Z-table not only for probabilities below a value which is the original application of cumulative distribution function, but also above and/or between values on standard normal distribution, and it was further extended to any normal distribution.
  −
  −
连续型随机变量 x 的 CDF 可以表示为其概率密度函数 f x 的积分,如下: ,是正态分布的累积分布函数。使用 z 表不仅对于概率低于最初应用累积分布函数的值,而且对于标准正态分布上的数值之间的和/或数值之间的概率也是非常有用的,它进一步推广到任何正态分布。
  −
  −
|border
  −
  −
|border colour = #0073CF
  −
  −
Properties
  −
  −
属性
  −
  −
|background colour=#F5FFFA}}
  −
  −
  −
  −
\bar F_X(x) \leq \frac{\operatorname{E}(X)}{x} .
  −
  −
除了 f _ x (x) leq frac { operatorname { e }(x)}{ x }。
  −
  −
In the definition above, the "less than or equal to" sign, "≤", is a convention, not a universally used one (e.g. Hungarian literature uses "<"), but the distinction is important for discrete distributions. The proper use of tables of the [[Binomial distribution|binomial]] and [[Poisson distribution]]s depends upon this convention. Moreover, important formulas like [[Paul Lévy (mathematician)|Paul Lévy]]'s inversion formula for the [[Characteristic function (probability theory)#Inversion formulae|characteristic function]] also rely on the "less than or equal" formulation.
  −
  −
  −
  −
Proof: Assuming X has a density function f_X, for any  c> 0
  −
  −
证明: 假设 x 有一个密度函数 f _ x,对于任意 c > 0
  −
  −
If treating several random variables <math>X,Y,\ldots</math> etc. the corresponding letters are used as subscripts while, if treating only one, the subscript is usually omitted. It is conventional to use a capital <math>F</math> for a cumulative distribution function, in contrast to the lower-case <math>f</math> used for [[probability density function]]s and [[probability mass function]]s. This applies when discussing general distributions: some specific distributions have their own conventional notation, for example the [[normal distribution]].
  −
  −
<math>
  −
  −
《数学》
  −
  −
  −
  −
\operatorname{E}(X) = \int_0^\infty x f_X(x) \, dx \geq \int_0^c x f_X(x) \, dx + c\int_c^\infty f_X(x) \, dx
  −
  −
操作数{ e }(x) = int _ 0 ^ infty x _ x (x) ,dx geq int _ 0 ^ c x _ x (x) ,dx + c int _ c ^ infty f _ x (x) ,dx
  −
  −
The probability density function of a continuous random variable can be determined from the cumulative distribution function by differentiating<ref>{{Cite book|title=Applied Statistics and Probability for Engineers|last1=Montgomery|first1=Douglas C.|last2=Runger|first2=George C.|publisher=John Wiley & Sons, Inc.|year=2003|isbn=0-471-20454-4|page=104|url=http://www.um.edu.ar/math/montgomery.pdf}}</ref> using the [[Fundamental Theorem of Calculus]]; i.e. given <math>F(x)</math>,
  −
  −
</math>
  −
  −
数学
  −
  −
  −
  −
Then, on recognizing \bar F_X(c) = \int_c^\infty f_X(x) \, dx  and rearranging terms,
  −
  −
然后,在识别条形 f _ x (c) = int _ c ^ infty f _ x (x) ,dx 并重新排列术语时,
  −
  −
: <math>f(x) = {dF(x) \over dx}</math>
  −
  −
<math>
  −
  −
《数学》
  −
  −
  −
  −
0 \leq c\bar F_X(c) \leq \operatorname{E}(X) - \int_0^c x f_X(x) \, dx \to 0 \text{ as } c \to \infty
  −
  −
0 leq c bar f _ x (c) leq operatorname { e }(x)-int _ 0 ^ c x _ x (x) ,dx to 0 text { as } c to infty
  −
  −
as long as the derivative exists.
  −
  −
</math>
  −
  −
数学
  −
  −
  −
  −
as claimed.
  −
  −
如你所说。
  −
  −
The CDF of a [[continuous random variable]] <math>X</math> can be expressed as the integral of its probability density function <math>f_X</math> as follows:<ref name="KunIlPark" />{{rp|p. 86}}
  −
  −
  −
  −
:<math>F_X(x) = \int_{-\infty}^x f_X(t)\,dt.</math>
  −
  −
Example of the folded cumulative distribution for a [[normal distribution function with an expected value of 0 and a standard deviation of 1.]]
  −
  −
折叠累积分布的例子[[期望值为0,标准差为1的正态分布函数]
  −
  −
  −
  −
While the plot of a cumulative distribution often has an S-like shape, an alternative illustration is the folded cumulative distribution or mountain plot, which folds the top half of the graph over,
  −
  −
虽然一个累积分布的图形通常是 s 形的,但另一个例子是折叠的累积分布或山脉图,它把图的上半部折叠起来,
  −
  −
In the case of a random variable <math>X</math> which has distribution having a discrete component at a value <math>b</math>,
  −
  −
thus using two scales, one for the upslope and another for the downslope. This form of illustration emphasises the median and dispersion (specifically, the mean absolute deviation from the median) of the distribution or of the empirical results.
  −
  −
因此使用两个尺度,一个为上坡,另一个为下坡。这种形式的例证强调了分布或实证结果的中位数和分散度(特别是中位数的平均平均差)。
  −
  −
  −
  −
:<math>\operatorname{P}(X=b) = F_X(b) - \lim_{x \to b^{-}} F_X(x).</math>
  −
  −
  −
  −
If <math>F_X</math> is continuous at <math>b</math>, this equals zero and there is no discrete component at <math>b</math>.
  −
  −
If the CDF F is strictly increasing and continuous then  F^{-1}( p ), p \in [0,1],  is the unique real number  x  such that  F(x) = p . In such a case, this defines the inverse distribution function or quantile function.
  −
  −
如果 CDF 是严格增长且连续的,那么 f ^ {-1}(p) ,p 在[0,1]中,是唯一的实数 x,使得 f (x) = p。在这种情况下,这定义了反分布函数或分位函数。
  −
  −
  −
  −
== Properties ==
  −
  −
Some distributions do not have a unique inverse (for example in the case where f_X(x)=0 for all a<x<b, causing F_X to be constant). This problem can be solved by defining, for  p \in [0,1] , the generalized inverse distribution function:
  −
  −
有些分布没有唯一的逆(例如,在 f _ x (x) = 0的情况下,所有 a < x < b,导致 f _ x 为常数)。这个问题可以通过定义 p 在[0,1]中的广义逆阵分布函数来解决:
  −
  −
[[File:Discrete probability distribution illustration.svg|right|thumb|From top to bottom, the cumulative distribution function of a discrete probability distribution, continuous probability distribution, and a distribution which has both a continuous part and a discrete part.]]
  −
  −
<math>
  −
  −
《数学》
  −
  −
  −
  −
F^{-1}(p) = \inf \{x \in \mathbb{R}: F(x) \geq p \}.
  −
  −
F ^ {-1}(p) = inf { x in mathbb { r } : f (x) geq p }.
  −
  −
Every cumulative distribution function <math>F_X</math> is [[monotone increasing|non-decreasing]]<ref name=KunIlPark/>{{rp|p. 78}} and [[right-continuous]]<ref name=KunIlPark/>{{rp|p. 79}}, which makes it a [[càdlàg]] function. Furthermore,
  −
  −
</math>
  −
  −
数学
  −
  −
:<math>\lim_{x\to -\infty}F_X(x)=0, \quad \lim_{x\to +\infty}F_X(x)=1.</math>
  −
  −
  −
  −
Every function with these four properties is a CDF, i.e., for every such function, a [[random variable]] can be defined such that the function is the cumulative distribution function of that random variable.
  −
  −
  −
  −
Some useful properties of the inverse cdf (which are also preserved in the definition of the generalized inverse distribution function) are:
  −
  −
逆 cdf 的一些有用的性质(在广义逆阵分布函数的定义中也保留了这些性质)是:
  −
  −
If <math>X</math> is a purely [[discrete random variable]], then it attains values <math>x_1,x_2,\ldots</math> with probability <math>p_i = p(x_i)</math>, and the CDF of <math>X</math> will be [[discontinuity (mathematics)|discontinuous]] at the points <math>x_i</math>:
  −
  −
  −
  −
F^{-1} is nondecreasing
  −
  −
F ^ {-1}是不减的
  −
  −
:<math>F_X(x) = \operatorname{P}(X\leq x) = \sum_{x_i \leq x} \operatorname{P}(X = x_i) = \sum_{x_i \leq x} p(x_i).</math>
  −
  −
F^{-1}(F(x)) \leq x
  −
  −
F ^ {-1}(f (x)) leq x
  −
  −
  −
  −
F(F^{-1}(p)) \geq p
  −
  −
F (f ^ {-1}(p)) geq p
  −
  −
If the CDF <math>F_X</math> of a real valued random variable <math>X</math> is [[continuous function|continuous]], then <math>X</math> is a [[continuous random variable]]; if furthermore <math>F_X</math> is [[absolute continuity|absolutely continuous]], then there exists a [[Lebesgue integral|Lebesgue-integrable]] function <math>f_X(x)</math> such that
  −
  −
F^{-1}(p) \leq x if and only if p \leq F(x)
  −
  −
F ^ {-1}(p) leq x 当且仅当 p leq f (x)
  −
  −
  −
  −
If Y has a U[0, 1] distribution then F^{-1}(Y) is distributed as F. This is used in random number generation using the inverse transform sampling-method.
  −
  −
如果 y 具有 u [0,1]分布,则 f ^ {-1}(y)分布为 f。这是用于随机数生成使用逆变换采样方法。
  −
  −
:<math>F_X(b)-F_X(a) = \operatorname{P}(a< X\leq b) = \int_a^b f_X(x)\,dx</math>
  −
  −
If \{X_\alpha\} is a collection of independent F-distributed random variables defined on the same sample space, then there exist random variables Y_\alpha such that Y_\alpha is distributed as U[0,1] and F^{-1}(Y_\alpha) = X_\alpha with probability 1 for all \alpha.
  −
  −
如果{ x _ alpha }是定义在同一样本空间上的独立 f 分布随机变量的集合,那么存在随机变量 y _ alpha,使得 y _ alpha 分布为 u [0,1]和 f ^ {-1}(y _ alpha) = x _ alpha,对所有 α 的概率为1。
  −
  −
  −
  −
for all real numbers <math>a</math> and <math>b</math>. The function <math>f_X</math> is equal to the [[derivative]] of <math>F_X</math> [[almost everywhere]], and it is called the [[probability density function]] of the distribution of <math>X</math>.
  −
  −
The inverse of the cdf can be used to translate results obtained for the uniform distribution to other distributions.
  −
  −
Cdf 的倒数可以用来将均匀分布的结果转化为其他分布。
  −
  −
  −
  −
== Examples ==
  −
  −
As an example, suppose <math>X</math> is [[Uniform distribution (continuous)|uniformly distributed]] on the unit interval <math>[0,1]</math>.
  −
  −
The empirical distribution function is an estimate of the cumulative distribution function that generated the points in the sample. It converges with probability 1 to that underlying distribution. A number of results exist to quantify the rate of convergence of the empirical distribution function to the underlying cumulative distribution function.
  −
  −
经验分布函数是在样本中产生点的累积分布函数的估计。它以1的概率收敛于那个基本分布。许多结果可以用来量化经验分布函数对基本累积分布函数的收敛速度。
  −
  −
  −
  −
Then the CDF of <math>X</math> is given by
  −
  −
  −
  −
: <math>F_X(x) = \begin{cases}
  −
  −
When dealing simultaneously with more than one random variable the joint cumulative distribution function can also be defined. For example, for a pair of random variables X,Y, the joint CDF F_{XY} is given by:
  −
  −
当同时处理多个随机变量时,也可以定义联合累积分布函数。例如,对于一对随机变量 x,y,联合 CDF _ { XY }由下面给出:
  −
  −
  −
  −
0 &:\ x < 0\\
  −
  −
given the joint probability density function in tabular form, determine the joint cumulative distribution function.
  −
  −
根据表格形式的关节概率密度函数确定关节累积分布函数。
  −
  −
  −
  −
{| class="wikitable"
  −
  −
{ | class = “ wikitable”
  −
  −
x &:\ 0 \le x \le 1\\
  −
  −
|
  −
  −
|
  −
  −
  −
  −
|Y = 2
  −
  −
2
  −
  −
1 &:\ x > 1
  −
  −
|Y = 4
  −
  −
4
  −
  −
  −
  −
|Y = 6
  −
  −
6
  −
  −
\end{cases}</math>
  −
  −
|Y = 8
  −
  −
8
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
Suppose instead that <math>X</math> takes only the discrete values 0 and 1, with equal probability.
  −
  −
|X = 1
  −
  −
1
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
Then the CDF of <math>X</math> is given by
  −
  −
|0.1
  −
  −
|0.1
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
: <math>F_X(x) = \begin{cases}
  −
  −
|0.1
  −
  −
|0.1
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
0 &:\ x < 0\\
  −
  −
|X = 3
  −
  −
| x = 3
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
1/2 &:\ 0 \le x < 1\\
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0.2
  −
  −
|0.2
  −
  −
1 &:\ x \ge 1
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
\end{cases}</math>
  −
  −
|X = 5
  −
  −
5
  −
  −
  −
  −
|0.3
  −
  −
|0.3
  −
  −
Suppose <math>X</math> is [[Exponential distribution|exponential distributed]]. Then the CDF of <math>X</math> is given by
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
: <math>F_X(x;\lambda) = \begin{cases}
  −
  −
|0.15
  −
  −
|0.15
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
1-e^{-\lambda x} & x \ge 0, \\
  −
  −
|X = 7
  −
  −
7
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
0 & x < 0.
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0.15
  −
  −
|0.15
  −
  −
\end{cases}</math>
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|}
  −
  −
|}
  −
  −
Here λ > 0 is the parameter of the distribution, often called the rate parameter.
  −
  −
Solution: using the given table of probabilities for each potential range of X and Y, the joint cumulative distribution function may be constructed in tabular form:
  −
  −
解决方案: 使用给定的 x 和 y 每个潜在范围的概率表,联合累积分布函数可以用表格的形式构造出来:
  −
  −
  −
  −
{| class="wikitable"
  −
  −
{ | class = “ wikitable”
  −
  −
Suppose <math>X</math> is [[Normal distribution|normal distributed]]. Then the CDF of <math>X</math> is given by
  −
  −
|
  −
  −
|
  −
  −
  −
  −
|Y < 2
  −
  −
| y < 2
  −
  −
: <math>
  −
  −
|2 ≤ Y < 4
  −
  −
|2 ≤ Y < 4
  −
  −
  −
  −
|4 ≤ Y < 6
  −
  −
|4 ≤ Y < 6
  −
  −
F(x;\mu,\sigma)
  −
  −
|6 ≤ Y < 8
  −
  −
|6 ≤ Y < 8
  −
  −
  −
  −
|Y ≤ 8
  −
  −
|Y ≤ 8
  −
  −
=
  −
  −
|-
  −
  −
|-
  −
  −
  −
  −
|X < 1
  −
  −
| x < 1
  −
  −
\frac{1}{\sigma\sqrt{2\pi}}
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
\int_{-\infty}^x
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
\exp
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
\left( -\frac{(t - \mu)^2}{2\sigma^2}
  −
  −
|1 ≤ X < 3
  −
  −
|1 ≤ X < 3
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
\ \right)\, dt.
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0.1
  −
  −
|0.1
  −
  −
</math>
  −
  −
|0.1
  −
  −
|0.1
  −
  −
  −
  −
|0.2
  −
  −
|0.2
  −
  −
Here the parameter <math>\mu</math>  is the mean or expectation of the distribution; and <math>\sigma</math>  is its standard deviation.
  −
  −
|-
  −
  −
|-
  −
  −
  −
  −
|3 ≤ X < 5
  −
  −
|3 ≤ X < 5
  −
  −
Suppose <math>X</math> is [[Binomial distribution|binomial distributed]]. Then the CDF of <math>X</math> is given by
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0
  −
  −
|0
  −
  −
: <math>F(k;n,p)=\Pr(X\leq k)=\sum _{i=0}^{\lfloor k\rfloor }{n \choose i}p^{i}(1-p)^{n-i}</math>
  −
  −
|0.1
  −
  −
|0.1
  −
  −
  −
  −
|0.3
  −
  −
|0.3
  −
  −
Here <math>p</math> is the probability of success and the function denotes the discrete probability distribution of the number of successes in a sequence of <math>n</math> independent experiments, and <math>\lfloor k\rfloor\,</math> is the "floor" under <math>k</math>, i.e. the [[greatest integer]] less than or equal to <math>k</math>.
  −
  −
|0.4
  −
  −
|0.4
  −
  −
<br />
  −
  −
|-
  −
  −
|-
  −
  −
  −
  −
|5 ≤ X < 7
  −
  −
|5 ≤ X < 7
  −
  −
==Derived functions==
  −
  −
|0
  −
  −
|0
  −
  −
===Complementary cumulative distribution function (tail distribution)===<!-- This section is linked from [[Power law]], [[Stretched exponential function]] and [[Weibull distribution]] -->
  −
  −
|0.3
  −
  −
|0.3
  −
  −
Sometimes, it is useful to study the opposite question and ask how often the random variable is ''above'' a particular level. This is called the '''complementary cumulative distribution function''' ('''ccdf''') or simply the '''tail distribution''' or '''exceedance''', and is defined as
  −
  −
|0.4
  −
  −
|0.4
  −
  −
  −
  −
|0.6
  −
  −
|0.6
  −
  −
:<math>\bar F_X(x) = \operatorname{P}(X > x) = 1 - F_X(x).</math>
  −
  −
|0.85
  −
  −
|0.85
  −
  −
  −
  −
|-
  −
  −
|-
  −
  −
This has applications in [[statistics|statistical]] [[hypothesis test]]ing, for example, because the one-sided [[p-value]] is the probability of observing a test statistic ''at least'' as extreme as the one observed. Thus, provided that the [[test statistic]], ''T'', has a continuous distribution, the one-sided [[p-value]] is simply given by the ccdf: for an observed value <math>t</math> of the test statistic
  −
  −
|X ≤ 7
  −
  −
|X ≤ 7
  −
  −
:<math>p= \operatorname{P}(T \ge t) = \operatorname{P}(T > t) =1 - F_T(t).</math>
  −
  −
|0
  −
  −
|0
  −
  −
  −
  −
|0.3
  −
  −
|0.3
  −
  −
In [[survival analysis]], <math>\bar F_X(x)</math> is called the '''[[survival function]]''' and denoted <math> S(x) </math>, while the term ''reliability function'' is common in [[engineering]].
  −
  −
|0.4
  −
  −
|0.4
  −
  −
  −
  −
|0.75
  −
  −
|0.75
  −
  −
Z-table:
  −
  −
|1
  −
  −
|1
  −
  −
  −
  −
|}
  −
  −
|}
  −
  −
One of the most popular application of cumulative distribution function is [[standard normal table]], also called the '''unit normal table''' or '''Z table'''<ref>{{Cite web|url=https://www.ztable.net/|title=Z Table|last=|first=|date=|website=Z Table|language=en-US|access-date=2019-12-11}}</ref>, is the value of cumulative distribution function of the normal distribution. It is very useful to use Z-table not only for probabilities below a value which is the original application of cumulative distribution function, but also above and/or between values on standard normal distribution, and it was further extended to any normal distribution.
  −
  −
<br />
  −
  −
< br/>
  −
  −
  −
  −
;Properties
  −
  −
* For a non-negative continuous random variable having an expectation, [[Markov's inequality]] states that<ref name="ZK">{{cite book| last1 = Zwillinger| first1 = Daniel| last2 = Kokoska| first2 = Stephen| title = CRC Standard Probability and Statistics Tables and Formulae| year = 2010| publisher = CRC Press| isbn = 978-1-58488-059-2| page = 49 }}</ref>
  −
  −
For N random variables X_1,\ldots,X_N, the joint CDF F_{X_1,\ldots,X_N} is given by
  −
  −
对于 n 个随机变量 x1,ldots,xn,给出了联合 CDF { x1,ldots,xn }
  −
  −
:: <math>\bar F_X(x) \leq \frac{\operatorname{E}(X)}{x} .</math>
  −
  −
* As <math> x \to \infty, \bar F_X(x) \to 0 \ </math>, and in fact <math> \bar F_X(x) = o(1/x) </math> provided that <math>\operatorname{E}(X)</math> is finite.
  −
  −
{{Equation box 1
  −
  −
{方程式方框1
  −
  −
:Proof:{{citation needed|date=April 2012}} Assuming <math>X</math> has a density function <math>f_X</math>, for any <math> c> 0 </math>
  −
  −
|indent =
  −
  −
2012年10月22日
  −
  −
::<math>
  −
  −
|title=
  −
  −
2012年10月11日
  −
  −
\operatorname{E}(X) = \int_0^\infty x f_X(x) \, dx \geq \int_0^c x f_X(x) \, dx + c\int_c^\infty f_X(x) \, dx
  −
  −
|equation = }}
  −
  −
| equation = }
  −
  −
</math>
  −
  −
|cellpadding= 6
  −
  −
6
  −
  −
:Then, on recognizing <math>\bar F_X(c) = \int_c^\infty f_X(x) \, dx </math> and rearranging terms,
  −
  −
|border
  −
  −
边界
  −
  −
::<math>
  −
  −
|border colour = #0073CF
  −
  −
0073CF
  −
  −
0 \leq c\bar F_X(c) \leq \operatorname{E}(X) - \int_0^c x f_X(x) \, dx \to 0 \text{ as } c \to \infty
  −
  −
|background colour=#F5FFFA}}
  −
  −
5/fffa }}
  −
  −
</math>
  −
  −
:as claimed.
  −
  −
Interpreting the N random variables as a random vector \mathbf{X} = (X_1,\ldots,X_N)^T yields a shorter notation:
  −
  −
将 n 个随机变量解释为一个随机向量 mathbf { x } = (x _ 1,ldots,x _ n) ^ t 得到一个更短的符号:
  −
  −
  −
  −
===Folded cumulative distribution===
  −
  −
F_{\mathbf{X}}(\mathbf{x}) = \operatorname{P}(X_1 \leq x_1,\ldots,X_N \leq x_n)
  −
  −
F _ { mathbf { x }(mathbf { x }) = 操作数名{ p }(x _ 1 leq x _ 1,ldots,x _ n leq x _ n)
  −
  −
[[Image:Folded-cumulative-distribution-function.svg|thumb|right|Example of the folded cumulative distribution for a [[normal distribution]] function with an [[expected value]] of 0 and a [[standard deviation]] of 1.]]
  −
  −
While the plot of a cumulative distribution often has an S-like shape, an alternative illustration is the '''folded cumulative distribution''' or '''mountain plot''', which folds the top half of the graph over,<ref name="Gentle">{{cite book| author = Gentle, J.E.| title = Computational Statistics| url = https://books.google.com/?id=m4r-KVxpLsAC&pg=PA348| accessdate = 2010-08-06| year = 2009| publisher = [[Springer Science+Business Media|Springer]]| isbn = 978-0-387-98145-1 }}{{Page needed|date=June 2011}}</ref><ref name="Monti">
  −
  −
{{cite journal|last=Monti|first=K. L.|authorlink= Katherine Monti |pages=342–345|year=1995|title=Folded Empirical Distribution Function Curves (Mountain Plots) |journal=The American Statistician|volume=49|issue=4|jstor=2684570|doi=10.2307/2684570}}</ref>
  −
  −
Every multivariate CDF is:
  −
  −
每一个多变量 CDF 都是:
  −
  −
thus using two scales, one for the upslope and another for the downslope. This form of illustration emphasises the [[median (statistics)|median]] and [[dispersion (statistics)|dispersion]] (specifically, the [[mean absolute deviation]] from the median<ref>{{Cite journal
  −
  −
Monotonically non-decreasing for each of its variables,
  −
  −
每个变量的单调非递减,
  −
  −
| last1 = Xue | first1 = J. H.
  −
  −
Right-continuous in each of its variables,
  −
  −
每个变量都是右连续的,
  −
  −
| last2 = Titterington | first2 = D. M.
  −
  −
0\leq F_{X_1 \ldots X_n}(x_1,\ldots,x_n)\leq 1,
  −
  −
0 leq f { x _ 1 ldots x _ n }(x _ 1,ldots,x _ n) leq 1,
  −
  −
| doi = 10.1016/j.spl.2011.03.014
  −
  −
\lim_{x_1,\ldots,x_n \rightarrow+\infty}F_{X_1 \ldots X_n}(x_1,\ldots,x_n)=1 \text{ and } \lim_{x_i\rightarrow-\infty}F_{X_1 \ldots X_n}(x_1,\ldots,x_n)=0, \text{for all } i.
  −
  −
lim_x_1,ldots,xn righttarrow + infty } f { x_1 ldots x _ n }(x_1,ldots,xn) = 1 text { and } lim_x_i right tarrow-infty } f { x_1 ldots x _ n }(x_1,ldots,xn) = 0,text { for all } i.
  −
  −
| title = The p-folded cumulative distribution function and the mean absolute deviation from the p-quantile
  −
  −
| journal = Statistics & Probability Letters
  −
  −
The probability that a point belongs to a hyperrectangle is analogous to the 1-dimensional case:
  −
  −
一个点属于超矩形的概率与一维情况类似:
  −
  −
| volume = 81 | issue = 8 | pages = 1179–1182
  −
  −
F_{X_1,X_2}(a, c) + F_{X_1,X_2}(b, d) - F_{X_1,X_2}(a, d) - F_{X_1,X_2}(b, c) = \operatorname{P}(a < X_1 \leq b, c < X_2 \leq d) = \int ...
  −
  −
F _ { x _ 1,x _ 2}(a,c) + f _ { x _ 1,x _ 2}(b,d)-f _ { x _ 1,x _ 2}(a,d)-f _ { x _ 1,x _ 2}(b,c) = 操作者名{ p }(a < x _ 1 leq b,c < x _ 2 leq d) = int..。
  −
  −
| year = 2011
  −
  −
| pmid = | pmc = | url = https://hal.archives-ouvertes.fr/hal-00753950/file/PEER_stage2_10.1016%252Fj.spl.2011.03.014.pdf
  −
  −
}}</ref>) of the distribution or of the empirical results.
  −
  −
  −
  −
The generalization of the cumulative distribution function from real to complex random variables is not obvious because expressions of the form  P(Z \leq 1+2i)  make no sense. However expressions of the form  P(\Re{(Z)} \leq 1, \Im{(Z)} \leq 3)  make sense. Therefore, we define the cumulative distribution of a complex random variables via the joint distribution of their real and imaginary parts:
  −
  −
由于形式 p (z leq 1 + 2 i)的表达式没有意义,累积分布函数从实随机变量到复随机变量的推广并不明显。然而形式 p (Re {(z)} leq 1,Im {(z)} leq 3的表达式是有意义的。因此,我们通过实部和虚部的联合分布来定义复杂随机变量的累积分布:
  −
  −
===Inverse distribution function (quantile function)===
  −
  −
F_Z(z)=F_{\Re{(Z)},\Im{(Z)}}(\Re{(z)},\Im{(z)})=P(\Re{(Z)} \leq \Re{(z)} , \Im{(Z)} \leq \Im{(z)}) .
  −
  −
F _ z (z) = f _ { Re {(z)} ,Im {(z)}(Re {(z)} ,Im {(z)}) = p (Re {(z)} leq Re {(z)} ,Im {(z)} leq Im {(z)}).
  −
  −
{{main|Quantile function}}
  −
  −
If the CDF ''F'' is strictly increasing and continuous then <math> F^{-1}( p ), p \in [0,1], </math> is the unique real number <math> x </math> such that <math> F(x) = p </math>. In such a case, this defines the '''inverse distribution function''' or [[quantile function]].
  −
  −
  −
  −
Generalization of  yields
  −
  −
产量的一般化
  −
  −
Some distributions do not have a unique inverse (for example in the case where <math>f_X(x)=0</math> for all <math>a<x<b</math>, causing <math>F_X</math> to be constant). This problem can be solved by defining, for <math> p \in [0,1] </math>, the '''generalized inverse distribution function''':
  −
  −
F_{\mathbf{Z}}(\mathbf{z}) = F_{\Re{(Z_1)},\Im{(Z_1)}, \ldots, \Re{(Z_n)},\Im{(Z_n)}}(\Re{(z_1)}, \Im{(z_1)},\ldots,\Re{(z_n)}, \Im{(z_n)}) = \operatorname{P}(\Re{(Z_1)} \leq \Re{(z_1)},\Im{(Z_1)} \leq \Im{(z_1)},\ldots,\Re{(Z_n)} \leq \Re{(z_n)},\Im{(Z_n)} \leq \Im{(z_n)})
  −
  −
句子太长,请提供一个短句
  −
  −
:<math>
  −
  −
as definition for the CDS of a complex random vector \mathbf{Z} = (Z_1,\ldots,Z_N)^T.
  −
  −
作为复随机向量 mathbf { z } = (z1,ldots,zn) ^ t 的 CDS 的定义。
  −
  −
F^{-1}(p) = \inf \{x \in \mathbb{R}: F(x) \geq p \}.
  −
  −
</math>
  −
  −
* Example 1: The median is <math>F^{-1}( 0.5 )</math>.
  −
  −
The concept of the cumulative distribution function makes an explicit appearance in statistical analysis in two (similar) ways. Cumulative frequency analysis is the analysis of the frequency of occurrence of values of a phenomenon less than a reference value. The empirical distribution function is a formal direct estimate of the cumulative distribution function for which simple statistical properties can be derived and which can form the basis of various statistical hypothesis tests. Such tests can assess whether there is evidence against a sample of data having arisen from a given distribution, or evidence against two samples of data having arisen from the same (unknown) population distribution.
  −
  −
累积分布函数的概念在统计分析中以两种(类似的)方式得到了清晰的体现。累积频率分析是分析出现频率值小于一个参考值的现象。经验分布函数是累积分布函数的一个正式的直接估计,它可以导出简单的统计性质,并且可以形成各种统计假设检验的基础。这种检验可以评估是否有证据证明某一数据样本来自某一特定分布,或者有证据证明两个数据样本来自同一(未知)人口分布。
  −
  −
* Example 2: Put <math> \tau = F^{-1}( 0.95 ) </math>. Then we call <math> \tau </math> the 95th percentile.
  −
  −
  −
  −
Some useful properties of the inverse cdf (which are also preserved in the definition of the generalized inverse distribution function) are:
  −
  −
The Kolmogorov–Smirnov test is based on cumulative distribution functions and can be used to test to see whether two empirical distributions are different or whether an empirical distribution is different from an ideal distribution. The closely related Kuiper's test is useful if the domain of the distribution is cyclic as in day of the week. For instance Kuiper's test might be used to see if the number of tornadoes varies during the year or if sales of a product vary by day of the week or day of the month.
  −
  −
Kolmogorov-Smirnov 检验是基于累积分布函数的,可用于检验两个经验分布是否不同,或者经验分布是否不同于理想分布。密切相关的柯伊伯试验是有用的,如果领域的分布是循环的,因为在一周的天。例如,柯伊伯的测试可以用来观察一年中龙卷风的数量是否变化,或者一个产品的销售是否随着一周中的某一天或一个月中的某一天而变化。
  −
  −
  −
  −
# <math>F^{-1}</math> is nondecreasing
  −
  −
# <math>F^{-1}(F(x)) \leq x</math>
  −
  −
# <math>F(F^{-1}(p)) \geq p</math>
  −
  −
# <math>F^{-1}(p) \leq x</math> if and only if <math>p \leq F(x)</math>
  −
  −
# If <math>Y</math> has a <math>U[0, 1]</math> distribution then <math>F^{-1}(Y)</math> is distributed as <math>F</math>. This is used in [[random number generation]] using the [[inverse transform sampling]]-method.
  −
  −
# If <math>\{X_\alpha\}</math> is a collection of independent <math>F</math>-distributed random variables defined on the same sample space, then there exist random variables <math>Y_\alpha</math> such that <math>Y_\alpha</math> is distributed as <math>U[0,1]</math> and <math>F^{-1}(Y_\alpha) = X_\alpha</math> with probability 1 for all <math>\alpha</math>.
  −
  −
  −
  −
The inverse of the cdf can be used to translate results obtained for the uniform distribution to other distributions.
  −
  −
  −
  −
=== Empirical distribution function ===
  −
  −
The [[empirical distribution function]] is an estimate of the cumulative distribution function that generated the points in the sample. It converges with probability 1 to that underlying distribution. A number of results exist to quantify the rate of convergence of the empirical distribution function to the underlying cumulative distribution function{{Citation needed|date=January 2020}}.
  −
  −
  −
  −
==Multivariate case==
  −
  −
===Definition for two random variables===
  −
  −
When dealing simultaneously with more than one random variable the '''joint cumulative distribution function''' can also be defined. For example, for a pair of random variables <math>X,Y</math>, the joint CDF <math>F_{XY}</math> is given by<ref name=KunIlPark/>{{rp|p. 89}}
  −
  −
Category:Functions related to probability distributions
  −
  −
类别: 与概率分布有关的函数
  −
  −
<noinclude>
  −
  −
<small>This page was moved from [[wikipedia:en:Cumulative distribution function]]. Its edit history can be viewed at [[概率密度函数/edithistory]]</small></noinclude>
  −
  −
[[Category:待整理页面]]
 
575

个编辑

导航菜单