概率密度函数

来自集智百科 - 伊辛模型
跳到导航 跳到搜索

此词条暂由彩云小译翻译,翻译字数共3371,未经人工整理和审校,带来阅读不便,请见谅。

模板:Use American English

文件:Boxplot vs PDF.svg
Boxplot and probability density function of a normal distribution N(0, σ2).

Boxplot and probability density function of a normal distribution .

[箱线图和正态分布的概率密度函数]

文件:Visualisation mode median mean.svg
Geometric visualisation of the mode, median and mean of an arbitrary probability density function.[1]

mode, median and mean of an arbitrary probability density function.]]

方式,中位数和平均概率密度函数]


In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample.[2] In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0 (since there are an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0 (since there are an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

在概率论中,一个概率密度函数,或者说一个连续随机变量的密度,是一个函数,它的值在样本空间中的任何一个给定的样本(或者点)(由随机变量取的一组可能的值)可以被解释为提供了一个随机变量的值与样本相等的相对可能性。换句话说,虽然一个连续的随机变量取任何特定值的绝对可能性是0(因为从一开始就有一组无限的可能值) ,但是可以用两个不同样本的 PDF 值来推断,在任何特定的随机变量的绘制中,随机变量与另一个样本相比等于一个样本的可能性有多大。


In a more precise sense, the PDF is used to specify the probability of the random variable falling within a particular range of values, as opposed to taking on any one value. This probability is given by the integral of this variable's PDF over that range—that is, it is given by the area under the density function but above the horizontal axis and between the lowest and greatest values of the range. The probability density function is nonnegative everywhere, and its integral over the entire space is equal to 1.

In a more precise sense, the PDF is used to specify the probability of the random variable falling within a particular range of values, as opposed to taking on any one value. This probability is given by the integral of this variable's PDF over that range—that is, it is given by the area under the density function but above the horizontal axis and between the lowest and greatest values of the range. The probability density function is nonnegative everywhere, and its integral over the entire space is equal to 1.

在更精确的意义上,PDF 用于指定随机变量在特定值范围内的概率,而不是取任何一个值。这个概率是由这个变量的 PDF 在该范围内的积分给出的,也就是说,它是由密度函数下的面积给出的,但在水平轴以上,在范围的最低值和最大值之间。概率密度函数在任何地方都是非负的,它在整个空间上的积分等于1。


The terms "probability distribution function"[3] and "probability function"[4] have also sometimes been used to denote the probability density function. However, this use is not standard among probabilists and statisticians. In other sources, "probability distribution function" may be used when the probability distribution is defined as a function over general sets of values or it may refer to the cumulative distribution function, or it may be a probability mass function (PMF) rather than the density. "Density function" itself is also used for the probability mass function, leading to further confusion.[5] In general though, the PMF is used in the context of discrete random variables (random variables that take values on a countable set), while the PDF is used in the context of continuous random variables.

The terms "probability distribution function" and "probability function" have also sometimes been used to denote the probability density function. However, this use is not standard among probabilists and statisticians. In other sources, "probability distribution function" may be used when the probability distribution is defined as a function over general sets of values or it may refer to the cumulative distribution function, or it may be a probability mass function (PMF) rather than the density. "Density function" itself is also used for the probability mass function, leading to further confusion. In general though, the PMF is used in the context of discrete random variables (random variables that take values on a countable set), while the PDF is used in the context of continuous random variables.

有时也会用“概率分布功能”和“概率密度函数功能”这两个词来表示概率密度函数。然而,这种用法在概率学家和统计学家中并不标准。在其他来源,“概率分布函数”可能被用于当概率分布被定义为一个函数超过一般的价值集,或者它可能是指累积分布函数,或者它可能是一个概率质量函数(PMF)而不是密度。“密度函数”本身也用于概率质量函数,导致美国圣约瑟的兽迷大会Further Confusion。一般来说,PMF 是用于离散随机变量的上下文(取值于可数集的随机变量) ,而 PDF 是用于连续随机变量的上下文。


Example

Suppose bacteria of a certain species typically live 4 to 6 hours. The probability that a bacterium lives 模板:Em 5 hours is equal to zero. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.0000000000... hours. However, the probability that the bacterium dies between 5 hours and 5.01 hours is quantifiable. Suppose the answer is 0.02 (i.e., 2%). Then, the probability that the bacterium dies between 5 hours and 5.001 hours should be about 0.002, since this time interval is one-tenth as long as the previous. The probability that the bacterium dies between 5 hours and 5.0001 hours should be about 0.0002, and so on.

Suppose bacteria of a certain species typically live 4 to 6 hours. The probability that a bacterium lives 5 hours is equal to zero. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.0000000000... hours. However, the probability that the bacterium dies between 5 hours and 5.01 hours is quantifiable. Suppose the answer is 0.02 (i.e., 2%). Then, the probability that the bacterium dies between 5 hours and 5.001 hours should be about 0.002, since this time interval is one-tenth as long as the previous. The probability that the bacterium dies between 5 hours and 5.0001 hours should be about 0.0002, and so on.

假设某种细菌通常能存活4到6小时。细菌存活5小时的概率等于零。许多细菌可以存活大约5个小时,但是任何特定的细菌都不可能在5.0000000000... 小时内死亡。然而,细菌在5小时到5.01小时之间死亡的可能性是可以量化的。假设答案是0.02(即2%)。那么,细菌在5小时到5.001小时之间死亡的概率应该是0.002,因为这个时间间隔是前一个时间间隔的十分之一。细菌在5小时到5.0001小时之间死亡的概率应该是0.0002左右,以此类推。


In these three examples, the ratio (probability of dying during an interval) / (duration of the interval) is approximately constant, and equal to 2 per hour (or 2 hour−1). For example, there is 0.02 probability of dying in the 0.01-hour interval between 5 and 5.01 hours, and (0.02 probability / 0.01 hours) = 2 hour−1. This quantity 2 hour−1 is called the probability density for dying at around 5 hours. Therefore, the probability that the bacterium dies at 5 hours can be written as (2 hour−1) dt. This is the probability that the bacterium dies within an infinitesimal window of time around 5 hours, where dt is the duration of this window. For example, the probability that it lives longer than 5 hours, but shorter than (5 hours + 1 nanosecond), is (2 hour−1)×(1 nanosecond) ≈ 模板:Val (using the unit conversion 模板:Val nanoseconds = 1 hour).

In these three examples, the ratio (probability of dying during an interval) / (duration of the interval) is approximately constant, and equal to 2 per hour (or 2 hour−1). For example, there is 0.02 probability of dying in the 0.01-hour interval between 5 and 5.01 hours, and (0.02 probability / 0.01 hours) = 2 hour−1. This quantity 2 hour−1 is called the probability density for dying at around 5 hours. Therefore, the probability that the bacterium dies at 5 hours can be written as (2 hour−1) dt. This is the probability that the bacterium dies within an infinitesimal window of time around 5 hours, where dt is the duration of this window. For example, the probability that it lives longer than 5 hours, but shorter than (5 hours + 1 nanosecond), is (2 hour−1)×(1 nanosecond) ≈ (using the unit conversion nanoseconds = 1 hour).

在这三个例子中,死亡概率(间隔期间的死亡概率)/(间隔期间的持续时间)接近于常数,等于每小时2(或2小时 < sup >-1 )。例如,在5小时至5.01小时之间的0.01小时内死亡的概率为0.02,(0.02概率/0.01小时) = 2小时 < sup >-1 。这个2小时 < sup >-1 的数量被称为5小时左右死亡的概率密度。因此,细菌在5小时内死亡的概率可以写成(2小时 < sup >-1 ) dt。这是细菌在5小时左右的无限小时内死亡的概率,其中 dt 是这个时间窗的持续时间。例如,它寿命超过5小时,但小于(5小时 + 1纳秒)的概率是(2小时 < sup >-1 ) × (1纳秒)≈(使用单位换算纳秒 = 1小时)。


There is a probability density function f with f(5 hours) = 2 hour−1. The integral of f over any window of time (not only infinitesimal windows but also large windows) is the probability that the bacterium dies in that window.

There is a probability density function f with f(5 hours) = 2 hour−1. The integral of f over any window of time (not only infinitesimal windows but also large windows) is the probability that the bacterium dies in that window.

有一个概率密度函数 f,f (5小时) = 2小时 < sup >-1 。F 在任意时间窗口(不仅是无限小的窗口,而且是大窗口)上的积分是细菌在该窗口中死亡的概率。


Absolutely continuous univariate distributions

A probability density function is most commonly associated with absolutely continuous univariate distributions. A random variable [math]\displaystyle{ X }[/math] has density [math]\displaystyle{ f_X }[/math], where [math]\displaystyle{ f_X }[/math] is a non-negative Lebesgue-integrable function, if:

A probability density function is most commonly associated with absolutely continuous univariate distributions. A random variable [math]\displaystyle{ X }[/math] has density [math]\displaystyle{ f_X }[/math], where [math]\displaystyle{ f_X }[/math] is a non-negative Lebesgue-integrable function, if:

概率密度函数通常与绝对连续的单变量分布联系在一起。一个随机变量 < math > x </math > 有密度 < math > f _ x </math > ,其中 < math > f _ x </math > 是一个非负的 Lebesgue-integrable 函数,如果:


[math]\displaystyle{ \Pr [a \le X \le b] = \int_a^b f_X(x) \, dx . }[/math]

[math]\displaystyle{ \Pr [a \le X \le b] = \int_a^b f_X(x) \, dx . }[/math]

[ math ][ a le x le b ] = int _ a ^ b f _ x (x) ,dx


Hence, if [math]\displaystyle{ F_X }[/math] is the cumulative distribution function of [math]\displaystyle{ X }[/math], then:

Hence, if [math]\displaystyle{ F_X }[/math] is the cumulative distribution function of [math]\displaystyle{ X }[/math], then:

因此,如果 < math > f _ x </math > 是 < math > x </math > 的累积分布函数,那么:


[math]\displaystyle{ F_X(x) = \int_{-\infty}^x f_X(u) \, du , }[/math]

[math]\displaystyle{ F_X(x) = \int_{-\infty}^x f_X(u) \, du , }[/math]

= int _ {-infty } ^ x _ x (u) ,du


and (if [math]\displaystyle{ f_X }[/math] is continuous at [math]\displaystyle{ x }[/math])

and (if [math]\displaystyle{ f_X }[/math] is continuous at [math]\displaystyle{ x }[/math])

而且(如果 < math > f _ x </math > 在 < math > x </math > 上是连续的)


[math]\displaystyle{ f_X(x) = \frac{d}{dx} F_X(x) . }[/math]

[math]\displaystyle{ f_X(x) = \frac{d}{dx} F_X(x) . }[/math]

(数学) = frac { d }{ dx } f _ x (x)


Intuitively, one can think of [math]\displaystyle{ f_X(x) \, dx }[/math] as being the probability of [math]\displaystyle{ X }[/math] falling within the infinitesimal interval [math]\displaystyle{ [x,x+dx] }[/math].

Intuitively, one can think of [math]\displaystyle{ f_X(x) \, dx }[/math] as being the probability of [math]\displaystyle{ X }[/math] falling within the infinitesimal interval [math]\displaystyle{ [x,x+dx] }[/math].

直观地说,我们可以把“ math”想象成“ max” ,“ dx”就是“ math”在无穷小的区间“ math”内的概率。


Formal definition

(This definition may be extended to any probability distribution using the measure-theoretic definition of probability.)

(This definition may be extended to any probability distribution using the measure-theoretic definition of probability.)

(这个定义可以用概率的度量论定义扩展到任何概率分布。)


A random variable [math]\displaystyle{ X }[/math] with values in a measurable space [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math] (usually [math]\displaystyle{ \mathbb{R}^n }[/math] with the Borel sets as measurable subsets) has as probability distribution the measure XP on [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math]: the density of [math]\displaystyle{ X }[/math] with respect to a reference measure [math]\displaystyle{ \mu }[/math] on [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math] is the Radon–Nikodym derivative:

A random variable [math]\displaystyle{ X }[/math] with values in a measurable space [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math] (usually [math]\displaystyle{ \mathbb{R}^n }[/math] with the Borel sets as measurable subsets) has as probability distribution the measure XP on [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math]: the density of [math]\displaystyle{ X }[/math] with respect to a reference measure [math]\displaystyle{ \mu }[/math] on [math]\displaystyle{ (\mathcal{X}, \mathcal{A}) }[/math] is the Radon–Nikodym derivative:

一个随机变量 x 在一个可测量的空间里有值(数学 x } ,数学 a }) </math > (通常数学 r } ^ n </math > 以 Borel 集合作为可测量的子集)作为测量 x < sub > x x p < math > (数学 x } ,数学 a }) </math > : 数学的密度关于一个参考测量 < math > mu/math > 关于 math </math > (mathcal { x } ,mathcal { a })是 Radon-Nikodym 的衍生物:


[math]\displaystyle{ f = \frac{d X_*P}{d \mu} . }[/math]

[math]\displaystyle{ f = \frac{d X_*P}{d \mu} . }[/math]

[ math > f = frac { d x _ * p }


That is, f is any measurable function with the property that:

That is, f is any measurable function with the property that:

也就是说,f 是任意一个具有以下属性的可测函数:


[math]\displaystyle{ \Pr [X \in A ] = \int_{X^{-1}A} \, d P = \int_A f \, d \mu }[/math]

[math]\displaystyle{ \Pr [X \in A ] = \int_{X^{-1}A} \, d P = \int_A f \, d \mu }[/math]

[ math ] Pr [ x in a ] = int _ { x ^ {-1} a } ,d p = int _ a f,d mu </math ]


for any measurable set [math]\displaystyle{ A \in \mathcal{A}. }[/math]

for any measurable set [math]\displaystyle{ A \in \mathcal{A}. }[/math]

对于任何可测量的集合 < math > a in mathcal { a } . </math >


Discussion

In the continuous univariate case above, the reference measure is the Lebesgue measure. The probability mass function of a discrete random variable is the density with respect to the counting measure over the sample space (usually the set of integers, or some subset thereof).

In the continuous univariate case above, the reference measure is the Lebesgue measure. The probability mass function of a discrete random variable is the density with respect to the counting measure over the sample space (usually the set of integers, or some subset thereof).

在上述连续的单变量情况下,参考度量是勒贝格测度。离散随机变量的概率质量函数是样本空间上计数测度的密度(通常是整数集,或其中的一些子集)。


It is not possible to define a density with reference to an arbitrary measure (e.g. one can't choose the counting measure as a reference for a continuous random variable). Furthermore, when it does exist, the density is almost everywhere unique.

It is not possible to define a density with reference to an arbitrary measure (e.g. one can't choose the counting measure as a reference for a continuous random variable). Furthermore, when it does exist, the density is almost everywhere unique.

不可能参照任意的测量来定义密度(例如:。不能选择计数测度作为连续随机变量的参考)。此外,当它确实存在时,密度几乎在任何地方都是唯一的。


Further details

Unlike a probability, a probability density function can take on values greater than one; for example, the uniform distribution on the interval [0, ½] has probability density f(x) = 2 for 0 ≤ x ≤ ½ and f(x) = 0 elsewhere.

Unlike a probability, a probability density function can take on values greater than one; for example, the uniform distribution on the interval [0, ½] has probability density f(x) = 2 for 0 ≤ x ≤ ½ and f(x) = 0 elsewhere.

与概率不同,概率密度函数可以取大于1的值; 例如,区间[0,1... 2]上的均匀分布对于0≤ x ≤12有概率密度 f (x) = 2,在其他地方有概率密度 f (x) = 0。


The standard normal distribution has probability density

The standard normal distribution has probability density

标准正态分布具有概率密度

[math]\displaystyle{ \lt math\gt 《数学》 f(x) = \frac{1}{\sqrt{2\pi}}\; e^{-x^2/2}. f(x) = \frac{1}{\sqrt{2\pi}}\; e^{-x^2/2}. F (x) = frac {1}{ sqrt {2 pi } ; e ^ {-x ^ 2/2}. }[/math]
 </math>

数学


If a random variable X is given and its distribution admits a probability density function f, then the expected value of X (if the expected value exists) can be calculated as

If a random variable X is given and its distribution admits a probability density function f, then the expected value of X (if the expected value exists) can be calculated as

如果给定一个随机变量 x,并且它的分布概率密度函数为 f,那么 x 的期望值(如果期望值存在)可以计算为

[math]\displaystyle{ \lt math\gt 《数学》 \operatorname{E}[X] = \int_{-\infty}^\infty x\,f(x)\,dx. \operatorname{E}[X] = \int_{-\infty}^\infty x\,f(x)\,dx. 操作数名{ e }[ x ] = int _ {-infty } ^ infty x,f (x) ,dx。 }[/math]
 </math>

数学


Not every probability distribution has a density function: the distributions of discrete random variables do not; nor does the Cantor distribution, even though it has no discrete component, i.e., does not assign positive probability to any individual point.

Not every probability distribution has a density function: the distributions of discrete random variables do not; nor does the Cantor distribution, even though it has no discrete component, i.e., does not assign positive probability to any individual point.

不是每个概率分布都有一个密度函数: 离散随机变量的分布没有; Cantor 分布也没有,即使它没有离散分量,也就是说,没有给任何个别点赋予正的概率。


A distribution has a density function if and only if its cumulative distribution function F(x) is absolutely continuous. In this case: F is almost everywhere differentiable, and its derivative can be used as probability density:

A distribution has a density function if and only if its cumulative distribution function F(x) is absolutely continuous. In this case: F is almost everywhere differentiable, and its derivative can be used as probability density:

一个分布有一个密度函数当且仅当它的累积分布函数 f (x)绝对连续。在这种情况下: f 几乎处处可微,它的导数可以用作概率密度:

[math]\displaystyle{ \lt math\gt 《数学》 \frac{d}{dx}F(x) = f(x). \frac{d}{dx}F(x) = f(x). Frac { d }{ dx } f (x) = f (x). }[/math]
 </math>

数学


If a probability distribution admits a density, then the probability of every one-point set {a} is zero; the same holds for finite and countable sets.

If a probability distribution admits a density, then the probability of every one-point set {a} is zero; the same holds for finite and countable sets.

如果一个概率分布集承认一个密度,那么每个一点集{ a }的概率为零; 有限集和可数集也是这样。


Two probability densities f and g represent the same probability distribution precisely if they differ only on a set of Lebesgue measure zero.

Two probability densities f and g represent the same probability distribution precisely if they differ only on a set of Lebesgue measure zero.

两个概率密度 f 和 g 表示相同的概率分布,如果它们只是在一组勒贝格测度上不同的话。


In the field of statistical physics, a non-formal reformulation of the relation above between the derivative of the cumulative distribution function and the probability density function is generally used as the definition of the probability density function. This alternate definition is the following:

In the field of statistical physics, a non-formal reformulation of the relation above between the derivative of the cumulative distribution function and the probability density function is generally used as the definition of the probability density function. This alternate definition is the following:

在统计物理学领域,累积分布函数和概率密度函数的导数之间的上述关系的非形式重新表述通常被用作概率密度函数的定义。这个替代定义如下:


If dt is an infinitely small number, the probability that X is included within the interval (tt + dt) is equal to f(tdt, or:

If dt is an infinitely small number, the probability that X is included within the interval (t, t + dt) is equal to f(t) dt, or:

如果 dt 是一个无穷小的数,则 x 包含在区间(t,t + dt)中的概率等于 f (t) dt,或:

[math]\displaystyle{ \lt math\gt 《数学》 \Pr(t\lt X\lt t+dt) = f(t)\,dt. \Pr(t\lt X\lt t+dt) = f(t)\,dt. Pr (t \lt t + dt) = f (t) ,dt. }[/math]
 </math>

数学


Link between discrete and continuous distributions

It is possible to represent certain discrete random variables as well as random variables involving both a continuous and a discrete part with a generalized probability density function, by using the Dirac delta function. (This is not possible with a probability density function in the sense defined above, it may be done with a distribution.) For example, consider a binary discrete random variable having the Rademacher distribution—that is, taking −1 or 1 for values, with probability ½ each. The density of probability associated with this variable is:

It is possible to represent certain discrete random variables as well as random variables involving both a continuous and a discrete part with a generalized probability density function, by using the Dirac delta function. (This is not possible with a probability density function in the sense defined above, it may be done with a distribution.) For example, consider a binary discrete random variable having the Rademacher distribution—that is, taking −1 or 1 for values, with probability ½ each. The density of probability associated with this variable is:

它可以表示某些离散的随机变量以及包括连续和离散部分的广义概率密度函数的随机变量,通过使用狄拉克δ函数。(这在上面定义的概率密度函数中是不可能的,它可以通过分布来实现。)例如,考虑一个具有 Rademacher 分布的二进制离散随机变量,也就是说,取-1或1的值,每个值的概率为12。与此变量相关的概率密度是:


[math]\displaystyle{ f(t) = \frac{1}{2}(\delta(t+1)+\delta(t-1)). }[/math]

[math]\displaystyle{ f(t) = \frac{1}{2}(\delta(t+1)+\delta(t-1)). }[/math]

F (t) = frac {1}{2}(delta (t + 1) + delta (t-1))


More generally, if a discrete variable can take n different values among real numbers, then the associated probability density function is:

More generally, if a discrete variable can take n different values among real numbers, then the associated probability density function is:

更一般地说,如果一个离散变量可以在实数中取 n 个不同的值,那么相关的概率密度函数是:


[math]\displaystyle{ f(t) = \sum_{i=1}^np_i\, \delta(t-x_i), }[/math]

[math]\displaystyle{ f(t) = \sum_{i=1}^np_i\, \delta(t-x_i), }[/math]

= sum { i = 1} ^ np _ i,delta (t-x _ i) ,</math >


where [math]\displaystyle{ x_1\ldots,x_n }[/math] are the discrete values accessible to the variable and [math]\displaystyle{ p_1,\ldots,p_n }[/math] are the probabilities associated with these values.

where [math]\displaystyle{ x_1\ldots,x_n }[/math] are the discrete values accessible to the variable and [math]\displaystyle{ p_1,\ldots,p_n }[/math] are the probabilities associated with these values.

其中,x _ 1点,x _ n </math > 是变量可以访问的离散值,而 < math > p _ 1,ldots,p _ n </math > 是与这些值相关的概率。


This substantially unifies the treatment of discrete and continuous probability distributions. For instance, the above expression allows for determining statistical characteristics of such a discrete variable (such as its mean, its variance and its kurtosis), starting from the formulas given for a continuous distribution of the probability.

This substantially unifies the treatment of discrete and continuous probability distributions. For instance, the above expression allows for determining statistical characteristics of such a discrete variable (such as its mean, its variance and its kurtosis), starting from the formulas given for a continuous distribution of the probability.

这实质上统一了处理离散和连续的概率分布。例如,上述表达式允许确定这样一个离散变量的统计特征(如其均值、方差和峭度) ,从给出的连续概率分布公式出发。


Families of densities

It is common for probability density functions (and probability mass functions) to

It is common for probability density functions (and probability mass functions) to

对于概率密度函数(和概率质量函数)来说

be parametrized—that is, to be characterized by unspecified parameters. For example, the normal distribution is parametrized in terms of the mean and the variance, denoted by [math]\displaystyle{ \mu }[/math] and [math]\displaystyle{ \sigma^2 }[/math] respectively, giving the family of densities

be parametrized—that is, to be characterized by unspecified parameters. For example, the normal distribution is parametrized in terms of the mean and the variance, denoted by [math]\displaystyle{ \mu }[/math] and [math]\displaystyle{ \sigma^2 }[/math] respectively, giving the family of densities

参数化ーー即拥有属性未指定的参数。例如,正态分布是以均值和方差参数化的,分别用 < math > mu </math > 和 < math > sigma ^ 2 </math > 表示,给出了密度家族


[math]\displaystyle{ \lt math\gt 《数学》 f(x;\mu,\sigma^2) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2 }. f(x;\mu,\sigma^2) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2 }. F (x; mu,sigma ^ 2) = frac {1}{ sigma sqrt {2 pi } e ^ {-frac {1}{2}左(frac { x-mu }{ sigma }右) ^ 2}。 }[/math]

</math>

数学

It is important to keep in mind the difference between the domain of a family of densities and the parameters of the family. Different values of the parameters describe different distributions of different random variables on the same sample space (the same set of all possible values of the variable); this sample space is the domain of the family of random variables that this family of distributions describes. A given set of parameters describes a single distribution within the family sharing the functional form of the density. From the perspective of a given distribution, the parameters are constants, and terms in a density function that contain only parameters, but not variables, are part of the normalization factor of a distribution (the multiplicative factor that ensures that the area under the density—the probability of something in the domain occurring— equals 1). This normalization factor is outside the kernel of the distribution.

It is important to keep in mind the difference between the domain of a family of densities and the parameters of the family. Different values of the parameters describe different distributions of different random variables on the same sample space (the same set of all possible values of the variable); this sample space is the domain of the family of random variables that this family of distributions describes. A given set of parameters describes a single distribution within the family sharing the functional form of the density. From the perspective of a given distribution, the parameters are constants, and terms in a density function that contain only parameters, but not variables, are part of the normalization factor of a distribution (the multiplicative factor that ensures that the area under the density—the probability of something in the domain occurring— equals 1). This normalization factor is outside the kernel of the distribution.

重要的是要记住一个密度家庭的领域和家庭的参数之间的差别。参数的不同值描述了同一样本空间(变量的所有可能值的相同集合)上不同的随机变量的不同分布,这个样本空间是这个分布族所描述的随机变量族的域。一组给定的参数描述了一个单一的分布族内共享的密度的函数形式。从给定分布的角度来看,参数是常数,密度函数中只包含参数而不包含变量的项是分布的归一化因子的一部分(乘性因子,确保密度下面积ーー区域内某事发生的概率ーー等于1)。这个标准化因子超出了分布的核心。


Since the parameters are constants, reparametrizing a density in terms of different parameters, to give a characterization of a different random variable in the family, means simply substituting the new parameter values into the formula in place of the old ones. Changing the domain of a probability density, however, is trickier and requires more work: see the section below on change of variables.

Since the parameters are constants, reparametrizing a density in terms of different parameters, to give a characterization of a different random variable in the family, means simply substituting the new parameter values into the formula in place of the old ones. Changing the domain of a probability density, however, is trickier and requires more work: see the section below on change of variables.

因为这些参数是常数,所以根据不同的参数重新参数化一个密度,给出这个家族中不同的随机变量的角色塑造,就是简单地用新的参数值代替旧的参数值。然而,改变概率密度的范围是比较棘手的,需要做更多的工作: 请参阅下面关于变量变化的部分。


Densities associated with multiple variables

Densities associated with multiple variables

= = = 与多个变量相关的密度 = = < ! ——本节链接自充足性(统计学)—— >


For continuous random variables X1, ..., Xn, it is also possible to define a probability density function associated to the set as a whole, often called joint probability density function. This density function is defined as a function of the n variables, such that, for any domain D in the n-dimensional space of the values of the variables X1, ..., Xn, the probability that a realisation of the set variables falls inside the domain D is

For continuous random variables X1, ..., Xn, it is also possible to define a probability density function associated to the set as a whole, often called joint probability density function. This density function is defined as a function of the n variables, such that, for any domain D in the n-dimensional space of the values of the variables X1, ..., Xn, the probability that a realisation of the set variables falls inside the domain D is

对于连续的随机变量 x < sub > 1 ,... ,x < sub > n ,它也可以定义一个与集合作为一个整体的概率密度函数,通常被称为关节概率密度函数。这个密度函数定义为 n 个变量的函数,这样,对于 n 维空间中变量 x < sub > 1 ,... ,x < sub > n 的值的任何域 d,集合变量在域 d 内实现的概率是


[math]\displaystyle{ \Pr \left( X_1,\ldots,X_n \isin D \right) \lt math\gt \Pr \left( X_1,\ldots,X_n \isin D \right) Pr 左(x _ 1,点,x _ n 在 d 右) = \int_D f_{X_1,\ldots,X_n}(x_1,\ldots,x_n)\,dx_1 \cdots dx_n. }[/math]
= \int_D f_{X_1,\ldots,X_n}(x_1,\ldots,x_n)\,dx_1 \cdots dx_n.</math>

= int _ d f _ { x _ 1,ldots,x _ n }(x _ 1,ldots,x _ n) ,dx _ 1 cdots dx _ n. </math >


If F(x1, ..., xn) = Pr(X1 ≤ x1, ..., Xn ≤ xn) is the cumulative distribution function of the vector (X1, ..., Xn), then the joint probability density function can be computed as a partial derivative

If F(x1, ..., xn) = Pr(X1 ≤ x1, ..., Xn ≤ xn) is the cumulative distribution function of the vector (X1, ..., Xn), then the joint probability density function can be computed as a partial derivative

如果 f (x < sub > 1 ,... ,x < sub > n ) = Pr (x < sub > 1 ≤ x 1 ,... ,x < sub > n ≤ x < sub > n )是矢量的累积分布函数(x < sub > 1 ,... ,x < sub > n ),则关节概率密度函数可计算为偏导数

[math]\displaystyle{ \lt math\gt 《数学》 f(x) = \frac{\partial^n F}{\partial x_1 \cdots \partial x_n} \bigg|_x f(x) = \frac{\partial^n F}{\partial x_1 \cdots \partial x_n} \bigg|_x F (x) = frac { partial ^ n f }{ partial x _ 1 cdots partial x _ n } bigg | _ x }[/math]
 </math>

数学


Marginal densities

For i = 1, 2, ...,n, let fXi(xi) be the probability density function associated with variable Xi alone. This is called the marginal density function, and can be deduced from the probability density associated with the random variables X1, ..., Xn by integrating over all values of the other n − 1 variables:

For i = 1, 2, ...,n, let fXi(xi) be the probability density function associated with variable Xi alone. This is called the marginal density function, and can be deduced from the probability density associated with the random variables X1, ..., Xn by integrating over all values of the other n − 1 variables:

对于 i = 1,2,... ,n,设 f < sub > x i (x < sub > i )是与变量 x < sub > i 相关的概率密度函数。这就是所谓的边际密度函数,可以从与随机变量 x < sub > 1 ,... ,x < sub > n 相关的概率密度通过积分其他 n-1变量的所有值得出:


[math]\displaystyle{ f_{X_i}(x_i) = \int f(x_1,\ldots,x_n)\, dx_1 \cdots dx_{i-1}\,dx_{i+1}\cdots dx_n . }[/math]

[math]\displaystyle{ f_{X_i}(x_i) = \int f(x_1,\ldots,x_n)\, dx_1 \cdots dx_{i-1}\,dx_{i+1}\cdots dx_n . }[/math]

< math > f { x _ i }(x _ i) = int f (x _ 1,ldots,x _ n) ,dx _ 1 cdots dx _ { i-1} ,dx _ { i + 1} cdots dx _ n. </math >


Independence

Continuous random variables X1, ..., Xn admitting a joint density are all independent from each other if and only if

Continuous random variables X1, ..., Xn admitting a joint density are all independent from each other if and only if

连续随机变量 x < sub > 1 ,... ,x < sub > n 承认一个关节的密度是相互独立的当且仅当


[math]\displaystyle{ f_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = f_{X_1}(x_1)\cdots f_{X_n}(x_n). }[/math]

[math]\displaystyle{ f_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = f_{X_1}(x_1)\cdots f_{X_n}(x_n). }[/math]

< math > f { x _ 1,ldots,x _ n }(x _ 1,ldots,x _ n) = f _ { x _ 1}(x _ 1) c dots f _ { x _ n }(x _ n) . </math >


Corollary

If the joint probability density function of a vector of n random variables can be factored into a product of n functions of one variable

If the joint probability density function of a vector of n random variables can be factored into a product of n functions of one variable

如果一个 n 个随机变量的向量的联合概率密度函数可以分解为一个变量的 n 个函数的乘积


[math]\displaystyle{ f_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = f_1(x_1)\cdots f_n(x_n), }[/math]

[math]\displaystyle{ f_{X_1,\ldots,X_n}(x_1,\ldots,x_n) = f_1(x_1)\cdots f_n(x_n), }[/math]

< math > f { x1,ldots,xn }(x1,ldots,xn) = f _ 1(x1) cndots f _ n (xn) ,</math >


(where each fi is not necessarily a density) then the n variables in the set are all independent from each other, and the marginal probability density function of each of them is given by

(where each fi is not necessarily a density) then the n variables in the set are all independent from each other, and the marginal probability density function of each of them is given by

(其中每个 f < sub > i 不一定是密度)那么集合中的 n 个变量都是相互独立的,每个变量的边际概率密度函数由


[math]\displaystyle{ f_{X_i}(x_i) = \frac{f_i(x_i)}{\int f_i(x)\,dx}. }[/math]

[math]\displaystyle{ f_{X_i}(x_i) = \frac{f_i(x_i)}{\int f_i(x)\,dx}. }[/math]

< math > f { x _ i }(x _ i) = frac { f _ i (x _ i)}{ int f _ i (x) ,dx } . </math >


Example

This elementary example illustrates the above definition of multidimensional probability density functions in the simple case of a function of a set of two variables. Let us call [math]\displaystyle{ \vec R }[/math] a 2-dimensional random vector of coordinates (X, Y): the probability to obtain [math]\displaystyle{ \vec R }[/math] in the quarter plane of positive x and y is

This elementary example illustrates the above definition of multidimensional probability density functions in the simple case of a function of a set of two variables. Let us call [math]\displaystyle{ \vec R }[/math] a 2-dimensional random vector of coordinates (X, Y): the probability to obtain [math]\displaystyle{ \vec R }[/math] in the quarter plane of positive x and y is

这个初等的例子说明了上述定义的多维概率密度函数在一个简单的情况下,一个函数的集合的两个变量。让我们称之为 < math > vec r </math > 一个二维随机坐标向量(x,y) : 在正 x 和 y 的四分之一平面上获得 < math > vec r </math > 的概率为


[math]\displaystyle{ \Pr \left( X \gt 0, Y \gt 0 \right) \lt math\gt \Pr \left( X \gt 0, Y \gt 0 \right) (x \gt 0,y \gt 0,右) = \int_0^\infty \int_0^\infty f_{X,Y}(x,y)\,dx\,dy. }[/math]
= \int_0^\infty \int_0^\infty f_{X,Y}(x,y)\,dx\,dy.</math>

= int _ 0 ^ infty int _ 0 ^ infty f _ { x,y }(x,y) ,dx,dy


Function of random variables and change of variables in the probability density function

If the probability density function of a random variable (or vector) X is given as fX(x), it is possible (but often not necessary; see below) to calculate the probability density function of some variable Y = g(X). This is also called a “change of variable” and is in practice used to generate a random variable of arbitrary shape fg(X) = fY using a known (for instance, uniform) random number generator.

If the probability density function of a random variable (or vector) X is given as fX(x), it is possible (but often not necessary; see below) to calculate the probability density function of some variable g(X)}}. This is also called a “change of variable” and is in practice used to generate a random variable of arbitrary shape fY}} using a known (for instance, uniform) random number generator.

如果给出一个随机变量(或者向量) x 的概率密度函数为 f < sub > x (x) ,那么可以(但通常不是必须的; 见下文)计算某个变量 g (x)}的概率密度函数。这也被称为“变量的变化” ,实际上是用一个已知的(例如,均匀的)随机数生成器生成一个任意形状的随机变量 f < sub > y }。


It is tempting to think that in order to find the expected value E(g(X)), one must first find the probability density fg(X) of the new random variable Y = g(X). However, rather than computing

It is tempting to think that in order to find the expected value E(g(X)), one must first find the probability density fg(X) of the new random variable g(X)}}. However, rather than computing

为了求期望值 e (g (x)) ,首先必须找到新的随机变量 g (x)}的概率密度 f < sub > g (x) 。然而,与其说是计算机,不如说是电脑


[math]\displaystyle{ \operatorname E\big(g(X)\big) = \int_{-\infty}^\infty y f_{g(X)}(y)\,dy, }[/math]
[math]\displaystyle{ \operatorname E\big(g(X)\big) = \int_{-\infty}^\infty y f_{g(X)}(y)\,dy,  }[/math]

操作符 e big (g (x) big) = int _ {-infty } ^ infty y f _ { g (x)}(y) ,dy,</math >


one may find instead

one may find instead

有人可能会发现


[math]\displaystyle{ \operatorname E\big(g(X)\big) = \int_{-\infty}^\infty g(x) f_X(x)\,dx. }[/math]
[math]\displaystyle{ \operatorname E\big(g(X)\big) = \int_{-\infty}^\infty g(x) f_X(x)\,dx. }[/math]

操作符 e big (g (x) big) = int _ {-infty } ^ infty g (x) f _ x (x) ,dx. </math >


The values of the two integrals are the same in all cases in which both X and g(X) actually have probability density functions. It is not necessary that g be a one-to-one function. In some cases the latter integral is computed much more easily than the former. See Law of the unconscious statistician.

The values of the two integrals are the same in all cases in which both X and g(X) actually have probability density functions. It is not necessary that g be a one-to-one function. In some cases the latter integral is computed much more easily than the former. See Law of the unconscious statistician.

在 x 和 g (x)实际上都有概率密度函数的所有情况下,这两个积分的值是相同的。G 不一定是单射。在某些情况下,后一种积分比前一种积分更容易计算。见无意识统计学家定律。


Scalar to scalar

Let [math]\displaystyle{ g:{\mathbb R} \rightarrow {\mathbb R} }[/math] be a monotonic function, then the resulting density function is

Let [math]\displaystyle{ g:{\mathbb R} \rightarrow {\mathbb R} }[/math] be a monotonic function, then the resulting density function is

设{ mathbb r } </math > g: { mathbb r } right tarrow { mathbb r } </math > 是一个单调函数,那么得到的密度函数是


[math]\displaystyle{ f_Y(y) =f_X\big(g^{-1}(y)\big) \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right|. }[/math]
[math]\displaystyle{ f_Y(y) =f_X\big(g^{-1}(y)\big)  \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right|. }[/math]

(g ^ {-1}(y) big)(g ^ {-1}(y) big)(g ^ {-1}(y) big)


Here g−1 denotes the inverse function.

Here g−1 denotes the inverse function.

这里 g < sup >-1 表示反函数。


This follows from the fact that the probability contained in a differential area must be invariant under change of variables. That is,

This follows from the fact that the probability contained in a differential area must be invariant under change of variables. That is,

这源于这样一个事实,即包含在微分区域的概率必须在变量的变化下保持不变。就是,


[math]\displaystyle{ \left| f_Y(y)\, dy \right| = \left| f_X(x)\, dx \right|, }[/math]
[math]\displaystyle{ \left| f_Y(y)\, dy \right| = \left| f_X(x)\, dx \right|, }[/math]

左 | f _ y (y) ,右 | = 左 | f _ x (x) ,右 | ,</math >


or

or


[math]\displaystyle{ f_Y(y) = \left| \frac{dx}{dy} \right| f_X(x) = \left| \frac{d}{dy} (x) \right| f_X(x) = \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right| f_X\big(g^{-1}(y)\big) = {\big|\big(g^{-1}\big)'(y)\big|} \cdot f_X\big(g^{-1}(y)\big) . }[/math]
[math]\displaystyle{ f_Y(y) = \left| \frac{dx}{dy} \right| f_X(x) = \left| \frac{d}{dy} (x) \right| f_X(x) = \left| \frac{d}{dy} \big(g^{-1}(y)\big) \right| f_X\big(g^{-1}(y)\big) = {\big|\big(g^{-1}\big)'(y)\big|} \cdot f_X\big(g^{-1}(y)\big) . }[/math]

< math > f _ y (y) = 左 | frac { dx }{ dy }右 | f _ x (x) = 左 | frac { dy }(x)右 | f _ x (x) = 左 | frac { d }{ dy } big (g ^ {1}(y) big)右 | f _ x big (g ^ {1}(y) big) = { big | big (g ^ {1} big)’(y) big | } cf _ x big (g {1}(y) big)。数学


For functions that are not monotonic, the probability density function for y is

For functions that are not monotonic, the probability density function for y is

对于非单调的函数,y 的概率密度函数是


[math]\displaystyle{ \sum_{k=1}^{n(y)} \left| \frac{d}{dy} g^{-1}_{k}(y) \right| \cdot f_X\big(g^{-1}_{k}(y)\big), }[/math]
[math]\displaystyle{ \sum_{k=1}^{n(y)} \left| \frac{d}{dy} g^{-1}_{k}(y) \right| \cdot f_X\big(g^{-1}_{k}(y)\big), }[/math]

< math > sum { k = 1} ^ { n (y)} left | frac { d }{ dy } g ^ {-1}{ k }(y) right | cdot f _ x big (g ^ {1}{ k }(y) big) ,</math >


where n(y) is the number of solutions in x for the equation [math]\displaystyle{ g(x)=y }[/math], and [math]\displaystyle{ g_k^{-1}(y) }[/math] are these solutions.

where n(y) is the number of solutions in x for the equation [math]\displaystyle{ g(x)=y }[/math], and [math]\displaystyle{ g_k^{-1}(y) }[/math] are these solutions.

其中 n (y)是方程 < math > g (x) = y </math > 的 x 中解的个数,而 < math > g _ k ^ {-1}(y) </math > 是这些解。


Vector to vector

The above formulas can be generalized to variables (which we will again call y) depending on more than one other variable. f(x1, ..., xn) shall denote the probability density function of the variables that y depends on, and the dependence shall be y = g(x1, …, xn). Then, the resulting density function is[citation needed]

The above formulas can be generalized to variables (which we will again call y) depending on more than one other variable. f(x1, ..., xn) shall denote the probability density function of the variables that y depends on, and the dependence shall be g(x1, …, xn)}}. Then, the resulting density function is

上面的公式可以推广到变量(我们将再次称为 y) ,这取决于一个以上的其他变量。F (x < sub > 1 ,... ,x < sub > n )表示 y 所依赖的变量的概率密度函数,依赖性为 g (x < sub > 1 ,... ,x < sub > n )}。然后,得到的密度函数是


[math]\displaystyle{ \int\limits_{y = g(x_1, \ldots, x_n)} \frac{f(x_1,\ldots, x_n)}{\sqrt{\sum_{j=1}^n \frac{\partial g}{\partial x_j}(x_1, \ldots, x_n)^2}} \,dV, }[/math]

[math]\displaystyle{ \int\limits_{y = g(x_1, \ldots, x_n)} \frac{f(x_1,\ldots, x_n)}{\sqrt{\sum_{j=1}^n \frac{\partial g}{\partial x_j}(x_1, \ldots, x_n)^2}} \,dV, }[/math]

[ math > int limits { y = g (x _ 1,ldots,x _ n)} frac { f (x _ 1,ldots,x _ n)}{ sqrt { sum { j = 1} ^ n frac { partial g }(x _ 1,ldots,x _ n) ^ 2} ,dV,</math >


where the integral is over the entire (n − 1)-dimensional solution of the subscripted equation and the symbolic dV must be replaced by a parametrization of this solution for a particular calculation; the variables x1, ..., xn are then of course functions of this parametrization.

where the integral is over the entire (n − 1)-dimensional solution of the subscripted equation and the symbolic dV must be replaced by a parametrization of this solution for a particular calculation; the variables x1, ..., xn are then of course functions of this parametrization.

其中积分在下标方程的整个(n-1)维解之上,符号 dV 必须用这个解的参数化来代替,因为对于特定的计算,变量 x < sub > 1 ,... ,x < sub > n 就是这个参数化的当然函数。


This derives from the following, perhaps more intuitive representation: Suppose x is an n-dimensional random variable with joint density f. If y = H(x), where H is a bijective, differentiable function, then y has density g:

This derives from the following, perhaps more intuitive representation: Suppose x is an n-dimensional random variable with joint density f. If H(x)}}, where H is a bijective, differentiable function, then y has density g:

这源于以下,也许更直观的表示: 假设 x 是一个 n 维随机变量,联合密度 f。如果 h (x)}} ,其中 h 是一个双射,可微函数,那么 y 有密度 g:


[math]\displaystyle{ g(\mathbf{y}) = f\Big(H^{-1}(\mathbf{y})\Big)\left\vert \det\left[\frac{dH^{-1}(\mathbf{z})}{d\mathbf{z}}\Bigg \vert_{\mathbf{z}=\mathbf{y}}\right]\right \vert }[/math]

[math]\displaystyle{ g(\mathbf{y}) = f\Big(H^{-1}(\mathbf{y})\Big)\left\vert \det\left[\frac{dH^{-1}(\mathbf{z})}{d\mathbf{z}}\Bigg \vert_{\mathbf{z}=\mathbf{y}}\right]\right \vert }[/math]

[math]\displaystyle{ g(\mathbf{y}) = f\Big(H^{-1}(\mathbf{y})\Big)\left\vert \det\left[\frac{dH^{-1}(\mathbf{z})}{d\mathbf{z}}\Bigg \vert_{\mathbf{z}=\mathbf{y}}\right]\right \vert }[/math]


with the differential regarded as the Jacobian of the inverse of H(.), evaluated at y.[6]

with the differential regarded as the Jacobian of the inverse of H(.), evaluated at y.

将微分看作 h (的逆的雅可比矩阵。) ,在 y 评估。


For example, in the 2-dimensional case x = (x1x2), suppose the transform H is given as y1 = H1(x1x2), y2 = H2(x1x2) with inverses x1 = H1−1(y1y2), x2 = H2−1(y1y2). The joint distribution for y = (y1, y2) has density[7]

For example, in the 2-dimensional case x = (x1, x2), suppose the transform H is given as y1 = H1(x1, x2), y2 = H2(x1, x2) with inverses x1 = H1−1(y1, y2), x2 = H2−1(y1, y2). The joint distribution for y = (y1, y2) has density

例如,在二维情形 x = (x < sub > 1 ,x < sub > 2 )中,假定变换 h 为 y < sub > 1 = h < sub > 1 (x < sub > 1 ,x < sub > 2 ) ,y < sub > 2 = h < sub > 2 (x < sub > 1 ,x < sub > 2 )用逆序 x < sub > 1 = h < sub > 1 < sup >-1 (y < sub > 1 ,y < sub > 2 ) ,x < sub > 2 = h < sub > 2 -1 (y < sub > 1 ,y < sub > 2 ).Y = (y < sub > 1 ,y < sub > 2 )的联合分布具有密度分布


[math]\displaystyle{ g(y_1,y_2) = f_{X_1,X_2}\big(H_1^{-1}(y_1,y_2), H_2^{-1}(y_1,y_2)\big) \left\vert \frac{\partial H_1^{-1}}{\partial y_1} \frac{\partial H_2^{-1}}{\partial y_2} - \frac{\partial H_1^{-1}}{\partial y_2} \frac{\partial H_2^{-1}}{\partial y_1} \right\vert. }[/math]

[math]\displaystyle{ g(y_1,y_2) = f_{X_1,X_2}\big(H_1^{-1}(y_1,y_2), H_2^{-1}(y_1,y_2)\big) \left\vert \frac{\partial H_1^{-1}}{\partial y_1} \frac{\partial H_2^{-1}}{\partial y_2} - \frac{\partial H_1^{-1}}{\partial y_2} \frac{\partial H_2^{-1}}{\partial y_1} \right\vert. }[/math]

< math > g (y _ 1,y _ 2) = f { x _ 1,x _ 2} big (h _ 1 ^ {-1}(y _ 1,y _ 2) ,h _ 2 ^ {-1}(y _ 1,y _ 2) big)左 frc { partial h _ 1 ^ {-1} frc { partial h _ 2 ^ {-1}{ partial h _ 2 ^ {-1}{ partial h _ 2}-c { partial h _ 1}{ partial h _ 2{-1}{ partial h _ 1}{{ partial h _ 2{模板:1{{ partial h _ 1}{{{{{ partial h _ 1}{{模板:Partial h 1


Vector to scalar

Let [math]\displaystyle{ V:{\mathbb R}^n \rightarrow {\mathbb R} }[/math] be a differentiable function and [math]\displaystyle{ X }[/math] be a random vector taking values in [math]\displaystyle{ {\mathbb R}^n }[/math], [math]\displaystyle{ f_X(\cdot) }[/math] be the probability density function of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ \delta(\cdot) }[/math] be the Dirac delta function. It is possible to use the formulas above to determine [math]\displaystyle{ f_Y(\cdot) }[/math], the probability density function of [math]\displaystyle{ Y=V(X) }[/math], which will be given by

Let [math]\displaystyle{ V:{\mathbb R}^n \rightarrow {\mathbb R} }[/math] be a differentiable function and [math]\displaystyle{ X }[/math] be a random vector taking values in [math]\displaystyle{ {\mathbb R}^n }[/math], [math]\displaystyle{ f_X(\cdot) }[/math] be the probability density function of [math]\displaystyle{ X }[/math] and [math]\displaystyle{ \delta(\cdot) }[/math] be the Dirac delta function. It is possible to use the formulas above to determine [math]\displaystyle{ f_Y(\cdot) }[/math], the probability density function of [math]\displaystyle{ Y=V(X) }[/math], which will be given by

设 x 是一个可微函数,x 是一个随机向量,在数学中取值,在数学中取值,在数学中取值,在数学中取值,在数学中取值,在数学中取值,在数学中取概率密度函数,在数学中取值,在数学中取值,在数学中取值,在数学中取值,在数学中取值。我们可以用上面的公式来确定 < math > f _ y (cdot) </math > ,< math > y = v (x) </math > 的概率密度函数


[math]\displaystyle{ f_Y(y) = \int_{{\mathbb R}^n} f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \,d \mathbf{x}. }[/math]

[math]\displaystyle{ f_Y(y) = \int_{{\mathbb R}^n} f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \,d \mathbf{x}. }[/math]

(mathbf { x }) delta big (y-v (mathbf { x }) big) ,d mathbf { x } . </math >


This result leads to the Law of the unconscious statistician:

This result leads to the Law of the unconscious statistician:

这个结果引出了无意识统计学家的定律:


[math]\displaystyle{ \operatorname{E}_Y[Y]=\int_{{\mathbb R}} y f_Y(y) dy = \int_{{\mathbb R}} y \int_{{\mathbb R}^n} f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \,d \mathbf{x} dy = \int_{{\mathbb R}^n} \int_{{\mathbb R}} y f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \, dy d \mathbf{x}= \int_{{\mathbb R}^n} V(\mathbf{x}) f_{X}(\mathbf{x}) d \mathbf{x}=\operatorname{E}_X[V(X)]. }[/math]

[math]\displaystyle{ \operatorname{E}_Y[Y]=\int_ y f_Y(y) dy = \int_ y \int_ y f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) \, dy d \mathbf{x}= \int_{{\mathbb R}^n} V(\mathbf{x}) f_{X}(\mathbf{x}) d \mathbf{x}=\operatorname{E}_X[V(X)]. }[/math]

< math > 操作数名{ e } _ y [ y ] = int _ y f _ y (y) dy = int _ y int _ y f _ { x }(mathbf { x }) delta big (y-v (mathbf { x }) big) ,dy mathbf { x } = int {{ bb r } ^ n } v (mathbf { x }) f { x }(mathbf { x }) d } = 操作数名{ e } x [ v (x)]。数学


Proof:

Proof:

证据:


Let [math]\displaystyle{ Z }[/math] be a collapsed random variable with probability density function [math]\displaystyle{ p_Z(z)=\delta(z) }[/math] (i.e. a constant equal to zero). Let the random vector [math]\displaystyle{ \tilde{X} }[/math] and the transform [math]\displaystyle{ H }[/math] be defined as

Let [math]\displaystyle{ Z }[/math] be a collapsed random variable with probability density function [math]\displaystyle{ p_Z(z)=\delta(z) }[/math] (i.e. a constant equal to zero). Let the random vector [math]\displaystyle{ \tilde{X} }[/math] and the transform [math]\displaystyle{ H }[/math] be defined as

设 z </math > 是一个倒塌的随机变量,概率密度函数 < math > p _ z (z) = delta (z) </math > 。常数等于零)。设随机向量 < math > > tilde { x } </math > 和变换 < math > h </math > 定义为

[math]\displaystyle{ H(Z,X)=\begin{bmatrix} Z+V(X)\\ X\end{bmatrix}=\begin{bmatrix} Y\\ \tilde{X}\end{bmatrix} }[/math].

[math]\displaystyle{ H(Z,X)=\begin{bmatrix} Z+V(X)\\ X\end{bmatrix}=\begin{bmatrix} Y\\ \tilde{X}\end{bmatrix} }[/math].

< math > h (z,x) = begin { bmatrix } z + v (x) x end { bmatrix } = begin { bmatrix } y tilde { x } end { bmatrix } </math > .


It is clear that [math]\displaystyle{ H }[/math] is a bijective mapping, and the Jacobian of [math]\displaystyle{ H^{-1} }[/math] is given by:

It is clear that [math]\displaystyle{ H }[/math] is a bijective mapping, and the Jacobian of [math]\displaystyle{ H^{-1} }[/math] is given by:

很明显,< math > h </math > 是一个双射映射,< math > h ^ {-1} </math > 的雅可比矩阵由以下方式给出:


[math]\displaystyle{ \frac{dH^{-1}(y,\tilde{\mathbf{x}})}{dy\,d\tilde{\mathbf{x}}}=\begin{bmatrix} 1 & -\frac{dV(\tilde{\mathbf{x}})}{d\tilde{\mathbf{x}}}\\ \mathbf{0}_{n\times1} & \mathbf{I}_{n\times n} \end{bmatrix} }[/math],

[math]\displaystyle{ \frac{dH^{-1}(y,\tilde{\mathbf{x}})}{dy\,d\tilde{\mathbf{x}}}=\begin{bmatrix} 1 & -\frac{dV(\tilde{\mathbf{x}})}{d\tilde{\mathbf{x}}}\\ \mathbf{0}_{n\times1} & \mathbf{I}_{n\times n} \end{bmatrix} }[/math],

{ dy,d tilde { mathbf { x }}} = begin { bmatrix }1 &-frac { dV (tilde { mathbf { x })}}{ d tilde { x }}} bf {0}{ times1} & mathbf { n times n } end { bmatrix </math > ,

which is an upper triangular matrix with ones on the main diagonal, therefore its determinant is 1. Applying the change of variable theorem from the previous section we obtain that

which is an upper triangular matrix with ones on the main diagonal, therefore its determinant is 1. Applying the change of variable theorem from the previous section we obtain that

它是一个上三角矩阵,主对角线上有1,因此它的行列式是1。应用前一节变量定理的变化,我们得到

[math]\displaystyle{ f_{Y,X}(y,x) = f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) }[/math],

[math]\displaystyle{ f_{Y,X}(y,x) = f_{X}(\mathbf{x}) \delta\big(y - V(\mathbf{x})\big) }[/math],

(y,x) = f { x }(mathbf { x }) delta big (y-v (mathbf { x }) big) </math > ,


which if marginalized over [math]\displaystyle{ x }[/math] leads to the desired probability density function.

which if marginalized over [math]\displaystyle{ x }[/math] leads to the desired probability density function.

如果因为数学而被边缘化,就会导致所需的概率密度函数。


Sums of independent random variables


The probability density function of the sum of two independent random variables U and V, each of which has a probability density function, is the convolution of their separate density functions:

The probability density function of the sum of two independent random variables U and V, each of which has a probability density function, is the convolution of their separate density functions:

两个独立随机变量 u 和 v 之和的概率密度函数是它们各自密度函数的卷积,每个变量都有一个概率密度函数:


[math]\displaystyle{ \lt math\gt 《数学》 f_{U+V}(x) = \int_{-\infty}^\infty f_U(y) f_V(x - y)\,dy f_{U+V}(x) = \int_{-\infty}^\infty f_U(y) f_V(x - y)\,dy F _ { u + v }(x) = int _ {-infty } ^ infty f _ u (y) f _ v (x-y) ,dy = \left( f_{U} * f_{V} \right) (x) = \left( f_{U} * f_{V} \right) (x) = left (f _ { u } * f _ { v } right)(x) }[/math]

</math>

数学


It is possible to generalize the previous relation to a sum of N independent random variables, with densities U1, ..., UN:

It is possible to generalize the previous relation to a sum of N independent random variables, with densities U1, ..., UN:

可以推广到 n 个独立随机变量之和,其密度 u < sub > 1 ,... ,u < sub > n  :


[math]\displaystyle{ \lt math\gt 《数学》 f_{U_1 + \cdots + U_N}(x) f_{U_1 + \cdots + U_N}(x) F _ { u _ 1 + cdots + u _ n }(x) = \left( f_{U_1} * \cdots * f_{U_N} \right) (x) = \left( f_{U_1} * \cdots * f_{U_N} \right) (x) = left (f _ { u _ 1} * cdots * f _ { u _ n } right)(x) }[/math]

</math>

数学


This can be derived from a two-way change of variables involving Y=U+V and Z=V, similarly to the example below for the quotient of independent random variables.

This can be derived from a two-way change of variables involving Y=U+V and Z=V, similarly to the example below for the quotient of independent random variables.

这可以从涉及 y = u + v 和 z = v 的变量的双向变化中推导出来,类似于下面关于独立随机变量商的例子。


Products and quotients of independent random variables


Given two independent random variables U and V, each of which has a probability density function, the density of the product Y = UV and quotient Y=U/V can be computed by a change of variables.

Given two independent random variables U and V, each of which has a probability density function, the density of the product Y = UV and quotient Y=U/V can be computed by a change of variables.

给定两个独立的随机变量 u 和 v,每个随机变量都有一个概率密度函数,乘积 y = UV 和商 y = U/V 的密度可以通过变量的变化来计算。


Example: Quotient distribution

To compute the quotient Y = U/V of two independent random variables U and V, define the following transformation:

To compute the quotient Y = U/V of two independent random variables U and V, define the following transformation:

为了计算两个独立随机变量 u 和 v 的商 y = U/V,定义以下变换:


[math]\displaystyle{ Y=U/V }[/math]

[math]\displaystyle{ Y=U/V }[/math]

《数学》 y = U/V

[math]\displaystyle{ Z=V }[/math]

[math]\displaystyle{ Z=V }[/math]

Z = v </math >


Then, the joint density p(y,z) can be computed by a change of variables from U,V to Y,Z, and Y can be derived by marginalizing out Z from the joint density.

Then, the joint density p(y,z) can be computed by a change of variables from U,V to Y,Z, and Y can be derived by marginalizing out Z from the joint density.

然后,通过变量 u,v 到 y,z,y 的变化计算节理密度 p (y,z) ,通过去除节理密度 z 得到 y。


The inverse transformation is

The inverse transformation is

逆变换是


[math]\displaystyle{ U = YZ }[/math]

[math]\displaystyle{ U = YZ }[/math]

《数学》 u = YZ

[math]\displaystyle{ V = Z }[/math]

[math]\displaystyle{ V = Z }[/math]

《数学》 v = z


The Jacobian matrix [math]\displaystyle{ J(U,V\mid Y,Z) }[/math] of this transformation is

The Jacobian matrix [math]\displaystyle{ J(U,V\mid Y,Z) }[/math] of this transformation is

该变换的雅可比矩阵 j (u,v mid y,z) </math > 为


[math]\displaystyle{ \lt math\gt 《数学》 \begin{vmatrix} \begin{vmatrix} 开始{ vmatrix } \frac{\partial u}{\partial y} & \frac{\partial u}{\partial z} \\ \frac{\partial u}{\partial y} & \frac{\partial u}{\partial z} \\ Frac { partial u }{ partial y } & frac { partial u }{ partial z } \frac{\partial v}{\partial y} & \frac{\partial v}{\partial z} \frac{\partial v}{\partial y} & \frac{\partial v}{\partial z} 部分 y } & 部分 v }{ partial z } \end{vmatrix} \end{vmatrix} 结束{ vmatrix } = = = \begin{vmatrix} \begin{vmatrix} 开始{ vmatrix } z & y \\ z & y \\ Z & y 0 & 1 0 & 1 0 & 1 \end{vmatrix} \end{vmatrix} 结束{ vmatrix } = |z| . = |z| . = z | | . }[/math]

</math>

数学


Thus:

Thus:

因此:


[math]\displaystyle{ p(y,z) = p(u,v)\,J(u,v\mid y,z) = p(u)\,p(v)\,J(u,v\mid y,z) = p_U(yz)\,p_V(z)\, |z| . }[/math]
[math]\displaystyle{ p(y,z) = p(u,v)\,J(u,v\mid y,z) = p(u)\,p(v)\,J(u,v\mid y,z) = p_U(yz)\,p_V(z)\, |z| . }[/math]

P (y,z) = p (u,v) ,j (u,v mid y,z) = p (u) ,p (v) ,j (u,v mid y,z) = p _ u (yz) ,p _ v (z) ,| z | . </math >


And the distribution of Y can be computed by marginalizing out Z:

And the distribution of Y can be computed by marginalizing out Z:

Y 的分布可以通过边缘化 z 来计算:


[math]\displaystyle{ p(y) = \int_{-\infty}^\infty p_U(yz)\,p_V(z)\, |z| \, dz }[/math]

[math]\displaystyle{ p(y) = \int_{-\infty}^\infty p_U(yz)\,p_V(z)\, |z| \, dz }[/math]

P (y) = int _ {-infty } ^ infty p _ u (yz) ,p _ v (z) ,| z | ,dz </math >


This method crucially requires that the transformation from U,V to Y,Z be bijective. The above transformation meets this because Z can be mapped directly back to V, and for a given V the quotient U/V is monotonic. This is similarly the case for the sum U + V, difference U − V and product UV.

This method crucially requires that the transformation from U,V to Y,Z be bijective. The above transformation meets this because Z can be mapped directly back to V, and for a given V the quotient U/V is monotonic. This is similarly the case for the sum U + V, difference U − V and product UV.

这种方法要求 u,v 到 y,z 的变换是双一射的。上面的变换满足这个条件,因为 z 可以直接映射回 v,而且对于给定的 v,商 U/V 是单调的。这同样适用于求和 u + v,差 u-v 和乘积 UV。


Exactly the same method can be used to compute the distribution of other functions of multiple independent random variables.

Exactly the same method can be used to compute the distribution of other functions of multiple independent random variables.

同样的方法也可以用来计算多个独立随机变量的其他函数的分布。


Example: Quotient of two standard normals

Given two standard normal variables U and V, the quotient can be computed as follows. First, the variables have the following density functions:

Given two standard normal variables U and V, the quotient can be computed as follows. First, the variables have the following density functions:

给定两个标准正态变量 u 和 v,商可以计算如下。首先,这些变量具有以下密度函数:


[math]\displaystyle{ p(u) = \frac{1}{\sqrt{2\pi}} e^{-\frac{u^2}{2}} }[/math]

[math]\displaystyle{ p(u) = \frac{1}{\sqrt{2\pi}} e^{-\frac{u^2}{2}} }[/math]

< math > p (u) = frac {1}{ sqrt {2 pi } e ^ {-frac { u ^ 2}{2}} </math >

[math]\displaystyle{ p(v) = \frac{1}{\sqrt{2\pi}} e^{-\frac{v^2}{2}} }[/math]

[math]\displaystyle{ p(v) = \frac{1}{\sqrt{2\pi}} e^{-\frac{v^2}{2}} }[/math]

< math > p (v) = frac {1}{ sqrt {2 pi } e ^ {-frac { v ^ 2}{2}} </math >


We transform as described above:

We transform as described above:

我们转换如上所述:


[math]\displaystyle{ Y=U/V }[/math]

[math]\displaystyle{ Y=U/V }[/math]

《数学》 y = U/V

[math]\displaystyle{ Z=V }[/math]

[math]\displaystyle{ Z=V }[/math]

Z = v </math >


This leads to:

This leads to:

这导致了:


[math]\displaystyle{ \begin{align} \lt math\gt \begin{align} 1.1.1.2.2.2.2.2.2.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3 p(y) &= \int_{-\infty}^\infty p_U(yz)\,p_V(z)\, |z| \, dz \\[5pt] p(y) &= \int_{-\infty}^\infty p_U(yz)\,p_V(z)\, |z| \, dz \\[5pt] P (y) & = int _ {-infty } ^ infty p _ u (yz) ,p _ v (z) ,| z | ,dz [5 pt ] &= \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} y^2 z^2} \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} |z| \, dz \\[5pt] &= \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} y^2 z^2} \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2} z^2} |z| \, dz \\[5pt] & = int _ {-infty } ^ infty frac {1}{ sqrt {2 pi } e ^ {-frac {1}{2} y ^ 2 z ^ 2} frac {1}{ sqrt {2 pi } e ^ {-frac {1}{2} z ^ 2} | z | ,dz [5 pt ] &= \int_{-\infty}^\infty \frac{1}{2\pi} e^{-\frac{1}{2}(y^2+1)z^2} |z| \, dz \\[5pt] &= \int_{-\infty}^\infty \frac{1}{2\pi} e^{-\frac{1}{2}(y^2+1)z^2} |z| \, dz \\[5pt] 1}{2 pi } e ^ {-frac {1}{2}(y ^ 2 + 1) z ^ 2} | z | ,dz [5 pt ] &= 2\int_0^\infty \frac{1}{2\pi} e^{-\frac{1}{2}(y^2+1)z^2} z \, dz \\[5pt] &= 2\int_0^\infty \frac{1}{2\pi} e^{-\frac{1}{2}(y^2+1)z^2} z \, dz \\[5pt] 2 int _ 0 ^ infty frac {1}{2 pi } e ^ {-frac {1}{2}(y ^ 2 + 1) z ^ 2} z,dz [5 pt ] &= \int_0^\infty \frac{1}{\pi} e^{-(y^2+1)u} \, du && u=\tfrac{1}{2}z^2\\[5pt] &= \int_0^\infty \frac{1}{\pi} e^{-(y^2+1)u} \, du && u=\tfrac{1}{2}z^2\\[5pt] & = int _ 0 ^ infty frac {1}{ pi } e ^ {-(y ^ 2 + 1) u } ,du & u = tfrac {1}{2} z ^ 2[5 pt ] &= \left. -\frac{1}{\pi(y^2+1)} e^{-(y^2+1)u}\right]_{u=0}^\infty \\[5pt] &= \left. -\frac{1}{\pi(y^2+1)} e^{-(y^2+1)u}\right]_{u=0}^\infty \\[5pt] 左边。- frac {1}{ pi (y ^ 2 + 1)} e ^ {-(y ^ 2 + 1) u }右]{ u = 0} ^ infty [5 pt ] &= \frac{1}{\pi(y^2+1)} &= \frac{1}{\pi(y^2+1)} 1}{ pi (y ^ 2 + 1)} \end{align} }[/math]

\end{align}</math>

结束{ align } </math >


This is the density of a standard Cauchy distribution.

This is the density of a standard Cauchy distribution.

这是标准柯西分布的密度。


See also

  • Uses as position probability density:


References

  1. "AP Statistics Review - Density Curves and the Normal Distributions". Archived from the original on 2 April 2015. Retrieved 16 March 2015.
  2. Grinstead, Charles M.; Snell, J. Laurie (2009). "Conditional Probability - Discrete Conditional". Grinstead & Snell's Introduction to Probability. Orange Grove Texts. ISBN 161610046X. https://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter4.pdf. Retrieved 2019-07-25. 
  3. Probability distribution function PlanetMath -{zh-cn:互联网档案馆; zh-tw:網際網路檔案館; zh-hk:互聯網檔案館;}-存檔,存档日期2011-08-07.
  4. Probability Function at MathWorld
  5. Ord, J.K. (1972) Families of Frequency Distributions, Griffin. (for example, Table 5.1 and Example 5.4)
  6. Devore, Jay L.; Berk, Kenneth N. (2007). Modern Mathematical Statistics with Applications. Cengage. p. 263. ISBN 0-534-40473-1. https://books.google.com/books?id=3X7Qca6CcfkC&pg=PA263. 
  7. David, Stirzaker (2007-01-01). Elementary Probability. Cambridge University Press. ISBN 0521534283. OCLC 851313783. 


Bibliography

  • Pierre Simon de Laplace

作者: 皮埃尔 · 西蒙 · 拉普拉斯 (1812

1812年). Analytical Theory of Probability. 

| title = Analytical Theory of Probability}}

| title = 分析概率理论}

The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités.
The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités.

第一部混合微积分和概率论的主要论文,最初用法语写成: 《概率分析理论》。

  • Andrei Nikolajevich Kolmogorov (1950

1950年). Foundations of the Theory of Probability. https://archive.org/details/foundationsofthe00kolm. 

| title = Foundations of the Theory of Probability| url = https://archive.org/details/foundationsofthe00kolm}}

概率论的基础 | url = https://archive.org/details/foundationsofthe00kolm }

The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitsrechnung) appeared in 1933.
The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitsrechnung) appeared in 1933.

概率论的现代测量理论基础,最初的德国版本(grundbigriffe der wahrscheinlicheitsrechung)出现在1933年。

  • Patrick Billingsley

作者: Patrick Billingsley (1979

1979年). 概率与测量. New York, Toronto, London: John Wiley and Sons

约翰 · 威利父子出版社. ISBN 0-471-00710-2. 

| isbn = 0-471-00710-2}}

| isbn = 0-471-00710-2}}

  • David Stirzaker

作者: David Stirzaker (2003

2003年). [https://archive.org/details/elementaryprobab0000stir

Https://archive.org/details/elementaryprobab0000stir 基本概率]. ISBN 0-521-42028-8. https://archive.org/details/elementaryprobab0000stir

Https://archive.org/details/elementaryprobab0000stir. 

}}
}}
Chapters 7 to 9 are about continuous variables.
Chapters 7 to 9 are about continuous variables.

第七章到第九章是关于连续变量的。


External links

}}

}}


模板:Theory of probability distributions

Category:Functions related to probability distributions

类别: 与概率分布有关的函数

Category:Concepts in physics

分类: 物理概念


This page was moved from wikipedia:en:Probability density function. Its edit history can be viewed at 概率密度函数/edithistory