第38行: |
第38行: |
| | | |
| == Kolmogorov–Smirnov statistic Kolmogorov-Smirnov统计== | | == Kolmogorov–Smirnov statistic Kolmogorov-Smirnov统计== |
| + | |
| + | The [[empirical distribution function]] ''F''<sub>''n''</sub> for ''n'' [[Independent and identically distributed random variables|independent and identically distributed]] (i.i.d.) ordered observations ''X<sub>i</sub>'' is defined as |
| | | |
| where I_{[-\infty,x]}(X_i) is the indicator function, equal to 1 if X_i \le x and equal to 0 otherwise. | | where I_{[-\infty,x]}(X_i) is the indicator function, equal to 1 if X_i \le x and equal to 0 otherwise. |
− |
| |
− | The [[empirical distribution function]] ''F''<sub>''n''</sub> for ''n'' [[Independent and identically distributed random variables|independent and identically distributed]] (i.i.d.) ordered observations ''X<sub>i</sub>'' is defined as
| |
| | | |
| The Kolmogorov–Smirnov statistic for a given cumulative distribution function F(x) is | | The Kolmogorov–Smirnov statistic for a given cumulative distribution function F(x) is |
第57行: |
第57行: |
| In practice, the statistic requires a relatively large number of data points (in comparison to other goodness of fit criteria such as the Anderson–Darling test statistic) to properly reject the null hypothesis. | | In practice, the statistic requires a relatively large number of data points (in comparison to other goodness of fit criteria such as the Anderson–Darling test statistic) to properly reject the null hypothesis. |
| | | |
− | :<math>D_n= \sup_x |F_n(x)-F(x)|</math>
| + | n个独立且均匀分布(i.i.d.)的有序观测值Xi的经验分布函数Fn定义为 |
− | | + | F_{n}(x)={1 \over n}\sum _{i=1}^{n}I_{[-\infty ,x]}(X_{i}) |
− | where sup<sub>''x''</sub> is the [[supremum]] of the set of distances. By the [[Glivenko–Cantelli theorem]], if the sample comes from distribution ''F''(''x''), then ''D''<sub>''n''</sub> converges to 0 [[almost surely]] in the limit when <math>n</math> goes to infinity. Kolmogorov strengthened this result, by effectively providing the rate of this convergence (see [[Kolmogorov-Smirnov test#Kolmogorov distribution|Kolmogorov distribution]]). [[Donsker's theorem]] provides a yet stronger result.
| |
− | | |
− | The Kolmogorov distribution is the distribution of the random variable
| |
− | | |
− | In practice, the statistic requires a relatively large number of data points (in comparison to other goodness of fit criteria such as the [[Anderson–Darling test]] statistic) to properly reject the null hypothesis.
| |
| | | |
− | K=\sup_{t\in[0,1]}|B(t)|
| + | 其中 {\displaystyle I_{[-\infty ,x]}(X_{i})}I_{[-\infty ,x]}(X_{i})是指标函数,如果 {\displaystyle X_{i}\leq x}X_{i}\leq x等于1,否则等于0。 |
| | | |
− | K = sup _ { t in [0,1]} | b (t) |
| + | 给定累积分布函数F(x)的Kolmogorov–Smirnov统计量为 |
| + | D_{n}=\sup _{x}|F_{n}(x)-F(x)| |
| | | |
| + | 其中supx是距离集的最大值。根据Glivenko-Cantelli定理,如果样本来自分布F(x),则当n变为无穷大时,Dn几乎肯定会收敛于0。Kolmogorov通过有效加入收敛速率来增强此结果(请参阅Kolmogorov分布)。另外Donsker定理提供了更强的结果。 |
| | | |
| + | 在实践中,该统计需要相对大量的数据点(与其他拟合优度标准相比,例如Anderson-Darling检验统计)才能正确地拒绝原假设。 |
| | | |
| ==Kolmogorov distribution== | | ==Kolmogorov distribution== |