更改

跳到导航 跳到搜索
删除1,666字节 、 2020年9月27日 (日) 11:06
第200行: 第200行:     
Where <math>n</math> and <math>m</math> are the sizes of first and second sample respectively. The value of <math>c({\alpha})</math> is given in the table below for the most common levels of <math>\alpha</math>
 
Where <math>n</math> and <math>m</math> are the sizes of first and second sample respectively. The value of <math>c({\alpha})</math> is given in the table below for the most common levels of <math>\alpha</math>
  −
so that the condition reads
  −
  −
:<math>D_{n,m}>\frac{1}{\sqrt{n}}\cdot\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1 + \tfrac{n}{m}}{2}}.</math>
  −
  −
  −
Here, again, the larger the sample sizes, the more sensitive the minimal bound: For a given ratio of sample sizes (e.g. m=n), the minimal bound scales in the size of either of the samples according to its inverse square root.
  −
  −
Note that the two-sample test checks whether the two data samples come from the same distribution. This does not specify what that common distribution is (e.g. whether it's normal or not normal). Again, tables of critical values have been published. A shortcoming of the Kolmogorov–Smirnov test is that it is not very powerful because it is devised to be sensitive against all possible types of differences between two distribution functions.  and  showed evidence that the Cucconi test, originally proposed for simultaneously comparing location and scale, is much more powerful than the Kolmogorov–Smirnov test when comparing two distribution functions.
  −
  −
  −
While the Kolmogorov–Smirnov test is usually used to test whether a given F(x) is the underlying probability distribution of Fn(x), the procedure may be inverted to give confidence limits on F(x) itself. If one chooses a critical value of the test statistic Dα such that P(Dn&nbsp;>&nbsp;Dα) = α, then a band of width ±Dα around Fn(x) will entirely contain F(x) with probability 1&nbsp;−&nbsp;α.
  −
  −
  −
{| class="wikitable"
  −
  −
|-
  −
  −
A distribution-free multivariate Kolmogorov–Smirnov goodness of fit test has been proposed by Justel, Peña and Zamar (1997).  The test uses a statistic which is built using Rosenblatt's transformation, and an algorithm is developed to compute it in the bivariate case.  An approximate test that can be easily computed in any dimension is also presented.
        第229行: 第210行:  
|}
 
|}
    +
and in general by
    +
:<math>c\left(\alpha\right)=\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1}{2}},</math>
   −
(see also Gosset
     −
and in general<ref>Eq. (15) in Section 3.3.1 of Knuth, D.E., The Art of Computer Programming, Volume 2 (Seminumerical Algorithms), 3rd Edition, Addison Wesley, Reading Mass, 1998.</ref> by
+
so that the condition reads
   −
for a 3D version)
+
:<math>D_{n,m}>\frac{1}{\sqrt{n}}\cdot\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1 + \tfrac{n}{m}}{2}}.</math>
   −
and another to Fasano and Franceschini (see Lopes et al. for a comparison and computational details). Critical values for the test statistic can be obtained by simulations, but depend on the dependence structure in the joint distribution.
     −
:<math>c\left(\alpha\right)=\sqrt{-\ln\left(\tfrac{\alpha}{2}\right)\cdot \tfrac{1}{2}},</math>
+
Here, again, the larger the sample sizes, the more sensitive the minimal bound: For a given ratio of sample sizes (e.g. m=n), the minimal bound scales in the size of either of the samples according to its inverse square root.
   −
In one dimension, the Kolmogorov–Smirnov statistic is identical to the so-called star discrepancy D, so another native KS extension to higher dimensions would be simply to use D also for higher dimensions. Unfortunately, the star discrepancy is hard to calculate in high dimensions.
+
Note that the two-sample test checks whether the two data samples come from the same distribution. This does not specify what that common distribution is (e.g. whether it's normal or not normal). Again, tables of critical values have been published. A shortcoming of the Kolmogorov–Smirnov test is that it is not very powerful because it is devised to be sensitive against all possible types of differences between two distribution functions.  and  showed evidence that the Cucconi test, originally proposed for simultaneously comparing location and scale, is much more powerful than the Kolmogorov–Smirnov test when comparing two distribution functions.
 
  −
 
  −
The Kolmogorov-Smirnov test (one or two sampled test verifies the equality of distributions) is implemented in many software programs:
      
==Setting confidence limits for the shape of a distribution function==
 
==Setting confidence limits for the shape of a distribution function==
961

个编辑

导航菜单