更改

重尾分布 (查看源代码)

2020年10月17日 (六) 22:34的版本

添加8,268字节、 2020年10月17日 (六) 22:34

第217行：第217行：

*The [[Student's t-distribution|t-distribution]].

*The skew lognormal cascade distribution.<ref>{{cite web | author=Stephen Lihn | title=Skew Lognormal Cascade Distribution | year=2009 | url=http://www.skew-lognormal-cascade-distribution.org/ | access-date=2009-06-12 | archive-url=https://web.archive.org/web/20140407075213/http://www.skew-lognormal-cascade-distribution.org/ | archive-date=2014-04-07 | url-status=dead }}</ref>

+

== Relationship to fat-tailed distributions ==

+

A [[fat-tailed distribution]] is a distribution for which the probability density function, for large x, goes to zero as a power <math>x^{-a}</math>. Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function (meaning they are heavy-tailed), but faster than a power (meaning they are not fat-tailed). An example is the [[log-normal distribution]] {{Contradict-inline|article=fat-tailed distribution|reason=Fat-tailed page says log-normals are in fact fat-tailed.|date=June 2019}}. Many other heavy-tailed distributions such as the [[log-logistic distribution|log-logistic]] and [[Pareto distribution|Pareto]] distribution are, however, also fat-tailed.

+

== Estimating the tail-index{{definition|date=January 2018}} ==

+

There are parametric (see Embrechts et al.<ref name="Embrechts"/>) and non-parametric (see, e.g., Novak<ref name="Novak2011">{{cite book

+

| author=Novak S.Y.

+

| title=Extreme value methods with applications to finance

+

| year=2011

+

| series=London: CRC

+

| isbn=978-1-43983-574-6

+

}}</ref>) approaches to the problem of the tail-index estimation.

+

To estimate the tail-index using the parametric approach, some authors employ [[GEV distribution]] or [[Pareto distribution]]; they may apply the maximum-likelihood estimator (MLE).

+

=== Pickand's tail-index estimator ===

+

With <math>(X_n , n \geq 1)</math> a random sequence of independent and same density function <math>F \in D(H(\xi))</math>, the Maximum Attraction Domain<ref name=Pickands>{{cite journal|last=Pickands III|first=James|title=Statistical Inference Using Extreme Order Statistics|journal=The Annals of Statistics|date=Jan 1975|volume=3|issue=1|pages=119–131|jstor=2958083|doi=10.1214/aos/1176343003|doi-access=free}}</ref> of the generalized extreme value density <math> H </math>, where <math>\xi \in \mathbb{R}</math>. If <math>\lim_{n\to\infty} k(n) = \infty </math> and <math>\lim_{n\to\infty} \frac{k(n)}{n}= 0</math>, then the ''Pickands'' tail-index estimation is<ref name="Embrechts"/><ref name="Pickands"/>

+

:<math>

+

\xi^\text{Pickands}_{(k(n),n)} =\frac{1}{\ln 2} \ln \left( \frac{X_{(n-k(n)+1,n)} - X_{(n-2k(n)+1,n)}}{X_{(n-2k(n)+1,n)} - X_{(n-4k(n)+1,n)}}\right)

+

</math>

+

where <math>X_{(n-k(n)+1,n)}=\max \left(X_{n-k(n)+1},\ldots ,X_{n}\right)</math>. This estimator converges in probability to <math>\xi</math>.

+

=== Hill's tail-index estimator ===

+

Let <math>(X_t , t \geq 1)</math> be a sequence of independent and identically distributed random variables with distribution function <math>F \in D(H(\xi))</math>, the maximum domain of attraction of the [[generalized extreme value distribution]] <math> H </math>, where <math>\xi \in \mathbb{R}</math>. The sample path is <math>{X_t: 1 \leq t \leq n}</math> where <math>n</math> is the sample size. If

+

<math>\{k(n)\}</math> is an intermediate order sequence, i.e. <math>k(n) \in \{1,\ldots,n-1\}, </math>, <math>k(n) \to \infty</math> and <math>k(n)/n \to 0</math>, then the Hill tail-index estimator is<ref>Hill B.M. (1975) A simple general approach to inference about the tail of a distribution. Ann. Stat., v. 3, 1163–1174.</ref>

+

: <math>

+

\xi^\text{Hill}_{(k(n),n)} = \left(\frac 1 {k(n)} \sum_{i=n-k(n)+1}^n \ln(X_{(i,n)}) - \ln (X_{(n-k(n)+1,n)})\right)^{-1},

+

</math>

+

where <math>X_{(i,n)}</math> is the <math>i</math>-th [[order statistic]] of <math>X_1, \dots, X_n</math>.

+

This estimator converges in probability to <math>\xi</math>, and is asymptotically normal provided <math>k(n) \to \infty </math> is restricted based on a higher order regular variation property<ref>Hall, P.(1982) On some estimates of an exponent of regular variation. J. R. Stat. Soc. Ser. B., v. 44, 37–42.</ref>

+

.<ref>Haeusler, E. and J. L. Teugels (1985) On asymptotic normality of Hill's estimator for the exponent of regular variation. Ann. Stat., v. 13, 743–756.</ref> Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences,<ref>Hsing, T. (1991) On tail index estimation using dependent data. Ann. Stat., v. 19, 1547–1569.</ref><ref>Hill, J. (2010) On tail index estimation for dependent, heterogeneous data. Econometric Th., v. 26, 1398–1436.</ref> irrespective of whether <math>X_t</math> is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent.<ref>Resnick, S. and Starica, C. (1997). Asymptotic behavior of Hill’s estimator for autoregressive data. Comm. Statist. Stochastic Models 13, 703–721.</ref><ref>Ling, S. and Peng, L. (2004). Hill’s estimator for the tail index of an ARMA model. J. Statist. Plann. Inference 123, 279–293.</ref><ref>Hill, J. B. (2015). Tail index estimation for a filtered dependent time series. Stat. Sin. 25, 609–630.</ref>

+

=== Ratio estimator of the tail-index ===

+

The ratio estimator (RE-estimator) of the tail-index was introduced by Goldie

+

and Smith.<ref>Goldie C.M., Smith R.L. (1987) Slow variation with remainder:

+

theory and applications. Quart. J. Math. Oxford, v. 38, 45–71.</ref>

+

It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".

+

A comparison of Hill-type and RE-type estimators can be found in Novak.<ref name="Novak2011"/>

+

===Software===

+

* [http://www.cs.bu.edu/~crovella/aest.html aest], [[C (programming language)|C]] tool for estimating the heavy-tail index.<ref>{{Cite journal | last1 = Crovella | first1 = M. E. | last2 = Taqqu | first2 = M. S. | title = Estimating the Heavy Tail Index from Scaling Properties| journal = Methodology and Computing in Applied Probability | volume = 1 | pages = 55–79 | year = 1999 | doi = 10.1023/A:1010012224103 | url = http://www.cs.bu.edu/~crovella/paper-archive/aest.ps| pmid = | pmc = }}</ref>

+

==Estimation of heavy-tailed density==

+

Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in

+

Markovich.<ref name="Markovich2007">{{cite book

+

| author=Markovich N.M.

+

| title=Nonparametric Analysis of Univariate Heavy-Tailed data: Research and Practice

+

| year=2007

+

| series=Chitester: Wiley

+

| isbn=978-0-470-72359-3

+

}}</ref> These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning (smoothing) parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error (MSE) and its asymptotic and their upper bounds.<ref name="WandJon1995">{{cite book

+

| author=Wand M.P., Jones M.C.

+

| title=Kernel smoothing

+

| year=1995

+

| series=New York: Chapman and Hall

+

| isbn=978-0412552700

+

}}</ref> A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions (dfs) and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in.<ref name="Markovich2007"/> Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.<ref name="Hall1992">{{cite book

+

| author=Hall P.

+

| title=The Bootstrap and Edgeworth Expansion

+

| year=1992

+

| series=Springer

+

| isbn=9780387945088

+

}}</ref>

+

==See also==

+

*[[Leptokurtic distribution]]

+

*[[Generalized extreme value distribution]]

+

*[[Outlier]]

+

*[[Long tail]]

+

*[[Power law]]

+

*[[Seven states of randomness]]

+

*[[Fat-tailed distribution]]

+

**[[Taleb distribution]] and [[Holy grail distribution]]

Jie

961

个编辑