− | 虽然人们已经提出了更成熟更稳健的方法,但通过随机样本检验幂律概率分布的最常用的图形方法还是[[帕累托双分位图 Pareto quantile-quantile plot(或帕累托Q-Q图)]],平均剩余寿命图 mean residual life plot<ref>Beirlant, J., Teugels, J. L., Vynckier, P. (1996a) ''Practical Analysis of Extreme Values'', Leuven: Leuven University Press</ref><ref>Coles, S. (2001) ''An introduction to statistical modeling of extreme values''. Springer-Verlag, London.</ref>和双对数图 Pareto quantile-quantile plots([https://en.wikipedia.org/wiki/Log%E2%80%93log_plot log-log图])。另一种更强大的图形检验法是利用残余分位函数束<ref name="Diaz">Diaz, F. J. (1999) [https://pattern.swarma.org/paper?id=e93579d4-6da7-11ea-8f36-0242ac1a0005 "Identifying Tail Behavior by Means of Residual Quantile Functions"].Journal of Computational and Graphical Statistics.8.(493--509)</ref> 。(注意,幂律分布也称为帕累托分布。)这里假设从概率分布中获得随机样本,并且我们想知道分布的尾部是否遵循幂律(换句话说,我们想知道分布是否有“帕累托尾”)。此处随机样本也被称为“数据”。
| + | 虽然人们已经提出了更成熟更稳健的方法,但检验随机样本是否具有幂律概率分布的最常用的图形方法还是[[帕累托双分位图 Pareto quantile-quantile plot(或帕累托Q-Q图)]],平均剩余寿命图 mean residual life plot<ref>Beirlant, J., Teugels, J. L., Vynckier, P. (1996a) ''Practical Analysis of Extreme Values'', Leuven: Leuven University Press</ref><ref>Coles, S. (2001) ''An introduction to statistical modeling of extreme values''. Springer-Verlag, London.</ref>和双对数图 Pareto quantile-quantile plots([https://en.wikipedia.org/wiki/Log%E2%80%93log_plot log-log图])。另一种更强大的图形检验法是利用残余分位函数束<ref name="Diaz">Diaz, F. J. (1999) [https://pattern.swarma.org/paper?id=e93579d4-6da7-11ea-8f36-0242ac1a0005 "Identifying Tail Behavior by Means of Residual Quantile Functions"].Journal of Computational and Graphical Statistics.8.(493--509)</ref> 。(注意,幂律分布也称为帕累托分布。)这里假设从概率分布中获得随机样本,并且我们想知道分布的尾部是否遵循幂律(换句话说,我们想知道分布是否有“帕累托尾”)。此处随机样本也被称为“数据”。 |
| 双对数图是使用随机样本以图形方式检验尾部分布的另一种方式。使用这个方法需谨慎,因为双对数图中呈现直线对幂律概率分布是必要不充分条件,许多非幂律分布在双对数图上也显示为直线<ref name="Three-Toed Sloth">Three-Toed Sloth (2018) [https://pattern.swarma.org/paper?id=932f3352-6da9-11ea-b3f5-0242ac1a0005 So You Think You Have a Power Law — Well Isn't That Special?].</ref><ref name="Aaron Clauset"></ref> 。这个方法是将特定数在该分布中的概率估计量的对数 | 对比这个数的对数 | 进行绘图。通常,此估计量是该数据在数据集中出现的次数的比例。如果图中的点在x较大时倾向于“收敛”为直线,则可得出结论,该分布具有“幂律尾”(power-law tail)。目前这些类型的绘图的[https://doi.org/10.1038/35036627 应用示例] 已经发表<ref name="Jeong">Jeong, H, Albert; Oltvai, B., Barabasi, Z.N., A.-L. (2000) [https://pattern.swarma.org/paper?id=ac8e1582-6dab-11ea-b7e7-0242ac1a0005 "The large-scale organization of metabolic networks"].Nature.407.(651--654)</ref>。但这种方法的局限是,需要大量的数据才能使结果可靠。此外,它仅适用于离散(或分组)数据。 | | 双对数图是使用随机样本以图形方式检验尾部分布的另一种方式。使用这个方法需谨慎,因为双对数图中呈现直线对幂律概率分布是必要不充分条件,许多非幂律分布在双对数图上也显示为直线<ref name="Three-Toed Sloth">Three-Toed Sloth (2018) [https://pattern.swarma.org/paper?id=932f3352-6da9-11ea-b3f5-0242ac1a0005 So You Think You Have a Power Law — Well Isn't That Special?].</ref><ref name="Aaron Clauset"></ref> 。这个方法是将特定数在该分布中的概率估计量的对数 | 对比这个数的对数 | 进行绘图。通常,此估计量是该数据在数据集中出现的次数的比例。如果图中的点在x较大时倾向于“收敛”为直线,则可得出结论,该分布具有“幂律尾”(power-law tail)。目前这些类型的绘图的[https://doi.org/10.1038/35036627 应用示例] 已经发表<ref name="Jeong">Jeong, H, Albert; Oltvai, B., Barabasi, Z.N., A.-L. (2000) [https://pattern.swarma.org/paper?id=ac8e1582-6dab-11ea-b7e7-0242ac1a0005 "The large-scale organization of metabolic networks"].Nature.407.(651--654)</ref>。但这种方法的局限是,需要大量的数据才能使结果可靠。此外,它仅适用于离散(或分组)数据。 |
− | 不过,目前已经提出了使用随机样本检验幂律概率分布的另一种图形方法。该方法包括绘制对数变换样本的束,是最早提出使用随机样本探索矩的存在和矩生成函数的工具,基于[[残差分位函数 RQF]](也称为残差百分位函数)<ref>Arnold, B. C., Brockett, P. L. (1983) [https://pattern.swarma.org/paper?id=4ae90c8a-6dfe-11ea-9588-0242ac1a0005 "When does the βth percentile residual life function determine the distribution?"].Operations Research.31.(391--396)</ref><ref>Joe, H., Proschan, F. (1984) [https://pattern.swarma.org/paper?id=718f4dea-6dfe-11ea-a580-0242ac1a0005 "Percentile residual life functions"].Operations Research.32.(668--678)</ref><ref>Joe, H., Part, A (1985) [https://pattern.swarma.org/paper?id=ececa614-6dfd-11ea-9867-0242ac1a0005 "Characterizations of life distributions from percentile residual lifetimes"].37.(165--172)</ref><ref name="Csorgo">Csorgo, S., Viharos, L. (1992) [https://pattern.swarma.org/paper?id=a8d9b326-6dfe-11ea-a263-0242ac1a0005 "Confidence bands for percentile residual lifetimes"].Journal of Statistical Planning and Inference.30.(327--337)</ref><ref>Schmittlein, D. C., Morrison, D. G. (1981) [https://pattern.swarma.org/paper?id=223f4cca-6e00-11ea-adc5-0242ac1a0005 "The median residual lifetime: A characterization theorem and an application"].Operations Research.29.(392--399)</ref><ref>Morrison, D. G., Schmittlein, D. C. (1980) [https://pattern.swarma.org/paper?id=8191c022-6e00-11ea-89f3-0242ac1a0005 "Jobs, strikes, and wars: Probability models for duration"].Organizational Behavior and Human Performance.25.(224--251)</ref><ref>Gerchak, Y (1984) [https://pattern.swarma.org/paper?id=99b1a06e-6e00-11ea-981a-0242ac1a0005 "Decreasing failure rates and related issues in the social sciences"].Operations Research.32.(537--546)</ref>.The European Physical Journal.58.(167--173)</ref> ,它提供了许多众所周知的概率分布的尾部行为的完整表征,包括幂律分布与其他类型的重尾,甚至非重尾分布的分布。这种方法绘制的图形没有上面提到的平均剩余寿命图、双对数图和帕累托 Q-Q图的缺点,它们对异常值很敏感,能够直观地检验具有小<math>\alpha</math>值的幂律,并且不适用于分析大量数据。此外,其他分布类型的尾部也可以用这个方法观察检验。
| + | 不过,目前已经发现了使用随机样本检验幂律概率分布的另一种图形方法。该方法包括绘制对数变换样本的束,是最早提出使用随机样本探索矩的存在和矩生成函数的工具,基于[[残差分位函数 RQF]](也称为残差百分位函数)<ref>Arnold, B. C., Brockett, P. L. (1983) [https://pattern.swarma.org/paper?id=4ae90c8a-6dfe-11ea-9588-0242ac1a0005 "When does the βth percentile residual life function determine the distribution?"].Operations Research.31.(391--396)</ref><ref>Joe, H., Proschan, F. (1984) [https://pattern.swarma.org/paper?id=718f4dea-6dfe-11ea-a580-0242ac1a0005 "Percentile residual life functions"].Operations Research.32.(668--678)</ref><ref>Joe, H., Part, A (1985) [https://pattern.swarma.org/paper?id=ececa614-6dfd-11ea-9867-0242ac1a0005 "Characterizations of life distributions from percentile residual lifetimes"].37.(165--172)</ref><ref name="Csorgo">Csorgo, S., Viharos, L. (1992) [https://pattern.swarma.org/paper?id=a8d9b326-6dfe-11ea-a263-0242ac1a0005 "Confidence bands for percentile residual lifetimes"].Journal of Statistical Planning and Inference.30.(327--337)</ref><ref>Schmittlein, D. C., Morrison, D. G. (1981) [https://pattern.swarma.org/paper?id=223f4cca-6e00-11ea-adc5-0242ac1a0005 "The median residual lifetime: A characterization theorem and an application"].Operations Research.29.(392--399)</ref><ref>Morrison, D. G., Schmittlein, D. C. (1980) [https://pattern.swarma.org/paper?id=8191c022-6e00-11ea-89f3-0242ac1a0005 "Jobs, strikes, and wars: Probability models for duration"].Organizational Behavior and Human Performance.25.(224--251)</ref><ref>Gerchak, Y (1984) [https://pattern.swarma.org/paper?id=99b1a06e-6e00-11ea-981a-0242ac1a0005 "Decreasing failure rates and related issues in the social sciences"].Operations Research.32.(537--546)</ref>.The European Physical Journal.58.(167--173)</ref> ,它提供了许多众所周知的概率分布的尾部行为的完整表征,包括幂律分布与其他类型的重尾,甚至非重尾分布的分布。这种方法绘制的图形没有上面提到的平均剩余寿命图、双对数图和帕累托 Q-Q图的缺点,它们对异常值很敏感,能够直观地检验具有小<math>\alpha</math>值的幂律,并且不适用于分析大量数据。此外,其他分布类型的尾部也可以用这个方法观察检验。 |