更改

时间序列分析 (查看源代码)

2023年2月5日 (日) 11:35的版本

删除5,246字节、 2023年2月5日 (日) 11:35

无编辑摘要

第1行：第1行：

−

−

In [[mathematics]], a '''time series''' is a series of [[data point]]s indexed (or listed or graphed) in time order. A time series is a [[sequence]] taken at successive equally spaced points in time. Thus it is a sequence of [[discrete-time]] data. Examples of time series are heights of ocean [[tides]], counts of [[sunspots]], and the daily closing value of the [[Dow Jones Industrial Average]].

−

数学中的时间序列是指按时间顺序索引(或列出或绘制)的一系列数据点。时间序列是在连续的等距时间点上的序列。因此，这种序列上的时间是处于离散状态的。测量海洋潮汐的高度、计算太阳黑子的数量和分析道琼斯工业平均指数的每日收盘价都是时间序列在实际工作上的应用。

−

+

时间序列分析是对按时间顺序导出的(或列出或绘制)的一系列数据点进行分析。时间序列是在连续的等距时间点上的序列。因此，这种序列上的时间是处于离散状态的。测量海洋潮汐的高度、计算太阳黑子的数量和分析道琼斯工业平均指数的每日收盘价都是时间序列在实际工作上的应用。

−

~~A Time series is very frequently plotted via a [[run chart]]~~ (~~which is a temporal [[line chart]]~~). Time series are used in [[statistics]], [[signal processing]], [[pattern recognition]], [[econometrics]], [[mathematical finance]], [[weather forecasting]], [[earthquake prediction]], [[electroencephalography]], [[control engineering]], [[astronomy]], [[communications engineering]], and largely in any domain of applied [[Applied science|science]] and [[engineering]] which involves [[Time|temporal]] measurements.

时间序列常通过趋势图（即时间线图Line chart）具象化。时间序列常被用于统计学、信号处理、模式识别、计量经济学、数理金融学、天气预报、地震预测、脑电图、控制工程、天文学、通信工程，以及涉及时序测量的任何科学和工程领域。

−

+

时间序列分析需要提取时间序列数据中有意义的统计特征以及数据的其他特征。时间序列分析涉及到时间序列的预测。时间序列预测是一种基于先前观测到的值去使用模型来预测未来值的方法。虽然回归分析经常被用于分析一个或多个不同时间序列之间的关系，但这种类型的分析通常不被称为 "时间序列分析"。时间序列分析特指的是分析单一序列中不同时间点之间的关系，也会分析被干预的时间序列（分析时间序列在接受干预前后的变化）。这种干预可能会影响基础变量。

−

'''Time series ''analysis''''' comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. '''Time series ''forecasting''''' is the use of a [[model (abstract)|model]] to predict future values based on previously observed values. While [[regression analysis]] is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series. [[Interrupted time series]] analysis is used to detect changes in the evolution of a time series from before to after some intervention which may affect the underlying variable.

−

时间序列分析Time series analysis包含了以提取时间序列数据中有意义的统计特征和数据的其他特征为目的的方法。时间序列预测Time series forecasting是基于先前观测到的值，使用模型Model来预测未来值的方法。虽然回归分析Regression analysis经常被用于分析一个或多个不同时间序列之间的关系，但这种类型的分析通常不被称为 "时间序列分析"。时间序列分析特指的是分析单一序列中不同时间点之间的关系。中断时间序列Interrupted time series分析是用来检测时间序列在接受干预前后的变化，这种干预可能会影响基础变量。

−

Time series data have a natural temporal ordering. This makes time series analysis distinct from [[cross-sectional study|cross-sectional studies]], in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order). Time series analysis is also distinct from [[spatial data analysis]] where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A [[stochastic]] model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see [[time reversibility]]).

−

时间序列数据具有自然的时间排序。这使得时间序列分析有别于截面研究Cross-sectional studies，在截面研究中，观察结果没有自然排序（例如，通过参考各自的教育水平来解释人们的工资，其中个人的数据可以按任何顺序输入）。时间序列分析也有别于空间数据分析Spatial data analysis，后者的观测值通常与地理位置有关（例如，通过地点以及房屋的内在特征来说明房价）。时间序列的随机Stochastic模型通常会反映这样一个事实，即在时间上相距较近的观测值会比相距较远的观测值更密切相关。此外，时间序列模型通常会利用自然的单向时间顺序，以便将给定时间段的值表示为以某种方式从过去的值而不是从未来的值中得出（参见时间可逆性Time reversibility）。

−

Time series analysis can be applied to [[real number|real-valued]], continuous data, [[:wikt:discrete|discrete]] [[Data type#Numeric types|numeric]] data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the [[English language]]<ref name=":0">{{cite book |last1=Lin |first1=Jessica |last2=Keogh |first2=Eamonn |last3=Lonardi |first3=Stefano |last4=Chiu |first4=Bill |chapter=A symbolic representation of time series, with implications for streaming algorithms |title=Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery |pages=2–11 |year=2003 |location=New York |publisher=ACM Press |doi=10.1145/882082.882086|citeseerx=10.1.1.14.5597 |s2cid=6084733 }}</ref>).

+

时间序列数据具有自然的时间排序。这使得时间序列分析有别于截面研究。在截面研究中，观察结果没有自然排序（例如，通过参考各自的教育水平来解释人们的工资，其中个人的数据可以按任何顺序输入）。时间序列分析也有别于空间数据分析，后者的观测值通常与地理位置有关（例如，通过地点以及房屋的内在特征来说明房价）。时间序列的随机模型通常会反映这样一个事实，即在时间上相距较近的观测值会比相距较远的观测值更密切相关。此外，时间序列模型通常会利用自然的单向时间顺序，以便将给定时间段的值表示为以某种方式从过去的值而不是从未来的值中得出（参见时间可逆性）。

−

~~时间序列分析可以应用于实值、连续数据、离散数值Numeric数据或离散符号数据（即字符序列，如英语English language中的字母和单词~~<ref name=":0" />）。

+

时间序列分析可以应用于实值、连续数据、离散数值Numeric数据或离散符号数据（即字符序列，如英语中的字母和单词<ref name=":0">{{cite book |last1=Lin |first1=Jessica |last2=Keogh |first2=Eamonn |last3=Lonardi |first3=Stefano |last4=Chiu |first4=Bill |chapter=A symbolic representation of time series, with implications for streaming algorithms |title=Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery |pages=2–11 |year=2003 |location=New York |publisher=ACM Press |doi=10.1145/882082.882086|citeseerx=10.1.1.14.5597 |s2cid=6084733 }}</ref>）。

+

==分析方法==

−

~~==Methods for analysis分析方法==~~

−

Methods for time series analysis may be divided into two classes: [[frequency-domain]] methods and [[time-domain]] methods. The former include [[frequency spectrum#Spectrum analysis|spectral analysis]] and [[wavelet analysis]]; the latter include [[auto-correlation]] and [[cross-correlation]] analysis. In the time domain, correlation and analysis can be made in a filter-like manner using [[scaled correlation]], thereby mitigating the need to operate in the frequency domain.

−

时间序列分析的方法可分为两类：频域Frequency-domain方法和时域Time-domain方法。前者包括频谱分析Spectral analysis和小波分析Wavelet analysis；后者包括自相关Auto-correlation和交叉相关Cross-correlation analysis分析。在时域中，可以用类似于滤波器的方式使用标度相关性Scaled correlation来进行关联和分析，从而减轻了在频域中操作的需要。

+

时间序列分析的方法可分为两类：频域方法和时域方法。前者包括频谱分析和小波分析；后者包括自相关和交叉相关分析。在时域中，可以用类似于滤波器的方式使用标度相关性来进行关联和分析。

+

此外，时间序列分析技术可分为参数化和非参数化方法。参数方法假定基础的平稳随机过程具有某种结构，可以用少量的参数来描述（例如，使用自回归或移动平均模型）。在这些方法中，时间序列分析的任务是估计描述随机过程的模型的参数。相比之下，非参数方法明确地估计过程的协方差或频谱，而不假设过程有任何特定的结构。

−

Additionally, time series analysis techniques may be divided into [[Parametric estimation|parametric]] and [[Non-parametric statistics|non-parametric]] methods. The [[Parametric estimation|parametric approaches]] assume that the underlying [[stationary process|stationary stochastic process]] has a certain structure which can be described using a small number of parameters (for example, using an [[autoregressive]] or [[moving average model]]). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, [[Non-parametric statistics|non-parametric approaches]] explicitly estimate the [[covariance]] or the [[spectrum]] of the process without assuming that the process has any particular structure.

−

此外，时间序列分析技术可分为参数化Parametric和非参数化Non-parametric方法。参数方法Parametric approaches假定基础的平稳随机过程Stationary stochastic process具有某种结构，可以用少量的参数来描述（例如，使用自回归Autoregressive或移动平均模型Moving average model）。在这些方法中，任务是估计描述随机过程的模型的参数。相比之下，非参数方法Non-parametric approaches明确地估计过程的协方差Covariance或频谱Spectrum，而不假设过程有任何特定的结构。

−

Methods of time series analysis may also be divided into [[Linear regression|linear]] and [[Nonlinear regression|non-linear]], and [[Univariate analysis|univariate]] and [[Multivariate analysis|multivariate]].

−

~~时间序列分析的方法也可以分为线性Linear 和非线性Non-linear，以及单变量Univariate 和多变量Multivariate。~~

+

时间序列分析的方法也可以分为线性和非线性，以及单变量和多变量。

−

==~~Panel data面板数据~~==

+

==面板数据==

A time series is one type of [[panel data]]. Panel data is the general class, a multidimensional data set, whereas a time series data set is a one-dimensional panel (as is a [[cross-sectional data]]set). A data set may exhibit characteristics of both panel data and time series data. One way to tell is to ask what makes one data record unique from the other records. If the answer is the time data field, then this is a time series data set candidate. If determining a unique record requires a time data field and an additional identifier which is unrelated to time (e.g. student ID, stock symbol, country code), then it is panel data candidate. If the differentiation lies on the non-time identifier, then the data set is a cross-sectional data set candidate.

−

时间序列是面板数据Panel data的一种类型，面板数据是更大的类别。面板数据是一个多维的数据集，而时间序列数据集是一个一维的面板（正如截面数据Cross-sectional data集一样）。一个数据集可能同时表现出面板数据和时间序列数据的特征。判断的方法之一是探究是什么使一条数据记录与其他记录不同。如果答案是时间数据字段，那么这就是一个时间序列数据集候选。如果确定一个独特的记录需要一个时间数据字段和一个与时间无关的额外标识符（如学生证、股票代码、国家代码），那么它就是面板数据的候选。如果区别在于非时间标识符，那么该数据集就是一个截面数据集候选。

+

时间序列是面板数据的一种类型，面板数据是更大的类别。面板数据是一个多维的数据集，而时间序列数据集是一个一维的面板（正如截面数据集一样）。一个数据集可能同时表现出面板数据和时间序列数据的特征。判断的方法之一是探究是什么使一条数据记录与其他记录不同。如果答案是时间数据字段，那么这就是一个时间序列数据集候选。如果确定一个独特的记录需要一个时间数据字段和一个与时间无关的额外标识符（如学生证、股票代码、国家代码），那么它就是面板数据的候选。如果区别在于非时间标识符，那么该数据集就是一个截面数据集候选。

==Analysis分析==

TYcl20

35

个编辑