更改

跳到导航 跳到搜索
删除10,211字节 、 2022年6月16日 (四) 21:20
无编辑摘要
第1行: 第1行: −
此词条暂由彩云小译翻译,翻译字数共3681,未经人工整理和审校,带来阅读不便,请见谅。
+
此词条由地球系统科学读书会词条梳理志愿者(李柔静)翻译审校,未经专家审核,带来阅读不便,请见谅。
    
{{Use American English|date = March 2019}}
 
{{Use American English|date = March 2019}}
 
{{short description|Sequence of data points over time}}
 
{{short description|Sequence of data points over time}}
[[File:Random-data-plus-trend-r2.png|thumb|250px|Time series: random data plus trend, with best-fit line and different applied filters|alt=|right]]
+
[[File:Random-data-plus-trend-r2.png|thumb|250px|Time series: random data plus trend, with best-fit line and different applied filters时间序列:随机数据加趋势,带有最佳拟合线和不同的过滤器|right|链接=Special:FilePath/Random-data-plus-trend-r2.png]]
 
In [[mathematics]], a '''time series''' is a series of [[data point]]s indexed (or listed or graphed) in time order.  Most commonly, a time series is a [[sequence]] taken at successive equally spaced points in time. Thus it is a sequence of [[discrete-time]] data. Examples of time series are heights of ocean [[tides]], counts of [[sunspots]], and the daily closing value of the [[Dow Jones Industrial Average]].
 
In [[mathematics]], a '''time series''' is a series of [[data point]]s indexed (or listed or graphed) in time order.  Most commonly, a time series is a [[sequence]] taken at successive equally spaced points in time. Thus it is a sequence of [[discrete-time]] data. Examples of time series are heights of ocean [[tides]], counts of [[sunspots]], and the daily closing value of the [[Dow Jones Industrial Average]].
    +
在数学Mathematics中,时间序列Time series是按时间顺序索引(或列出或绘制)的一系列数据点Data point。最常见的是,时间序列是在连续的等距时间点上的序列Sequence。因此,它是一个离散时间Discrete-time数据的序列。时间序列的例子有海洋潮汐Tides的高度、太阳黑子Sunspot的数量以及道琼斯工业平均指数Dow Jones Industrial Average的每日收盘价。
   −
  −
thumb|250px|Time series: random data plus trend, with best-fit line and different applied filters|alt=|right
  −
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order.  Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.
  −
  −
拇指 | 250px | 时间序列: 随机数据加趋势,带有最佳拟合线和不同的应用过滤器 | alt = | right 在数学中,时间序列是按时间顺序索引(或列出或绘制)的一系列数据点。最常见的是,时间序列是在相继的等间隔时间点上拍摄的序列。因此,它是一个离散时间数据序列。时间序列的例子有海潮的高度、太阳黑子的数量以及道琼斯工业平均指数的每日收盘价。
      
A Time series is very frequently plotted via a [[run chart]] (which is a temporal [[line chart]]). Time series are used in [[statistics]], [[signal processing]], [[pattern recognition]], [[econometrics]], [[mathematical finance]], [[weather forecasting]], [[earthquake prediction]], [[electroencephalography]], [[control engineering]], [[astronomy]], [[communications engineering]], and largely in any domain of applied [[Applied science|science]] and [[engineering]] which involves [[Time|temporal]] measurements.
 
A Time series is very frequently plotted via a [[run chart]] (which is a temporal [[line chart]]). Time series are used in [[statistics]], [[signal processing]], [[pattern recognition]], [[econometrics]], [[mathematical finance]], [[weather forecasting]], [[earthquake prediction]], [[electroencephalography]], [[control engineering]], [[astronomy]], [[communications engineering]], and largely in any domain of applied [[Applied science|science]] and [[engineering]] which involves [[Time|temporal]] measurements.
   −
A Time series is very frequently plotted via a run chart (which is a temporal line chart). Time series are used in statistics, signal processing, pattern recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography, control engineering, astronomy, communications engineering, and largely in any domain of applied science and engineering which involves temporal measurements.
+
时间序列经常通过趋势图Run chart(时间线图Line chart)来绘制。时间序列被用于统计学Statistics、信号处理Signal processing、模式识别Pattern recognition、计量经济学Econometrics、数理金融学Mathematical finance、天气预报Weather forecasting、地震预测Earthquake prediction、脑电图Electroencephalography、控制工程Control engineering、天文学Astronomy、通信工程Communications engineering,以及涉及时序Temporal测量的任何应用科学Science和工程Engineering领域。
   −
时间序列通常是通过运行图(时间线图)绘制的。时间序列广泛应用于统计学、信号处理、模式识别、计量经济学、数学金融学、天气预报、地震预测、脑电图、控制工程、天文学、通信工程等领域,还广泛应用于涉及时间测量的任何应用科学和工程领域。
      
'''Time series ''analysis''''' comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. '''Time series ''forecasting''''' is the use of a [[model (abstract)|model]] to predict future values based on previously observed values. While [[regression analysis]] is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series. [[Interrupted time series]] analysis is used to detect changes in the evolution of a time series from before to after some intervention which may affect the underlying variable.
 
'''Time series ''analysis''''' comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. '''Time series ''forecasting''''' is the use of a [[model (abstract)|model]] to predict future values based on previously observed values. While [[regression analysis]] is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series. [[Interrupted time series]] analysis is used to detect changes in the evolution of a time series from before to after some intervention which may affect the underlying variable.
   −
Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series forecasting is the use of a model to predict future values based on previously observed values. While regression analysis is often employed in such a way as to test relationships between one or more different time series, this type of analysis is not usually called "time series analysis", which refers in particular to relationships between different points in time within a single series. Interrupted time series analysis is used to detect changes in the evolution of a time series from before to after some intervention which may affect the underlying variable.
+
时间序列分析Time series analysis包含了以提取时间序列数据中有意义的统计特征和数据的其他特征为目的的方法。时间序列预测Time series forecasting是基于先前观测到的值,使用模型Model来预测未来值的方法。虽然回归分析Regression analysis经常被用于分析一个或多个不同时间序列之间的关系,但这种类型的分析通常不被称为 "时间序列分析"。时间序列分析特指的是分析单一序列中不同时间点之间的关系。中断时间序列Interrupted time series分析是用来检测时间序列在接受干预前后的变化,这种干预可能会影响基础变量。
   −
时间序列分析包括分析时间序列数据的方法,以提取有意义的统计数据和数据的其他特征。时间序列预测是使用一个模型来预测未来的价值基于以前观察到的价值。虽然回归分析通常用于测试一个或多个不同时间序列之间的关系,但这种类型的分析通常不被称为“时间序列分析”,它特别指的是单个序列中不同时间点之间的关系。中断时间序列分析是用来检测一个时间序列从干预之前到干预之后的变化,这些变化可能会影响到基础变量。
      
Time series data have a natural temporal ordering.  This makes time series analysis distinct from [[cross-sectional study|cross-sectional studies]], in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order).  Time series analysis is also distinct from [[spatial data analysis]] where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A [[stochastic]] model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see [[time reversibility]]).
 
Time series data have a natural temporal ordering.  This makes time series analysis distinct from [[cross-sectional study|cross-sectional studies]], in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order).  Time series analysis is also distinct from [[spatial data analysis]] where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A [[stochastic]] model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see [[time reversibility]]).
   −
Time series data have a natural temporal ordering.  This makes time series analysis distinct from cross-sectional studies, in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their respective education levels, where the individuals' data could be entered in any order).  Time series analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). A stochastic model for a time series will generally reflect the fact that observations close together in time will be more closely related than observations further apart. In addition, time series models will often make use of the natural one-way ordering of time so that values for a given period will be expressed as deriving in some way from past values, rather than from future values (see time reversibility).
+
时间序列数据具有自然的时间排序。这使得时间序列分析有别于截面研究Cross-sectional studies,在截面研究中,观察结果没有自然排序(例如,通过参考各自的教育水平来解释人们的工资,其中个人的数据可以按任何顺序输入)。时间序列分析也有别于空间数据分析Spatial data analysis,后者的观测值通常与地理位置有关(例如,通过地点以及房屋的内在特征来说明房价)。时间序列的随机Stochastic模型通常会反映这样一个事实,即在时间上相距较近的观测值会比相距较远的观测值更密切相关。此外,时间序列模型通常会利用自然的单向时间顺序,以便将给定时间段的值表示为以某种方式从过去的值而不是从未来的值中得出(参见时间可逆性Time reversibility)。
   −
时间序列数据具有自然的时间序列。这使得时间序列分析不同于横断面研究,横断面研究中的观察没有自然的顺序(例如:。通过参照个人的教育程度来解释人们的工资,个人的数据可以按任意顺序输入)。时间序列分析也不同于空间数据分析,因为空间数据分析的观测通常与地理位置有关(例如:。根据房屋的位置和内在特征来计算房价)。一个时间序列的随机模型通常反映了这样一个事实,即在时间上紧密相连的观察比相距较远的观察更密切相关。此外,时间序列模型往往利用时间的自然单向排序,以便某一时期的数值以某种方式从过去的数值而不是从未来的数值得出(见时间可逆性)。
     −
Time series analysis can be applied to [[real number|real-valued]], continuous data, [[:wikt:discrete|discrete]] [[Data type#Numeric types|numeric]] data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the [[English language]]<ref>{{cite book |last1=Lin |first1=Jessica |last2=Keogh |first2=Eamonn |last3=Lonardi |first3=Stefano |last4=Chiu |first4=Bill |chapter=A symbolic representation of time series, with implications for streaming algorithms |title=Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery |pages=2–11 |year=2003 |location=New York |publisher=ACM Press |doi=10.1145/882082.882086|citeseerx=10.1.1.14.5597 |s2cid=6084733 }}</ref>).
+
Time series analysis can be applied to [[real number|real-valued]], continuous data, [[:wikt:discrete|discrete]] [[Data type#Numeric types|numeric]] data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the [[English language]]<ref name=":0">{{cite book |last1=Lin |first1=Jessica |last2=Keogh |first2=Eamonn |last3=Lonardi |first3=Stefano |last4=Chiu |first4=Bill |chapter=A symbolic representation of time series, with implications for streaming algorithms |title=Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery |pages=2–11 |year=2003 |location=New York |publisher=ACM Press |doi=10.1145/882082.882086|citeseerx=10.1.1.14.5597 |s2cid=6084733 }}</ref>).
    
Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the English language).
 
Time series analysis can be applied to real-valued, continuous data, discrete numeric data, or discrete symbolic data (i.e. sequences of characters, such as letters and words in the English language).
   −
时间序列分析可以应用于实值数据、连续数据、离散数值数据或离散符号数据(即。字符序列,例如英语中的字母和单词)。
+
时间序列分析可以应用于实值real-valued、连续数据、离散Discrete数值Numeric数据或离散符号数据(即字符序列,如英语English language中的字母和单词<ref name=":0" />)。
 
  −
==Methods for analysis==
  −
 
  −
==Methods for analysis==
  −
 
  −
= = = 分析方法 = =  
      +
==Methods for analysis分析方法==
 
Methods for time series analysis may be divided into two classes: [[frequency-domain]] methods and [[time-domain]] methods. The former include [[frequency spectrum#Spectrum analysis|spectral analysis]] and [[wavelet analysis]]; the latter include [[auto-correlation]] and [[cross-correlation]] analysis. In the time domain, correlation and analysis can be made in a filter-like manner using [[scaled correlation]], thereby mitigating the need to operate in the frequency domain.
 
Methods for time series analysis may be divided into two classes: [[frequency-domain]] methods and [[time-domain]] methods. The former include [[frequency spectrum#Spectrum analysis|spectral analysis]] and [[wavelet analysis]]; the latter include [[auto-correlation]] and [[cross-correlation]] analysis. In the time domain, correlation and analysis can be made in a filter-like manner using [[scaled correlation]], thereby mitigating the need to operate in the frequency domain.
   −
Methods for time series analysis may be divided into two classes: frequency-domain methods and time-domain methods. The former include spectral analysis and wavelet analysis; the latter include auto-correlation and cross-correlation analysis. In the time domain, correlation and analysis can be made in a filter-like manner using scaled correlation, thereby mitigating the need to operate in the frequency domain.
+
时间序列分析的方法可分为两类:频域Frequency-domain方法和时域Time-domain方法。前者包括频谱分析Spectral analysis和小波分析Wavelet analysis;后者包括自相关Auto-correlation和交叉相关Cross-correlation analysis分析。在时域中,可以用类似于滤波器的方式使用标度相关性Scaled correlation来进行关联和分析,从而减轻了在频域中操作的需要。
   −
时间序列分析方法可分为两类: 频域方法和时域方法。前者包括谱分析和小波分析,后者包括自相关和互相关分析。在时域上,可以使用比例相关以类似滤波器的方式进行相关和分析,从而减少了在频域进行操作的需要。
      
Additionally, time series analysis techniques may be divided into [[Parametric estimation|parametric]] and [[Non-parametric statistics|non-parametric]] methods. The [[Parametric estimation|parametric approaches]] assume that the underlying [[stationary process|stationary stochastic process]] has a certain structure which can be described using a small number of parameters (for example, using an [[autoregressive]] or [[moving average model]]). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, [[Non-parametric statistics|non-parametric approaches]] explicitly estimate the [[covariance]] or the [[spectrum]] of the process without assuming that the process has any particular structure.
 
Additionally, time series analysis techniques may be divided into [[Parametric estimation|parametric]] and [[Non-parametric statistics|non-parametric]] methods. The [[Parametric estimation|parametric approaches]] assume that the underlying [[stationary process|stationary stochastic process]] has a certain structure which can be described using a small number of parameters (for example, using an [[autoregressive]] or [[moving average model]]). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, [[Non-parametric statistics|non-parametric approaches]] explicitly estimate the [[covariance]] or the [[spectrum]] of the process without assuming that the process has any particular structure.
   −
Additionally, time series analysis techniques may be divided into parametric and non-parametric methods. The parametric approaches assume that the underlying stationary stochastic process has a certain structure which can be described using a small number of parameters (for example, using an autoregressive or moving average model). In these approaches, the task is to estimate the parameters of the model that describes the stochastic process. By contrast, non-parametric approaches explicitly estimate the covariance or the spectrum of the process without assuming that the process has any particular structure.
+
此外,时间序列分析技术可分为参数化Parametric和非参数化Non-parametric方法。参数方法Parametric approaches假定基础的平稳随机过程Stationary stochastic process具有某种结构,可以用少量的参数来描述(例如,使用自回归Autoregressive或移动平均模型Moving average model)。在这些方法中,任务是估计描述随机过程的模型的参数。相比之下,非参数方法Non-parametric approaches明确地估计过程的协方差Covariance或频谱Spectrum,而不假设过程有任何特定的结构。
   −
此外,时间序列分析技术可分为参数方法和非参数方法。参数方法假设潜在的平稳随机过程有一个特定的结构,这个结构可以用少量的参数来描述(例如,使用自回归或移动平均模型)。在这些方法中,任务是估计描述随机过程的模型的参数。相比之下,非参数方法明确地估计过程的协方差或谱,而不假设过程具有任何特定的结构。
      
Methods of time series analysis may also be divided into [[Linear regression|linear]] and [[Nonlinear regression|non-linear]], and [[Univariate analysis|univariate]] and [[Multivariate analysis|multivariate]].
 
Methods of time series analysis may also be divided into [[Linear regression|linear]] and [[Nonlinear regression|non-linear]], and [[Univariate analysis|univariate]] and [[Multivariate analysis|multivariate]].
   −
Methods of time series analysis may also be divided into linear and non-linear, and univariate and multivariate.
+
时间序列分析的方法也可以分为线性Linear 和非线性Non-linear,以及单变量Univariate 和多变量Multivariate。
 
  −
时间序列分析的方法也可分为线性和非线性,以及单变量和多变量。
  −
 
  −
==Panel data==
  −
 
  −
==Panel data==
  −
 
  −
= = = 面板数据 = =
      +
==Panel data面板数据==
 
A time series is one type of [[panel data]]. Panel data is the general class, a multidimensional data set, whereas a time series data set is a one-dimensional panel (as is a [[cross-sectional data]]set).  A data set may exhibit characteristics of both panel data and time series data.  One way to tell is to ask what makes one data record unique from the other records.  If the answer is the time data field, then this is a time series data set candidate.  If determining a unique record requires a time data field and an additional identifier which is unrelated to time (e.g. student ID, stock symbol, country code), then it is panel data candidate.  If the differentiation lies on the non-time identifier, then the data set is a cross-sectional data set candidate.
 
A time series is one type of [[panel data]]. Panel data is the general class, a multidimensional data set, whereas a time series data set is a one-dimensional panel (as is a [[cross-sectional data]]set).  A data set may exhibit characteristics of both panel data and time series data.  One way to tell is to ask what makes one data record unique from the other records.  If the answer is the time data field, then this is a time series data set candidate.  If determining a unique record requires a time data field and an additional identifier which is unrelated to time (e.g. student ID, stock symbol, country code), then it is panel data candidate.  If the differentiation lies on the non-time identifier, then the data set is a cross-sectional data set candidate.
   −
A time series is one type of panel data. Panel data is the general class, a multidimensional data set, whereas a time series data set is a one-dimensional panel (as is a cross-sectional dataset).  A data set may exhibit characteristics of both panel data and time series data.  One way to tell is to ask what makes one data record unique from the other records.  If the answer is the time data field, then this is a time series data set candidate.  If determining a unique record requires a time data field and an additional identifier which is unrelated to time (e.g. student ID, stock symbol, country code), then it is panel data candidate.  If the differentiation lies on the non-time identifier, then the data set is a cross-sectional data set candidate.
+
时间序列是面板数据Panel data的一种类型,面板数据是更大的类别。面板数据是一个多维的数据集,而时间序列数据集是一个一维的面板(正如截面数据Cross-sectional data集一样)。一个数据集可能同时表现出面板数据和时间序列数据的特征。判断的方法之一是探究是什么使一条数据记录与其他记录不同。如果答案是时间数据字段,那么这就是一个时间序列数据集候选。如果确定一个独特的记录需要一个时间数据字段和一个与时间无关的额外标识符(如学生证、股票代码、国家代码),那么它就是面板数据的候选。如果区别在于非时间标识符,那么该数据集就是一个截面数据集候选。
 
  −
时间序列是面板数据的一种。Panel data 是一个通用类,是一个多维数据集,而时间序列数据集是一个一维面板(就像横截面数据集一样)。一个数据集可以同时显示面板数据和时间序列数据的特征。判断的一种方法是询问是什么使一个数据记录与其他记录相比是唯一的。如果答案是时间数据字段,那么这是一个候选的时间序列数据集。如果确定一个唯一的记录需要一个时间数据字段和一个与时间无关的附加标识符(例如:。学生证,股票代码,国家代码) ,然后是面板数据候选人。如果区分取决于非时间标识符,那么数据集是一个横断面数据集候选者。
  −
 
  −
==Analysis==
  −
 
  −
==Analysis==
  −
 
  −
= = 分析 = =
      +
==Analysis分析==
 
There are several types of motivation and data analysis available for time series which are appropriate for different purposes.
 
There are several types of motivation and data analysis available for time series which are appropriate for different purposes.
   −
There are several types of motivation and data analysis available for time series which are appropriate for different purposes.
+
对于具有不同目的的时间序列,适用的动机和数据分析方法都不同。
 
  −
有几种类型的动机和数据分析可用于时间序列是适合不同的目的。
  −
 
  −
===Motivation===
  −
 
  −
===Motivation===
  −
 
  −
= = = 动机 = =
  −
 
  −
In the context of [[statistics]], [[econometrics]], [[quantitative finance]], [[seismology]], [[meteorology]], and [[geophysics]] the primary goal of time series analysis is [[forecasting]]. In the context of [[signal processing]], [[control engineering]] and [[communication engineering]] it is used for signal detection. Other applications are in [[data mining]], [[pattern recognition]] and [[machine learning]], where time series analysis can be used for [[cluster analysis|clustering]],<ref>{{cite journal | last1 = Liao | first1 = T. Warren | title = Clustering of time series data - a survey | journal = Pattern Recognition | volume = 38 | issue = 11 | pages = 1857–1874 | publisher = Elsevier | date = 2005 | language = en | doi = 10.1016/j.patcog.2005.01.025| bibcode = 2005PatRe..38.1857W }}{{subscription required|via=ScienceDirect }}</ref><ref>{{cite journal | last1 = Aghabozorgi | first1 = Saeed | last2 = Shirkhorshidi | first2 = Ali S. | last3 = Wah | first3 = Teh Y. | title = Time-series clustering – A decade review | journal = Information Systems | volume = 53 | pages = 16–38 | publisher = Elsevier | date = 2015 | language = en | doi = 10.1016/j.is.2015.04.007}}{{subscription required|via=ScienceDirect }}</ref> [[Statistical classification|classification]],<ref>{{cite journal | last1 = Keogh | first1 = Eamonn J. | title = On the need for time series data mining benchmarks | journal = Data Mining and Knowledge Discovery | volume = 7 | pages = 349–371 | publisher = Kluwer | date = 2003 | language = en | doi = 10.1145/775047.775062| isbn = 158113567X | s2cid = 41617550 }}{{subscription required|via=ACM Digital Library }}</ref> query by content,<ref>{{cite conference|last1=Agrawal|first1=Rakesh|last2=Faloutsos|first2=Christos|last3=Swami|first3=Arun|date=October 1993|title=Efficient Similarity Search In Sequence Databases|conference=International Conference on Foundations of Data Organization and Algorithms|volume=730|pages=69–84|book-title=Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms|doi=10.1007/3-540-57301-1_5}}{{Subscription required|via=SpringerLink}}</ref> [[anomaly detection]] as well as [[forecasting]].<ref>{{cite journal|last1=Chen|first1=Cathy W. S.|last2=Chiu|first2=L. M.|date=September 2021|title=Ordinal Time Series Forecasting of the Air Quality Index|journal=Entropy|language=en|volume=23|issue=9|pages=1167|doi=10.3390/e23091167|pmid=34573792|pmc=8469594|bibcode=2021Entrp..23.1167C|doi-access=free}}</ref>
     −
In the context of statistics, econometrics, quantitative finance, seismology, meteorology, and geophysics the primary goal of time series analysis is forecasting. In the context of signal processing, control engineering and communication engineering it is used for signal detection. Other applications are in data mining, pattern recognition and machine learning, where time series analysis can be used for clustering, classification, query by content, anomaly detection as well as forecasting.
+
===Motivation动机===
 +
In the context of [[statistics]], [[econometrics]], [[quantitative finance]], [[seismology]], [[meteorology]], and [[geophysics]] the primary goal of time series analysis is [[forecasting]]. In the context of [[signal processing]], [[control engineering]] and [[communication engineering]] it is used for signal detection. Other applications are in [[data mining]], [[pattern recognition]] and [[machine learning]], where time series analysis can be used for [[cluster analysis|clustering]],<ref name=":1">{{cite journal | last1 = Liao | first1 = T. Warren | title = Clustering of time series data - a survey | journal = Pattern Recognition | volume = 38 | issue = 11 | pages = 1857–1874 | publisher = Elsevier | date = 2005 | language = en | doi = 10.1016/j.patcog.2005.01.025| bibcode = 2005PatRe..38.1857W }}{{subscription required|via=ScienceDirect }}</ref><ref name=":2">{{cite journal | last1 = Aghabozorgi | first1 = Saeed | last2 = Shirkhorshidi | first2 = Ali S. | last3 = Wah | first3 = Teh Y. | title = Time-series clustering – A decade review | journal = Information Systems | volume = 53 | pages = 16–38 | publisher = Elsevier | date = 2015 | language = en | doi = 10.1016/j.is.2015.04.007}}{{subscription required|via=ScienceDirect }}</ref> [[Statistical classification|classification]],<ref name=":3">{{cite journal | last1 = Keogh | first1 = Eamonn J. | title = On the need for time series data mining benchmarks | journal = Data Mining and Knowledge Discovery | volume = 7 | pages = 349–371 | publisher = Kluwer | date = 2003 | language = en | doi = 10.1145/775047.775062| isbn = 158113567X | s2cid = 41617550 }}{{subscription required|via=ACM Digital Library }}</ref> query by content,<ref name=":4">{{cite conference|last1=Agrawal|first1=Rakesh|last2=Faloutsos|first2=Christos|last3=Swami|first3=Arun|date=October 1993|title=Efficient Similarity Search In Sequence Databases|conference=International Conference on Foundations of Data Organization and Algorithms|volume=730|pages=69–84|book-title=Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms|doi=10.1007/3-540-57301-1_5}}{{Subscription required|via=SpringerLink}}</ref> [[anomaly detection]] as well as [[forecasting]].<ref name=":5">{{cite journal|last1=Chen|first1=Cathy W. S.|last2=Chiu|first2=L. M.|date=September 2021|title=Ordinal Time Series Forecasting of the Air Quality Index|journal=Entropy|language=en|volume=23|issue=9|pages=1167|doi=10.3390/e23091167|pmid=34573792|pmc=8469594|bibcode=2021Entrp..23.1167C|doi-access=free}}</ref>
   −
在统计学、计量经济学、定量金融学、地震学、气象学和地球物理学的背景下,时间序列分析的主要目标是预测。在信号处理、控制工程和通信工程中,它被用于信号检测。其他应用还包括数据挖掘、模式识别和机器学习,时间序列分析可用于聚类、分类、内容查询、异常检测和预测。
+
在统计学Statistics、计量经济学Econometrics、定量金融学Quantitative finance、地震学Seismology、气象学Meteorology和地球物理学Geophysics方面,时间序列分析的主要目标是预测Forecasting。在信号处理Signal processing、控制工程Control engineering和通信工程Communication engineering方面,它被用于信号检测。在数据挖掘Data mining、模式识别Pattern recognition和机器学习Machine learning等其他应用中,时间序列分析可用于聚类Clustering<ref name=":1" /><ref name=":2" />、分类Classification<ref name=":3" />、按内容查询<ref name=":4" />、异常检测Anomaly detection以及预测Forecasting<ref name=":5" />。
   −
===Exploratory analysis===
+
===Exploratory analysis探索性分析===
[[File:Tuberculosis incidence US 1953-2009.png|thumb|Tuberculosis incidence US 1953-2009]]
+
[[File:Tuberculosis incidence US 1953-2009.png|thumb|Tuberculosis incidence US 1953-2009美国1953-2009年结核病发病率|链接=Special:FilePath/Tuberculosis_incidence_US_1953-2009.png]]
 
{{further|Exploratory analysis}}
 
{{further|Exploratory analysis}}
 
A straightforward way to examine a regular time series is manually with a [[line chart]]. An example chart is shown on the right for tuberculosis incidence in the United States, made with a spreadsheet program. The number of cases was standardized to a rate per 100,000 and the percent change per year in this rate was calculated. The nearly steadily dropping line shows that the TB incidence was decreasing in most years, but the percent change in this rate varied by as much as +/- 10%, with 'surges' in 1975 and around the early 1990s. The use of both vertical axes allows the comparison of two time series in one graphic.
 
A straightforward way to examine a regular time series is manually with a [[line chart]]. An example chart is shown on the right for tuberculosis incidence in the United States, made with a spreadsheet program. The number of cases was standardized to a rate per 100,000 and the percent change per year in this rate was calculated. The nearly steadily dropping line shows that the TB incidence was decreasing in most years, but the percent change in this rate varied by as much as +/- 10%, with 'surges' in 1975 and around the early 1990s. The use of both vertical axes allows the comparison of two time series in one graphic.
   −
thumb|Tuberculosis incidence US 1953-2009
+
绘制折线图Line chart是检查常规时间序列的直观方法。右侧显示了一个使用电子表格程序制作的美国结核病发病率示例图表。病例的数量被标准化为每10万人的比率,并计算出该比率每年的变化百分比。几乎稳定下降的线条表明,结核病发病率在大多数年份都在下降,但该比率的变化百分比高达+/-10%,在1975年和20世纪90年代初前后出现了 "激增"。图中应用了两个纵轴,使得可以在一个图表中比较两个时间序列。
 
  −
A straightforward way to examine a regular time series is manually with a line chart. An example chart is shown on the right for tuberculosis incidence in the United States, made with a spreadsheet program. The number of cases was standardized to a rate per 100,000 and the percent change per year in this rate was calculated. The nearly steadily dropping line shows that the TB incidence was decreasing in most years, but the percent change in this rate varied by as much as +/- 10%, with 'surges' in 1975 and around the early 1990s. The use of both vertical axes allows the comparison of two time series in one graphic.
     −
探索性分析结核病发病率美国1953-2009一个简单的方法来检查一个定期的时间序列是手工与线图。右图显示的是美国肺结核发病率的示例图,是用电子表格程序制作的。病例数量标准化为每100000例,并计算了这一比率每年的变化百分比。几乎稳步下降的曲线表明,结核病的发病率在大多数年份都在下降,但这一比率的变化率高达 +/-10% ,在1975年和90年代初期出现了“激增”。两个垂直轴的使用允许在一个图形中比较两个时间序列。
     −
A study of corporate data analysts found two challenges to exploratory time series analysis: discovering the shape of interesting patterns, and finding an explanation for these patterns.<ref>{{Cite journal|last=Sarkar|first=Advait|last2=Spott|first2=Martin|last3=Blackwell|first3=Alan F.|last4=Jamnik|first4=Mateja|date=2016|title=Visual discovery and model-driven explanation of time series patterns|url=https://doi.org/10.1109/VLHCC.2016.7739668|journal=2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)|publisher=IEEE|doi=10.1109/vlhcc.2016.7739668}}</ref> Visual tools that represent time series data as [[Heat map|heat map matrices]] can help overcome these challenges.
+
A study of corporate data analysts found two challenges to exploratory time series analysis: discovering the shape of interesting patterns, and finding an explanation for these patterns.<ref name=":6">{{Cite journal|last=Sarkar|first=Advait|last2=Spott|first2=Martin|last3=Blackwell|first3=Alan F.|last4=Jamnik|first4=Mateja|date=2016|title=Visual discovery and model-driven explanation of time series patterns|url=https://doi.org/10.1109/VLHCC.2016.7739668|journal=2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)|publisher=IEEE|doi=10.1109/vlhcc.2016.7739668}}</ref> Visual tools that represent time series data as [[Heat map|heat map matrices]] can help overcome these challenges.
   −
A study of corporate data analysts found two challenges to exploratory time series analysis: discovering the shape of interesting patterns, and finding an explanation for these patterns. Visual tools that represent time series data as heat map matrices can help overcome these challenges.
+
一项对企业数据分析师的研究发现,探索性时间序列分析有两个挑战:发现有趣模式,以及为这些模式找到解释<ref name=":6" />。将时间序列数据可视化为热力图矩阵Heat map matrices的工具可以帮助克服这些挑战。
   −
一项针对企业数据分析师的研究发现,探索性时间序列分析面临两个挑战: 发现有趣模式的形状,以及为这些模式找到解释。将时间序列数据表示为热图矩阵的可视化工具可以帮助克服这些挑战。
  −
  −
Other techniques include:
      
Other techniques include:
 
Other techniques include:
第124行: 第81行:  
* Separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity: see [[trend estimation]] and [[decomposition of time series]]
 
* Separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity: see [[trend estimation]] and [[decomposition of time series]]
   −
* Autocorrelation analysis to examine serial dependence
+
* 自相关分析检验序列相关性;
* Spectral analysis to examine cyclic behavior which need not be related to seasonality. For example, sunspot activity varies over 11 year cycles. Other common examples include celestial phenomena, weather patterns, neural activity, commodity prices, and economic activity.
+
* 频谱分析来检查与季节性无关的周期性行为。例如,太阳黑子活动在一个周期内(11年)的变化。其他常见的例子包括天体现象、天气模式、神经活动、商品价格和经济活动;
* Separation into components representing trend, seasonality, slow and fast variation, and cyclical irregularity: see trend estimation and decomposition of time series
+
* 将序列分离为代表趋势、季节性、慢速和快速变化以及周期性不规则的成分:见趋势估计和时间序列的分。
 
     −
* 自相关分析检验序列相关性
+
===Curve fitting曲线拟合===
* 谱分析检验不需要与季节性相关的循环行为。例如,太阳黑子的活动周期为11年。其他常见的例子包括天文现象、天气模式、神经活动、商品价格和经济活动。
  −
* 分离成分代表趋势,季节性,缓慢和快速变化,周期性不规则: 见趋势估计和时间序列分解
  −
 
  −
===Curve fitting===
   
{{main|Curve fitting}}
 
{{main|Curve fitting}}
   −
Curve fitting<ref>Sandra Lach Arlinghaus, PHB Practical Handbook of Curve Fitting. CRC Press, 1994.</ref><ref>William M. Kolb. Curve Fitting for Programmable Calculators. Syntec, Incorporated, 1984.</ref> is the process of constructing a [[curve]], or [[function (mathematics)|mathematical function]], that has the best fit to a series of [[data]] points,<ref>S.S. Halli, K.V. Rao. 1992. Advanced Techniques of Population Analysis. {{isbn|0306439972}} Page 165 (''cf''. ... functions are fulfilled if we have a good to moderate fit for the observed data.)</ref> possibly subject to constraints.<ref>[https://archive.org/details/signalnoisewhymo00silv ''[[The Signal and the Noise]]]: Why So Many Predictions Fail-but Some Don't.'' By Nate Silver</ref><ref>[https://books.google.com/books?id=hhdVr9F-JfAC Data Preparation for Data Mining]: Text. By Dorian Pyle.</ref> Curve fitting can involve either [[interpolation]],<ref>Numerical Methods in Engineering with MATLAB®. By [[Jaan Kiusalaas]]. Page 24.</ref><ref>[https://books.google.com/books?id=YlkgAwAAQBAJ&printsec=frontcover#v=onepage&q=%22curve%20fitting%22&f=false Numerical Methods in Engineering with Python 3]. By Jaan Kiusalaas. Page 21.</ref> where an exact fit to the data is required, or [[smoothing]],<ref>[https://books.google.com/books?id=UjnB0FIWv_AC&printsec=frontcover#v=onepage&q&f=false Numerical Methods of Curve Fitting]. By P. G. Guest, Philip George Guest. Page 349.</ref><ref>See also: [[Mollifier]]</ref> in which a "smooth" function is constructed that approximately fits the data.  A related topic is [[regression analysis]],<ref>[http://www.facm.ucl.ac.be/intranet/books/statistics/Prism-Regression-Book.unlocked.pdf Fitting Models to Biological Data Using Linear and Nonlinear Regression]. By Harvey Motulsky, Arthur Christopoulos.</ref><ref>Regression Analysis By Rudolf J. Freund, William J. Wilson, Ping Sa. Page 269.</ref> which focuses more on questions of [[statistical inference]] such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization,<ref>Visual Informatics. Edited by Halimah Badioze Zaman, Peter Robinson, Maria Petrou, Patrick Olivier, Heiko Schröder. Page 689.</ref><ref>[https://books.google.com/books?id=rdJvXG1k3HsC&printsec=frontcover#v=onepage&q&f=false Numerical Methods for Nonlinear Engineering Models]. By John R. Hauser. Page 227.</ref> to infer values of a function where no data are available,<ref>Methods of Experimental Physics: Spectroscopy, Volume 13, Part 1. By Claire Marton. Page 150.</ref> and to summarize the relationships among two or more variables.<ref>Encyclopedia of Research Design, Volume 1. Edited by Neil J. Salkind. Page 266.</ref> [[Extrapolation]] refers to the use of a fitted curve beyond the [[range (statistics)|range]] of the observed data,<ref>[https://books.google.com/books?id=ba0hAQAAQBAJ&printsec=frontcover#v=onepage&q&f=false Community Analysis and Planning Techniques]. By Richard E. Klosterman. Page 1.</ref> and is subject to a [[Uncertainty|degree of uncertainty]]<ref>An Introduction to Risk and Uncertainty in the Evaluation of Environmental Investments. DIANE Publishing. [https://books.google.com/books?id=rJ23LWaZAqsC&pg=PA69 Pg 69]</ref> since it may reflect the method used to construct the curve as much as it reflects the observed data.
+
Curve fitting<ref name=":7">Sandra Lach Arlinghaus, PHB Practical Handbook of Curve Fitting. CRC Press, 1994.</ref><ref name=":8">William M. Kolb. Curve Fitting for Programmable Calculators. Syntec, Incorporated, 1984.</ref> is the process of constructing a [[curve]], or [[function (mathematics)|mathematical function]], that has the best fit to a series of [[data]] points,<ref name=":9">S.S. Halli, K.V. Rao. 1992. Advanced Techniques of Population Analysis. {{isbn|0306439972}} Page 165 (''cf''. ... functions are fulfilled if we have a good to moderate fit for the observed data.)</ref> possibly subject to constraints.<ref name=":10">[https://archive.org/details/signalnoisewhymo00silv]''[[The Signal and the Noise]]'': Why So Many Predictions Fail-but Some Don't.'' By Nate Silver''</ref><ref name=":11">[https://books.google.com/books?id=hhdVr9F-JfAC Data Preparation for Data Mining]: Text. By Dorian Pyle.</ref> Curve fitting can involve either [[interpolation]],<ref name=":12">Numerical Methods in Engineering with MATLAB®. By [[Jaan Kiusalaas]]. Page 24.</ref><ref name=":13">[https://books.google.com/books?id=YlkgAwAAQBAJ&printsec=frontcover#v=onepage&q=%22curve%20fitting%22&f=false Numerical Methods in Engineering with Python 3]. By Jaan Kiusalaas. Page 21.</ref> where an exact fit to the data is required, or [[smoothing]],<ref name=":14">[https://books.google.com/books?id=UjnB0FIWv_AC&printsec=frontcover#v=onepage&q&f=false Numerical Methods of Curve Fitting]. By P. G. Guest, Philip George Guest. Page 349.</ref><ref name=":15">See also: [[Mollifier]]</ref> in which a "smooth" function is constructed that approximately fits the data.  A related topic is [[regression analysis]],<ref name=":16">[http://www.facm.ucl.ac.be/intranet/books/statistics/Prism-Regression-Book.unlocked.pdf Fitting Models to Biological Data Using Linear and Nonlinear Regression]. By Harvey Motulsky, Arthur Christopoulos.</ref><ref name=":17">Regression Analysis By Rudolf J. Freund, William J. Wilson, Ping Sa. Page 269.</ref> which focuses more on questions of [[statistical inference]] such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization,<ref name=":18">Visual Informatics. Edited by Halimah Badioze Zaman, Peter Robinson, Maria Petrou, Patrick Olivier, Heiko Schröder. Page 689.</ref><ref name=":19">[https://books.google.com/books?id=rdJvXG1k3HsC&printsec=frontcover#v=onepage&q&f=false Numerical Methods for Nonlinear Engineering Models]. By John R. Hauser. Page 227.</ref> to infer values of a function where no data are available,<ref name=":20">Methods of Experimental Physics: Spectroscopy, Volume 13, Part 1. By Claire Marton. Page 150.</ref> and to summarize the relationships among two or more variables.<ref name=":21">Encyclopedia of Research Design, Volume 1. Edited by Neil J. Salkind. Page 266.</ref> [[Extrapolation]] refers to the use of a fitted curve beyond the [[range (statistics)|range]] of the observed data,<ref name=":22">[https://books.google.com/books?id=ba0hAQAAQBAJ&printsec=frontcover#v=onepage&q&f=false Community Analysis and Planning Techniques]. By Richard E. Klosterman. Page 1.</ref> and is subject to a [[Uncertainty|degree of uncertainty]]<ref name=":23">An Introduction to Risk and Uncertainty in the Evaluation of Environmental Investments. DIANE Publishing. [https://books.google.com/books?id=rJ23LWaZAqsC&pg=PA69 Pg 69]</ref> since it may reflect the method used to construct the curve as much as it reflects the observed data.
   −
Curve fittingSandra Lach Arlinghaus, PHB Practical Handbook of Curve Fitting. CRC Press, 1994.William M. Kolb. Curve Fitting for Programmable Calculators. Syntec, Incorporated, 1984. is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points,S.S. Halli, K.V. Rao. 1992. Advanced Techniques of Population Analysis.  Page 165 (cf. ... functions are fulfilled if we have a good to moderate fit for the observed data.) possibly subject to constraints.The Signal and the Noise: Why So Many Predictions Fail-but Some Don't. By Nate SilverData Preparation for Data Mining: Text. By Dorian Pyle. Curve fitting can involve either interpolation,Numerical Methods in Engineering with MATLAB®. By Jaan Kiusalaas. Page 24.Numerical Methods in Engineering with Python 3. By Jaan Kiusalaas. Page 21. where an exact fit to the data is required, or smoothing,Numerical Methods of Curve Fitting. By P. G. Guest, Philip George Guest. Page 349.See also: Mollifier in which a "smooth" function is constructed that approximately fits the data.  A related topic is regression analysis,Fitting Models to Biological Data Using Linear and Nonlinear Regression. By Harvey Motulsky, Arthur Christopoulos.Regression Analysis By Rudolf J. Freund, William J. Wilson, Ping Sa. Page 269. which focuses more on questions of statistical inference such as how much uncertainty is present in a curve that is fit to data observed with random errors. Fitted curves can be used as an aid for data visualization,Visual Informatics. Edited by Halimah Badioze Zaman, Peter Robinson, Maria Petrou, Patrick Olivier, Heiko Schröder. Page 689.Numerical Methods for Nonlinear Engineering Models. By John R. Hauser. Page 227. to infer values of a function where no data are available,Methods of Experimental Physics: Spectroscopy, Volume 13, Part 1. By Claire Marton. Page 150. and to summarize the relationships among two or more variables.Encyclopedia of Research Design, Volume 1. Edited by Neil J. Salkind. Page 266. Extrapolation refers to the use of a fitted curve beyond the range of the observed data,Community Analysis and Planning Techniques. By Richard E. Klosterman. Page 1. and is subject to a degree of uncertaintyAn Introduction to Risk and Uncertainty in the Evaluation of Environmental Investments. DIANE Publishing. Pg 69 since it may reflect the method used to construct the curve as much as it reflects the observed data.
+
正文:曲线拟合Curve fitting
   −
桑德拉 · 拉赫 · 阿林豪斯,PHB 曲线拟合实用手册。1994. William m. Kolb.可编程计算器的曲线拟合。公司,1984年。是构造一条曲线或数学函数的过程,这条曲线最适合一系列数据点。哈利,k.v。男名男子名。1992.人口分析的高级技术。第165页(参考英文版)。如果我们对观察到的数据有一个良好到中等的拟合,那么功能就完成了信号和噪音: 为什么这么多预测失败——但是有些没有。数据挖掘的准备: 文本。作者: Dorian Pyle。用 MATLAB 进行曲线拟合可以包括插值、工程中的数值方法。By Jaan Kiusalaas.第24页,Python 3在工程中的数值方法。By Jaan Kiusalaas.需要对数据进行精确拟合或平滑处理的曲线拟合的数值方法。作者: p · g · 盖斯特,菲利普 · 乔治 · 盖斯特。参见: Mollifier,其中构造了一个近似适合数据的“ smooth”函数。一个相关的话题是回归分析,利用线性和非线性回归来拟合生物数据的模型。作者: Harvey Motulsky,Arthur christoph,回归分析: Rudolf j. Freund,William j. Wilson,Ping Sa。第269页。它更多地关注推论统计学的问题,比如一条曲线在多大程度上存在不确定性,而这条曲线又与随机误差观测到的数据相吻合。拟合的曲线可以用来作为一个辅助的数据可视化,视觉信息学。Edited by Halimah Badioze Zaman, Peter Robinson, Maria Petrou, Patrick Olivier, Heiko Schröder.非线性工程模型的数值方法。作者: John r. Hauser。第227页。在没有数据可用的情况下推断函数的值,实验物理学方法: 光谱学,第13卷,第1部分。作者: Claire Marton。总结两个或两个以上变量之间的关系。研究设计百科全书,第一卷。编辑: Neil j. Salkind。第266页。外推法是指使用拟合曲线超出观测数据的范围,社区分析和规划技术。作者: Richard e. Klosterman。第一页。《环境投资评估中的风险与不确定性导论》。DIANE Publishing.第69页,因为它可能反映了用来构造曲线的方法,就像它反映了观测数据一样。
+
曲线拟合<ref name=":7" /><ref name=":8" /> 是构建一条曲线Curve或数学函数Mathematical function的过程,它对一系列的数据Data点具有最佳的拟合效果<ref name=":9" />,可能会受到一些限制<ref name=":10" /><ref name=":11" />。曲线拟合包括插值Interpolation<ref name=":12" /><ref name=":13" />(需要精确地拟合数据)与平滑Smoothing<ref name=":14" /><ref name=":15" />(构造一个 "平滑 "的函数来近似地拟合数据)。与曲线拟合相近的回归分析Regression analysis<ref name=":16" /><ref name=":17" />更侧重于统计推断Statistical inference的问题。例如,在拟合有随机误差的数据的曲线中,有多少不确定性存在。拟合曲线可以作为数据可视化的辅助工具<ref name=":18" /><ref name=":19" />,在没有数据的情况下推断函数的值<ref name=":20" />,并总结两个或多个变量之间的关系<ref name=":21" />。外推法Extrapolation是指在观测到的数据范围Range之外使用拟合曲线<ref name=":22" />,它有一定程度的不确定性Degree of uncertainty<ref name=":23" />,因为它既可能是反映观测数据,也可能是反映用于构建曲线的方法。
   −
The construction of economic time series involves the estimation of some components for some dates by [[interpolation]] between values ("benchmarks") for earlier and later dates. Interpolation is estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available information ("reading between the lines").<ref>Hamming, Richard. Numerical methods for scientists and engineers. Courier Corporation, 2012.</ref> Interpolation is useful where the data surrounding the missing data is available and its trend, seasonality, and longer-term cycles are known. This is often done by using a related series known for all relevant dates.<ref>Friedman, Milton. "[http://www.nber.org/chapters/c2062.pdf The interpolation of time series by related series]." Journal of the American Statistical Association 57.300 (1962): 729–757.</ref> Alternatively [[polynomial interpolation]] or [[spline interpolation]] is used where piecewise [[polynomial]] functions are fit into time intervals such that they fit smoothly together. A different problem which is closely related to interpolation is the approximation of a complicated function by a simple function (also called [[Polynomial regression|regression]]).The main difference between regression and interpolation is that polynomial regression gives a single polynomial that models the entire data set.  Spline interpolation, however, yield a piecewise continuous function composed of many polynomials to model the data set.
     −
The construction of economic time series involves the estimation of some components for some dates by interpolation between values ("benchmarks") for earlier and later dates. Interpolation is estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available information ("reading between the lines").Hamming, Richard. Numerical methods for scientists and engineers. Courier Corporation, 2012. Interpolation is useful where the data surrounding the missing data is available and its trend, seasonality, and longer-term cycles are known. This is often done by using a related series known for all relevant dates.Friedman, Milton. "The interpolation of time series by related series." Journal of the American Statistical Association 57.300 (1962): 729–757. Alternatively polynomial interpolation or spline interpolation is used where piecewise polynomial functions are fit into time intervals such that they fit smoothly together. A different problem which is closely related to interpolation is the approximation of a complicated function by a simple function (also called regression).The main difference between regression and interpolation is that polynomial regression gives a single polynomial that models the entire data set.  Spline interpolation, however, yield a piecewise continuous function composed of many polynomials to model the data set.
+
The construction of economic time series involves the estimation of some components for some dates by [[interpolation]] between values ("benchmarks") for earlier and later dates. Interpolation is estimation of an unknown quantity between two known quantities (historical data), or drawing conclusions about missing information from the available information ("reading between the lines").<ref name=":24">Hamming, Richard. Numerical methods for scientists and engineers. Courier Corporation, 2012.</ref> Interpolation is useful where the data surrounding the missing data is available and its trend, seasonality, and longer-term cycles are known. This is often done by using a related series known for all relevant dates.<ref name=":25">Friedman, Milton. "[http://www.nber.org/chapters/c2062.pdf The interpolation of time series by related series]." Journal of the American Statistical Association 57.300 (1962): 729–757.</ref> Alternatively [[polynomial interpolation]] or [[spline interpolation]] is used where piecewise [[polynomial]] functions are fit into time intervals such that they fit smoothly together. A different problem which is closely related to interpolation is the approximation of a complicated function by a simple function (also called [[Polynomial regression|regression]]).The main difference between regression and interpolation is that polynomial regression gives a single polynomial that models the entire data set.  Spline interpolation, however, yield a piecewise continuous function composed of many polynomials to model the data set.
   −
经济时间序列的构建涉及通过早期和晚期数据的值(”基准”)之间的内插来估计某些数据的某些组成部分。插值是对两个已知量(历史数据)之间的未知量的估计,或者从可用信息(“字里行间的读数”)中得出缺失信息的结论。汉明,理查德。科学家和工程师的数值方法。快递公司,2012。在缺失数据周围的数据可用,其趋势、季节性和长期周期已知的情况下,插值是有用的。这通常是通过使用一个相关的系列知道所有相关的日期。按相关序列对时间序列进行插值美国统计协会杂志57.300(1962) : 729-757。或者使用多项式插值或样条插值,在这种情况下,分段多项式函数适合于时间间隔,以便它们能够平滑地组合在一起。另一个与插值密切相关的问题是用一个简单函数(也称为回归)逼近一个复杂函数。回归和插值的主要区别是多项式回归给出一个单一的多项式模型的整个数据集。然而,样条插值可以产生一个由多项式组成的分段连续函数来为数据集建模。
+
经济时间序列的构建涉及通过在早期和晚期的值(“基准”)之间进行插值Interpolation来估计某些日期的某些组成部分。插值法是在两个已知量(历史数据)之间估计一个未知量,或从现有信息中得出关于缺失信息的结论("从字里行间阅读")<ref name=":24" />。如果围绕缺失数据的数据是可用的,并且其趋势、季节性和长期周期是已知的,那么插值法就很有用。插值法通常是通过使用已知所有相关日期的相关序列来实现的<ref name=":25" />。或者使用多项式插值Polynomial interpolation或样条插值Spline interpolation,将分段多项式Polynomial函数拟合到时间间隔中,使其平滑地拟合在一起。一个与插值密切相关的问题是用一个简单的函数来逼近一个复杂的函数(也称为回归Regression)。回归和插值的主要区别是,多项式回归给出一个单一的多项式来模拟整个数据集。而样条插值则产生一个由许多多项式组成的分段连续函数来模拟数据集。
    
[[Extrapolation]] is the process of estimating, beyond the original observation range, the value of a variable on the basis of its relationship with another variable. It is similar to [[interpolation]], which produces estimates between known observations, but extrapolation is subject to greater [[uncertainty]] and a higher risk of producing meaningless results.
 
[[Extrapolation]] is the process of estimating, beyond the original observation range, the value of a variable on the basis of its relationship with another variable. It is similar to [[interpolation]], which produces estimates between known observations, but extrapolation is subject to greater [[uncertainty]] and a higher risk of producing meaningless results.
   −
Extrapolation is the process of estimating, beyond the original observation range, the value of a variable on the basis of its relationship with another variable. It is similar to interpolation, which produces estimates between known observations, but extrapolation is subject to greater uncertainty and a higher risk of producing meaningless results.
+
外推法Extrapolation是指在原始观察范围之外,根据一个变量与另一个变量的关系来估计其数值的过程。它与插值Interpolation类似,插值在已知的观测值之间产生估计值,但外推法的不确定性Uncertainty更大,产生无意义结果的风险也更大。
 
  −
外推法是根据一个变量与另一个变量的关系,在原始观测范围之外估计该变量的值的过程。它类似于插值,在已知的观测值之间产生估计值,但是外推法存在更大的不确定性和产生无意义结果的更高风险。
      
===Function approximation===
 
===Function approximation===

导航菜单