更改

添加11字节 、 2021年7月31日 (六) 17:24
无编辑摘要
第17行: 第17行:  
其他例子包括:对输入中具有重大不确定性的现象进行建模,如商业中的风险计算,以及在数学中对具有复杂边界条件的多维定积分进行评估。 在系统工程问题(空间、石油勘探、飞机设计等)的应用中,基于蒙特卡罗的故障预测、成本超支和进度超支通常比人类的直觉或其他的“软性”方法更有效。<ref name=":2" />  
 
其他例子包括:对输入中具有重大不确定性的现象进行建模,如商业中的风险计算,以及在数学中对具有复杂边界条件的多维定积分进行评估。 在系统工程问题(空间、石油勘探、飞机设计等)的应用中,基于蒙特卡罗的故障预测、成本超支和进度超支通常比人类的直觉或其他的“软性”方法更有效。<ref name=":2" />  
   −
In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the [[law of large numbers]], integrals described by the [[expected value]] of some random variable can be approximated by taking the [[Sample mean and sample covariance|empirical mean]] (a.k.a. the sample mean) of independent samples of the variable. When the [[probability distribution]] of the variable is parametrized, mathematicians often use a [[Markov chain Monte Carlo]] (MCMC) sampler.<ref>{{Cite journal|title = Equation of State Calculations by Fast Computing Machines|journal = The Journal of Chemical Physics|date = 1953-06-01|issn = 0021-9606|pages = 1087–1092|volume = 21|issue = 6|doi = 10.1063/1.1699114|first1 = Nicholas|last1 = Metropolis|first2 = Arianna W.|last2 = Rosenbluth|first3 = Marshall N.|last3 = Rosenbluth|first4 = Augusta H.|last4 = Teller|first5 = Edward|last5 = Teller|bibcode=1953JChPh..21.1087M|s2cid = 1046577|url = https://semanticscholar.org/paper/f6a13f116e270dde9d67848495f801cdb8efa25d}}</ref><ref>{{Cite journal|title = Monte Carlo sampling methods using Markov chains and their applications|journal = Biometrika|date = 1970-04-01|issn = 0006-3444|pages = 97–109|volume = 57|issue = 1|doi = 10.1093/biomet/57.1.97|first = W. K.|last = Hastings|bibcode = 1970Bimka..57...97H|s2cid = 21204149|url = https://semanticscholar.org/paper/143d2e02ab91ae6259576ac50b664b8647af8988}}</ref><ref>{{Cite journal|title = The Multiple-Try Method and Local Optimization in Metropolis Sampling|journal = Journal of the American Statistical Association|date = 2000-03-01|issn = 0162-1459|pages = 121–134|volume = 95|issue = 449|doi = 10.1080/01621459.2000.10473908|first1 = Jun S.|last1 = Liu|first2 = Faming|last2 = Liang|first3 = Wing Hung|last3 = Wong|s2cid = 123468109|url = https://semanticscholar.org/paper/ff17c129a8d32bb7dc7206230da612e94bd24b9f}}</ref> The central idea is to design a judicious [[Markov chain]] model with a prescribed [[stationary probability distribution]]. That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution.<ref>{{cite journal | last1 = Spall | first1 = J. C. | year = 2003 | title = Estimation via Markov Chain Monte Carlo | doi = 10.1109/MCS.2003.1188770 | journal = IEEE Control Systems Magazine | volume = 23 | issue = 2| pages = 34–45 }}</ref><ref>{{Cite journal |doi = 10.1109/MCS.2018.2876959|title = Stationarity and Convergence of the Metropolis-Hastings Algorithm: Insights into Theoretical Aspects|journal = IEEE Control Systems Magazine |volume = 39|pages = 56–67|year = 2019|last1 = Hill|first1 = Stacy D.|last2 = Spall|first2 = James C.|s2cid = 58672766}}</ref> By the [[ergodic theorem]], the stationary distribution is approximated by the [[empirical measure]]s of the random states of the MCMC sampler.
+
In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the [[law of large numbers]], integrals described by the [[expected value]] of some random variable can be approximated by taking the [[Sample mean and sample covariance|empirical mean]] (a.k.a. the sample mean) of independent samples of the variable. When the [[probability distribution]] of the variable is parametrized, mathematicians often use a [[Markov chain Monte Carlo]] (MCMC) sampler.<ref name=":3">{{Cite journal|title = Equation of State Calculations by Fast Computing Machines|journal = The Journal of Chemical Physics|date = 1953-06-01|issn = 0021-9606|pages = 1087–1092|volume = 21|issue = 6|doi = 10.1063/1.1699114|first1 = Nicholas|last1 = Metropolis|first2 = Arianna W.|last2 = Rosenbluth|first3 = Marshall N.|last3 = Rosenbluth|first4 = Augusta H.|last4 = Teller|first5 = Edward|last5 = Teller|bibcode=1953JChPh..21.1087M|s2cid = 1046577|url = https://semanticscholar.org/paper/f6a13f116e270dde9d67848495f801cdb8efa25d}}</ref><ref name=":4">{{Cite journal|title = Monte Carlo sampling methods using Markov chains and their applications|journal = Biometrika|date = 1970-04-01|issn = 0006-3444|pages = 97–109|volume = 57|issue = 1|doi = 10.1093/biomet/57.1.97|first = W. K.|last = Hastings|bibcode = 1970Bimka..57...97H|s2cid = 21204149|url = https://semanticscholar.org/paper/143d2e02ab91ae6259576ac50b664b8647af8988}}</ref><ref name=":5">{{Cite journal|title = The Multiple-Try Method and Local Optimization in Metropolis Sampling|journal = Journal of the American Statistical Association|date = 2000-03-01|issn = 0162-1459|pages = 121–134|volume = 95|issue = 449|doi = 10.1080/01621459.2000.10473908|first1 = Jun S.|last1 = Liu|first2 = Faming|last2 = Liang|first3 = Wing Hung|last3 = Wong|s2cid = 123468109|url = https://semanticscholar.org/paper/ff17c129a8d32bb7dc7206230da612e94bd24b9f}}</ref> The central idea is to design a judicious [[Markov chain]] model with a prescribed [[stationary probability distribution]]. That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution.<ref name=":6">{{cite journal | last1 = Spall | first1 = J. C. | year = 2003 | title = Estimation via Markov Chain Monte Carlo | doi = 10.1109/MCS.2003.1188770 | journal = IEEE Control Systems Magazine | volume = 23 | issue = 2| pages = 34–45 }}</ref><ref name=":7">{{Cite journal |doi = 10.1109/MCS.2018.2876959|title = Stationarity and Convergence of the Metropolis-Hastings Algorithm: Insights into Theoretical Aspects|journal = IEEE Control Systems Magazine |volume = 39|pages = 56–67|year = 2019|last1 = Hill|first1 = Stacy D.|last2 = Spall|first2 = James C.|s2cid = 58672766}}</ref> By the [[ergodic theorem]], the stationary distribution is approximated by the [[empirical measure]]s of the random states of the MCMC sampler.
   −
In principle, Monte Carlo methods can be used to solve any problem having a probabilistic interpretation. By the law of large numbers, integrals described by the expected value of some random variable can be approximated by taking the empirical mean (a.k.a. the sample mean) of independent samples of the variable. When the probability distribution of the variable is parametrized, mathematicians often use a Markov chain Monte Carlo (MCMC) sampler. The central idea is to design a judicious Markov chain model with a prescribed stationary probability distribution. That is, in the limit, the samples being generated by the MCMC method will be samples from the desired (target) distribution. By the ergodic theorem, the stationary distribution is approximated by the empirical measures of the random states of the MCMC sampler.
+
理论上,蒙特卡罗方法可以用来解决任何具有概率解释的问题。根据'''大数定律 Law of Large Numbers''',用某个随机变量的期望值描述的积分可以用该变量独立样本的经验均值 (即样本均值)来近似。 当变量的概率分布被参数化时,数学家们经常使用'''马尔可夫链蒙特卡罗 Markov chain Monte Carlo(MCMC)'''采样器。<ref name=":3" /><ref name=":4" /><ref name=":5" /> 其中心思想是设计一个具有给定'''稳态概率分布 Stationary Probability Distribution'''的有效马尔可夫链模型。也就是说,在极限情况下,马尔科夫链蒙特卡洛方法生成的样本将成为来自期望(目标)分布的样本。<ref name=":6" /><ref name=":7" /> 通过'''遍历定理 Ergodic Theorem''',稳态分布可以用马尔科夫链蒙特卡洛采样器随机状态的经验测度来近似。  
   −
理论上,蒙特卡罗方法可以用来解决任何具有概率解释的问题。根据'''大数定律 Law of Large Numbers''',用某个随机变量的期望值描述的积分可以用该变量独立样本的经验均值 (即样本均值)来近似。 当变量的概率分布被参数化时,数学家们经常使用'''马尔可夫链蒙特卡罗 Markov chain Monte Carlo(MCMC)'''采样器。其中心思想是设计一个具有给定'''稳态概率分布 Stationary Probability Distribution'''的有效马尔可夫链模型。也就是说,在极限情况下,马尔科夫链蒙特卡洛方法生成的样本将成为来自期望(目标)分布的样本。通过'''遍历定理 Ergodic Theorem''',稳态分布可以用马尔科夫链蒙特卡洛采样器随机状态的经验测度来近似。  
+
In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a [[Markov process]] whose transition probabilities depend on the distributions of the current random states (see [[McKean–Vlasov process]]es, [[particle filter|nonlinear filtering equation]]).<ref name="kol10">{{cite book|last = Kolokoltsov|first = Vassili|title = Nonlinear Markov processes|year = 2010|publisher = Cambridge Univ. Press|pages = 375}}</ref><ref name="dp13">{{cite book|last = Del Moral|first = Pierre|title = Mean field simulation for Monte Carlo integration|year = 2013|publisher = Chapman & Hall/CRC Press|quote = Monographs on Statistics & Applied Probability|url = http://www.crcpress.com/product/isbn/9781466504059|pages = 626}}</ref> In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain.<ref name="dp13" /><ref name=":8">{{Cite journal|title = Sequential Monte Carlo samplers | last1 = Del Moral | first1 = P | last2 = Doucet | first2 = A | last3 = Jasra | first3 = A | year = 2006 |doi=10.1111/j.1467-9868.2006.00553.x|volume=68| issue = 3 |journal=Journal of the Royal Statistical Society, Series B|pages=411–436|arxiv = cond-mat/0212648| s2cid = 12074789 }}</ref> A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled [[empirical measure]]s. In contrast with traditional Monte Carlo and MCMC methodologies these [[Mean field particle methods|mean field particle]] techniques rely on sequential interacting samples. The terminology ''mean field'' reflects the fact that each of the ''samples'' (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.
 
  −
In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a [[Markov process]] whose transition probabilities depend on the distributions of the current random states (see [[McKean–Vlasov process]]es, [[particle filter|nonlinear filtering equation]]).<ref name="kol10">{{cite book|last = Kolokoltsov|first = Vassili|title = Nonlinear Markov processes|year = 2010|publisher = Cambridge Univ. Press|pages = 375}}</ref><ref name="dp13">{{cite book|last = Del Moral|first = Pierre|title = Mean field simulation for Monte Carlo integration|year = 2013|publisher = Chapman & Hall/CRC Press|quote = Monographs on Statistics & Applied Probability|url = http://www.crcpress.com/product/isbn/9781466504059|pages = 626}}</ref> In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain.<ref name="dp13" /><ref>{{Cite journal|title = Sequential Monte Carlo samplers | last1 = Del Moral | first1 = P | last2 = Doucet | first2 = A | last3 = Jasra | first3 = A | year = 2006 |doi=10.1111/j.1467-9868.2006.00553.x|volume=68| issue = 3 |journal=Journal of the Royal Statistical Society, Series B|pages=411–436|arxiv = cond-mat/0212648| s2cid = 12074789 }}</ref> A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled [[empirical measure]]s. In contrast with traditional Monte Carlo and MCMC methodologies these [[Mean field particle methods|mean field particle]] techniques rely on sequential interacting samples. The terminology ''mean field'' reflects the fact that each of the ''samples'' (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.
      
In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depend on the distributions of the current random states (see McKean–Vlasov processes, nonlinear filtering equation). In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain. A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and MCMC methodologies these mean field particle techniques rely on sequential interacting samples. The terminology mean field reflects the fact that each of the samples (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.
 
In other problems, the objective is generating draws from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability distributions can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depend on the distributions of the current random states (see McKean–Vlasov processes, nonlinear filtering equation). In other instances we are given a flow of probability distributions with an increasing level of sampling complexity (path spaces models with an increasing time horizon, Boltzmann–Gibbs measures associated with decreasing temperature parameters, and many others). These models can also be seen as the evolution of the law of the random states of a nonlinear Markov chain. A natural way to simulate these sophisticated nonlinear Markov processes is to sample multiple copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and MCMC methodologies these mean field particle techniques rely on sequential interacting samples. The terminology mean field reflects the fact that each of the samples (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes.
   −
在其他问题中,目标是从满足非线性发展方程的概率分布序列生成图。这些概率分布流总是可以解释为马尔可夫过程的随机状态的分布,其转移概率依赖于当前随机状态的分布(见麦肯-弗拉索夫 McKean-Vlasov过程,非线性滤波方程)。在其他情况下,我们给出了采样复杂度不断增加的概率分布流(如时间范围不断增加的路径空间模型,与温度参数降低有联系的'''玻尔兹曼—吉布斯测度 Boltzmann-Gibbs Measures''',以及许多其他例子)。这些模型也可以看作是一个非线性马尔可夫链的随机状态规律的演化。模拟这些复杂非线性马尔可夫过程的一个自然的方法是对过程的多个副本进行抽样,用抽样的经验测度替代演化方程中未知的随机状态分布。与传统的蒙特卡罗和 MCMC 方法相比,这些平均场粒子技术依赖于连续的相互作用样本。平均场一词反映了每个样本(也就是粒子、个体、步行者、媒介、生物或表现型)与过程的经验测量相互作用的事实。当系统的大小趋近于无穷时,这些随机经验测度收敛于非线性马尔可夫链随机状态的确定性分布,从而使粒子之间的统计相互作用消失。
+
在其他问题中,要实现的目标则是从满足非线性演化方程的概率分布序列中生成图像。这些概率分布流总是可以解释为'''马尔可夫过程 Markov process'''的随机状态的分布,其转移概率依赖于当前随机状态的分布(见麦肯-弗拉索夫过程,'''非线性滤波方程 Nonlinear Filtering Equation''')。<ref name="kol10" /><ref name="dp13" /> 其他情况下,我们给出了采样复杂度不断增加的概率分布流(如时间范围不断增加的路径空间模型,与温度参数降低有联系的'''玻尔兹曼—吉布斯测度 Boltzmann-Gibbs Measures''',以及许多其他例子)。这些模型也可以看作是一个非线性马尔可夫链的随机状态规律的演化。<ref name="dp13" /><ref name=":8" /> 模拟这些复杂非线性马尔可夫过程的一个自然的方法是对过程的多个副本进行抽样,用抽样的经验测度替代演化方程中未知的随机状态分布。与传统的蒙特卡罗和马尔科夫链蒙特卡洛方法相比,这些'''平均场粒子 Mean Field Particle'''技术依赖于连续相互作用的样本。平均场一词反映了每个样本(也就是粒子、个体、步行者、媒介、生物或表现型)与过程的经验测量相互作用的事实。当系统大小趋近于无穷时,这些随机经验测度收敛于非线性马尔可夫链随机状态的确定性分布,从而使粒子之间的统计相互作用消失。
    
Despite its conceptual and algorithmic simplicity, the computational cost associated with a Monte Carlo simulation can be staggeringly high. In general the method requires many samples to get a good approximation, which may incurs an arbitrarily large total runtime if the processing time of a single sample is high. Although this is a severe limitation in very complex problems, the embarrassingly parallel nature of the algorithm allows this large cost to be reduced (perhaps to a feasible level) through parallel computing strategies in local processors, clusters, cloud computing, GPU, FPGA etc.
 
Despite its conceptual and algorithmic simplicity, the computational cost associated with a Monte Carlo simulation can be staggeringly high. In general the method requires many samples to get a good approximation, which may incurs an arbitrarily large total runtime if the processing time of a single sample is high. Although this is a severe limitation in very complex problems, the embarrassingly parallel nature of the algorithm allows this large cost to be reduced (perhaps to a feasible level) through parallel computing strategies in local processors, clusters, cloud computing, GPU, FPGA etc.
 +
 +
尽管其概念和算法简单,与蒙特卡罗模拟相关的计算成本可能是惊人的高。一般情况下,该方法需要大量的样本来获得良好的近似,如果单个样本的处理时间较长,可能会导致任意大的总运行时间。尽管在非常复杂的问题中,这是一个严重的限制,但该算法令人尴尬的并行性质允许通过本地处理器、集群、云计算、GPU、FPGA等的并行计算策略来降低这么大的成本(可能在可行的水平上)。
    
== Overview 概述 ==
 
== Overview 概述 ==
596

个编辑