更改

添加2,040字节 、 2020年5月17日 (日) 15:48
第350行: 第350行:  
Moscato和Fontanari通过观察源自他们的研究的类似于“阈值更新”退火的“比热”曲线得出结论:“模拟退火算法中的Metropolis更新的随机性在寻找近似最优最小值中没有发挥主要作用”。相反,他们提出“在高温下平滑成本函数景观和在冷却过程中逐渐定义最小值是模拟退火成功的基本要素。”
 
Moscato和Fontanari通过观察源自他们的研究的类似于“阈值更新”退火的“比热”曲线得出结论:“模拟退火算法中的Metropolis更新的随机性在寻找近似最优最小值中没有发挥主要作用”。相反,他们提出“在高温下平滑成本函数景观和在冷却过程中逐渐定义最小值是模拟退火成功的基本要素。”
 
这种方法后来被Dueck和Scheuer命名为“阈值接受”而推广开来。
 
这种方法后来被Dueck和Scheuer命名为“阈值接受”而推广开来。
在2001年,Franz, Hoffmann和Salamon证明了确定性更新策略确实是在模拟成本/能源景观上随机行走的大类算法中最优的。
+
在2001年,Franz, Hoffmann和Salamon证明了确定性更新策略确实是在模拟成本/能量趋势图上随机行走的大类算法中最优的。
   −
===Efficient candidate generation===
+
  ---~~energy landscape 能量绘景,能量图景,能源景观,能量景貌  译为能量图?能量趋势图
 +
 
 +
===有效的备选 Efficient candidate generation===
    
When choosing the candidate generator {{code|neighbour()}}, one must consider that after a few iterations of the simulated annealing algorithm, the current state is expected to have much lower energy than a random state. Therefore, as a general rule, one should skew the generator towards candidate moves where the energy of the destination state <math>s'</math> is likely to be similar to that of the current state. This [[heuristic]] (which is the main principle of the [[Metropolis–Hastings algorithm]]) tends to exclude "very good" candidate moves as well as "very bad" ones; however, the former are usually much less common than the latter, so the heuristic is generally quite effective.
 
When choosing the candidate generator {{code|neighbour()}}, one must consider that after a few iterations of the simulated annealing algorithm, the current state is expected to have much lower energy than a random state. Therefore, as a general rule, one should skew the generator towards candidate moves where the energy of the destination state <math>s'</math> is likely to be similar to that of the current state. This [[heuristic]] (which is the main principle of the [[Metropolis–Hastings algorithm]]) tends to exclude "very good" candidate moves as well as "very bad" ones; however, the former are usually much less common than the latter, so the heuristic is generally quite effective.
第358行: 第360行:  
When choosing the candidate generator , one must consider that after a few iterations of the simulated annealing algorithm, the current state is expected to have much lower energy than a random state. Therefore, as a general rule, one should skew the generator towards candidate moves where the energy of the destination state <math>s'</math> is likely to be similar to that of the current state. This heuristic (which is the main principle of the Metropolis–Hastings algorithm) tends to exclude "very good" candidate moves as well as "very bad" ones; however, the former are usually much less common than the latter, so the heuristic is generally quite effective.
 
When choosing the candidate generator , one must consider that after a few iterations of the simulated annealing algorithm, the current state is expected to have much lower energy than a random state. Therefore, as a general rule, one should skew the generator towards candidate moves where the energy of the destination state <math>s'</math> is likely to be similar to that of the current state. This heuristic (which is the main principle of the Metropolis–Hastings algorithm) tends to exclude "very good" candidate moves as well as "very bad" ones; however, the former are usually much less common than the latter, so the heuristic is generally quite effective.
   −
在选择候选生成器的时候,必须考虑到在模拟退火 / 状态算法的几次迭代之后,当前状态的能量要比随机状态低得多。因此,作为一般规则,应该将生成器倾斜到候选移动,其中目标状态数学的能量可能与当前状态的能量相似。这种启发式(Metropolis-Hastings 算法的主要原则)倾向于排除”非常好”的候选动作和”非常坏”的动作; 然而,前者通常不如后者常见,因此这种启发式一般是相当有效的。
+
在选择候选产生器时,必须考虑到经过模拟退火算法的多次迭代后,当前状态的能量比随机状态低得多。因此,通常的规则是,当目标状态<math>s'</math>的能量可能与当前状态类似时,应该使生成器向候选的地方移动倾斜。这种启发式(这是Metropolis-Hastings算法的主要原则)倾向于排除“非常好的”候选移动和“非常糟糕的”候选移动;
 
+
然而,前者通常比后者少得多,所以启发式通常是相当有效的。
      第366行: 第368行:  
In the traveling salesman problem above, for example, swapping two consecutive cities in a low-energy tour is expected to have a modest effect on its energy (length); whereas swapping two arbitrary cities is far more likely to increase its length than to decrease it. Thus, the consecutive-swap neighbour generator is expected to perform better than the arbitrary-swap one, even though the latter could provide a somewhat shorter path to the optimum (with <math>n-1</math> swaps, instead of <math>n(n-1)/2</math>).
 
In the traveling salesman problem above, for example, swapping two consecutive cities in a low-energy tour is expected to have a modest effect on its energy (length); whereas swapping two arbitrary cities is far more likely to increase its length than to decrease it. Thus, the consecutive-swap neighbour generator is expected to perform better than the arbitrary-swap one, even though the latter could provide a somewhat shorter path to the optimum (with <math>n-1</math> swaps, instead of <math>n(n-1)/2</math>).
   −
例如,在上面的旅行推销员问题中,在一个低能量旅游中交换两个连续的城市预计会对其能量(长度)产生适度的影响,而交换两个任意的城市更有可能增加而不是减少其长度。因此,预计连续交换邻居生成器的性能要优于任意交换生成器,尽管后者可以提供一条通往最优化的较短路径(使用数学 n-1 / math 交换,而不是数学 n (n-1) / 2 / math)。
+
例如,在上面的旅行商问题中,也就是TSP问题(Traveling Salesman Problem)。假设有一个旅行商人要拜访n个城市,他必须选择所要走的路径,路径的限制是每个城市只能拜访一次,而且最后要回到原来出发的城市。路径的选择目标是要求得的路径路程为所有路径之中的最小值。
 
+
在一个寻找最短路径的旅行中,对两个连续的城市所走的顺序进行交换,预计会对其能量(长度)产生适度的影响;而交换两个任意的城市的移动顺序更有可能增加而不是减少城市的长度。
 +
因此,预期连续交换邻接生成器的性能要优于任意交换生成器,尽管后者可以提供更短的路径来达到最佳状态(使用<math>n-1</math>交换,而不是<math>n(n-1)/2</math>)。
      第376行: 第379行:  
关于这种启发式的一个更精确的陈述是,人们应该尝试第一个候选状态的数学 s / math,对于这个状态的数学 p (e (s) ,e (s’) ,t) / math 是很大的。对于上面提到的“标准”接受函数数学 p / math,它意味着数学 e (s’)-e (s) / math 的顺序是数学 t / math 或更少。因此,在上面的旅行推销员的例子中,我们可以使用一个交换两个随机城市的函数,其中选择一个城市对的概率消失了,因为它们的距离增加超过了数学 t / math。
 
关于这种启发式的一个更精确的陈述是,人们应该尝试第一个候选状态的数学 s / math,对于这个状态的数学 p (e (s) ,e (s’) ,t) / math 是很大的。对于上面提到的“标准”接受函数数学 p / math,它意味着数学 e (s’)-e (s) / math 的顺序是数学 t / math 或更少。因此,在上面的旅行推销员的例子中,我们可以使用一个交换两个随机城市的函数,其中选择一个城市对的概率消失了,因为它们的距离增加超过了数学 t / math。
    +
一个更精确的启发式陈述是,我们应该尝试第一个候选状态<math>s'</math>,其中<math>P(E(s), E(s'), T)</math>很大。
 +
对于上面的“标准”接受函数<math>P</math>,这意味着<math>E(s') - E(s)</math>的长度是<math>T</math>或更少。
 +
因此,在上面的旅行商问题中,我们可以使用一个函数来交换两个随机城市,当它们的距离超过<math>T</math>时,就不再选择该城市对。
      −
===Barrier avoidance===
+
===避开障碍Barrier avoidance===
    
When choosing the candidate generator {{code|neighbour()}} one must also try to reduce the number of "deep" local minima—states (or sets of connected states) that have much lower energy than all its neighbouring states. Such "closed [[drainage basin|catchment]] basins" of the energy function may trap the simulated annealing algorithm with high probability (roughly proportional to the number of states in the basin) and for a very long time (roughly exponential on the energy difference between the surrounding states and the bottom of the basin).
 
When choosing the candidate generator {{code|neighbour()}} one must also try to reduce the number of "deep" local minima—states (or sets of connected states) that have much lower energy than all its neighbouring states. Such "closed [[drainage basin|catchment]] basins" of the energy function may trap the simulated annealing algorithm with high probability (roughly proportional to the number of states in the basin) and for a very long time (roughly exponential on the energy difference between the surrounding states and the bottom of the basin).
    
When choosing the candidate generator  one must also try to reduce the number of "deep" local minima—states (or sets of connected states) that have much lower energy than all its neighbouring states. Such "closed catchment basins" of the energy function may trap the simulated annealing algorithm with high probability (roughly proportional to the number of states in the basin) and for a very long time (roughly exponential on the energy difference between the surrounding states and the bottom of the basin).
 
When choosing the candidate generator  one must also try to reduce the number of "deep" local minima—states (or sets of connected states) that have much lower energy than all its neighbouring states. Such "closed catchment basins" of the energy function may trap the simulated annealing algorithm with high probability (roughly proportional to the number of states in the basin) and for a very long time (roughly exponential on the energy difference between the surrounding states and the bottom of the basin).
  −
在选择候选生成器时,还必须尽量减少“深”局部最小状态(或连通状态集合)的数量,这些状态的能量要比所有相邻状态的能量低得多。这种能量函数的“封闭集水盆地”可能会以高概率(大致与盆地的状态数量成正比)和很长一段时间(周围状态和盆地底部之间的能量差大致呈指数形式)困住模拟退火算法。
         +
在选择候选生成器时,还必须尝试减少具有比所有相邻状态低得多的能量的“深层”局部最小状态(或连接状态集)的数量。
 +
这种能量函数的“封闭集水盆地”可能以很高的概率(大致与盆地内的状态数成正比)并在很长一段时间内(大致以指数形式反映周围状态与盆地底部的能量差)困住模拟退火算法。
    +
  --~~这句话我有些不太理解 只能先直译一下,感觉有些像说 盆地是指较优点,会困住模拟退火算法
 +
 
As a rule, it is impossible to design a candidate generator that will satisfy this goal and also prioritize candidates with similar energy. On the other hand, one can often vastly improve the efficiency of simulated annealing by relatively simple changes to the generator. In the traveling salesman problem, for instance, it is not hard to exhibit two tours <math>A</math>, <math>B</math>, with nearly equal lengths, such that (1) <math>A</math> is optimal, (2) every sequence of city-pair swaps that converts <math>A</math> to <math>B</math> goes through tours that are much longer than both, and (3) <math>A</math> can be transformed into <math>B</math> by flipping (reversing the order of) a set of consecutive cities. In this example, <math>A</math> and <math>B</math> lie in different "deep basins" if the generator performs only random pair-swaps; but they will be in the same basin if the generator performs random segment-flips.
 
As a rule, it is impossible to design a candidate generator that will satisfy this goal and also prioritize candidates with similar energy. On the other hand, one can often vastly improve the efficiency of simulated annealing by relatively simple changes to the generator. In the traveling salesman problem, for instance, it is not hard to exhibit two tours <math>A</math>, <math>B</math>, with nearly equal lengths, such that (1) <math>A</math> is optimal, (2) every sequence of city-pair swaps that converts <math>A</math> to <math>B</math> goes through tours that are much longer than both, and (3) <math>A</math> can be transformed into <math>B</math> by flipping (reversing the order of) a set of consecutive cities. In this example, <math>A</math> and <math>B</math> lie in different "deep basins" if the generator performs only random pair-swaps; but they will be in the same basin if the generator performs random segment-flips.
   第393行: 第401行:     
一般来说,不可能设计一个候选生成器,既能满足这个目标,又能优先考虑具有类似能量的候选人。另一方面,人们通常可以通过对生成器进行相对简单的改变来大大提高模拟退火的效率。例如,在旅行推销员问题中,不难发现两个长度几乎相等的旅行数学 a / 数学 b / 数学,比如(1)数学 a / 数学是最优的; (2)每一个将数学 a / 数学转换为数学 b / 数学的城市对交换序列都要经过比两者都长得多的旅行; (3)数学 a / 数学可以通过翻转一组连续的城市转换为数学 b / 数学。在这个例子中,如果生成器只进行随机对偶交换,数学 a / math 和数学 b / math 位于不同的“深盆”中; 但是如果生成器执行随机分段翻转,它们将位于同一盆中。
 
一般来说,不可能设计一个候选生成器,既能满足这个目标,又能优先考虑具有类似能量的候选人。另一方面,人们通常可以通过对生成器进行相对简单的改变来大大提高模拟退火的效率。例如,在旅行推销员问题中,不难发现两个长度几乎相等的旅行数学 a / 数学 b / 数学,比如(1)数学 a / 数学是最优的; (2)每一个将数学 a / 数学转换为数学 b / 数学的城市对交换序列都要经过比两者都长得多的旅行; (3)数学 a / 数学可以通过翻转一组连续的城市转换为数学 b / 数学。在这个例子中,如果生成器只进行随机对偶交换,数学 a / math 和数学 b / math 位于不同的“深盆”中; 但是如果生成器执行随机分段翻转,它们将位于同一盆中。
 +
通常,不可能设计一个候选生成器来满足这个目标,并且优先考虑具有相同能量的候选。
 +
一般来说,不可能设计一个候选生成器,既能满足这个目标,又能优先考虑具有类似能量的候选移动。另一方面,通过对生成器进行相对简单的更改,可以极大地提高模拟退火的效率。例如在旅行商问题中,不难看出旅行<math>A</math>, <math>B</math>有着近似相等的长度
 +
(1)<math>A</math>处最优;
 +
(2)从<math>A</math>到<math>B</math>的每个城市对相互转换,经过更长的旅程
 +
(3) 通过翻转(颠倒顺序)一系列连续的城市<math>A</math>可以转换成<math>B</math>。
 +
在这个例子中,如果生成器只执行随机的对交换,<math>A</math>和<math>B</math>位于不同的“深盆”中;但是,如果生成器执行随机段翻转,它们将处于相同的池中。
         −
===Cooling schedule===
+
===冷却时刻表Cooling schedule===
    
The physical analogy that is used to justify simulated annealing assumes that the cooling rate is low enough for the probability distribution of the current state to be near [[thermodynamic equilibrium]] at all times.  Unfortunately, the ''relaxation time''—the time one must wait for the equilibrium to be restored after a change in temperature—strongly depends on the "topography" of the energy function and on the current temperature.  In the simulated annealing algorithm, the relaxation time also depends on the candidate generator, in a very complicated way.  Note that all these parameters are usually provided as [[procedural parameter|black box functions]] to the simulated annealing algorithm. Therefore, the ideal cooling rate cannot be determined beforehand, and should be empirically adjusted for each problem. [[Adaptive simulated annealing]] algorithms address this problem by connecting the cooling schedule to the search progress. Other adaptive approach as Thermodynamic Simulated Annealing<ref>{{cite journal |doi=10.1016/j.physleta.2003.08.070 |title=Placement by thermodynamic simulated annealing |year=2003 |last1=De Vicente |first1=Juan |last2=Lanchares |first2=Juan |last3=Hermida |first3=Román |journal=Physics Letters A |volume=317 |issue=5–6 |pages=415–423|bibcode=2003PhLA..317..415D }}</ref>, automatically adjusts the temperature at each step based on the energy difference between the two states, according to the laws of thermodynamics.
 
The physical analogy that is used to justify simulated annealing assumes that the cooling rate is low enough for the probability distribution of the current state to be near [[thermodynamic equilibrium]] at all times.  Unfortunately, the ''relaxation time''—the time one must wait for the equilibrium to be restored after a change in temperature—strongly depends on the "topography" of the energy function and on the current temperature.  In the simulated annealing algorithm, the relaxation time also depends on the candidate generator, in a very complicated way.  Note that all these parameters are usually provided as [[procedural parameter|black box functions]] to the simulated annealing algorithm. Therefore, the ideal cooling rate cannot be determined beforehand, and should be empirically adjusted for each problem. [[Adaptive simulated annealing]] algorithms address this problem by connecting the cooling schedule to the search progress. Other adaptive approach as Thermodynamic Simulated Annealing<ref>{{cite journal |doi=10.1016/j.physleta.2003.08.070 |title=Placement by thermodynamic simulated annealing |year=2003 |last1=De Vicente |first1=Juan |last2=Lanchares |first2=Juan |last3=Hermida |first3=Román |journal=Physics Letters A |volume=317 |issue=5–6 |pages=415–423|bibcode=2003PhLA..317..415D }}</ref>, automatically adjusts the temperature at each step based on the energy difference between the two states, according to the laws of thermodynamics.
第402行: 第416行:  
The physical analogy that is used to justify simulated annealing assumes that the cooling rate is low enough for the probability distribution of the current state to be near thermodynamic equilibrium at all times.  Unfortunately, the relaxation time—the time one must wait for the equilibrium to be restored after a change in temperature—strongly depends on the "topography" of the energy function and on the current temperature.  In the simulated annealing algorithm, the relaxation time also depends on the candidate generator, in a very complicated way.  Note that all these parameters are usually provided as black box functions to the simulated annealing algorithm. Therefore, the ideal cooling rate cannot be determined beforehand, and should be empirically adjusted for each problem. Adaptive simulated annealing algorithms address this problem by connecting the cooling schedule to the search progress. Other adaptive approach as Thermodynamic Simulated Annealing, automatically adjusts the temperature at each step based on the energy difference between the two states, according to the laws of thermodynamics.
 
The physical analogy that is used to justify simulated annealing assumes that the cooling rate is low enough for the probability distribution of the current state to be near thermodynamic equilibrium at all times.  Unfortunately, the relaxation time—the time one must wait for the equilibrium to be restored after a change in temperature—strongly depends on the "topography" of the energy function and on the current temperature.  In the simulated annealing algorithm, the relaxation time also depends on the candidate generator, in a very complicated way.  Note that all these parameters are usually provided as black box functions to the simulated annealing algorithm. Therefore, the ideal cooling rate cannot be determined beforehand, and should be empirically adjusted for each problem. Adaptive simulated annealing algorithms address this problem by connecting the cooling schedule to the search progress. Other adaptive approach as Thermodynamic Simulated Annealing, automatically adjusts the temperature at each step based on the energy difference between the two states, according to the laws of thermodynamics.
   −
用来证明模拟退火的物理类比是假设冷却速率足够低,以至于当前状态的概率分布在任何时候都接近热力学平衡。不幸的是,弛豫时间——温度变化后必须等待平衡恢复的时间——强烈地依赖于能量函数的“地形”和当前的温度。在模拟退火算法中,松弛时间以一种非常复杂的方式也依赖于候选生成器。请注意,所有这些参数通常作为黑盒函数提供给模拟退火算法。因此,理想的冷却速率不能事先确定,而应根据每个问题进行经验调整。自适应模拟退火算法通过连接冷却计划和搜索进度来解决这个问题。其他的自适应方法,如热力学模拟退火,根据美国热力学定律协会的说法,可以根据两种状态之间的能量差自动调节每一步的温度。
+
用来证明模拟退火合理性的物理类比是假定冷却速率足够低,使当前状态的概率分布在任何时候都接近热力学平衡。
 +
遗憾的是,弛豫时间——一个人必须等待平衡在温度变化后恢复的时间——强烈地依赖于能量函数的“地形”和当前的温度。
 +
在模拟退火算法中,弛豫时间也以非常复杂的方式依赖于候选生成器。
 +
注意,所有这些参数通常都作为黑盒函数提供给模拟退火算法。
 +
因此,理想的冷却速率不能预先确定,应该根据实际情况对每个问题进行调整。
 +
自适应模拟退火算法通过将冷却计划与搜索过程相结合来解决这个问题。
 +
其他自适应方法如热力模拟退火,根据美国热力学定律协会的说法,可以根据两种状态之间的能量差自动调节每一步的温度。
    
==Restarts==
 
==Restarts==
579

个编辑