“哥德尔机”的版本间的差异

来自集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织
跳到导航 跳到搜索
ZQ讨论 | 贡献
第37行: 第37行:
  
 
There are three variables that are particularly useful in the run time of the Gödel machine.<ref name="Gödel Machines." />
 
There are three variables that are particularly useful in the run time of the Gödel machine.<ref name="Gödel Machines." />
 
有三个变量在哥德尔机器的运行时特别有用<ref name="Gödel Machines." />
 
 
有三个变量在哥德尔机器的运行时特别有用。
 
  
 
* At some time <math>t</math>, the variable <math>\text{time}</math> will have the binary equivalent of <math>t</math>. This is incremented steadily throughout the run time of the machine.
 
* At some time <math>t</math>, the variable <math>\text{time}</math> will have the binary equivalent of <math>t</math>. This is incremented steadily throughout the run time of the machine.
 
* At some time t, the variable \text{time} will have the binary equivalent of t. This is incremented steadily throughout the run time of the machine.
 
 
 
* 在某个时刻 ''t'',变量 '''time''' 将具有与 ''t'' 等价的二进制值。在整个机器的运行时间中,这个值逐步递增。
 
  
 
* Any [[input (computer science)|input]] meant for the Gödel machine from the natural environment is stored in variable <math>x</math>. It is likely the case that <math>x</math> will hold different values for different values of variable <math>\text{time}</math>.
 
* Any [[input (computer science)|input]] meant for the Gödel machine from the natural environment is stored in variable <math>x</math>. It is likely the case that <math>x</math> will hold different values for different values of variable <math>\text{time}</math>.
  
* Any input meant for the Gödel machine from the natural environment is stored in variable x. It is likely the case that x will hold different values for different values of variable \text{time}.
+
* The outputs of the Gödel machine are stored in variable <math>y</math>, where <math>y(t)</math> would be the output bit-string at some time <math>t</math>.
 
 
  
* 自然环境中用于哥德尔机器的任何输入都存储在变量 ''x'' 中。可能的情况是,对于变量 '''time''' 的不同值,''x'' 将持有不同的值。
+
有三个变量在哥德尔机运行时非常有用<ref name="Gödel Machines." />
  
* The outputs of the Gödel machine are stored in variable <math>y</math>, where <math>y(t)</math> would be the output bit-string at some time <math>t</math>.
+
* 在某个时刻<math>t</math>,变量<math>\text{time}</math>将具有与<math>t</math>等价的二进制值。这在机器运行过程中会稳步增加。
  
* The outputs of the Gödel machine are stored in variable y, where y(t) would be the output bit-string at some time t.
+
* 任何来自自然环境的输入都存储在变量<math>x</math>中。可能的情况中,对于不同的<math>\text{time}</math>值,<math>x</math>具有不同的值。
  
 +
* Gödel机器的输出存储在变量<math>y</math>中,其中<math>y(t)</math>是某个时间<math>t</math>的bit-string输出。
  
* 哥德尔机的输出存储在变量 y 中,其中 y (t)是某个时刻的输出位串 t。
 
  
 
At any given time <math>t</math>, where <math>(1 \leq t \leq T)</math>, the goal is to maximize future success or utility. A typical ''utility function'' follows the pattern <math>u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R}</math>:
 
At any given time <math>t</math>, where <math>(1 \leq t \leq T)</math>, the goal is to maximize future success or utility. A typical ''utility function'' follows the pattern <math>u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R}</math>:
 
At any given time t, where (1 \leq t \leq T), the goal is to maximize future success or utility. A typical utility function follows the pattern u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R}:
 
 
在任何给定的时间 t,其中(1 leq t leq t) ,目标是最大化未来的成功或效用。典型的效用函数遵循 u (s,mathrm { Env }) : s 乘以 e,right tarrow mathbb { r } :
 
  
 
: <math>u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg]</math>
 
: <math>u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg]</math>
 
: u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg]
 
 
: u (s,mathrm { Env }) = e _ mu Bigg [ sum { tau = text { time } ^ t r (tau) mid s,mathrm { Env } Bigg ]
 
  
 
where <math>r(t)</math> is a real-valued reward input (encoded within <math>s(t)</math>) at time <math>t</math>, <math>E_\mu [ \cdot \mid \cdot ]</math> denotes the
 
where <math>r(t)</math> is a real-valued reward input (encoded within <math>s(t)</math>) at time <math>t</math>, <math>E_\mu [ \cdot \mid \cdot ]</math> denotes the
第79行: 第61行:
 
set <math>M</math> of possible distributions (<math>M</math> reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned <math>\text{time} = \operatorname{time}(s)</math> is a function of state <math>s</math> which uniquely identifies the current cycle.<ref name="Gödel Machines."/> Note that we take into account the possibility of extending the expected lifespan through appropriate actions.<ref name="Gödel Machines."/>
 
set <math>M</math> of possible distributions (<math>M</math> reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned <math>\text{time} = \operatorname{time}(s)</math> is a function of state <math>s</math> which uniquely identifies the current cycle.<ref name="Gödel Machines."/> Note that we take into account the possibility of extending the expected lifespan through appropriate actions.<ref name="Gödel Machines."/>
  
where r(t) is a real-valued reward input (encoded within s(t)) at time t, E_\mu [ \cdot \mid \cdot ] denotes the
+
在任何给定的时刻<math>t</math>,其中<math>(1 \leq t \leq T)</math>,其目标是最大化未来的成功或有效性。典型的“效用函数(utility function)”遵循<math>u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R}</math>的模式:
conditional expectation operator with respect to some possibly unknown distribution \mu from a
+
 
set M of possible distributions (M reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned \text{time} = \operatorname{time}(s) is a function of state s which uniquely identifies the current cycle. Note that we take into account the possibility of extending the expected lifespan through appropriate actions.
+
: <math>u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg]</math>
 +
 
 +
其中<math>r(t)</math>是在时刻<math>t</math>的实值奖励输入(编码在<math>s(t)</math>中),<math>E_\mu [ \cdot \mid \cdot ]</math>表示条件期望算子,关于来自可能的分布集合<math>M</math>中的某个可能未知的分布<math>\mu</math>(<math>M</math>反映了关于环境的可能概率反应的已知信息),上面提到的<math>\text{time} = \operatorname{time}(s)</math>是状态<math>s</math>的一个函数,它唯一地确定了当前的周期。此外,这里考虑了通过适当的行动来延长预期寿命的可能性。
  
其中 r (t)是时间 t 的实值奖赏输入(编码在 s (t)内) ,e _ mu [ cdot mid cdot ]表示条件期望算子对于一组可能的分布集合 m 中某些可能未知的分布 μ (m 反映了关于环境可能的概率反应的任何已知) ,上述文本{ time } = 操作者{ time }(s)是唯一标识当前周期的状态函数。请注意,我们考虑到了通过适当的行动延长预期寿命的可能性。
 
  
 
{{Clear}}
 
{{Clear}}
  
==Instructions used by proof techniques==
 
 
==Instructions used by proof techniques==
 
  
= = 证明技术使用的指令 = =  
+
== Instructions used by proof techniques ==  
  
 
{{Confusing|Section|date=September 2017}}
 
{{Confusing|Section|date=September 2017}}
第98行: 第78行:
 
to insert an incorrect theorem into proof, thus trivializing proof verification.<ref name="Gödel Machines." />
 
to insert an incorrect theorem into proof, thus trivializing proof verification.<ref name="Gödel Machines." />
  
The nature of the six proof-modifying instructions below makes it impossible
+
以下六个证明修改指令的性质使得不可能将错误的定理插入证明,从而使证明验证变得简单<ref name="Gödel Machines." />。
to insert an incorrect theorem into proof, thus trivializing proof verification.
 
 
 
下面的六个证明修改指令的性质使得不可能在证明中插入一个错误的定理,从而使证明验证变得无足轻重。
 
  
 
===get-axiom(''n'')===
 
===get-axiom(''n'')===
 
===get-axiom(n)===
 
 
= = = get-axiom (n) = =
 
  
 
Appends the ''n''-th [[axiom]] as a theorem to the current theorem sequence. Below is the initial axiom scheme:
 
Appends the ''n''-th [[axiom]] as a theorem to the current theorem sequence. Below is the initial axiom scheme:
 
Appends the n-th axiom as a theorem to the current theorem sequence. Below is the initial axiom scheme:
 
 
将第 n 公理作为一个定理附加到当前定理序列。下面是最初的公理方案:
 
  
 
* '''Hardware Axioms''' formally specify how components of the machine could change from one cycle to the next.
 
* '''Hardware Axioms''' formally specify how components of the machine could change from one cycle to the next.
第122行: 第91行:
 
* '''Utility Axioms''' describe the overall goal in the form of utility function ''u''.
 
* '''Utility Axioms''' describe the overall goal in the form of utility function ''u''.
  
* Hardware Axioms formally specify how components of the machine could change from one cycle to the next.
+
将第''n''个公理作为定理追加到当前的定理序列中。下面是初始公理方案:
* Reward Axioms define the computational cost of hardware instruction and the physical cost of output actions. Related Axioms also define the lifetime of the Gödel machine as scalar quantities representing all rewards/costs.
 
* Environment Axioms restrict the way new inputs x are produced from the environment, based on previous sequences of inputs y.
 
* Uncertainty Axioms/String Manipulation Axioms are standard axioms for arithmetic, calculus, probability theory, and string manipulation that allow for the construction of proofs related to future variable values within the Gödel machine.
 
* Initial State Axioms contain information about how to reconstruct parts or all of the initial state.
 
* Utility Axioms describe the overall goal in the form of utility function u.
 
  
 +
* '''硬件公理'''正式指定了机器的组件如何从一个周期变为下一个周期。
  
* 硬件公理正式指定机器的组件如何从一个周期改变到下一个周期。
+
* '''奖励公理'''定义了硬件指令的计算成本以及输出动作的物理成本。相关公理还将哥德尔机的寿命定义为表示所有奖励/成本的标量。
* 奖励公理定义硬件指令的计算成本和输出动作的实际成本。相关公理还将哥德尔机的寿命定义为表示所有报酬/成本的标量。
 
* 环境公理限制了基于先前输入序列 y 从环境中产生新输入 x 的方式。
 
* 不确定性公理/字符串操作公理是算术、微积分、概率论和字符串操作的标准公理,允许在哥德尔机器中构造与未来变量值相关的证明。
 
* 初始状态公理包含如何重构部分或全部初始状态的信息。
 
* 效用公理以效用函数 u 的形式描述总体目标。
 
  
===apply-rule(''k'', ''m'', ''n'')===
+
* '''环境公理'''限制了根据先前输入序列''y''从环境中产生新输入''x''的方式。
  
===apply-rule(k, m, n)===
+
* '''不确定性公理/字符串操作公理'''是关于算术、微积分、概率论和字符串操作的标准公理,允许在哥德尔机内构建与未来变量值有关的证明。
  
= = = apply-rule (k,m,n) = =
+
* ''初始状态公理'''包含有关如何重建部分或所有初始状态的信息。
  
Takes in the index ''k'' of an inference rule (such as [[Modus tollens]], [[Modus ponens]]), and attempts to apply it to the two previously proved theorems ''m'' and ''n''. The resulting theorem is then added to the proof.
+
* '''效用公理'''以效用函数''u''的形式描述了总体目标。
  
Takes in the index k of an inference rule (such as Modus tollens, Modus ponens), and attempts to apply it to the two previously proved theorems m and n. The resulting theorem is then added to the proof.
 
  
引入推理规则的索引 k (如否定式、否定式) ,并试图将其应用于前面已证明的两个定理 m 和 n。然后将得到的定理加到证明中。
+
===apply-rule(''k'', ''m'', ''n'')===
  
===delete-theorem(''m'')===
+
Takes in the index ''k'' of an inference rule (such as [[Modus tollens]], [[Modus ponens]]), and attempts to apply it to the two previously proved theorems ''m'' and ''n''. The resulting theorem is then added to the proof.
  
===delete-theorem(m)===
+
引入推理规则(如Modus tollens,Modus ponens)的索引''k'',并尝试将其应用于两个先前证明的定理''m''和''n''。然后将得到的定理添加到证明中。
  
= = delete-theorem (m) = =  
+
===delete-theorem(''m'')===
  
 
Deletes the theorem stored at index ''m'' in the current proof. This helps to mitigate storage constraints caused by redundant and unnecessary theorems. Deleted theorems can no longer be referenced by the above '''''apply-rule''''' function.
 
Deletes the theorem stored at index ''m'' in the current proof. This helps to mitigate storage constraints caused by redundant and unnecessary theorems. Deleted theorems can no longer be referenced by the above '''''apply-rule''''' function.
  
Deletes the theorem stored at index m in the current proof. This helps to mitigate storage constraints caused by redundant and unnecessary theorems. Deleted theorems can no longer be referenced by the above apply-rule function.
+
删除当前证明中索引''m''处存储的定理。这有助于减轻由冗余和不必要的定理引起的存储限制。已删除的定理将无法再被上述'''''apply-rule'''''函数引用。
  
删除当前证明中存储在索引 m 处的定理。这有助于减轻由冗余和不必要的定理引起的存储约束。上述应用规则函数不能再引用删除定理。
 
  
 
===set-switchprog(''m'', ''n'')===
 
===set-switchprog(''m'', ''n'')===
  
===set-switchprog(m, n)===
 
 
= = set-switchprog (m,n) = =
 
  
 
Replaces ''switchprog'' ''S<sup> p</sup><sub>m:n</sub>'', provided it is a non-empty [[substring]] of ''S<sup> p</sup>''.
 
Replaces ''switchprog'' ''S<sup> p</sup><sub>m:n</sub>'', provided it is a non-empty [[substring]] of ''S<sup> p</sup>''.
  
Replaces switchprog S pm:n, provided it is a non-empty substring of S p.
+
替换''switchprog'' ''S<sup> p</sup><sub>m:n</sub>'', 前提为其是''S<sup> p</sup>''的非空子字符串。
 
 
替换 switchprog s pm: n,前提是它是 s p 的非空子字符串。
 
 
 
===check()===
 
  
 
===check()===
 
===check()===
 
= = 谢谢观赏 = =
 
  
 
Verifies whether the goal of the proof search has been reached. A target theorem states that given the current axiomatized utility function ''u'' (Item 1f), the utility of a switch from ''p'' to the current switchprog would be higher than the utility of continuing the execution of ''p'' (which would keep searching for alternative switchprogs).<ref name="Gödel Machines." /> This is demonstrated in the following description of the decoded check() function for the Gödel Machine:
 
Verifies whether the goal of the proof search has been reached. A target theorem states that given the current axiomatized utility function ''u'' (Item 1f), the utility of a switch from ''p'' to the current switchprog would be higher than the utility of continuing the execution of ''p'' (which would keep searching for alternative switchprogs).<ref name="Gödel Machines." /> This is demonstrated in the following description of the decoded check() function for the Gödel Machine:
  
Verifies whether the goal of the proof search has been reached. A target theorem states that given the current axiomatized utility function u (Item 1f), the utility of a switch from p to the current switchprog would be higher than the utility of continuing the execution of p (which would keep searching for alternative switchprogs). This is demonstrated in the following description of the decoded check() function for the Gödel Machine:
+
验证证明搜索的目标是否已经达到。目标定理指出,根据当前的公理化效用函数''u''(项目1f),从''p''切换到当前切换程序的效用将高于继续执行''p''的效用(这将继续寻找替代切换程序)。<ref name="Gödel Machines." />
 
 
验证证明搜索的目的是否已经达到。一个目标定理指出,给定当前公理化效用函数 u (条目1f) ,从 p 切换到当前切换的效用将高于继续执行 p 的效用(继续搜索替代切换项)。下面对 Gödel Machine 的 decoded check ()函数的描述说明了这一点:
 
 
 
: <math> D_{KA} = \frac{ d_\text{pore}} 3 u = \frac{d_\text{pore}} 3 \sqrt{\frac{8\kappa NT}{\pi M_A}} </math>
 
 
 
:  D_{KA} = \frac{ d_\text{pore}} 3 u = \frac{d_\text{pore}} 3 \sqrt{\frac{8\kappa NT}{\pi M_A}}
 
 
 
3 u = frac { d _ text { pore }3 sqrt { kappa NT }{ pi m _ a }
 
  
 
===state2theorem(''m'', ''n'')===
 
===state2theorem(''m'', ''n'')===
 
===state2theorem(m, n)===
 
 
= = = 状态2theorem (m,n) = =
 
  
 
Takes in two arguments, ''m'' and ''n'', and attempts to convert the contents of ''S<sub>m:n</sub>'' into a theorem.
 
Takes in two arguments, ''m'' and ''n'', and attempts to convert the contents of ''S<sub>m:n</sub>'' into a theorem.
  
Takes in two arguments, m and n, and attempts to convert the contents of Sm:n into a theorem.
+
输入两个参数,''m''和''n'',并尝试将''S<sub>m:n</sub>''的内容转换为一个定理。
  
接受两个参数,m 和 n,并试图将 Sm: n 的内容转换成一个定理。
 
  
 
==Example applications==
 
==Example applications==
 
==Example applications==
 
 
= = 应用实例 = =
 
  
 
===Time-limited NP-hard optimization===
 
===Time-limited NP-hard optimization===
 
===Time-limited NP-hard optimization===
 
 
= = = 时间有限的 np 难优化 = = =
 
  
 
The initial input to the Gödel machine is the representation of a connected graph with a large number of [[Vertex (graph theory)|nodes]] linked by edges of various lengths. Within given time ''T'' it should find a [[cycle (graph theory)|cyclic]] path connecting all nodes. The only real-valued reward will occur at time ''T''. It equals 1 divided by the length of the best path found so far (0 if none was found). There are no other inputs. The by-product of maximizing expected reward is to find the shortest path findable within the limited time, given the initial bias.<ref name="Gödel Machines." />
 
The initial input to the Gödel machine is the representation of a connected graph with a large number of [[Vertex (graph theory)|nodes]] linked by edges of various lengths. Within given time ''T'' it should find a [[cycle (graph theory)|cyclic]] path connecting all nodes. The only real-valued reward will occur at time ''T''. It equals 1 divided by the length of the best path found so far (0 if none was found). There are no other inputs. The by-product of maximizing expected reward is to find the shortest path findable within the limited time, given the initial bias.<ref name="Gödel Machines." />
  
The initial input to the Gödel machine is the representation of a connected graph with a large number of nodes linked by edges of various lengths. Within given time T it should find a cyclic path connecting all nodes. The only real-valued reward will occur at time T. It equals 1 divided by the length of the best path found so far (0 if none was found). There are no other inputs. The by-product of maximizing expected reward is to find the shortest path findable within the limited time, given the initial bias.
+
哥德尔机的初始输入是为连通图,其中包含大量由不同长度的边连接起来的节点。在给定的时间''T''内,哥德尔机需要找到一个连接所有节点的循环路径。唯一的实值奖励将发生在时间''T''。等于1除以到目前为止找到的最佳路径的长度(如果没有找到则为0)。没有其他输入。在有限时间内找到最短路径的副产品是在给定初始偏差下的最大化期望奖励,需要。<ref name="Gödel Machines." />
  
哥德尔机器的初始输入是一个连通图的表示,其中包含大量由不同长度的边连接起来的节点。在给定的时间 t 内,它应该找到一个连接所有节点的循环路径。唯一真正有价值的奖励将发生在 t 时刻。它等于1除以迄今为止找到的最佳路径的长度(如果没有找到,则为0)。没有其他的输入。最大化期望报酬的副产品是在给定初始偏差的情况下,在有限时间内找到最短路径。
 
  
 
===Fast theorem proving===
 
===Fast theorem proving===
 
===Fast theorem proving===
 
 
= = = 快速定理证明 = = =
 
  
 
Prove or disprove as quickly as possible that all even integer > 2 are the sum of two primes ([[Goldbach’s conjecture]]). The reward is 1/''t'', where ''t'' is the time required to produce and verify the first such proof.<ref>{{cite journal|last1=Schmidhuber|first1=Jürgen|title=Ultimate Cognition à la Gödel|journal=Cognitive Computation|volume=1|issue=2|pages=177–193|doi=10.1007/s12559-009-9014-y|date=5 March 2009|citeseerx=10.1.1.218.3323}}<!--|accessdate=11 November 2014--></ref>
 
Prove or disprove as quickly as possible that all even integer > 2 are the sum of two primes ([[Goldbach’s conjecture]]). The reward is 1/''t'', where ''t'' is the time required to produce and verify the first such proof.<ref>{{cite journal|last1=Schmidhuber|first1=Jürgen|title=Ultimate Cognition à la Gödel|journal=Cognitive Computation|volume=1|issue=2|pages=177–193|doi=10.1007/s12559-009-9014-y|date=5 March 2009|citeseerx=10.1.1.218.3323}}<!--|accessdate=11 November 2014--></ref>
  
Prove or disprove as quickly as possible that all even integer > 2 are the sum of two primes (Goldbach’s conjecture). The reward is 1/t, where t is the time required to produce and verify the first such proof.
+
尽可能快地证明或反驳所有大于2的偶数都是两个质数之和(即哥德巴赫猜想)。奖励是1/''t'',其中''t''是产生和验证第一个这样的证明所需的时间。<ref name="ultimate" />
  
尽快证明或证伪所有大于2的偶数都是两个素数之和(哥德巴赫猜想)。奖励是1/t,其中 t 是出示和验证第一份证据所需的时间。
 
  
 
===Maximizing expected reward with bounded resources===
 
===Maximizing expected reward with bounded resources===
  
===Maximizing expected reward with bounded resources===
 
 
= = = 用有界资源最大化期望报酬 = = =
 
 
[[文件:生产任务流程图.jpg|缩略图|570x570像素]]
 
[[文件:生产任务流程图.jpg|缩略图|570x570像素]]
 
A [[cognitive]] robot that needs at least 1 [[liter]] of gasoline per hour interacts with a partially unknown environment, trying to find hidden, limited gasoline depots to occasionally refuel its tank. It is rewarded in proportion to its lifetime, and dies after at most 100 years or as soon as its tank is empty or it falls off a cliff, and so on. The [[probabilistic]] environmental reactions are initially unknown but assumed to be sampled from the axiomatized Speed Prior, according to which hard-to-compute environmental reactions are unlikely. This permits a computable strategy for making near-optimal predictions. One by-product of maximizing expected reward is to maximize expected lifetime.<ref name="Gödel Machines." />
 
A [[cognitive]] robot that needs at least 1 [[liter]] of gasoline per hour interacts with a partially unknown environment, trying to find hidden, limited gasoline depots to occasionally refuel its tank. It is rewarded in proportion to its lifetime, and dies after at most 100 years or as soon as its tank is empty or it falls off a cliff, and so on. The [[probabilistic]] environmental reactions are initially unknown but assumed to be sampled from the axiomatized Speed Prior, according to which hard-to-compute environmental reactions are unlikely. This permits a computable strategy for making near-optimal predictions. One by-product of maximizing expected reward is to maximize expected lifetime.<ref name="Gödel Machines." />
  
A cognitive robot that needs at least 1 liter of gasoline per hour interacts with a partially unknown environment, trying to find hidden, limited gasoline depots to occasionally refuel its tank. It is rewarded in proportion to its lifetime, and dies after at most 100 years or as soon as its tank is empty or it falls off a cliff, and so on. The probabilistic environmental reactions are initially unknown but assumed to be sampled from the axiomatized Speed Prior, according to which hard-to-compute environmental reactions are unlikely. This permits a computable strategy for making near-optimal predictions. One by-product of maximizing expected reward is to maximize expected lifetime.
+
一个认知机器人每小时至少需要1升汽油与部分未知的环境互动,试图找到隐藏的、有限的汽油库以不时地为其油箱加油。它的奖励与其寿命成正比,最多在100年后死亡,或者在油箱空了或掉下悬崖等情况下立即死亡。最初未知的概率环境反应被假设为来自公理化的速度先验,根据这一先验,难以计算的环境反应是不可能的。这允许一个可计算的策略来进行接近最优的预测。最大化期望奖励的一个副产品是最大化预期寿命。<ref name="Gödel Machines." />
  
一个每小时至少需要1升汽油的认知机器人与部分未知的环境互动,试图找到隐藏的、有限的汽油库,偶尔给油箱加油。它被按照寿命的比例给予奖励,最多100年后死亡,或者在它的水箱空了或者掉下悬崖时死亡,等等。概率环境反应最初是未知的,但假定是从公理化速度优先采样,根据难以计算的环境反应是不可能的。这使得一个可计算的策略可以做出接近最优的预测。最大化期望报酬的一个副产品是最大化期望寿命。
 
  
 
==See also==
 
==See also==
[[Gödel's incompleteness theorems]]
 
  
Gödel's incompleteness theorems
+
[[哥德尔不完备定理]]
 
 
= = 也见
 
* 哥德尔的不完备性定理
 
  
 
==References==
 
==References==
第259行: 第173行:
 
* [http://www.idsia.ch/~juergen/goedelmachine.html Goedel machines home page]
 
* [http://www.idsia.ch/~juergen/goedelmachine.html Goedel machines home page]
 
* [https://arxiv.org/abs/cs.LO/0309048 Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements]
 
* [https://arxiv.org/abs/cs.LO/0309048 Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements]
 
 
* Goedel machines home page
 
* Goedel machines home page
 
* Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements
 
* Goedel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements
 
= = 外部链接 = =
 
* Goedel Machines 主页
 
* Goedel Machines: Self-Referential Universal Problem Solvers Making provedable Optimal Self-Improvements
 
  
 
{{DEFAULTSORT:Godel machine}}
 
{{DEFAULTSORT:Godel machine}}
 
[[Category:Artificial intelligence]]
 
[[Category:Artificial intelligence]]
 
  
 
Category:Artificial intelligence
 
Category:Artificial intelligence
 
类别: 人工智能
 
  
 
<noinclude>
 
<noinclude>

2023年4月17日 (一) 16:34的版本

A Gödel machine is a hypothetical self-improving computer program that solves problems in an optimal way.模板:Clarify It uses a recursive self-improvement protocol in which it rewrites its own code when it can prove the new code provides a better strategy.[1][2] The machine was invented by Jürgen Schmidhuber (first proposed in 2003[3]), but is named after Kurt Gödel who inspired the mathematical theories.[4]


哥德尔机是一种假想中可以进行自我完善,从而以最优的方式解决问题的通用计算机程序。哥德尔机基于递归的自我提升架构 (recursive self-improvement),即当它能够自我证明新的代码提供了更好的解决方案时,它会自动更新为更优的代码[1][2] 。哥德尔机是由Jürgen Schmidhuber (LSTM网络提出者) 于2003年提出[3]。因为Jürgen受到库尔特·哥德尔的数学理论的启发而创建该模型,所以命名为哥德尔机[4]


The Gödel machine is often discussed when dealing with issues of meta-learning, also known as "learning to learn." Applications include automating human design decisions and transfer of knowledge between multiple related tasks, and may lead to design of more robust and general learning architectures.[5] Though theoretically possible, no full implementation has been created.[6]

哥德尔机经常在元学习(meta-learning)领域被讨论,是一种被称为“学会学习”的过程。应用包括自动化人类设计决策,以及在多个相关任务之间进行知识传递,用于设计更加强大和通用的学习架构[5] 。但是目前哥德尔机只存在于理论当中,尚未完整的实现[6]

The Gödel machine is often compared with Marcus Hutter's AIXItl, another formal specification for an artificial general intelligence. Schmidhuber points out that the Gödel machine could start out by implementing AIXItl as its initial sub-program, and self-modify after it finds proof that another algorithm for its search code will be better.[7]

The Gödel machine is often compared with Marcus Hutter's AIXItl, another formal specification for an artificial general intelligence. Schmidhuber points out that the Gödel machine could start out by implementing AIXItl as its initial sub-program, and self-modify after it finds proof that another algorithm for its search code will be better.


人们常将哥德尔机与Marcus Hutter的AIXItl进行比较,后者是另一种通用人工智能范式。Schmidhuber指出,哥德尔机可以首先通过实现AIXItl作为其初始子程序,并在找到能够被证明的另一个更好的算法后进行自我修改(self-modify)。

局限性

Traditional problems solved by a computer only require one input and provide some output. Computers of this sort had their initial algorithm hardwired.[8] This doesn't take into account the dynamic natural environment, and thus was a goal for the Gödel machine to overcome.


传统的由计算机解决的问题只需要进行输入,然后计算机给出输出。这类计算机的初始算法是硬编码的[8] 。这并没有考虑到动态的自然环境,因此哥德尔机的目标就是克服这一问题。

The Gödel machine has limitations of its own, however. According to Gödel's First Incompleteness Theorem, any formal system that encompasses arithmetic is either flawed or allows for statements that cannot be proved in the system. Hence even a Gödel machine with unlimited computational resources must ignore those self-improvements whose effectiveness it cannot prove.[3]

The Gödel machine has limitations of its own, however. According to Gödel's First Incompleteness Theorem, any formal system that encompasses arithmetic is either flawed or allows for statements that cannot be proved in the system. Hence even a Gödel machine with unlimited computational resources must ignore those self-improvements whose effectiveness it cannot prove.

然而,哥德尔机器本身也有局限性。根据哥德尔的第一条不完备定理,任何包含算术的形式系统要么是有缺陷的,要么允许系统内无法证明的陈述。因此,即使一个拥有无限计算资源的哥德尔机也必须忽略那些无法证明其有效性的自我改进。

Variables of interest

一些变量

模板:Confusing

There are three variables that are particularly useful in the run time of the Gödel machine.[3]

  • At some time [math]\displaystyle{ t }[/math], the variable [math]\displaystyle{ \text{time} }[/math] will have the binary equivalent of [math]\displaystyle{ t }[/math]. This is incremented steadily throughout the run time of the machine.
  • Any input meant for the Gödel machine from the natural environment is stored in variable [math]\displaystyle{ x }[/math]. It is likely the case that [math]\displaystyle{ x }[/math] will hold different values for different values of variable [math]\displaystyle{ \text{time} }[/math].
  • The outputs of the Gödel machine are stored in variable [math]\displaystyle{ y }[/math], where [math]\displaystyle{ y(t) }[/math] would be the output bit-string at some time [math]\displaystyle{ t }[/math].

有三个变量在哥德尔机运行时非常有用[3]

  • 在某个时刻[math]\displaystyle{ t }[/math],变量[math]\displaystyle{ \text{time} }[/math]将具有与[math]\displaystyle{ t }[/math]等价的二进制值。这在机器运行过程中会稳步增加。
  • 任何来自自然环境的输入都存储在变量[math]\displaystyle{ x }[/math]中。可能的情况中,对于不同的[math]\displaystyle{ \text{time} }[/math]值,[math]\displaystyle{ x }[/math]具有不同的值。
  • Gödel机器的输出存储在变量[math]\displaystyle{ y }[/math]中,其中[math]\displaystyle{ y(t) }[/math]是某个时间[math]\displaystyle{ t }[/math]的bit-string输出。


At any given time [math]\displaystyle{ t }[/math], where [math]\displaystyle{ (1 \leq t \leq T) }[/math], the goal is to maximize future success or utility. A typical utility function follows the pattern [math]\displaystyle{ u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R} }[/math]:

[math]\displaystyle{ u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg] }[/math]

where [math]\displaystyle{ r(t) }[/math] is a real-valued reward input (encoded within [math]\displaystyle{ s(t) }[/math]) at time [math]\displaystyle{ t }[/math], [math]\displaystyle{ E_\mu [ \cdot \mid \cdot ] }[/math] denotes the conditional expectation operator with respect to some possibly unknown distribution [math]\displaystyle{ \mu }[/math] from a set [math]\displaystyle{ M }[/math] of possible distributions ([math]\displaystyle{ M }[/math] reflects whatever is known about the possibly probabilistic reactions of the environment), and the above-mentioned [math]\displaystyle{ \text{time} = \operatorname{time}(s) }[/math] is a function of state [math]\displaystyle{ s }[/math] which uniquely identifies the current cycle.[3] Note that we take into account the possibility of extending the expected lifespan through appropriate actions.[3]

在任何给定的时刻[math]\displaystyle{ t }[/math],其中[math]\displaystyle{ (1 \leq t \leq T) }[/math],其目标是最大化未来的成功或有效性。典型的“效用函数(utility function)”遵循[math]\displaystyle{ u(s, \mathrm{Env}) : S \times E \rightarrow \mathbb{R} }[/math]的模式:

[math]\displaystyle{ u(s, \mathrm{Env}) = E_\mu \Bigg[ \sum_{\tau=\text{time}}^T r(\tau) \mid s, \mathrm{Env} \Bigg] }[/math]

其中[math]\displaystyle{ r(t) }[/math]是在时刻[math]\displaystyle{ t }[/math]的实值奖励输入(编码在[math]\displaystyle{ s(t) }[/math]中),[math]\displaystyle{ E_\mu [ \cdot \mid \cdot ] }[/math]表示条件期望算子,关于来自可能的分布集合[math]\displaystyle{ M }[/math]中的某个可能未知的分布[math]\displaystyle{ \mu }[/math][math]\displaystyle{ M }[/math]反映了关于环境的可能概率反应的已知信息),上面提到的[math]\displaystyle{ \text{time} = \operatorname{time}(s) }[/math]是状态[math]\displaystyle{ s }[/math]的一个函数,它唯一地确定了当前的周期。此外,这里考虑了通过适当的行动来延长预期寿命的可能性。



Instructions used by proof techniques

模板:Confusing

The nature of the six proof-modifying instructions below makes it impossible to insert an incorrect theorem into proof, thus trivializing proof verification.[3]

以下六个证明修改指令的性质使得不可能将错误的定理插入证明,从而使证明验证变得简单[3]

get-axiom(n)

Appends the n-th axiom as a theorem to the current theorem sequence. Below is the initial axiom scheme:

  • Hardware Axioms formally specify how components of the machine could change from one cycle to the next.
  • Reward Axioms define the computational cost of hardware instruction and the physical cost of output actions. Related Axioms also define the lifetime of the Gödel machine as scalar quantities representing all rewards/costs.
  • Environment Axioms restrict the way new inputs x are produced from the environment, based on previous sequences of inputs y.
  • Uncertainty Axioms/String Manipulation Axioms are standard axioms for arithmetic, calculus, probability theory, and string manipulation that allow for the construction of proofs related to future variable values within the Gödel machine.
  • Initial State Axioms contain information about how to reconstruct parts or all of the initial state.
  • Utility Axioms describe the overall goal in the form of utility function u.

将第n个公理作为定理追加到当前的定理序列中。下面是初始公理方案:

  • 硬件公理正式指定了机器的组件如何从一个周期变为下一个周期。
  • 奖励公理定义了硬件指令的计算成本以及输出动作的物理成本。相关公理还将哥德尔机的寿命定义为表示所有奖励/成本的标量。
  • 环境公理限制了根据先前输入序列y从环境中产生新输入x的方式。
  • 不确定性公理/字符串操作公理是关于算术、微积分、概率论和字符串操作的标准公理,允许在哥德尔机内构建与未来变量值有关的证明。
  • 初始状态公理'包含有关如何重建部分或所有初始状态的信息。
  • 效用公理以效用函数u的形式描述了总体目标。


apply-rule(k, m, n)

Takes in the index k of an inference rule (such as Modus tollens, Modus ponens), and attempts to apply it to the two previously proved theorems m and n. The resulting theorem is then added to the proof.

引入推理规则(如Modus tollens,Modus ponens)的索引k,并尝试将其应用于两个先前证明的定理mn。然后将得到的定理添加到证明中。

delete-theorem(m)

Deletes the theorem stored at index m in the current proof. This helps to mitigate storage constraints caused by redundant and unnecessary theorems. Deleted theorems can no longer be referenced by the above apply-rule function.

删除当前证明中索引m处存储的定理。这有助于减轻由冗余和不必要的定理引起的存储限制。已删除的定理将无法再被上述apply-rule函数引用。


set-switchprog(m, n)

Replaces switchprog S pm:n, provided it is a non-empty substring of S p.

替换switchprog S pm:n, 前提为其是S p的非空子字符串。

check()

Verifies whether the goal of the proof search has been reached. A target theorem states that given the current axiomatized utility function u (Item 1f), the utility of a switch from p to the current switchprog would be higher than the utility of continuing the execution of p (which would keep searching for alternative switchprogs).[3] This is demonstrated in the following description of the decoded check() function for the Gödel Machine:

验证证明搜索的目标是否已经达到。目标定理指出,根据当前的公理化效用函数u(项目1f),从p切换到当前切换程序的效用将高于继续执行p的效用(这将继续寻找替代切换程序)。[3]

state2theorem(m, n)

Takes in two arguments, m and n, and attempts to convert the contents of Sm:n into a theorem.

输入两个参数,mn,并尝试将Sm:n的内容转换为一个定理。


Example applications

Time-limited NP-hard optimization

The initial input to the Gödel machine is the representation of a connected graph with a large number of nodes linked by edges of various lengths. Within given time T it should find a cyclic path connecting all nodes. The only real-valued reward will occur at time T. It equals 1 divided by the length of the best path found so far (0 if none was found). There are no other inputs. The by-product of maximizing expected reward is to find the shortest path findable within the limited time, given the initial bias.[3]

哥德尔机的初始输入是为连通图,其中包含大量由不同长度的边连接起来的节点。在给定的时间T内,哥德尔机需要找到一个连接所有节点的循环路径。唯一的实值奖励将发生在时间T。等于1除以到目前为止找到的最佳路径的长度(如果没有找到则为0)。没有其他输入。在有限时间内找到最短路径的副产品是在给定初始偏差下的最大化期望奖励,需要。[3]


Fast theorem proving

Prove or disprove as quickly as possible that all even integer > 2 are the sum of two primes (Goldbach’s conjecture). The reward is 1/t, where t is the time required to produce and verify the first such proof.[9]

尽可能快地证明或反驳所有大于2的偶数都是两个质数之和(即哥德巴赫猜想)。奖励是1/t,其中t是产生和验证第一个这样的证明所需的时间。[10]


Maximizing expected reward with bounded resources

生产任务流程图.jpg

A cognitive robot that needs at least 1 liter of gasoline per hour interacts with a partially unknown environment, trying to find hidden, limited gasoline depots to occasionally refuel its tank. It is rewarded in proportion to its lifetime, and dies after at most 100 years or as soon as its tank is empty or it falls off a cliff, and so on. The probabilistic environmental reactions are initially unknown but assumed to be sampled from the axiomatized Speed Prior, according to which hard-to-compute environmental reactions are unlikely. This permits a computable strategy for making near-optimal predictions. One by-product of maximizing expected reward is to maximize expected lifetime.[3]

一个认知机器人每小时至少需要1升汽油与部分未知的环境互动,试图找到隐藏的、有限的汽油库以不时地为其油箱加油。它的奖励与其寿命成正比,最多在100年后死亡,或者在油箱空了或掉下悬崖等情况下立即死亡。最初未知的概率环境反应被假设为来自公理化的速度先验,根据这一先验,难以计算的环境反应是不可能的。这允许一个可计算的策略来进行接近最优的预测。最大化期望奖励的一个副产品是最大化预期寿命。[3]


See also

哥德尔不完备定理

References

  1. 1.0 1.1 Mahmud, M. M. Hassan (2008). Universal Transfer Learning. pp. 16–18. ISBN 9780549909880. 
  2. 2.0 2.1 Anderson, Michael L.; Tim Oates (Spring 2007). "A review of recent research in metareasoning and metalearning". AI Magazine. 28 (1): 7.
  3. 3.00 3.01 3.02 3.03 3.04 3.05 3.06 3.07 3.08 3.09 3.10 3.11 3.12 3.13 3.14 Schmidhuber, Jürgen (December 2006). Gödel Machines: Self-Referential ¨ Universal Problem Solvers Making Provably Optimal Self-Improvements. ftp://ftp.idsia.ch/pub/juergen/gm6.pdf. Retrieved 10 November 2014. 
  4. 4.0 4.1 "Gödel machine".
  5. 5.0 5.1 Schaul, Tom; Schmidhuber, Juergen (2010). "Metalearning". Scholarpedia. 5 (6): 4650. Bibcode:2010SchpJ...5.4650S. doi:10.4249/scholarpedia.4650. Retrieved 10 November 2014.
  6. 6.0 6.1 Steunebrink, Bas R.; Schmidhuber, Jürgen (2011). A Family of Gödel Machine Implementations. 6830. pp. 275–280. doi:10.1007/978-3-642-22887-2_29. ISBN 978-3-642-22886-5. 
  7. Schmidhuber, Jürgen (5 March 2009). "Ultimate Cognition à la Gödel" (PDF). Cognitive Computation. 1 (2): 177–193. CiteSeerX 10.1.1.218.3323. doi:10.1007/s12559-009-9014-y. Retrieved 13 November 2014.
  8. 8.0 8.1 Schmidhuber, Jürgen (5 March 2009). "Ultimate Cognition à la Gödel" (PDF). Cognitive Computation. 1 (2): 177–193. CiteSeerX 10.1.1.218.3323. doi:10.1007/s12559-009-9014-y. Retrieved 13 November 2014.
  9. Schmidhuber, Jürgen (5 March 2009). "Ultimate Cognition à la Gödel". Cognitive Computation. 1 (2): 177–193. CiteSeerX 10.1.1.218.3323. doi:10.1007/s12559-009-9014-y.
  10. 引用错误:无效<ref>标签;未给name属性为ultimate的引用提供文字

External links

Category:Artificial intelligence


This page was moved from wikipedia:en:Gödel machine. Its edit history can be viewed at 哥德尔机/edithistory