第1行: |
第1行: |
− | 此词条暂由水流心不竞初译,未经审校,带来阅读不便,请见谅。
| |
| | | |
− | 本词条已由[[用户:Qige96|Ricky]]审校。
| |
| | | |
| | | |
第266行: |
第264行: |
| </math> | | </math> |
| | | |
− |
| |
− | The set of parents is a subset of the set of non-descendants because the graph is [[Cycle (graph theory)|acyclic]].
| |
− |
| |
− | The set of parents is a subset of the set of non-descendants because the graph is acyclic.
| |
| | | |
| 因为贝叶斯网络是[[非循环的 acyclic]],所以每个节点的父节点集同时也是该节点的是非后代节点集的一个子集。 | | 因为贝叶斯网络是[[非循环的 acyclic]],所以每个节点的父节点集同时也是该节点的是非后代节点集的一个子集。 |
第276行: |
第270行: |
| ===生成一个贝叶斯网络=== | | ===生成一个贝叶斯网络=== |
| | | |
− | Developing a Bayesian network often begins with creating a DAG ''G'' such that ''X'' satisfies the local Markov property with respect to ''G''. Sometimes this is a [[Causal graph|causal]] DAG. The conditional probability distributions of each variable given its parents in ''G'' are assessed. In many cases, in particular in the case where the variables are discrete, if the joint distribution of ''X'' is the product of these conditional distributions, then ''X'' is a Bayesian network with respect to ''G''.<ref>{{cite book |first=Richard E. |last=Neapolitan | name-list-format = vanc |title=Learning Bayesian networks |url={{google books |plainurl=y |id=OlMZAQAAIAAJ}} |year=2004 |publisher=Prentice Hall |isbn=978-0-13-012534-7 }}</ref>
| + | 生成一个贝叶斯网络通常从创建一个有向无环图''G''开始,然后使图''G''里的随机变量''X''满足的局部马尔可夫性。有时这个图同时还是一个因果图。接下来,图里的每一个随机变量的概率分布都要被估计出来。在许多情况下,特别是在变量都是离散的情况下,如果 ''X'' 的联合分布是这些单个随机变量的条件分布的乘积,那么 ''X'' 就是 ''G'' 的贝叶斯网络。.<ref>{{cite book |first=Richard E. |last=Neapolitan | name-list-format = vanc |title=Learning Bayesian networks |url={{google books |plainurl=y |id=OlMZAQAAIAAJ}} |year=2004 |publisher=Prentice Hall |isbn=978-0-13-012534-7 }}</ref> |
− | | |
− | Developing a Bayesian network often begins with creating a DAG G such that X satisfies the local Markov property with respect to G. Sometimes this is a causal DAG. The conditional probability distributions of each variable given its parents in G are assessed. In many cases, in particular in the case where the variables are discrete, if the joint distribution of X is the product of these conditional distributions, then X is a Bayesian network with respect to G.
| |
− | | |
− | 生成一个贝叶斯网络通常从创建一个有向无环图''G''开始,然后使图''G''里的随机变量''X''满足的局部马尔可夫性。有时这个图同时还是一个因果图。接下来,图里的每一个随机变量的概率分布都要被估计出来。在许多情况下,特别是在变量都是离散的情况下,如果 ''X'' 的联合分布是这些单个随机变量的条件分布的乘积,那么 ''X'' 就是 ''G'' 的贝叶斯网络。
| |
| | | |
| | | |
第286行: |
第276行: |
| ===马尔科夫毯=== | | ===马尔科夫毯=== |
| | | |
− | The [[Markov blanket]] of a node is the set of nodes consisting of its parents, its children, and any other parents of its children. The Markov blanket renders the node independent of the rest of the network; the joint distribution of the variables in the Markov blanket of a node is sufficient knowledge for calculating the distribution of the node. ''X'' is a Bayesian network with respect to ''G'' if every node is conditionally independent of all other nodes in the network, given its [[Markov blanket]].{{sfn|Russell|Norvig|2003|p=499}}
| + | 一个节点的'''<font color="#ff8000">马尔可夫毯 Markov blanket</font>'''是由其父节点、其子节点和其子节点的所有其他父节点组成的节点集。一个节点的马尔可夫毯可以使该节点独立于网络的其余部分。一个节点的马尔可夫毯中所有变量的联合分布是计算该节点分布的一个充分条件。如果网络中的每个节点再给定其马尔可夫毯的情况下,条件地独立于网络中的所有其他节点,那么 ''X'' 就是 ''G'' 的贝叶斯网络。 |
− | | |
− | The Markov blanket of a node is the set of nodes consisting of its parents, its children, and any other parents of its children. The Markov blanket renders the node independent of the rest of the network; the joint distribution of the variables in the Markov blanket of a node is sufficient knowledge for calculating the distribution of the node. X is a Bayesian network with respect to G if every node is conditionally independent of all other nodes in the network, given its Markov blanket.
| |
− | | |
− | 一个节点的马尔可夫毯是由其父节点、其子节点和其子节点的所有其他父节点组成的节点集。一个节点的马尔可夫毯可以使该节点独立于网络的其余部分。一个节点的马尔可夫毯中所有变量的联合分布是计算该节点分布的一个充分条件。如果网络中的每个节点再给定其马尔可夫毯的情况下,条件地独立于网络中的所有其他节点,那么 ''X'' 就是 ''G'' 的贝叶斯网络。
| |
| | | |
| | | |
第296行: |
第282行: |
| ===''d''-分隔=== | | ===''d''-分隔=== |
| | | |
| + | 使用''d''-分隔的概念,贝叶斯网络的定义可以更加通用,其中 d 代表有向的。<ref>{{cite journal|last=Geiger |first=Dan |last2=Verma |first2=Thomas |last3=Pearl |first3=Judea | name-list-format = vanc |title=Identifying independence in Bayesian Networks |journal=Networks |year=1990 |volume=20 |pages=507–534 |doi=10.1177/089443939100900106 |url=http://ftp.cs.ucla.edu/pub/stat_ser/r116.pdf }}</ref><ref>{{citation |author=Richard Scheines|title=D-separation|url=http://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html}}</ref> 我们首先定义一条轨迹的''d''-分隔的线索,然后我们再定义两个节点间的''d''-分隔。 |
| | | |
− |
| |
− | This definition can be made more general by defining the "d"-separation of two nodes, where d stands for directional.<ref>{{cite journal|last=Geiger |first=Dan |last2=Verma |first2=Thomas |last3=Pearl |first3=Judea | name-list-format = vanc |title=Identifying independence in Bayesian Networks |journal=Networks |year=1990 |volume=20 |pages=507–534 |doi=10.1177/089443939100900106 |url=http://ftp.cs.ucla.edu/pub/stat_ser/r116.pdf }}</ref><ref>{{citation |author=Richard Scheines|title=D-separation|url=http://www.andrew.cmu.edu/user/scheines/tutor/d-sep.html}}</ref> We first define the "d"-separation of a trail and then we will define the "d"-separation of two nodes in terms of that.
| |
− |
| |
− | This definition can be made more general by defining the "d"-separation of two nodes, where d stands for directional. We first define the "d"-separation of a trail and then we will define the "d"-separation of two nodes in terms of that.
| |
− |
| |
− | 使用''d''-分隔的概念,贝叶斯网络的定义可以更加通用,其中 d 代表有向(directed)。我们首先定义一条轨迹的''d''-分隔的线索,然后我们再定义两个节点间的''d''-分隔。
| |
− |
| |
− |
| |
− | Let ''P'' be a trail from node ''u'' to ''v''. A trail is a loop-free, undirected (i.e. all edge directions are ignored) path between two nodes. Then ''P'' is said to be ''d''-separated by a set of nodes ''Z'' if any of the following conditions holds:
| |
− |
| |
− | Let P be a trail from node u to v. A trail is a loop-free, undirected (i.e. all edge directions are ignored) path between two nodes. Then P is said to be d-separated by a set of nodes Z if any of the following conditions holds:
| |
| | | |
| 设 ''P'' 是从节点 ''u'' 到节点 ''v'' 的轨迹。轨迹是一条两个节点之间的无环、无向的(即所有边方向被忽略)的路径。如果下列任何一个条件成立,则称 ''P'' 被一组节点 ''Z'' 分隔: | | 设 ''P'' 是从节点 ''u'' 到节点 ''v'' 的轨迹。轨迹是一条两个节点之间的无环、无向的(即所有边方向被忽略)的路径。如果下列任何一个条件成立,则称 ''P'' 被一组节点 ''Z'' 分隔: |
− |
| |
− | *''P'' contains (but does not need to be entirely) a directed chain, <math> u \cdots \leftarrow m \leftarrow \cdots v</math> or <math> u \cdots \rightarrow m \rightarrow \cdots v</math>, such that the middle node ''m'' is in ''Z'',
| |
− | *“P”包含一条有向链<math> u \cdots \leftarrow m \leftarrow \cdots v</math> or <math> u \cdots \rightarrow m \rightarrow \cdots v</math>, 其中间节点''m''属于点集''Z'',
| |
− | *''P'' contains a fork, <math> u \cdots \leftarrow m \rightarrow \cdots v</math>, such that the middle node ''m'' is in ''Z'', or
| |
− | *“P”包含一个分叉<math> u \cdots \leftarrow m \rightarrow \cdots v</math>, 其中间节点''m''位于“Z”中,或
| |
− | *''P'' contains an inverted fork (or collider), <math> u \cdots \rightarrow m \leftarrow \cdots v</math>, such that the middle node ''m'' is not in ''Z'' and no descendant of ''m'' is in ''Z''.
| |
− | *“P”包含一个倒叉(或称对撞),<math> u \cdots \rightarrow m \leftarrow \cdots v</math>,其中间节点''m''不在''Z''中,在''Z''中也没有''m''的后代节点。
| |
| | | |
| | | |
− | The nodes ''u'' and ''v'' are ''d''-separated by ''Z'' if all trails between them are ''d''-separated. If ''u'' and ''v'' are not d-separated, they are d-connected.
| + | *''P''包含一条有向链<math> u \cdots \leftarrow m \leftarrow \cdots v</math> or <math> u \cdots \rightarrow m \rightarrow \cdots v</math>, 其中间节点''m''属于点集''Z'', |
| + | *''P''包含一个分叉<math> u \cdots \leftarrow m \rightarrow \cdots v</math>, 其中间节点''m''位于“Z”中,或 |
| + | *''P''包含一个倒叉(或称对撞),<math> u \cdots \rightarrow m \leftarrow \cdots v</math>,其中间节点''m''不在''Z''中,在''Z''中也没有''m''的后代节点。 |
| | | |
− | The nodes u and v are d-separated by Z if all trails between them are d-separated. If u and v are not d-separated, they are d-connected.
| |
| | | |
− | 如果节点 ''u'' 和 ''v'' 之间的所有轨迹都是''d''-分隔的,则称节点 ''u'' 和 ''v'' 被 ''Z'' 分开。如果 '''u'' 和 ''v'' 不是''d''-分隔的,则称它们是 ‘’d‘’-连通的。 | + | 如果节点 ''u'' 和 ''v'' 之间的所有轨迹都是''d''-分隔的,则称节点 ''u'' 和 ''v'' 被 ''Z'' 分开。如果 '''u'' 和 ''v'' 不是''d''-分隔的,则称它们是 ''d''-连通的。 |
| | | |
− |
| |
− |
| |
− | ''X'' is a Bayesian network with respect to ''G'' if, for any two nodes ''u'', ''v'':
| |
− |
| |
− | X is a Bayesian network with respect to G if, for any two nodes u, v:
| |
| | | |
| 我们称''X''是''G''的贝叶斯网络,当对于图''G''中任意两个节点 ''u'',''v''满足: | | 我们称''X''是''G''的贝叶斯网络,当对于图''G''中任意两个节点 ''u'',''v''满足: |
− |
| |
| | | |
| | | |
| : <math>X_u \perp\!\!\!\perp X_v \mid X_Z</math> | | : <math>X_u \perp\!\!\!\perp X_v \mid X_Z</math> |
| | | |
− |
| |
− | where ''Z'' is a set which ''d''-separates ''u'' and ''v''. (The [[Markov blanket]] is the minimal set of nodes which ''d''-separates node ''v'' from all other nodes.)
| |
− |
| |
− | where Z is a set which d-separates u and v. (The Markov blanket is the minimal set of nodes which d-separates node v from all other nodes.)
| |
| | | |
| 其中 ''Z'' 是一个将 ''u'' 和 ''v''进行了 ''d''-分隔的集合(马尔可夫毯其实就是将节点 ''v'' 与其他所有节点''d''-分隔的最小节点集合)。 | | 其中 ''Z'' 是一个将 ''u'' 和 ''v''进行了 ''d''-分隔的集合(马尔可夫毯其实就是将节点 ''v'' 与其他所有节点''d''-分隔的最小节点集合)。 |
第347行: |
第307行: |
| | | |
| ===因果网络=== | | ===因果网络=== |
− |
| |
− | Although Bayesian networks are often used to represent [[causality|causal]] relationships, this need not be the case: a directed edge from ''u'' to ''v'' does not require that ''X<sub>v</sub>'' be causally dependent on ''X<sub>u</sub>''. This is demonstrated by the fact that Bayesian networks on the graphs:
| |
− |
| |
− | Although Bayesian networks are often used to represent causal relationships, this need not be the case: a directed edge from u to v does not require that X<sub>v</sub> be causally dependent on X<sub>u</sub>. This is demonstrated by the fact that Bayesian networks on the graphs:
| |
| | | |
| 虽然贝叶斯网络经常被用来表示因果关系,但这种情形并不是必须的: 从 ''u'' 到 ''v'' 的有向边并不意味着 X<sub>v</sub>一定是X<sub>u</sub>导致的结果。下面这两个贝叶斯网络是等价的: | | 虽然贝叶斯网络经常被用来表示因果关系,但这种情形并不是必须的: 从 ''u'' 到 ''v'' 的有向边并不意味着 X<sub>v</sub>一定是X<sub>u</sub>导致的结果。下面这两个贝叶斯网络是等价的: |
第357行: |
第313行: |
| :<math> a \rightarrow b \rightarrow c \qquad \text{and} \qquad a \leftarrow b \leftarrow c </math> | | :<math> a \rightarrow b \rightarrow c \qquad \text{and} \qquad a \leftarrow b \leftarrow c </math> |
| | | |
− |
| |
− | are equivalent: that is they impose exactly the same conditional independence requirements.
| |
− |
| |
− | are equivalent: that is they impose exactly the same conditional independence requirements.
| |
| | | |
| 也就是说,它们表示的条件独立要求完全相同。 | | 也就是说,它们表示的条件独立要求完全相同。 |
| | | |
| | | |
− | | + | 当一个贝叶斯网络中的节点关系全都是因果关系时,这网络才能被称为时因果网络。因果网络还规定,如果一个节点 ''X''再被干预的情况下变成了状态 x (写作do(''X'' = ''x'')) ,那么概率密度函数就会发生改变,表示成一个从 ''X'' 的父节点到 ''X''的链接被切断的网络,并将变量''X''的值设置为''x''。利用这些规则,我们可以在真正实施外部干预前,就能从数据中预测到实施干预后会产生的影响。 |
− | A causal network is a Bayesian network with the requirement that the relationships be causal. The additional semantics of causal networks specify that if a node ''X'' is actively caused to be in a given state ''x'' (an action written as do(''X'' = ''x'')), then the probability density function changes to that of the network obtained by cutting the links from the parents of ''X'' to ''X'', and setting ''X'' to the caused value ''x''.<ref name=pearl2000/> Using these semantics, the impact of external interventions from data obtained prior to intervention can be predicted.
| |
− | | |
− | A causal network is a Bayesian network with the requirement that the relationships be causal. The additional semantics of causal networks specify that if a node X is actively caused to be in a given state x (an action written as do(X = x)), then the probability density function changes to that of the network obtained by cutting the links from the parents of X to X, and setting X to the caused value x. Using these semantics, the impact of external interventions from data obtained prior to intervention can be predicted.
| |
− | | |
− | 当一个贝叶斯网络中的节点关系全都是因果关系时,这网络才能被称为时因果网络。因果网络还规定,如果一个节点 ''X''再被干预的情况下变成了状态 x (写作do(''X'' = ''x'')) ,那么概率密度函数就会发生改变,表示成一个从 ''X'' 的父节点到 ''X''的链接被切断的网络,并将变量''X''的值设置为''x''。利用这些规则,我们可以在真正实施外部干预前,就能从数据中预测到实施干预后会产生的影响。 | |
| | | |
| | | |
第376行: |
第323行: |
| ==推理复杂度与近似算法== | | ==推理复杂度与近似算法== |
| | | |
− | In 1990, while working at Stanford University on large bioinformatic applications, Cooper proved that exact inference in Bayesian networks is [[NP-hard]].<ref>
| + | 1990年,Cooper在斯坦福大学研究大规模生物信息学应用时,证明了贝叶斯网络中的精确推理是[[NP-hard]]<ref> |
| {{cite journal | first = Gregory F. | last = Cooper | name-list-style = vanc | title = The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks | url = https://stat.duke.edu/~sayan/npcomplete.pdf | journal = Artificial Intelligence | volume = 42 | issue = 2–3 | date = 1990 | pages = 393–405 | doi = 10.1016/0004-3702(90)90060-d }} | | {{cite journal | first = Gregory F. | last = Cooper | name-list-style = vanc | title = The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks | url = https://stat.duke.edu/~sayan/npcomplete.pdf | journal = Artificial Intelligence | volume = 42 | issue = 2–3 | date = 1990 | pages = 393–405 | doi = 10.1016/0004-3702(90)90060-d }} |
− | </ref> This result prompted research on approximation algorithms with the aim of developing a tractable approximation to probabilistic inference. In 1993, Dagum and [[Michael Luby|Luby]] proved two surprising results on the complexity of approximation of probabilistic inference in Bayesian networks.<ref> | + | </ref>(NP难问题,NP 是指非确定性多项式 non-deterministic polynomial)。这个结果促进了近似算法的研究,目的是开发出一种易于处理的近似方法来进行概率推理。1993年,Dagum 和 Luby 证明了贝叶斯网络中近似概率推理复杂度的两个令人惊讶的结果。<ref> |
| {{cite journal | vauthors = Dagum P, Luby M | author-link1 = Paul Dagum | author-link2 = Michael Luby | title = Approximating probabilistic inference in Bayesian belief networks is NP-hard | journal = Artificial Intelligence | volume = 60 | issue = 1 | date = 1993 | pages = 141–153 | doi = 10.1016/0004-3702(93)90036-b | citeseerx = 10.1.1.333.1586 }} | | {{cite journal | vauthors = Dagum P, Luby M | author-link1 = Paul Dagum | author-link2 = Michael Luby | title = Approximating probabilistic inference in Bayesian belief networks is NP-hard | journal = Artificial Intelligence | volume = 60 | issue = 1 | date = 1993 | pages = 141–153 | doi = 10.1016/0004-3702(93)90036-b | citeseerx = 10.1.1.333.1586 }} |
− | </ref> First, they proved that no tractable [[deterministic algorithm]] can approximate probabilistic inference to within an [[absolute error]] ''ɛ'' < 1/2. Second, they proved that no tractable [[randomized algorithm]] can approximate probabilistic inference to within an absolute error ''ɛ'' < 1/2 with confidence probability greater than 1/2. | + | </ref>首先,他们证明了任何易于处理的确定性近似概率推断算法的绝对误差都不可能小于1/2。其次,他们证明了没有任何易于处理的随机化的近似概率推断算法可以在置信度大于1 / 2的情况下,误差小于1 / 2。 |
| | | |
− | 1990年,Cooper在斯坦福大学研究大规模生物信息学应用时,证明了贝叶斯网络中的精确推理是[[NP-hard]](NP难问题,NP 是指非确定性多项式( non-deterministic polynomial ,缩写 NP ) 。 这个结果促进了近似算法的研究,目的是开发出一种易于处理的近似方法来进行概率推理。1993年,Dagum 和 Luby 证明了贝叶斯网络中近似概率推理复杂度的两个令人惊讶的结果。首先,他们证明了任何易于处理的确定性近似概率推断算法的绝对误差都不可能小于1/2。其次,他们证明了没有任何易于处理的随机化的近似概率推断算法可以在置信度大于1 / 2的情况下,误差小于1 / 2。
| |
| | | |
| + | 与此同时,Roth 证明了贝叶斯网络中的精确推理实际上是 #P-完全的(因此就像计数使得一个'''合取范式表达式 conjunctive normal form(CNF) formula'''为真的赋值的数量一样困难)。即使对于受限结构的贝叶斯网络来说,在对于任意''ɛ'' > 0,2<sup>''n''<sup>1−''ɛ''</sup></sup> 的近似推理也是 NP-hard的。<ref>D. Roth, [http://cogcomp.cs.illinois.edu/page/publication_view/5 On the hardness of approximate reasoning], IJCAI (1993)</ref><ref>D. Roth, [http://cogcomp.cs.illinois.edu/papers/hardJ.pdf On the hardness of approximate reasoning], Artificial Intelligence (1996)</ref> |
| | | |
| | | |
− | At about the same time, [[Dan Roth|Roth]] proved that exact inference in Bayesian networks is in fact [[Sharp-P-complete|#P-complete]] (and thus as hard as counting the number of satisfying assignments of a [[conjunctive normal form]] formula (CNF) and that approximate inference within a factor 2<sup>''n''<sup>1−''ɛ''</sup></sup> for every ''ɛ'' > 0, even for Bayesian networks with restricted architecture, is NP-hard.<ref>D. Roth, [http://cogcomp.cs.illinois.edu/page/publication_view/5 On the hardness of approximate reasoning], IJCAI (1993)</ref><ref>D. Roth, [http://cogcomp.cs.illinois.edu/papers/hardJ.pdf On the hardness of approximate reasoning], Artificial Intelligence (1996)</ref>
| + | 实际上,这些复杂性结果表明,虽然贝叶斯网络在人工智能和机器学习应用中是一种强大的表示,但它们在大型实际应用中需要对拓扑结构施加约束(如朴素贝叶斯网络)或对条件概率施加约束。'''<font color="#ff8000">有界方差算法 Bounded variance algorithm </font>'''<ref>{{cite journal | vauthors = Dagum P, Luby M | author-link1 = Paul Dagum | author-link2 = Michael Luby | title = An optimal approximation algorithm for Bayesian inference | url = http://icsi.berkeley.edu/~luby/PAPERS/bayesian.ps | journal = Artificial Intelligence | volume = 93 | issue = 1–2 | date = 1997 | pages = 1–27 | doi = 10.1016/s0004-3702(97)00013-1 | citeseerx = 10.1.1.36.7946 | access-date = 2015-12-19 | archive-url = https://web.archive.org/web/20170706064354/http://www1.icsi.berkeley.edu/~luby/PAPERS/bayesian.ps | archive-date = 2017-07-06 | url-status = dead }}</ref>是贝叶斯网络中第一个被证明在保证误差的情况下还能进行快速近似概率推理的算法。这个强大的算法需要对贝叶斯网络的条件概率进行细微的限制,使其处于[0 + 1/p(n), 1 - ''p''(''n'')]的区间内 ,其中''p''(''n'')是网络中节点数n的任意多项式。 |
| | | |
− | At about the same time, Roth proved that exact inference in Bayesian networks is in fact #P-complete (and thus as hard as counting the number of satisfying assignments of a conjunctive normal form formula (CNF) and that approximate inference within a factor 2<sup>n<sup>1−ɛ</sup></sup> for every ɛ > 0, even for Bayesian networks with restricted architecture, is NP-hard.
| |
| | | |
− | 与此同时,Roth 证明了贝叶斯网络中的精确推理实际上是 #P-完全的(因此就像计数使得一个合取范式表达式(CNF)为真的赋值的数量一样困难) 。即使对于受限结构的贝叶斯网络来说,在对于任意''ɛ'' > 0,2<sup>''n''<sup>1−''ɛ''</sup></sup> 的近似推理也是 NP-hard的。
| |
− |
| |
− |
| |
− |
| |
− | In practical terms, these complexity results suggested that while Bayesian networks were rich representations for AI and machine learning applications, their use in large real-world applications would need to be tempered by either topological structural constraints, such as naïve Bayes networks, or by restrictions on the conditional probabilities. The bounded variance algorithm<ref>{{cite journal | vauthors = Dagum P, Luby M | author-link1 = Paul Dagum | author-link2 = Michael Luby | title = An optimal approximation algorithm for Bayesian inference | url = http://icsi.berkeley.edu/~luby/PAPERS/bayesian.ps | journal = Artificial Intelligence | volume = 93 | issue = 1–2 | date = 1997 | pages = 1–27 | doi = 10.1016/s0004-3702(97)00013-1 | citeseerx = 10.1.1.36.7946 | access-date = 2015-12-19 | archive-url = https://web.archive.org/web/20170706064354/http://www1.icsi.berkeley.edu/~luby/PAPERS/bayesian.ps | archive-date = 2017-07-06 | url-status = dead }}</ref> was the first provable fast approximation algorithm to efficiently approximate probabilistic inference in Bayesian networks with guarantees on the error approximation. This powerful algorithm required the minor restriction on the conditional probabilities of the Bayesian network to be bounded away from zero and one by 1/''p''(''n'') where ''p''(''n'') was any polynomial on the number of nodes in the network ''n''.
| |
− |
| |
− | In practical terms, these complexity results suggested that while Bayesian networks were rich representations for AI and machine learning applications, their use in large real-world applications would need to be tempered by either topological structural constraints, such as naïve Bayes networks, or by restrictions on the conditional probabilities. The bounded variance algorithm was the first provable fast approximation algorithm to efficiently approximate probabilistic inference in Bayesian networks with guarantees on the error approximation. This powerful algorithm required the minor restriction on the conditional probabilities of the Bayesian network to be bounded away from zero and one by 1/p(n) where p(n) was any polynomial on the number of nodes in the network n.
| |
− |
| |
− | 实际上,这些复杂性结果表明,虽然贝叶斯网络在人工智能和机器学习应用中是一种强大的表示,但它们在大型实际应用中需要对拓扑结构施加约束(如朴素贝叶斯网络)或对条件概率施加约束。'''<font color="#ff8000">有界方差算法 Bounded variance algorithm </font>'''是贝叶斯网络中第一个被证明在保证误差的情况下还能进行快速近似概率推理的算法。这个强大的算法需要对贝叶斯网络的条件概率进行细微的限制,使其处于[0 + 1/p(n), 1 - 1/p(n)]的区间内 ,其中 p (n)是网络中节点数n的任意多项式。
| |
− |
| |
− |
| |
− |
| |
− | ==Software软件==
| |
− |
| |
− | <!--Entries in this list should be "notable" with a sourced Wikipedia article. See WP:GNG and WP:WTAF. -->
| |
− |
| |
− | <!--Entries in this list should be "notable" with a sourced Wikipedia article. See WP:GNG and WP:WTAF. -->
| |
− |
| |
− | ! ——这个列表中的条目与来源于维基百科的文章同样“值得注意”。参见 WP: GNG 和 WP: WTAF。-->
| |
− |
| |
− | Notable software for Bayesian networks include:
| |
− |
| |
− | Notable software for Bayesian networks include:
| |
| | | |
| + | ==软件== |
| 著名的贝叶斯网络软件包括: | | 著名的贝叶斯网络软件包括: |
| | | |
− | * [[Just another Gibbs sampler]] (JAGS) – Open-source alternative to WinBUGS. Uses Gibbs sampling.
| |
| * [[Just another Gibbs sampler]] (JAGS) WinBUGS的开源替代品。使用了吉布斯采样。 | | * [[Just another Gibbs sampler]] (JAGS) WinBUGS的开源替代品。使用了吉布斯采样。 |
− | * [[OpenBUGS]] – Open-source development of WinBUGS.
| |
| * [[OpenBUGS]] –WinBUGS的开源版本。 | | * [[OpenBUGS]] –WinBUGS的开源版本。 |
− | * [[SPSS Modeler]] – Commercial software that includes an implementation for Bayesian networks.
| |
| * [[SPSS Modeler]] –一个包括了贝叶斯网络实现的商业软件。 | | * [[SPSS Modeler]] –一个包括了贝叶斯网络实现的商业软件。 |
− | * [[Stan (software)]] – Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler (NUTS),<ref>{{Cite document |arxiv = 1111.4246|bibcode = 2011arXiv1111.4246H|last1 = Hoffman|first1 = Matthew D.|last2 = Gelman|first2 = Andrew|title = The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo|year = 2011}}</ref> a variant of Hamiltonian Monte Carlo.
| |
| * [[Stan (software)]] – stan是一个开源软件包,用于使用No-U-Turn取样器(NUTS),<ref>{{Cite document |arxiv = 1111.4246|bibcode = 2011arXiv1111.4246H|last1 = Hoffman|first1 = Matthew D.|last2 = Gelman|first2 = Andrew|title = The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo|year = 2011}}</ref> 是汉密尔顿蒙特卡洛方法的一个变体。 | | * [[Stan (software)]] – stan是一个开源软件包,用于使用No-U-Turn取样器(NUTS),<ref>{{Cite document |arxiv = 1111.4246|bibcode = 2011arXiv1111.4246H|last1 = Hoffman|first1 = Matthew D.|last2 = Gelman|first2 = Andrew|title = The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo|year = 2011}}</ref> 是汉密尔顿蒙特卡洛方法的一个变体。 |
− | * [[PyMC3]] – A Python library implementing an embedded domain specific language to represent bayesian networks, and a variety of samplers (including NUTS)
| |
| * [[PyMC3]] – 一个python库,它实现了一个能用来表示贝叶斯网络的微型语言,以及各种采样器(包括No-U-Turn取样器)。 | | * [[PyMC3]] – 一个python库,它实现了一个能用来表示贝叶斯网络的微型语言,以及各种采样器(包括No-U-Turn取样器)。 |
− | * [[WinBUGS]] – One of the first computational implementations of MCMC samplers. No longer maintained.
| |
| * [[WinBUGS]] –马尔可夫链蒙特卡罗采样器最早的实现之一,但这个软件已经不再维护。 | | * [[WinBUGS]] –马尔可夫链蒙特卡罗采样器最早的实现之一,但这个软件已经不再维护。 |
| | | |
| | | |
− | ==History历史== | + | ==历史== |
− | | |
− | The term Bayesian network was coined by [[Judea Pearl]] in 1985 to emphasize:
| |
| | | |
− | 1985年,朱迪亚 · 珀尔创造了'''<font color="#ff8000">贝叶斯网络</font>'''一词来强调:
| + | 1985年,朱迪亚·珀尔 Judea Pearl创造了'''<font color="#ff8000">贝叶斯网络</font>'''一词来强调: |
| | | |
− | *the often subjective nature of the input information
| |
| *输入信息通常是主观的 | | *输入信息通常是主观的 |
− | *the reliance on Bayes' conditioning as the basis for updating information
| |
| *依赖贝叶斯条件作为信息更新的基础 | | *依赖贝叶斯条件作为信息更新的基础 |
− | *the distinction between causal and evidential modes of reasoning<ref>{{Cite journal | last = Bayes | first = T. | name-list-format = vanc | authorlink = Thomas Bayes | year = 1763 | title = An Essay towards solving a Problem in the Doctrine of Chances | journal = [[Philosophical Transactions of the Royal Society]] | volume = 53 | pages = 370–418 | doi = 10.1098/rstl.1763.0053 | last2 = Price | title-link = An Essay towards solving a Problem in the Doctrine of Chances | doi-access = free }}</ref> | + | *因果推理和相关推理是有区别的<ref>{{Cite journal | last = Bayes | first = T. | name-list-format = vanc | authorlink = Thomas Bayes | year = 1763 | title = An Essay towards solving a Problem in the Doctrine of Chances | journal = [[Philosophical Transactions of the Royal Society]] | volume = 53 | pages = 370–418 | doi = 10.1098/rstl.1763.0053 | last2 = Price | title-link = An Essay towards solving a Problem in the Doctrine of Chances | doi-access = free }}</ref> |
− | *因果推理和相关推理是有区别的
| |
| | | |
− | In the late 1980s Pearl's ''Probabilistic Reasoning in Intelligent Systems''<ref>{{cite book | vauthors = Pearl J |title=Probabilistic Reasoning in Intelligent Systems |publisher=[[Morgan Kaufmann]] |location=San Francisco CA |isbn=978-1558604797 |pages=1988 |url={{google books |plainurl=y |id=AvNID7LyMusC}}|date=1988-09-15 }}</ref> and [[Richard E. Neapolitan|Neapolitan]]'s ''Probabilistic Reasoning in Expert Systems''<ref>{{cite book |first=Richard E. |last=Neapolitan | name-list-format = vanc |title=Probabilistic reasoning in expert systems: theory and algorithms |url={{google books |plainurl=y |id=7X5KLwEACAAJ}} |year=1989 |publisher=Wiley |isbn=978-0-471-61840-9}}</ref> summarized their properties and established them as a field of study.
| |
| | | |
− | In the late 1980s Pearl's Probabilistic Reasoning in Intelligent Systems and Neapolitan's Probabilistic Reasoning in Expert Systems summarized their properties and established them as a field of study.
| + | 20世纪80年代后期, Pearl的《智能系统中的概率推理 Probabilistic Reasoning in Intelligent Systems》<ref>{{cite book | vauthors = Pearl J |title=Probabilistic Reasoning in Intelligent Systems |publisher=[[Morgan Kaufmann]] |location=San Francisco CA |isbn=978-1558604797 |pages=1988 |url={{google books |plainurl=y |id=AvNID7LyMusC}}|date=1988-09-15 }}</ref>和Neapolitan的《专家系统中的概率推理 Probabilistic Reasoning in Expert Systems》<ref>{{cite book |first=Richard E. |last=Neapolitan | name-list-format = vanc |title=Probabilistic reasoning in expert systems: theory and algorithms |url={{google books |plainurl=y |id=7X5KLwEACAAJ}} |year=1989 |publisher=Wiley |isbn=978-0-471-61840-9}}</ref> 总结了它们的性质,并将它们确立为一个研究领域。 |
− | | |
− | 20世纪80年代后期,珀尔的《智能系统中的概率推理》和那不勒斯的《专家系统中的概率推理》总结了它们的性质,并将它们确立为一个研究领域。
| |
− | | |
− | == 更多 == | |
− | {{Portal|Mathematics}} | |
− | {{columns-list|colwidth=30em| | |
− | *[[Bayesian programming]]
| |
− | *[[Causal inference]]
| |
− | *[[Causal loop diagram]]
| |
− | *[[Chow–Liu tree]]
| |
− | *[[Computational intelligence]]
| |
− | *[[Computational phylogenetics]]
| |
− | *[[Deep belief network]]
| |
− | *[[Dempster–Shafer theory]] – a generalization of Bayes' theorem
| |
− | *[[Expectation–maximization algorithm]]
| |
− | *[[Factor graph]]
| |
− | *[[Hierarchical temporal memory]]
| |
− | *[[Kalman filter]]
| |
− | *[[Memory-prediction framework]]
| |
− | *[[Mixture distribution]]
| |
− | *[[Mixture model]]
| |
− | *[[Naive Bayes classifier]]
| |
− | *[[Polytree]]
| |
− | *[[Sensor fusion]]
| |
− | *[[Sequence alignment]]
| |
− | *[[Structural equation modeling]]
| |
− | *[[Subjective logic]]
| |
− | *[[Variable-order Bayesian network]]
| |
− | }} | |
| | | |
| + | == 另见 == |
| {{Portal|Mathematics}} | | {{Portal|Mathematics}} |
| {{columns-list|colwidth=30em| | | {{columns-list|colwidth=30em| |
第536行: |
第420行: |
| * {{cite book| last = Heckerman | first =David | date = March 1, 1995 | contribution = Tutorial on Learning with Bayesian Networks | contribution-url = http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-95-06 | editor-last = Jordan | editor-first = Michael Irwin | title = Learning in Graphical Models | series = Adaptive Computation and Machine Learning | location = [[Cambridge, Massachusetts]] | publication-date = 1998 | publisher = [[MIT Press]] | pages = 301–354 | isbn = 978-0-262-60032-3}} | | * {{cite book| last = Heckerman | first =David | date = March 1, 1995 | contribution = Tutorial on Learning with Bayesian Networks | contribution-url = http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-95-06 | editor-last = Jordan | editor-first = Michael Irwin | title = Learning in Graphical Models | series = Adaptive Computation and Machine Learning | location = [[Cambridge, Massachusetts]] | publication-date = 1998 | publisher = [[MIT Press]] | pages = 301–354 | isbn = 978-0-262-60032-3}} |
| | | |
− | :Also appears as {{cite journal |date=March 1997|title= Bayesian Networks for Data Mining |journal= [[Data Mining and Knowledge Discovery]]|volume= 1| issue= 1 |pages= 79–119 |doi= 10.1023/A:1009730122752 |last1= Heckerman |first1= David}} | + | :也显示为{{cite journal |date=March 1997|title= Bayesian Networks for Data Mining |journal= [[Data Mining and Knowledge Discovery]]|volume= 1| issue= 1 |pages= 79–119 |doi= 10.1023/A:1009730122752 |last1= Heckerman |first1= David}} |
− | | |
− | Also appears as
| |
| | | |
− | 也显示为
| |
| | | |
| :An earlier version appears as [https://web.archive.org/web/20060719171558/http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-95-06 Technical Report MSR-TR-95-06], [[Microsoft Research]], March 1, 1995. The paper is about both parameter and structure learning in Bayesian networks. | | :An earlier version appears as [https://web.archive.org/web/20060719171558/http://research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-95-06 Technical Report MSR-TR-95-06], [[Microsoft Research]], March 1, 1995. The paper is about both parameter and structure learning in Bayesian networks. |
第605行: |
第486行: |
| | | |
| | | |
| + | 此词条暂由水流心不竞初译,未经审校,带来阅读不便,请见谅。 |
| | | |
− | {{DEFAULTSORT:Bayesian Network}}
| + | 本词条已由[[用户:Qige96|Ricky]]审校。 |
− | | |
− | [[Category:Bayesian networks| ]] | |
− | | |
− | [[Category:Graphical models]]
| |
− | | |
− | Category:Graphical models
| |
− | | |
− | 类别: 图形模型
| |
− | | |
− | [[Category:Causality]]
| |
− | | |
− | Category:Causality
| |
− | | |
− | 分类: 因果关系
| |
− | | |
− | [[Category:Causal inference]]
| |
− | | |
− | Category:Causal inference
| |
− | | |
− | 类别: 因果推理
| |
− | | |
− | <noinclude>
| |
| | | |
− | <small>This page was moved from [[wikipedia:en:Bayesian network]]. Its edit history can be viewed at [[贝叶斯网络/edithistory]]</small></noinclude>
| |
| | | |
− | [[Category:待整理页面]] | + | [[Category:图形模型]] |
| + | [[Category:因果关系]] |
| + | [[Category:因果推理]] |