更改

添加587字节 、 2020年9月9日 (三) 09:35
第282行: 第282行:  
Bayesian networks perform three main inference tasks:
 
Bayesian networks perform three main inference tasks:
   −
贝叶斯网络执行三个主要推理任务:
+
'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''执行三个主要推理任务:
      第292行: 第292行:  
Because a Bayesian network is a complete model for its variables and their relationships, it can be used to answer probabilistic queries about them. For example, the network can be used to update knowledge of the state of a subset of variables when other variables (the evidence variables) are observed. This process of computing the posterior distribution of variables given evidence is called probabilistic inference. The posterior gives a universal sufficient statistic for detection applications, when choosing values for the variable subset that minimize some expected loss function, for instance the probability of decision error. A Bayesian network can thus be considered a mechanism for automatically applying Bayes' theorem to complex problems.
 
Because a Bayesian network is a complete model for its variables and their relationships, it can be used to answer probabilistic queries about them. For example, the network can be used to update knowledge of the state of a subset of variables when other variables (the evidence variables) are observed. This process of computing the posterior distribution of variables given evidence is called probabilistic inference. The posterior gives a universal sufficient statistic for detection applications, when choosing values for the variable subset that minimize some expected loss function, for instance the probability of decision error. A Bayesian network can thus be considered a mechanism for automatically applying Bayes' theorem to complex problems.
   −
因为贝氏网路是变量及其关系的完整模型,所以它可以用来回答关于变量的概率查询。例如,当观察到其他变量(证据变量)时,网络可用于更新变量子集的状态知识。这个计算给定证据的变量后验概率的过程被称为概率推理。后验方法为检测应用提供了一个通用的充分的统计量,当为变量子集选择值时,可以最小化一些期望损失函数,例如决策错误的概率。因此,贝氏网路可以被看作是一种自动应用贝叶斯定理解决复杂问题的机制。
+
因为'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''是变量及其关系的完整模型,所以它可以用来回答关于变量的概率查询。例如,当观察到其他变量(证据变量)时,网络可用于更新变量子集的状态知识。这个计算给定证据的变量后验概率的过程被称为概率推理。后验方法为检测应用提供了一个通用的充分的统计量,当为变量子集选择值时,可以最小化一些期望损失函数,例如决策错误的概率。因此,'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''可以被看作是一种自动应用贝叶斯定理解决复杂问题的机制。
      第310行: 第310行:  
In order to fully specify the Bayesian network and thus fully represent the joint probability distribution, it is necessary to specify for each node X the probability distribution for X conditional upon Xs parents. The distribution of X conditional upon its parents may have any form. It is common to work with discrete or Gaussian distributions since that simplifies calculations. Sometimes only constraints on a distribution are known; one can then use the principle of maximum entropy to determine a single distribution, the one with the greatest entropy given the constraints. (Analogously, in the specific context of a dynamic Bayesian network, the conditional distribution for the hidden state's temporal evolution is commonly specified to maximize the entropy rate of the implied stochastic process.)
 
In order to fully specify the Bayesian network and thus fully represent the joint probability distribution, it is necessary to specify for each node X the probability distribution for X conditional upon Xs parents. The distribution of X conditional upon its parents may have any form. It is common to work with discrete or Gaussian distributions since that simplifies calculations. Sometimes only constraints on a distribution are known; one can then use the principle of maximum entropy to determine a single distribution, the one with the greatest entropy given the constraints. (Analogously, in the specific context of a dynamic Bayesian network, the conditional distribution for the hidden state's temporal evolution is commonly specified to maximize the entropy rate of the implied stochastic process.)
   −
为了完全指定贝氏网路节点,从而完全代表联合分布节点,有必要为每个节点 x 指定基于 x s 父节点的概率分布节点 x。以其父母为条件的 x 的分布可以有任何形式。它是共同的工作与离散或高斯分布,因为这简化了计算。有时只有分布上的约束是已知的; 然后可以使用最大熵原理分布来确定一个单一的分布,即给定约束条件下熵最大的分布。类似地,在动态贝氏网路的特定上下文中,隐状态时间演化的条件分布通常被指定为最大化隐含随机过程的熵率
+
为了完全指定'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''节点,从而完全代表联合分布节点,有必要为每个节点 x 指定基于 x s 父节点的概率分布节点 x。以其父母为条件的 x 的分布可以有任何形式。通常用于离散或'''<font color="#ff8000">高斯分布Gaussian distributions </font>''',因为这简化了计算。有时只有分布上的约束是已知的; 然后可以使用最大熵原理分布来确定一个单一的分布,即给定约束条件下熵最大的分布。类似地,在动态'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''的特定上下文中,隐状态时间演化的条件分布通常被指定为最大化隐含'''<font color="#ff8000"> 随机过程Stochastic process</font>'''的熵率
      第318行: 第318行:  
Often these conditional distributions include parameters that are unknown and must be estimated from data, e.g., via the maximum likelihood approach. Direct maximization of the likelihood (or of the posterior probability) is often complex given unobserved variables. A classical approach to this problem is the expectation-maximization algorithm, which alternates computing expected values of the unobserved variables conditional on observed data, with maximizing the complete likelihood (or posterior) assuming that previously computed expected values are correct. Under mild regularity conditions this process converges on maximum likelihood (or maximum posterior) values for parameters.
 
Often these conditional distributions include parameters that are unknown and must be estimated from data, e.g., via the maximum likelihood approach. Direct maximization of the likelihood (or of the posterior probability) is often complex given unobserved variables. A classical approach to this problem is the expectation-maximization algorithm, which alternates computing expected values of the unobserved variables conditional on observed data, with maximizing the complete likelihood (or posterior) assuming that previously computed expected values are correct. Under mild regularity conditions this process converges on maximum likelihood (or maximum posterior) values for parameters.
   −
通常这些条件分布包括未知的参数,必须从数据中估计,例如,通过最大似然法。直接最大化的可能性(或后验概率)往往是复杂的给定未观测的变量。这个问题的一个经典方法是期望最大化算法,它以观测数据为条件,交替计算未观测变量的期望值,并假设先前计算的期望值是正确的,最大化完全似然(或后验)。在温和的正则性条件下,这个过程收敛于参数的最大似然值(或最大后验值)。
+
通常这些条件分布包括未知的参数,必须从数据中估计,例如,通过'''<font color="#ff8000"> 最大似然法Maximum likelihood approach</font>'''。直接最大化的可能性(或后验概率)往往是复杂的给定未观测的变量。这个问题的一个经典方法是期望最大化算法,它以观测数据为条件,交替计算未观测变量的期望值,并假设先前计算的期望值是正确的,最大化完全似然(或后验)。在温和的正则性条件下,这个过程收敛于参数的最大似然值(或最大后验值)。
      第338行: 第338行:  
In the simplest case, a Bayesian network is specified by an expert and is then used to perform inference. In other applications the task of defining the network is too complex for humans. In this case the network structure and the parameters of the local distributions must be learned from data.
 
In the simplest case, a Bayesian network is specified by an expert and is then used to perform inference. In other applications the task of defining the network is too complex for humans. In this case the network structure and the parameters of the local distributions must be learned from data.
   −
在最简单的情况下,专家指定一个贝氏网路,然后用它来执行推理。在其他应用程序中,定义网络的任务对于人类来说过于复杂。在这种情况下,必须从数据中学习网络结构和局部分布的参数。
+
在最简单的情况下,专家指定一个'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>''',然后用它来执行推理。在其他应用程序中,定义网络的任务对于人类来说过于复杂。在这种情况下,必须从数据中学习网络结构和局部分布的参数。
      第346行: 第346行:  
Automatically learning the graph structure of a Bayesian network (BN) is a challenge pursued within machine learning. The basic idea goes back to a recovery algorithm developed by Rebane and Pearl and rests on the distinction between the three possible patterns allowed in a 3-node DAG:
 
Automatically learning the graph structure of a Bayesian network (BN) is a challenge pursued within machine learning. The basic idea goes back to a recovery algorithm developed by Rebane and Pearl and rests on the distinction between the three possible patterns allowed in a 3-node DAG:
   −
自动学习贝氏网路的图形结构是机器学习的一个挑战。其基本思想可以追溯到由 Rebane 和 Pearl 开发的恢复算法,该算法基于三节点有向无环图中允许的三种可能模式的区别:
+
自动学习'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''的图形结构是机器学习的一个挑战。其基本思想可以追溯到由 Rebane 和 Pearl 开发的恢复算法,该算法基于三节点有向无环图中允许的三种可能模式的区别:
    
{| class="wikitable"
 
{| class="wikitable"
第436行: 第436行:  
The first 2 represent the same dependencies (<math>X</math> and <math>Z</math> are independent given <math>Y</math>) and are, therefore, indistinguishable. The collider, however, can be uniquely identified, since <math>X</math> and <math>Z</math> are marginally independent and all other pairs are dependent. Thus, while the skeletons (the graphs stripped of arrows) of these three triplets are identical, the directionality of the arrows is partially identifiable. The same distinction applies when <math>X</math> and <math>Z</math> have common parents, except that one must first condition on those parents. Algorithms have been developed to systematically determine the skeleton of the underlying graph and, then, orient all arrows whose directionality is dictated by the conditional independences observed.
 
The first 2 represent the same dependencies (<math>X</math> and <math>Z</math> are independent given <math>Y</math>) and are, therefore, indistinguishable. The collider, however, can be uniquely identified, since <math>X</math> and <math>Z</math> are marginally independent and all other pairs are dependent. Thus, while the skeletons (the graphs stripped of arrows) of these three triplets are identical, the directionality of the arrows is partially identifiable. The same distinction applies when <math>X</math> and <math>Z</math> have common parents, except that one must first condition on those parents. Algorithms have been developed to systematically determine the skeleton of the underlying graph and, then, orient all arrows whose directionality is dictated by the conditional independences observed.
   −
前2个表示相同的依赖关系(数学 x / math 和数学 z / math 在给定数学 y / math 时是独立的) ,因此无法区分。然而,碰撞器可以被唯一地标识,因为数学 x / math 和数学 z / math 是边际独立的,而其他所有对都是依赖的。因此,虽然这三个三元组的框架(去掉箭头的图)是相同的,箭头的方向是部分可识别的。当数学 x / math 和数学 z / math 有着共同的父母时,同样的区别也适用,除了父母必须有第一个条件。算法已经发展到系统地确定骨架的基础图,然后定向所有箭头的方向性是由条件独立观察。
+
前2个表示相同的依赖关系(数学 x / math 和数学 z / math 在给定数学 y / math 时是独立的) ,因此无法区分。然而,碰撞器可以被唯一地标识,因为数学 x / math 和数学 z / math 是边际独立的,而其他所有对都是依赖的。因此,虽然这三个三元组的框架(去掉箭头的图)是相同的,箭头的方向是部分可识别的。当数学 x / math 和数学 z / math 有着共同的父母时,同样的区别也适用,除了父母必须有第一个条件。算法已经发展到系统地确定基础图的框架,然后由条件独立观察定向所有箭头的方向。
      第460行: 第460行:  
In order to deal with problems with thousands of variables, a different approach is necessary. One is to first sample one ordering, and then find the optimal BN structure with respect to that ordering. This implies working on the search space of the possible orderings, which is convenient as it is smaller than the space of network structures. Multiple orderings are then sampled and evaluated. This method has been proven to be the best available in literature when the number of variables is huge.
 
In order to deal with problems with thousands of variables, a different approach is necessary. One is to first sample one ordering, and then find the optimal BN structure with respect to that ordering. This implies working on the search space of the possible orderings, which is convenient as it is smaller than the space of network structures. Multiple orderings are then sampled and evaluated. This method has been proven to be the best available in literature when the number of variables is huge.
   −
为了处理成千上万个变量的问题,一种不同的方法是必要的。一种是先对一种有序化进行取样,然后根据这种有序化找到最优的氮化硼结构。这意味着可能排序的搜索空间比网络结构的搜索空间小,因而方便。然后对多重排序进行采样和评估。这种方法已被证明是最好的可用的文献时,变量的数量是巨大的。
+
为了处理成千上万个变量的问题,一种不同的方法是必要的。一种是先对一种有序化进行取样,然后根据这种有序化找到最优的氮化硼结构。这意味着可能排序的搜索空间比网络结构的搜索空间小,因而方便。然后对多重排序进行采样和评估。这种方法已被证明在具有大量数据的可用文献时是最佳的。
      第476行: 第476行:  
Learning Bayesian networks with bounded treewidth is necessary to allow exact, tractable inference, since the worst-case inference complexity is exponential in the treewidth k (under the exponential time hypothesis). Yet, as a global property of the graph, it considerably increases the difficulty of the learning process. In this context it is possible to use K-tree for effective learning.
 
Learning Bayesian networks with bounded treewidth is necessary to allow exact, tractable inference, since the worst-case inference complexity is exponential in the treewidth k (under the exponential time hypothesis). Yet, as a global property of the graph, it considerably increases the difficulty of the learning process. In this context it is possible to use K-tree for effective learning.
   −
学习树宽有限的贝叶斯网络是必要的,以允许精确的,易于处理的推理,因为最坏情况下的推理复杂度是指数在树宽 k (在指数时间假说下)。然而,作为图表的全局属性,它大大增加了学习过程的难度。在这种情况下,可以使用 k 树进行有效的学习。
+
学习树宽有限的'''<font color="#ff8000"> 贝叶斯网络Bayesian networks</font>'''是必要的,以允许精确的,易于处理的推理,因为最坏情况下的推理复杂度是指数树型k(在指数时间假设下)。然而,作为图表的全局属性,它大大增加了学习过程的难度。在这种情况下,可以使用 k 树进行有效的学习。
 
  −
 
      
==Statistical introduction统计简介==
 
==Statistical introduction统计简介==
561

个编辑