https://wiki.swarma.org/api.php?action=feedcontributions&user=Pjhhh&feedformat=atom集智百科 - 复杂系统|人工智能|复杂科学|复杂网络|自组织 - 用户贡献 [zh-cn]2024-03-28T23:49:09Z用户贡献MediaWiki 1.35.0https://wiki.swarma.org/index.php?title=%E4%BF%A1%E6%81%AF%E8%AE%BA_Information_theory&diff=22194信息论 Information theory2021-03-13T08:17:01Z<p>Pjhhh:</p>
<hr />
<div>{{#seo:<br />
|keywords=信息论,信息时代,正规科学,控制论,计算机科学<br />
|description=信息论,信息时代,正规科学,控制论,计算机科学<br />
}}<br />
'''信息论 Information theory'''研究的是信息的量化、存储与传播。信息论最初是由[[克劳德·香农 Claude Shannon]]在1948年的一篇题为'''<font color="#ff8000">《一种通信的数学理论 A Mathematical Theory of Communication 》</font>'''的里程碑式论文中提出的,其目的是找到信号处理和通信操作(如数据压缩)的基本限制。信息论对于旅行者号深空探测任务的成功、光盘的发明、移动电话的可行性、互联网的发展、语言学和人类感知的研究、对黑洞的理解以及许多其他领域的研究都是至关重要的。<br />
<br />
该领域是数学、统计学、计算机科学、物理学、神经生物学、信息工程和电气工程的交叉学科。这一理论也在其他领域得到了应用,比如推论统计学、自然语言处理、密码学、神经生物学<ref name="Spikes">{{cite book|title=Spikes: Exploring the Neural Code|author1=F. Rieke|author2=D. Warland|author3=R Ruyter van Steveninck|author4=W Bialek|publisher=The MIT press|year=1997|isbn=978-0262681087}}</ref>、人类视觉<ref>{{Cite journal|last1=Delgado-Bonal|first1=Alfonso|last2=Martín-Torres|first2=Javier|date=2016-11-03|title=Human vision is determined based on information theory|journal=Scientific Reports|language=En|volume=6|issue=1|pages=36038|bibcode=2016NatSR...636038D|doi=10.1038/srep36038|issn=2045-2322|pmc=5093619|pmid=27808236}}</ref>、分子编码的进化、和功能(生物信息学)、统计学中的模型选择<ref>Burnham, K. P. and Anderson D. R. (2002) ''Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition'' (Springer Science, New York)}}.</ref>、热物理学<ref>{{cite journal|last1=Jaynes|first1=E. T.|year=1957|title=Information Theory and Statistical Mechanics|url=http://bayes.wustl.edu/|journal=Phys. Rev.|volume=106|issue=4|page=620|bibcode=1957PhRv..106..620J|doi=10.1103/physrev.106.620}}</ref> 、量子计算、语言学、剽窃检测<ref>{{cite journal|last1=Bennett|first1=Charles H.|last2=Li|first2=Ming|last3=Ma|first3=Bin|year=2003|title=Chain Letters and Evolutionary Histories|url=http://sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|journal=Scientific American|volume=288|issue=6|pages=76–81|bibcode=2003SciAm.288f..76B|doi=10.1038/scientificamerican0603-76|pmid=12764940|access-date=2008-03-11|archive-url=https://web.archive.org/web/20071007041539/http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|archive-date=2007-10-07|url-status=dead}}</ref>、模式识别和异常检测<ref>{{Cite web|url=http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|title=Some background on why people in the empirical sciences may want to better understand the information-theoretic methods|author=David R. Anderson|date=November 1, 2003|archiveurl=https://web.archive.org/web/20110723045720/http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|archivedate=July 23, 2011|url-status=dead|accessdate=2010-06-23}}<br />
</ref>。 <br />
<br />
信息论的重要分支包括信源编码、算法复杂性理论、算法信息论、信息理论安全性、灰色系统理论和信息度量。<br />
<br />
信息论在应用领域的基本课题包括无损数据压缩(例如:ZIP压缩文件)、有损数据压缩(例如:Mp3和jpeg格式),以及频道编码(例如:DSL)。信息论在信息检索、情报收集、赌博,甚至在音乐创作中也有应用。<br />
<br />
信息论中的一个关键度量是'''[[熵]]'''。熵量化了一个随机变量的值或者一个随机过程的结果所包含的不确定性。例如,识别一次公平抛硬币的结果(有两个同样可能的结果)所提供的信息(较低的熵)少于识别抛一次骰子的结果(有六个同样可能的结果)。信息论中的其他一些重要指标有:互信息、信道容量、误差指数和相对熵。<br />
<br />
==概览==<br />
<br />
信息论主要研究信息的传递、处理、提取和利用。抽象地说,信息可以作为不确定性的解决方案。1948年,Claude Shannon在他的论文《一种通信的数学理论》中将这个抽象的概念具体化,在这篇论文中“信息”被认为是一组可能的信号,这些信号在通过带有噪声的信道发送后,接收者能在信道噪声的影响下以较低的错误概率来重构这些信号。 Shannon的主要结论,有噪信道编码定理,表明在信道使用的许多限制情况下,渐近可达到信息传输速率等于的信道容量,一个仅仅依赖于信息发送所经过的信道本身的统计量。(译注:当信道的信息传输率不超过信道容量时,采用合适的编码方法可以实现任意高的传输可靠性,但若信息传输率超过了信道容量,就不可能实现可靠的传输。)<br />
<br />
信息论与一系列纯科学和应用科学密切相关。在过去半个世纪甚至更久的时间里,在全球范围内已经有各种各样的学科理论被研究和化归为工程实践,比如在[[自适应系统]],预期系统,人工智能,[[复杂系统]],[[复杂性科学]],[[控制论]],信息学,[[机器学习]],以及[[系统科学]]。信息论是一个广博而深遂的数学理论,也具有广泛而深入的应用,其中'''编码理论'''是至关重要的领域。<br />
<br />
编码理论与寻找明确的方法(编码)有关,用于提高效率和将有噪信道上传输的数据错误率降低到接近信道容量。这些编码可大致分为数据压缩编码(信源编码)和纠错(信道编码)技术。对于纠错技术,Shannon证明了理论极限很多年后才有人找到了真正实现了理论最优的方法。<br />
<br />
第三类信息论代码是'''密码算法'''(包括密文和密码)。编码理论和信息论的概念、方法和结果在密码学和密码分析中得到了广泛的应用。<br />
<br />
==历史背景==<br />
<br />
1948年7月和10月,[[克劳德·E·香农 Claude E. Shannon]]在《贝尔系统技术期刊》上发表了经典论文:《一种通信的数学理论》,这就是建立信息论学科并立即引起全世界关注的里程碑事件。<br />
<br />
在此之前,贝尔实验室已经提出了有限的信息论思想,所有这些理论都隐性地假设了概率均等的事件。Harry Nyquist 在1924年发表的论文《集中影响电报速率的因素 Certain Factors Affecting Telegraph Speed》中包含一个理论章节,量化了“情报”和通信系统可以传输的“线路速度”,并给出了关系式 {{math|1=''W'' = ''K'' log ''m''}} (参考玻尔兹曼常数) ,其中 ''W'' 是情报传输的速度, ''m'' 是每个时间步长可以选择的不同电压电平数,''K'' 是常数。Ralph Hartley 在1928年发表的论文《信息的传输 Transmission of Information》中,将单词信息作为一个可测量的量,以此反映接收者区分一系列符号的能力,从而将信息量化为 {{math|1=''H'' = log ''S''<sup>''n''</sup> = ''n'' log ''S''}},其中 ''S'' 是可以使用的符号的数量,''n'' 是传输中符号的数量。因此信息的单位就是十进制数字,为了表示对他的尊敬,这个单位有时被称为 Hartley,作为信息的单位、尺度或度量。1940年,图灵在二战时期破解德国的“迷”密码 Enigma ciphers的统计分析中使用了类似的思想。<br />
<br />
信息论背后的许多数学理论(包括不同概率的事件)都是由[[路德维希·玻尔兹曼 Ludwig Boltzmann]]和[[约西亚·威拉德·吉布斯 J. Willard Gibbs]]为热力学领域开发出来的。<br />
<br />
Shannon的那篇革命性的、开创性的论文,于1944年的年底便已基本在贝尔实验室完成。在这论文里, Shannon将通信看作一个统计学过程,首次提出了通信的量化模型,并以此为基础推导出了信息论。论文开篇便提出了一下论断:<br />
<br />
''<blockquote>“The basic problem of communication is the accurate or approximate representation at one point of selected information at another point.”<br><br />
“通信的基本问题是在一点上精确地或近似地再现在另一点上选择的信息。”</blockquote>''<br />
<br />
与此相关的一些想法包括:<br />
<br />
* 信息熵和信源冗余,以及'''<font color="#ff8000">信源编码定理</font>''';<br />
<br />
* '''<font color="#ff8000">互信息,有噪信道的信道容量</font>''',包括无损通信的证明,和'''<font color="#ff8000">有噪信道编码定理</font>''';<br />
<br />
* '''<font color="#ff8000">香农-哈特利定律 Shannon–Hartley law</font>'''应用于高斯信道的信道容量的结果;<br />
<br />
* '''<font color="#ff8000">比特 bit</font>'''——一种新的度量信息的最基本单位。<br />
<br />
==信息的度量==<br />
<br />
信息论基于概率论和统计学,其中经常涉及衡量随机变量的分布的信息。信息论中重要的信息量有:熵(单个随机变量中信息的度量)和互信息(两个随机变量之间的信息的度量)。熵是随机变量的概率分布的一个属性,它限制了从给定分布中独立采样得到的数据的压缩率。互信息是两个随机变量的联合概率分布的一个属性,是当信道的统计量由联合分布确定时,在长块长度的限制下,通过有噪信道的可靠通信的最大速率。<br />
<br />
在下列公式中,对数底数的选择决定了信息熵的单位。信息的常见单位是比特(基于二进制对数)。其他单位包括 nat(自然对数)和十进制数字(常用对数)。<br />
<br />
下文中,按惯例将 {{math|1=''p'' = 0}} 时的表达式{{math|''p'' log ''p''}}的值视为等于零,因为<math>\lim_{p \rightarrow 0+} p \log p = 0</math>适用于任何对数底。<br />
<br />
===信源的熵===<br />
<br />
基于每个用于通信的源符号的概率质量函数,'''<font color="#ff8000">香农熵 Shannon Entropy</font>'''(以比特为单位)由下式给出:<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
其中{{math|''p<sub>i</sub>''}}是源符号的第{{math|''i''}}个可能值出现的概率。该方程以比特(每个符号)为单位给出熵,因为它使用以2为底的对数。为表纪念,这个熵有时被称为'''香农熵'''。熵的计算也通常使用自然对数(以[[E (mathematical constant)|{{mvar|e}}]]为底数,其中{{mvar|e}}是欧拉数,其他底数也是可行的,但不常用),这样就可以测量每个符号的熵值,有时在公式中可以通过避免额外的常量来简化分析。例如以{{math|1=2<sup>8</sup> = 256}}为底的对数,得出的值就以字节(而非比特)作为单位。以10为底的对数,每个符号将产生以十进制数字(或哈特利)为单位的测量值。<br />
<br />
直观的来看,离散型随机变量{{math|''X''}}的熵{{math|''H<sub>X</sub>''}}是对不确定性的度量,当只知道其分布时,它的值与{{math|''X''}}的值相关。<br />
<br />
当一个信息源发出了一串含有{{math|''N''}}个符号的序列,且每个符号[[独立同分布]]时,其熵为{{math|''N'' ⋅ ''H''}}位(每个信息{{math|''N''}}符号)。<br />
如果源数据符号是同分布但不独立的,则长度为{{math|''N''}}的消息的熵将小于{{math|''N'' ⋅ ''H''}}。<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px| 伯努利实验的熵,作为一个成功概率的函数,通常被称为二值熵函数, {{math|''H''<sub>b</sub>(''p'')}}。当使用一个无偏的硬币做实验时,两个可能结果出现的概率相等,此时的熵值最大,为1。]]<br />
<br />
如果一个人发送了1000比特(0s和1s),然而接收者在发送之前就已知这串比特序列中的每一个位的值,显然这个通信过程并没有任何信息(译注:如果你要告诉我一个我已经知到的消息,那么本次通信没有传递任何信息)。但是,如果消息未知,且每个比特独立且等可能的为0或1时,则本次通信传输了1000香农的信息(通常称为“比特”)。在这两个极端之间,信息可以按以下方式进行量化。如果𝕏是{{math|''X''}}可能在的所有消息的集合{{math|{''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}},且{{math|''p''(''x'')}}是<math>x \in \mathbb X</math>的概率,那么熵、{{math|''H''}}和{{math|''H''}}的定义如下: <ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x)</math><br />
<br />
(其中:{{math|''I''(''x'')}}是[[自信息]],表示单个信息的熵贡献;{{math|''I''(''x'')}}{{math|𝔼<sub>''X''</sub>}}为{{math|''X''}}的期望。)熵的一个特性是,当消息空间中的所有消息都是等概率{{math|1=''p''(''x'') = 1/''n''}}时熵最大; 也就是说,在{{math|1=''H''(''X'') = log ''n''}}这种情况下,熵是最不可预测的。<br />
<br />
对于只有两种可能取值的随机变量的信息熵,其特殊情况为二值熵函数(通常用以为底2对数,因此以香农(Sh)为单位):<br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p)</math><br />
<br />
<br><br />
<br />
===联合熵 Joint entropy===<br />
<br />
两个离散的随机变量{{math|''X''}}和{{math|''Y''}}的'''<font color="#ff8000">联合熵 Joint Entropy</font>'''大致是它们的配对: {{math|(''X'', ''Y'')}}。若{{math|''X''}}和{{math|''Y''}}是独立的,那么它们的联合熵就是其各自熵的总和。<br />
<br />
例如:如果{{math|(''X'', ''Y'')}}代表棋子的位置({{math|''X''}} 表示行和{{math|''Y''}}表示列),那么棋子所在位置的熵就是棋子行、列的联合熵。<br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
尽管符号相似,注意联合熵与交叉熵不能混淆。<br />
<br />
<br />
===条件熵(含糊度)Conditional entropy (equivocation)===<br />
<br />
在给定随机变量{{math|''Y''}}下{{math|''X''}}的'''<font color="#ff8000">条件熵 Conditional Entropy</font>'''(或条件不确定性,也可称为{{math|''X''}}关于{{math|''Y''}}的含糊度))是{{math|''Y''}}上的平均条件熵: <ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
由于熵能够以随机变量或该随机变量的某个值为条件,所以应注意不要混淆条件熵的这两个定义(前者更为常用)。该类条件熵的一个基本属性为:<br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<br />
===互信息(转移信息) Mutual information (transinformation)===<br />
<br />
'''<font color="#ff8000">互信息 Mutual Information</font>'''度量的是某个随机变量在通过观察另一个随机变量时可以获得的信息量。在通信中可以用它来最大化发送和接收信号之间共享的信息量,这一点至关重要。{{math|''X''}}相对于{{math|''Y''}}的互信息由以下公式给出:<br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
其中{{math|SI}} (Specific mutual Information,特定互信息)是点间的互信息。<br />
<br />
互信息的一个基本属性是:<br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y)\,</math><br />
<br />
也就是说,在编码''X''的过程中,知道''Y''比不知道''Y''平均节省{{math|''I''(''X''; ''Y'')}}比特。<br />
<br />
互信息是对称的:<br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y)\,</math><br />
<br />
互信息可以表示为在给定''Y''值的情况下''X''的后验分布,以及''X''的先验概率分布之间的平均 Kullback-Leibler 散度(信息增益):<br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )]</math><br />
<br />
换句话说,这个指标度量:当我们给出''Y''的值,得出''X''上的概率分布将会平均变化多少。这通常用于计算边缘分布的乘积与实际联合分布的差异:<br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y))</math><br />
<br />
互信息与列联表中的似然比检验,多项分布,以及皮尔森卡方检验密切相关: 互信息可以视为评估一对变量之间独立性的统计量,并且具有明确指定的渐近分布。<br />
<br />
<br><br />
<br />
===Kullback-Leibler散度(信息增益) Kullback–Leibler divergence (information gain)===<br />
<br />
'''Kullback-Leibler 散度'''(或信息散度、相对熵、信息增益)是比较两种分布的方法: “真实的”概率分布''p(X)''和任意概率分布''q(X)''。若假设''q(X)''是基于某种方式压缩的数据的分布,而实际上''p(X)''才是真正分布,那么 Kullback-Leibler 散度是每个数据压缩所需的平均额外比特数。因此定义:<br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
尽管有时会将KL散度用作距离量度但它并不是一个真正的指标,因为它是不对称的,同时也不满足三角不等式(KL散度可以作为一个半准度量)。<br />
<br />
KL散度的另一种解释是一种先验知识引入的“不必要的惊讶”。假设将从概率分布为“ p(x)”的离散集合中随机抽取数字“ X”,如果Alice知道真实的分布“p(x)”,而Bob(因为具有先验知识)认为概率分布是“q(x)”,那么在看到抽取出来的''X''的值后,平均而言,Bob将比Alice更加惊讶。KL散度就是Bob惊讶的期望值减去Alice惊讶的期望值(如果对数以2为底,则以比特为单位),这样Bob所拥有的先验知识的“错误的”程度可以用他“不必要的惊讶”的期望值来进行量化。<br />
<br />
===其他度量===<br />
<br />
信息论中其他重要的量包括'''<font color="#ff8000">瑞丽熵 Rényi Entropy</font>'''(一种熵的推广),微分熵(信息量推广到连续分布),以及条件互信息。<br />
<br />
<br />
==编码理论==<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|在可读CD的表面上显示划痕的图片。音乐和数据CD使用纠错编码进行编码,因此即使它们有轻微的划痕,也可以通过错误检测和纠正来对CD进行读取。]]<br />
<br />
'''<font color="#ff8000">编码理论 Coding Theory</font>'''是信息论最重要、最直接的应用之一,可以细分为'''<font color="#ff8000">信源编码理论 Source Coding Theory</font>'''和'''<font color="#ff8000">信道编码理论 Channel Coding Theory</font>'''。信息论使用统计学来量化描述数据所需的比特数,也就是源的信息熵。<br />
<br />
* 数据压缩(源编码):压缩问题有两个相关公式;<br />
<br />
* [[无损数据压缩]]:数据必须准确重构;<br />
<br />
* [[有损数据压缩]]:由失真函数测得的在指定保真度级别内分配重构数据所需的比特数。信息论中的这个部分称为率失真理论。<br />
<br />
*纠错码(信道编码):数据压缩会尽可能多的消除冗余,而纠错码会添加所需的冗余(即纠错),以便在嘈杂的信道上有效且保真地传输数据。<br />
<br />
信息传输定理,或着说“信源-信道分离定理”证明,编码理论应当划分为压缩和传输两部分。定理证明了在许多情况下使用比特作为信息的''通用货币''是合理的,但这只在发送用户与特定接收用户建立通信的情况下才成立。在具有多个发送器(多路访问信道),多个接收器(广播信道)或中转器(中继信道)或多个计算机网络的情况下,压缩后再进行传输可能就不再是最佳选择。[[网络信息论]]指的就是这些多主体通信模型。<br />
<br />
<br />
<br />
===信源理论===<br />
<br />
生成连续消息的任何过程都可以视为信息的通讯来源。无记忆信源是指每个消息都是独立同分布的随机变量,而遍历理论和平稳过程的性质对信源施加的限制较少。所有这些信源都可以看作随机的。在信息论领域外,这些术语也已经有很全面的相关研究。<br />
<br />
<br />
====速率====<br />
<br />
'''<font color="#ff8000">信息速率 Information Rate</font>'''(熵率)是每个符号的平均熵。对于无记忆信源,信息速率仅表示每个符号的熵,而在平稳随机过程中,它是:<br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots)</math>;<br />
<br />
也就是,给定所有之前生成的符号下,一个符号的条件熵。对于非平稳的过程的更一般情况,平均速率为:<br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n)</math>;<br />
<br />
也就是每个符号的联合熵的极限。对于平稳源,这两个表达式得出的结果相同。<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
<br />
在信息论中谈论一种语言的“速率”或“熵”是很常见的,也是很合适的,比如当信源是英文散文时。信息源的速率与其冗余度以及可被压缩程度有关。<br />
<br />
<br />
<br />
===信道容量===<br />
<br />
通过信道(例如,以太网电缆)进行通信是信息论的主要动机。然而,这样的信道往往不能产生信号的精确重建;静默时段内、噪声、其他形式的信号损坏往往会使得信息质量的降低。<br />
<br />
考虑离散信道上的通信过程。该过程的简单模型如下:<br />
<br />
<br />
[[File:Channel model.svg|center|800px|Channel model]]<br />
<br />
这里''X''表示要发送的信息的空间(全集),''Y''表示单位时间内通过信道接收的信息的空间。设{{math|''p''(''y''{{pipe}}''x'')}}是给定''X''的''Y''的条件概率分布函数。我们将{{math|''p''(''y''{{pipe}}''x'')}}视为通信信道的固定属性(表示信道噪声的性质)。那么''X''和''Y''的联合分布完全取决于所选用的信道和{{math|''f''(''x'')}},以及通过信道发送的信息的边缘分布。在这些约束条件下,我们希望最大化信息速率或信号速率,可以通过信道进行通信。对此的适当度量为互信息,信道容量即为最大互信息,且由下式给出:<br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
信道容量具有以下与以信息速率“R”进行通信有关的属性(其中“R”通常为每个符号的比特数)。对于任意信息速率''R < C''和编码错误ε > 0,存在足够大的长度为''N''和速率大于等于R的代码以及解码算法使得块错误的最大概率小于等于ε;即总是可以在任意小的块错误下进行传输。此外对于任何速率的“ R> C”,不可能以很小的块错误进行发送。<br />
<br />
信道编码就寻找一种接近最优的编码,它可以用于在噪声信道上以接近信道容量的速率传输数据,且编码错误很小。<br />
<br />
<br />
====特定信道容量模型====<br />
<br />
*连续时间内受高斯噪声 Gaussian noise限制的模拟通信信道(详细内容请参见[[Shannon–Hartley定理]])。<br />
<br />
<br />
*'''二进制对称通道 binary symmetric channel(BSC)'''是交叉概率为''p''的二进制输入、二进制输出(以概率''p''翻转输入位)通道。每个通道使用的BSC容量为{{math|1 &minus; ''H''<sub>b</sub>(''p'')}}比特,其中{{math|''H''<sub>b</sub>}}是以2为底的对数的二进制熵函数:<br />
<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
*'''二进制擦除通道 binary erasure channel(BEC)'''是擦除概率为“ p”的二进制输入、三进制输出通道。可能的通道输出为0、1和擦除符号'e'。擦除表示信息输入位的完全丢失。每个通道使用的BEC容量为{{nowrap|1 &minus; ''p''}}比特。<br />
<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
==在其他领域的应用==<br />
<br />
===情报使用和安全应用===<br />
<br />
信息论的概念可以应用于密码学和密码分析。在Ultra的项目中就使用了图灵的信息单位[[Ban(unit)| ban]],破解了德国的恩尼格玛密码,加速了二战在欧洲的结束。香农定义了一个重要的概念,现在称为'''单一性距离 [[unicity distance]]''',基于明文的冗余性尝试给出具有唯一可解密性所需的最少量的密文。<br />
<br />
信息论使我们觉得保密比最初看起来要困难得多。穷举法也可以破解基于非对称密钥算法或最常用的对称密钥算法(也称为密钥算法),如分块加密。所有这些方法的安全性都来自以下假设:在一定的的时间内没有已知的攻击方法可以破解它们。<br />
<br />
信息理论安全性指的是诸如一次性密钥之类的不易受到这种暴力攻击的方法。在这种情况下,可以确保明文和密文(以密钥为条件)之间的正条件互信息正确的传输,而明文和密文之间的无条件互信息仍为零,从而保证绝对安全的通信。换句话说,窃听者将无法通过获取密文而不是密钥的知识来改善其对原文本的猜测。但是,就像在其他任何密码系统中一样,即便时信息论中安全的方法必须小心正确的使用;之所以Venona 项目能够破解苏联的一次性密钥,就是因为苏联不当地重复使用关键材料。<br />
<br />
===伪随机数的生成===<br />
<br />
伪随机数生成器在计算机语言库和应用程序中广泛应用。由于它们没有规避现代计算机设备和软件的确定性,因此普遍不适合用在密码学中。一类改进的随机数生成器称为加密安全的伪随机数生成器,但也需要软件外部的随机种子才能正常工作,这通过提取器来获得。用来度量提取器中充分随机性的概念是最小熵,该值通过[[瑞丽熵]]与香农熵关联;瑞丽熵还用于评估密码系统中的随机性。虽然相关,但具有较高香农熵的随机变量不一定适合在提取器中使用,因此也不能用在密码学中。<br />
<br />
<br />
<br />
===地震勘探===<br />
<br />
信息论的一个早期商业应用是在地震石油勘探领域。在该领域的应用可以从期望的地震信号中剔除和分离不需要的噪声。与以前的模拟方法相比,信息论和数字信号处理大大提高了图像的分辨率和清晰度。<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
<br />
<br />
<br />
===符号学===<br />
<br />
符号学家Doede Nauta和Winfried Nöth都认为Charles Sanders Peirce在他的符号学著作中创造了信息论。<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref><ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref> Nauta将符号信息论定义为研究编码、过滤和信息处理的内部过程。<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
信息论的概念(例如冗余和代码控制)已被符号学家如Umberto Eco和Ferruccio Rossi-Landi用来解释意识形态,将其作为消息传输的一种形式,占统治地位的社会阶层通过使用具有高度冗余性的标志来发出其信息,使得从符号中解码出来的消息只有一种,而不会时其他可能的消息。<ref>Nöth, Winfried (1981). "[https://kobra.uni-kassel.de/bitstream/handle/123456789/2014122246977/semi_2004_002.pdf?sequence=1&isAllowed=y Semiotics of ideology]". ''Semiotica'', Issue 148.</ref><br />
<br />
===其他应用===<br />
<br />
信息论在赌博、黑洞信息论和生物信息学中也有应用。<br />
<br />
<br />
<br />
==参见==<br />
{{div col|colwidth=20em}}<br />
* [[算法概率]]<br />
* [[贝叶斯推断]]<br />
* [[通信理论]]<br />
* [[构造器理论]] - 一种包含了量子信息的广义信息论<br />
* [[归纳概率]]<br />
* [[信息度量]]<br />
* [[最小消息长度]]<br />
* [[最小描述长度]]<br />
* [[计算机科学重要文献列表#信息论|重要文献列表]]<br />
* [[信息的哲学]]<br />
{{div col end}}<br />
<br />
===应用===<br />
<br />
{{div col|colwidth=20em}}<br />
* [[主动网络]]<br />
* [[密码分析]]<br />
* [[密码学]]<br />
* [[控制论]]<br />
* [[热力学和信息论中的熵]]<br />
* [[赌博]]<br />
* [[智能 (信息采集)]]<br />
* [[反射波勘探法|地震勘探]]<br />
<br />
{{div col end}}<br />
<br />
===理论===<br />
<br />
{{div col|colwidth=20em}} <br />
* [[编码理论]]<br />
* [[探测理论]]<br />
* [[估计理论]]<br />
* [[费舍尔理论]]<br />
* [[信息代数]]<br />
* [[信息不对称性]]<br />
* [[信息场论]]<br />
* [[信息几何]]<br />
* [[信息论于测度论]]<br />
* [[柯尔莫哥洛夫复杂度]]<br />
* [[信息论未解之谜]]<br />
* [[信息的逻辑]]<br />
* [[网络编码]]<br />
* [[信息的哲学]]<br />
* [[量子信息科学]]<br />
* [[信源编码]]<br />
<br />
{{div col end}}<br />
<br />
===概念===<br />
<br />
{{div col|colwidth=20em}}<br />
* [[Ban (单位)]] —— 以10为底的对数信息量单位<br />
* [[信道容量]]<br />
* [[信道]]<br />
* [[信源]]<br />
* [[条件熵]]<br />
* [[转换信道]]<br />
* [[数据压缩]]<br />
* 解码器<br />
* [[微分熵]]<br />
* [[可互换信息]]<br />
* [[信息波动复杂度]]<br />
* [[信息熵]]<br />
* [[联合熵]]<br />
* [[Kullback–Leibler散度]]<br />
* [[互信息]]<br />
* [[点间互信息]](PMI)<br />
* [[接收器 (信息论)]]<br />
* [[冗余 (信息论)|冗余]]<br />
* [[瑞丽熵]]<br />
* [[子信息]]<br />
* [[单一性距离]]<br />
* [[种类 (控制论)|种类]]<br />
* [[汉明距离]]<br />
<br />
{{div col end}}<br />
<br />
==参考资料==<br />
<br />
{{Reflist}}<br />
<br />
<br />
===经典之作===<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]] (1948), "A Mathematical Theory of Communication", ''Bell System Technical Journal'', 27, pp.&nbsp;379–423 & 623–656, July & October, 1948. [http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf PDF.] <br />[http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Notes and other formats.]<br />
<br />
* R.V.L. Hartley, [http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdf "Transmission of Information"], ''Bell System Technical Journal'', July 1928<br />
<br />
* Andrey Kolmogorov (1968), "[https://www.tandfonline.com/doi/pdf/10.1080/00207166808803030 Three approaches to the quantitative definition of information]" in International Journal of Computer Mathematics.<br />
<br />
===其他期刊文章===<br />
<br />
* J. L. Kelly, Jr., [http://betbubbles.com/wp-content/uploads/2017/07/kelly.pdf Betbubbles.com], "A New Interpretation of Information Rate" ''Bell System Technical Journal'', Vol. 35, July 1956, pp.&nbsp;917–26.<br />
<br />
* R. Landauer, [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=615478 IEEE.org], "Information is Physical" ''Proc. Workshop on Physics and Computation PhysComp'92'' (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp.&nbsp;1–4.<br />
<br />
* {{cite journal | last1 = Landauer | first1 = R. | year = 1961 | title = Irreversibility and Heat Generation in the Computing Process | url = http://www.research.ibm.com/journal/rd/441/landauerii.pdf | journal = IBM J. Res. Dev. | volume = 5 | issue = 3| pages = 183–191 | doi = 10.1147/rd.53.0183 }}<br />
<br />
* {{cite arXiv |last=Timme |first=Nicholas|last2=Alford |first2=Wesley|last3=Flecker |first3=Benjamin|last4=Beggs |first4=John M.|date=2012 |title=Multivariate information measures: an experimentalist's perspective |eprint=1111.6857|class=cs.IT}}<br />
<br />
===信息论教材===<br />
<br />
* Arndt, C. ''Information Measures, Information and its Description in Science and Engineering'' (Springer Series: Signals and Communication Technology), 2004,<br />
<br />
* Ash, RB. ''Information Theory''. New York: Interscience, 1965. New York: Dover 1990. <br />
<br />
* Gallager, R. ''Information Theory and Reliable Communication.'' New York: John Wiley and Sons, 1968. <br />
<br />
* Goldman, S. ''Information Theory''. New York: Prentice Hall, 1953. New York: Dover 1968, 2005.<br />
<br />
* {{cite book |last1=Cover |first1=Thomas |author-link1=Thomas M. Cover |last2=Thomas |first2=Joy A. |title=Elements of information theory |edition=2nd |location=New York |publisher=[[Wiley-Interscience]] |date=2006}}<br />
<br />
* Csiszar, I, Korner, J. ''Information Theory: Coding Theorems for Discrete Memoryless Systems'' Akademiai Kiado: 2nd edition, 1997. <br />
<br />
* David J. C. MacKay|MacKay, David J. C.. ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'' Cambridge: Cambridge University Press, 2003. <br />
<br />
* Mansuripur, M. ''Introduction to Information Theory''. New York: Prentice Hall, 1987. <br />
<br />
* Robert McEliece|McEliece, R. ''The Theory of Information and Coding". Cambridge, 2002. <br />
<br />
*Pierce, JR. "An introduction to information theory: symbols, signals and noise". Dover (2nd Edition). 1961 (reprinted by Dover 1980).<br />
<br />
* Reza, F. ''An Introduction to Information Theory''. New York: McGraw-Hill 1961. New York: Dover 1994. <br />
<br />
* {{cite book |last1=Shannon |first1=Claude |author-link1=Claude Shannon |last2=Weaver |first2=Warren |author-link2=Warren Weaver |date=1949 |title=The Mathematical Theory of Communication |url=http://monoskop.org/images/b/be/Shannon_Claude_E_Weaver_Warren_The_Mathematical_Theory_of_Communication_1963.pdf |location=[[Urbana, Illinois]] |publisher=[[University of Illinois Press]] |lccn=49-11922 }}<br />
<br />
* Stone, JV. Chapter 1 of book [http://jim-stone.staff.shef.ac.uk/BookInfoTheory/InfoTheoryBookMain.html "Information Theory: A Tutorial Introduction"], University of Sheffield, England, 2014.<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book/ A First Course in Information Theory]'' Kluwer Academic/Plenum Publishers, 2002.<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book2/ Information Theory and Network Coding]'' Springer 2008, 2002.<br />
<br />
===其他书籍===<br />
<br />
* Leon Brillouin, ''Science and Information Theory'', Mineola, N.Y.: Dover, [1956, 1962] 2004.<br />
<br />
* James Gleick, ''The Information: A History, a Theory, a Flood'', New York: Pantheon, 2011.<br />
<br />
* A. I. Khinchin, ''Mathematical Foundations of Information Theory'', New York: Dover, 1957.<br />
<br />
* H. S. Leff and A. F. Rex, Editors, ''Maxwell's Demon: Entropy, Information, Computing'', Princeton University Press, Princeton, New Jersey (1990). <br />
<br />
* Robert K. Logan. ''What is Information? - Propagating Organization in the Biosphere, the Symbolosphere, the Technosphere and the Econosphere'', Toronto: DEMO Publishing.<br />
<br />
* Tom Siegfried, ''The Bit and the Pendulum'', Wiley, 2000.<br />
<br />
* Charles Seife, ''Decoding the Universe'', Viking, 2006.<br />
<br />
* Jeremy Campbell, ''Grammatical Man'', Touchstone/Simon & Schuster, 1982, <br />
<br />
* Henri Theil, ''Economics and Information Theory'', Rand McNally & Company - Chicago, 1967.<br />
<br />
* Escolano, Suau, Bonev, ''[https://www.springer.com/computer/image+processing/book/978-1-84882-296-2 Information Theory in Computer Vision and Pattern Recognition]'', Springer, 2009. <br />
<br />
* Vlatko Vedral, ''Decoding Reality: The Universe as Quantum Information'', Oxford University Press 2010.<br />
<br />
===信息论大型开放式课程===<br />
<br />
* Raymond W. Yeung, "[http://www.inc.cuhk.edu.hk/InformationTheory/index.html Information Theory]" (The Chinese University of Hong Kong)<br />
<br />
==外部链接==<br />
<br />
* Lambert F. L. (1999), "[http://jchemed.chem.wisc.edu/Journal/Issues/1999/Oct/abs1385.html Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!]", ''Journal of Chemical Education''<br />
<br />
* [http://www.itsoc.org/ IEEE Information Theory Society] and [https://www.itsoc.org/resources/surveys ITSOC Monographs, Surveys, and Reviews]<br />
<br />
==编者推荐==<br />
[[File:Last1.png|400px|thumb|right|[https://swarma.org/?p=13364 用神经学习模型计算海量实际网络中的节点中心性度量 | 论文速递1篇|集智俱乐部]]]<br />
===集智文章推荐===<br />
====[https://swarma.org/?p=21423 计算美学前沿速递:用信息论“重新发现”风景画艺术史]====<br />
美术研究中的一个核心问题是,不同年代和流派的绘画作品,在组织架构上,是否有着相似之处?2020年10月发表在美国国家科学院院刊PNAS的论文中,研究者通过信息论和网络分析,对来自61个国家,1476名画家总计14912幅西方风景画的研究,证实了该假说。<br />
<br/><br/><br />
<br />
====[https://swarma.org/?p=20253 Science前沿:用信息论解释动植物间的军备竞赛]====<br />
在植物与植食性昆虫组成的生态系统中,不同物种在相互作用的过程中, 彼此适应,形成了一个相互影响的协同适应系统。近期Sicence的一项研究从气味信息的角度,讨论动植物协同进化中的军备竞赛,对研究生态网络内部的交流机制很有启发。同期的一篇相关评论文章对该话题进行了全新解答,本文是该评论文章的编译。<br />
<br />
<br />
<br/><br />
----<br />
本中文词条由[[用户:Pjhhh|Pjhhh]]、[[用户:Moonscar|Moonscar]]参与编译, [[用户:Qige96|Ricky]] 审校,[[用户:不是海绵宝宝|不是海绵宝宝]]、[[用户:唐糖糖|唐糖糖]]编辑,欢迎在讨论页面留言。<br />
<br />
'''本词条内容源自wikipedia及公开资料,遵守 CC3.0协议。'''<br />
[[分类: 信息时代]] [[分类: 正规科学]] [[分类: 控制论]] [[分类: 计算机科学]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E4%BF%A1%E6%81%AF%E8%AE%BA_Information_theory&diff=22193信息论 Information theory2021-03-13T08:16:43Z<p>Pjhhh:</p>
<hr />
<div>{{#seo:<br />
|keywords=信息论,信息时代,正规科学,控制论,计算机科学<br />
|description=信息论,信息时代,正规科学,控制论,计算机科学<br />
}}<br />
'''信息论 Information theory'''研究的是信息的量化、存储与传播。信息论最初是由[[克劳德·香农 Claude Shannon]]在1948年的一篇题为'''<font color="#ff8000">《一种通信的数学理论 A Mathematical Theory of Communication 》</font>'''的里程碑式论文中提出的,其目的是找到信号处理和通信操作(如数据压缩)的基本限制。信息论对于旅行者号深空探测任务的成功、光盘的发明、移动电话的可行性、互联网的发展、语言学和人类感知的研究、对黑洞的理解以及许多其他领域的研究都是至关重要的。<br />
<br />
该领域是数学、统计学、计算机科学、物理学、神经生物学、信息工程和电气工程的交叉学科。这一理论也在其他领域得到了应用,比如推论统计学、自然语言处理、密码学、神经生物学<ref name="Spikes">{{cite book|title=Spikes: Exploring the Neural Code|author1=F. Rieke|author2=D. Warland|author3=R Ruyter van Steveninck|author4=W Bialek|publisher=The MIT press|year=1997|isbn=978-0262681087}}</ref>、人类视觉<ref>{{Cite journal|last1=Delgado-Bonal|first1=Alfonso|last2=Martín-Torres|first2=Javier|date=2016-11-03|title=Human vision is determined based on information theory|journal=Scientific Reports|language=En|volume=6|issue=1|pages=36038|bibcode=2016NatSR...636038D|doi=10.1038/srep36038|issn=2045-2322|pmc=5093619|pmid=27808236}}</ref>、分子编码的进化、和功能(生物信息学)、统计学中的模型选择<ref>Burnham, K. P. and Anderson D. R. (2002) ''Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition'' (Springer Science, New York)}}.</ref>、热物理学<ref>{{cite journal|last1=Jaynes|first1=E. T.|year=1957|title=Information Theory and Statistical Mechanics|url=http://bayes.wustl.edu/|journal=Phys. Rev.|volume=106|issue=4|page=620|bibcode=1957PhRv..106..620J|doi=10.1103/physrev.106.620}}</ref> 、量子计算、语言学、剽窃检测<ref>{{cite journal|last1=Bennett|first1=Charles H.|last2=Li|first2=Ming|last3=Ma|first3=Bin|year=2003|title=Chain Letters and Evolutionary Histories|url=http://sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|journal=Scientific American|volume=288|issue=6|pages=76–81|bibcode=2003SciAm.288f..76B|doi=10.1038/scientificamerican0603-76|pmid=12764940|access-date=2008-03-11|archive-url=https://web.archive.org/web/20071007041539/http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|archive-date=2007-10-07|url-status=dead}}</ref>、模式识别和异常检测<ref>{{Cite web|url=http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|title=Some background on why people in the empirical sciences may want to better understand the information-theoretic methods|author=David R. Anderson|date=November 1, 2003|archiveurl=https://web.archive.org/web/20110723045720/http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|archivedate=July 23, 2011|url-status=dead|accessdate=2010-06-23}}<br />
</ref>。 <br />
<br />
信息论的重要分支包括信源编码、算法复杂性理论、算法信息论、信息理论安全性、灰色系统理论和信息度量。<br />
<br />
信息论在应用领域的基本课题包括无损数据压缩(例如:ZIP压缩文件)、有损数据压缩(例如:Mp3和jpeg格式),以及频道编码(例如:DSL)。信息论在信息检索、情报收集、赌博,甚至在音乐创作中也有应用。<br />
<br />
信息论中的一个关键度量是'''[[熵]]'''。熵量化了一个随机变量的值或者一个随机过程的结果所包含的不确定性。例如,识别一次公平抛硬币的结果(有两个同样可能的结果)所提供的信息(较低的熵)少于识别抛一次骰子的结果(有六个同样可能的结果)。信息论中的其他一些重要指标有:互信息、信道容量、误差指数和相对熵。<br />
<br />
==概览==<br />
<br />
信息论主要研究信息的传递、处理、提取和利用。抽象地说,信息可以作为不确定性的解决方案。1948年,Claude Shannon在他的论文《一种通信的数学理论》中将这个抽象的概念具体化,在这篇论文中“信息”被认为是一组可能的信号,这些信号在通过带有噪声的信道发送后,接收者能在信道噪声的影响下以较低的错误概率来重构这些信号。 Shannon的主要结论,有噪信道编码定理,表明在信道使用的许多限制情况下,渐近可达到信息传输速率等于的信道容量,一个仅仅依赖于信息发送所经过的信道本身的统计量。(译注:当信道的信息传输率不超过信道容量时,采用合适的编码方法可以实现任意高的传输可靠性,但若信息传输率超过了信道容量,就不可能实现可靠的传输。)<br />
<br />
信息论与一系列纯科学和应用科学密切相关。在过去半个世纪甚至更久的时间里,在全球范围内已经有各种各样的学科理论被研究和化归为工程实践,比如在[[自适应系统]],预期系统,人工智能,[[复杂系统]],[[复杂性科学]],[[控制论]],信息学,[[机器学习]],以及[[系统科学]]。信息论是一个广博而深遂的数学理论,也具有广泛而深入的应用,其中'''编码理论'''是至关重要的领域。<br />
<br />
编码理论与寻找明确的方法(编码)有关,用于提高效率和将有噪信道上传输的数据错误率降低到接近信道容量。这些编码可大致分为数据压缩编码(信源编码)和纠错(信道编码)技术。对于纠错技术,Shannon证明了理论极限很多年后才有人找到了真正实现了理论最优的方法。<br />
<br />
第三类信息论代码是'''密码算法'''(包括密文和密码)。编码理论和信息论的概念、方法和结果在密码学和密码分析中得到了广泛的应用。<br />
<br />
==历史背景==<br />
<br />
1948年7月和10月,[[克劳德·E·香农 Claude E. Shannon]]在《贝尔系统技术期刊》上发表了经典论文:《一种通信的数学理论》,这就是建立信息论学科并立即引起全世界关注的里程碑事件。<br />
<br />
在此之前,贝尔实验室已经提出了有限的信息论思想,所有这些理论都隐性地假设了概率均等的事件。Harry Nyquist 在1924年发表的论文《集中影响电报速率的因素 Certain Factors Affecting Telegraph Speed》中包含一个理论章节,量化了“情报”和通信系统可以传输的“线路速度”,并给出了关系式 {{math|1=''W'' = ''K'' log ''m''}} (参考玻尔兹曼常数) ,其中 ''W'' 是情报传输的速度, ''m'' 是每个时间步长可以选择的不同电压电平数,''K'' 是常数。Ralph Hartley 在1928年发表的论文《信息的传输 Transmission of Information》中,将单词信息作为一个可测量的量,以此反映接收者区分一系列符号的能力,从而将信息量化为 {{math|1=''H'' = log ''S''<sup>''n''</sup> = ''n'' log ''S''}},其中 ''S'' 是可以使用的符号的数量,''n'' 是传输中符号的数量。因此信息的单位就是十进制数字,为了表示对他的尊敬,这个单位有时被称为 Hartley,作为信息的单位、尺度或度量。1940年,图灵在二战时期破解德国的“迷”密码 Enigma ciphers的统计分析中使用了类似的思想。<br />
<br />
信息论背后的许多数学理论(包括不同概率的事件)都是由[[路德维希·玻尔兹曼 Ludwig Boltzmann]]和[[约西亚·威拉德·吉布斯 J. Willard Gibbs]]为热力学领域开发出来的。<br />
<br />
Shannon的那篇革命性的、开创性的论文,于1944年的年底便已基本在贝尔实验室完成。在这论文里, Shannon将通信看作一个统计学过程,首次提出了通信的量化模型,并以此为基础推导出了信息论。论文开篇便提出了一下论断:<br />
<br />
''<blockquote>“The basic problem of communication is the accurate or approximate representation at one point of selected information at another point.”<br><br />
“通信的基本问题是在一点上精确地或近似地再现在另一点上选择的信息。”</blockquote>''<br />
<br />
与此相关的一些想法包括:<br />
<br />
* 信息熵和信源冗余,以及'''<font color="#ff8000">信源编码定理</font>''';<br />
<br />
* '''<font color="#ff8000">互信息,有噪信道的信道容量</font>''',包括无损通信的证明,和'''<font color="#ff8000">有噪信道编码定理</font>''';<br />
<br />
* '''<font color="#ff8000">香农-哈特利定律 Shannon–Hartley law</font>'''应用于高斯信道的信道容量的结果;<br />
<br />
* '''<font color="#ff8000">比特 bit</font>'''——一种新的度量信息的最基本单位。<br />
<br />
==信息的度量==<br />
<br />
信息论基于概率论和统计学,其中经常涉及衡量随机变量的分布的信息。信息论中重要的信息量有:熵(单个随机变量中信息的度量)和互信息(两个随机变量之间的信息的度量)。熵是随机变量的概率分布的一个属性,它限制了从给定分布中独立采样得到的数据的压缩率。互信息是两个随机变量的联合概率分布的一个属性,是当信道的统计量由联合分布确定时,在长块长度的限制下,通过有噪信道的可靠通信的最大速率。<br />
<br />
在下列公式中,对数底数的选择决定了信息熵的单位。信息的常见单位是比特(基于二进制对数)。其他单位包括 nat(自然对数)和十进制数字(常用对数)。<br />
<br />
下文中,按惯例将 {{math|1=''p'' = 0}} 时的表达式{{math|''p'' log ''p''}}的值视为等于零,因为<math>\lim_{p \rightarrow 0+} p \log p = 0</math>适用于任何对数底。<br />
<br />
===信源的熵===<br />
<br />
基于每个用于通信的源符号的概率质量函数,'''<font color="#ff8000">香农熵 Shannon Entropy</font>'''(以比特为单位)由下式给出:<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
其中{{math|''p<sub>i</sub>''}}是源符号的第{{math|''i''}}个可能值出现的概率。该方程以比特(每个符号)为单位给出熵,因为它使用以2为底的对数。为表纪念,这个熵有时被称为'''香农熵'''。熵的计算也通常使用自然对数(以[[E (mathematical constant)|{{mvar|e}}]]为底数,其中{{mvar|e}}是欧拉数,其他底数也是可行的,但不常用),这样就可以测量每个符号的熵值,有时在公式中可以通过避免额外的常量来简化分析。例如以{{math|1=2<sup>8</sup> = 256}}为底的对数,得出的值就以字节(而非比特)作为单位。以10为底的对数,每个符号将产生以十进制数字(或哈特利)为单位的测量值。<br />
<br />
直观的来看,离散型随机变量{{math|''X''}}的熵{{math|''H<sub>X</sub>''}}是对不确定性的度量,当只知道其分布时,它的值与{{math|''X''}}的值相关。<br />
<br />
当一个信息源发出了一串含有{{math|''N''}}个符号的序列,且每个符号[[独立同分布]]时,其熵为{{math|''N'' ⋅ ''H''}}位(每个信息{{math|''N''}}符号)。<br />
如果源数据符号是同分布但不独立的,则长度为{{math|''N''}}的消息的熵将小于{{math|''N'' ⋅ ''H''}}。<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px| 伯努利实验的熵,作为一个成功概率的函数,通常被称为二值熵函数, {{math|''H''<sub>b</sub>(''p'')}}。当使用一个无偏的硬币做实验时,两个可能结果出现的概率相等,此时的熵值最大,为1。]]<br />
<br />
如果一个人发送了1000比特(0s和1s),然而接收者在发送之前就已知这串比特序列中的每一个位的值,显然这个通信过程并没有任何信息(译注:如果你要告诉我一个我已经知到的消息,那么本次通信没有传递任何信息)。但是,如果消息未知,且每个比特独立且等可能的为0或1时,则本次通信传输了1000香农的信息(通常称为“比特”)。在这两个极端之间,信息可以按以下方式进行量化。如果𝕏是{{math|''X''}}可能在的所有消息的集合{{math|{''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}},且{{math|''p''(''x'')}}是<math>x \in \mathbb X</math>的概率,那么熵、{{math|''H''}}和{{math|''H''}}的定义如下: <ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x)</math><br />
<br />
(其中:{{math|''I''(''x'')}}是[[自信息]],表示单个信息的熵贡献;{{math|''I''(''x'')}}{{math|𝔼<sub>''X''</sub>}}为{{math|''X''}}的期望。)熵的一个特性是,当消息空间中的所有消息都是等概率{{math|1=''p''(''x'') = 1/''n''}}时熵最大; 也就是说,在{{math|1=''H''(''X'') = log ''n''}}这种情况下,熵是最不可预测的。<br />
<br />
对于只有两种可能取值的随机变量的信息熵,其特殊情况为二值熵函数(通常用以为底2对数,因此以香农(Sh)为单位):<br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p)</math><br />
<br />
<br><br />
<br />
===联合熵 Joint entropy===<br />
<br />
两个离散的随机变量{{math|''X''}}和{{math|''Y''}}的'''<font color="#ff8000">联合熵 Joint Entropy</font>'''大致是它们的配对: {{math|(''X'', ''Y'')}}。若{{math|''X''}}和{{math|''Y''}}是独立的,那么它们的联合熵就是其各自熵的总和。<br />
<br />
例如:如果{{math|(''X'', ''Y'')}}代表棋子的位置({{math|''X''}} 表示行和{{math|''Y''}}表示列),那么棋子所在位置的熵就是棋子行、列的联合熵。<br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
尽管符号相似,注意联合熵与交叉熵不能混淆。<br />
<br />
<br />
===条件熵(含糊度)Conditional entropy (equivocation)===<br />
<br />
在给定随机变量{{math|''Y''}}下{{math|''X''}}的'''<font color="#ff8000">条件熵 Conditional Entropy</font>'''(或条件不确定性,也可称为{{math|''X''}}关于{{math|''Y''}}的含糊度))是{{math|''Y''}}上的平均条件熵: <ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
由于熵能够以随机变量或该随机变量的某个值为条件,所以应注意不要混淆条件熵的这两个定义(前者更为常用)。该类条件熵的一个基本属性为:<br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<br />
===互信息(转移信息) Mutual information (transinformation)===<br />
<br />
'''<font color="#ff8000">互信息 Mutual Information</font>'''度量的是某个随机变量在通过观察另一个随机变量时可以获得的信息量。在通信中可以用它来最大化发送和接收信号之间共享的信息量,这一点至关重要。{{math|''X''}}相对于{{math|''Y''}}的互信息由以下公式给出:<br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
其中{{math|SI}} (Specific mutual Information,特定互信息)是点间的互信息。<br />
<br />
互信息的一个基本属性是:<br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y)\,</math><br />
<br />
也就是说,在编码''X''的过程中,知道''Y''比不知道''Y''平均节省{{math|''I''(''X''; ''Y'')}}比特。<br />
<br />
互信息是对称的:<br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y)\,</math><br />
<br />
互信息可以表示为在给定''Y''值的情况下''X''的后验分布,以及''X''的先验概率分布之间的平均 Kullback-Leibler 散度(信息增益):<br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )]</math><br />
<br />
换句话说,这个指标度量:当我们给出''Y''的值,得出''X''上的概率分布将会平均变化多少。这通常用于计算边缘分布的乘积与实际联合分布的差异:<br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y))</math><br />
<br />
互信息与列联表中的似然比检验,多项分布,以及皮尔森卡方检验密切相关: 互信息可以视为评估一对变量之间独立性的统计量,并且具有明确指定的渐近分布。<br />
<br />
<br><br />
<br />
===Kullback-Leibler散度(信息增益) Kullback–Leibler divergence (information gain)===<br />
<br />
'''Kullback-Leibler 散度'''(或信息散度、相对熵、信息增益)是比较两种分布的方法: “真实的”概率分布''p(X)''和任意概率分布''q(X)''。若假设''q(X)''是基于某种方式压缩的数据的分布,而实际上''p(X)''才是真正分布,那么 Kullback-Leibler 散度是每个数据压缩所需的平均额外比特数。因此定义:<br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
尽管有时会将KL散度用作距离量度但它并不是一个真正的指标,因为它是不对称的,同时也不满足三角不等式(KL散度可以作为一个半准度量)。<br />
<br />
KL散度的另一种解释是一种先验知识引入的“不必要的惊讶”。假设将从概率分布为“ p(x)”的离散集合中随机抽取数字“ X”,如果Alice知道真实的分布“p(x)”,而Bob(因为具有先验知识)认为概率分布是“q(x)”,那么在看到抽取出来的''X''的值后,平均而言,Bob将比Alice更加惊讶。KL散度就是Bob惊讶的期望值减去Alice惊讶的期望值(如果对数以2为底,则以比特为单位),这样Bob所拥有的先验知识的“错误的”程度可以用他“不必要的惊讶”的期望值来进行量化。<br />
<br />
===其他度量===<br />
<br />
信息论中其他重要的量包括'''<font color="#ff8000">瑞丽熵 Rényi Entropy</font>'''(一种熵的推广),微分熵(信息量推广到连续分布),以及条件互信息。<br />
<br />
<br />
==编码理论==<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|在可读CD的表面上显示划痕的图片。音乐和数据CD使用纠错编码进行编码,因此即使它们有轻微的划痕,也可以通过错误检测和纠正来对CD进行读取。]]<br />
<br />
'''<font color="#ff8000">编码理论 Coding Theory</font>'''是信息论最重要、最直接的应用之一,可以细分为'''<font color="#ff8000">信源编码理论 Source Coding Theory</font>'''和'''<font color="#ff8000">信道编码理论 Channel Coding Theory</font>'''。信息论使用统计学来量化描述数据所需的比特数,也就是源的信息熵。<br />
<br />
* 数据压缩(源编码):压缩问题有两个相关公式;<br />
<br />
* [[无损数据压缩]]:数据必须准确重构;<br />
<br />
* [[有损数据压缩]]:由失真函数测得的在指定保真度级别内分配重构数据所需的比特数。信息论中的这个部分称为率失真理论。<br />
<br />
*纠错码(信道编码):数据压缩会尽可能多的消除冗余,而纠错码会添加所需的冗余(即纠错),以便在嘈杂的信道上有效且保真地传输数据。<br />
<br />
信息传输定理,或着说“信源-信道分离定理”证明,编码理论应当划分为压缩和传输两部分。定理证明了在许多情况下使用比特作为信息的''通用货币''是合理的,但这只在发送用户与特定接收用户建立通信的情况下才成立。在具有多个发送器(多路访问信道),多个接收器(广播信道)或中转器(中继信道)或多个计算机网络的情况下,压缩后再进行传输可能就不再是最佳选择。[[网络信息论]]指的就是这些多主体通信模型。<br />
<br />
<br />
<br />
===信源理论===<br />
<br />
生成连续消息的任何过程都可以视为信息的通讯来源。无记忆信源是指每个消息都是独立同分布的随机变量,而遍历理论和平稳过程的性质对信源施加的限制较少。所有这些信源都可以看作随机的。在信息论领域外,这些术语也已经有很全面的相关研究。<br />
<br />
<br />
====速率====<br />
<br />
'''<font color="#ff8000">信息速率 Information Rate</font>'''(熵率)是每个符号的平均熵。对于无记忆信源,信息速率仅表示每个符号的熵,而在平稳随机过程中,它是:<br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots)</math>;<br />
<br />
也就是,给定所有之前生成的符号下,一个符号的条件熵。对于非平稳的过程的更一般情况,平均速率为:<br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n)</math>;<br />
<br />
也就是每个符号的联合熵的极限。对于平稳源,这两个表达式得出的结果相同。<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
<br />
在信息论中谈论一种语言的“速率”或“熵”是很常见的,也是很合适的,比如当信源是英文散文时。信息源的速率与其冗余度以及可被压缩程度有关。<br />
<br />
<br />
<br />
===信道容量===<br />
<br />
通过信道(例如,以太网电缆)进行通信是信息论的主要动机。然而,这样的信道往往不能产生信号的精确重建;静默时段内、噪声、其他形式的信号损坏往往会使得信息质量的降低。<br />
<br />
考虑离散信道上的通信过程。该过程的简单模型如下:<br />
<br />
<br />
[[File:Channel model.svg|center|800px|Channel model]]<br />
<br />
这里''X''表示要发送的信息的空间(全集),''Y''表示单位时间内通过信道接收的信息的空间。设{{math|''p''(''y''{{pipe}}''x'')}}是给定''X''的''Y''的条件概率分布函数。我们将{{math|''p''(''y''{{pipe}}''x'')}}视为通信信道的固定属性(表示信道噪声的性质)。那么''X''和''Y''的联合分布完全取决于所选用的信道和{{math|''f''(''x'')}},以及通过信道发送的信息的边缘分布。在这些约束条件下,我们希望最大化信息速率或信号速率,可以通过信道进行通信。对此的适当度量为互信息,信道容量即为最大互信息,且由下式给出:<br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
信道容量具有以下与以信息速率“R”进行通信有关的属性(其中“R”通常为每个符号的比特数)。对于任意信息速率''R < C''和编码错误ε > 0,存在足够大的长度为''N''和速率大于等于R的代码以及解码算法使得块错误的最大概率小于等于ε;即总是可以在任意小的块错误下进行传输。此外对于任何速率的“ R> C”,不可能以很小的块错误进行发送。<br />
<br />
信道编码就寻找一种接近最优的编码,它可以用于在噪声信道上以接近信道容量的速率传输数据,且编码错误很小。<br />
<br />
<br />
====特定信道容量模型====<br />
<br />
*连续时间内受高斯噪声 Gaussian noise限制的模拟通信信道(详细内容请参见[[Shannon–Hartley定理]])。<br />
<br />
<br />
*'''二进制对称通道 binary symmetric channel(BSC)'''是交叉概率为''p''的二进制输入、二进制输出(以概率''p''翻转输入位)通道。每个通道使用的BSC容量为{{math|1 &minus; ''H''<sub>b</sub>(''p'')}}比特,其中{{math|''H''<sub>b</sub>}}是以2为底的对数的二进制熵函数:<br />
<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
*'''二进制擦除通道 binary erasure channel(BEC)'''是擦除概率为“ p”的二进制输入、三进制输出通道。可能的通道输出为0、1和擦除符号'e'。擦除表示信息输入位的完全丢失。每个通道使用的BEC容量为{{nowrap|1 &minus; ''p''}}比特。<br />
<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
==在其他领域的应用==<br />
<br />
===情报使用和安全应用===<br />
<br />
信息论的概念可以应用于密码学和密码分析。在Ultra的项目中就使用了图灵的信息单位[[Ban(unit)| ban]],破解了德国的恩尼格玛密码,加速了二战在欧洲的结束。香农定义了一个重要的概念,现在称为'''单一性距离 [[unicity distance]]''',基于明文的冗余性尝试给出具有唯一可解密性所需的最少量的密文。<br />
<br />
信息论使我们觉得保密比最初看起来要困难得多。穷举法也可以破解基于非对称密钥算法或最常用的对称密钥算法(也称为密钥算法),如分块加密。所有这些方法的安全性都来自以下假设:在一定的的时间内没有已知的攻击方法可以破解它们。<br />
<br />
信息理论安全性指的是诸如一次性密钥之类的不易受到这种暴力攻击的方法。在这种情况下,可以确保明文和密文(以密钥为条件)之间的正条件互信息正确的传输,而明文和密文之间的无条件互信息仍为零,从而保证绝对安全的通信。换句话说,窃听者将无法通过获取密文而不是密钥的知识来改善其对原文本的猜测。但是,就像在其他任何密码系统中一样,即便时信息论中安全的方法必须小心正确的使用;之所以Venona 项目能够破解苏联的一次性密钥,就是因为苏联不当地重复使用关键材料。<br />
<br />
===伪随机数的生成===<br />
<br />
伪随机数生成器在计算机语言库和应用程序中广泛应用。由于它们没有规避现代计算机设备和软件的确定性,因此普遍不适合用在密码学中。一类改进的随机数生成器称为加密安全的伪随机数生成器,但也需要软件外部的随机种子才能正常工作,这通过提取器来获得。用来度量提取器中充分随机性的概念是最小熵,该值通过[[瑞丽熵]]与香农熵关联;瑞丽熵还用于评估密码系统中的随机性。虽然相关,但具有较高香农熵的随机变量不一定适合在提取器中使用,因此也不能用在密码学中。<br />
<br />
<br />
<br />
===地震勘探===<br />
<br />
信息论的一个早期商业应用是在地震石油勘探领域。在该领域的应用可以从期望的地震信号中剔除和分离不需要的噪声。与以前的模拟方法相比,信息论和数字信号处理大大提高了图像的分辨率和清晰度。<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
<br />
<br />
<br />
===符号学===<br />
<br />
符号学家Doede Nauta和Winfried Nöth都认为Charles Sanders Peirce在他的符号学著作中创造了信息论。<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref><ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref> Nauta将符号信息论定义为研究编码、过滤和信息处理的内部过程。<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
信息论的概念(例如冗余和代码控制)已被符号学家如Umberto Eco和Ferruccio Rossi-Landi用来解释意识形态,将其作为消息传输的一种形式,占统治地位的社会阶层通过使用具有高度冗余性的标志来发出其信息,使得从符号中解码出来的消息只有一种,而不会时其他可能的消息。<ref>Nöth, Winfried (1981). "[https://kobra.uni-kassel.de/bitstream/handle/123456789/2014122246977/semi_2004_002.pdf?sequence=1&isAllowed=y Semiotics of ideology]". ''Semiotica'', Issue 148.</ref><br />
<br />
===其他应用===<br />
<br />
信息论在赌博、黑洞信息论和生物信息学中也有应用。<br />
<br />
<br />
<br />
==参见==<br />
{{div col|colwidth=20em}}<br />
* [[算法概率]]<br />
* [[贝叶斯推断]]<br />
* [[通信理论]]<br />
* [[构造器理论]] - 一种包含了量子信息的广义信息论<br />
* [[归纳概率]]<br />
* [[信息度量]]<br />
* [[最小消息长度]]<br />
* [[最小描述长度]]<br />
* [[计算机科学重要文献列表#信息论|重要文献列表]]<br />
* [[信息的哲学]]<br />
{{div col end}}<br />
<br />
===应用===<br />
<br />
{{div col|colwidth=20em}}<br />
* [[主动网络]]<br />
* [[密码分析]]<br />
* [[密码学]]<br />
* [[控制论]]<br />
* [[热力学和信息论中的熵]]<br />
* [[赌博]]<br />
* [[智能 (信息采集)]]<br />
* [[反射波勘探法|地震勘探]]<br />
<br />
{{div col end}}<br />
<br />
===理论===<br />
<br />
{{div col|colwidth=20em}} <br />
* [[编码理论]]<br />
* [[探测理论]]<br />
* [[估计理论]]<br />
* [[费舍尔理论]]<br />
* [[信息代数]]<br />
* [[信息不对称性]]<br />
* [[信息场论]]<br />
* [[信息几何]]<br />
* [[信息论于测度论]]<br />
* [[柯尔莫哥洛夫复杂度]]<br />
* [[信息论未解之谜]]<br />
* [[信息的逻辑]]<br />
* [[网络编码]]<br />
* [[信息的哲学]]<br />
* [[量子信息科学]]<br />
* [[信源编码]]<br />
<br />
{{div col end}}<br />
<br />
===概念===<br />
<br />
{{div col|colwidth=20em}}<br />
* [[Ban (单位)]] —— 以10为底的对数信息量单位<br />
* [[信道容量]]<br />
* [[信道]]<br />
* [[信源]]<br />
* [[条件熵]]<br />
* [[转换信道]]<br />
* [[数据压缩]]<br />
* 解码器<br />
* [[微分熵]]<br />
* [[可互换信息]]<br />
* [[信息波动复杂度]]<br />
* [[信息熵]]<br />
* [[联合熵]]<br />
* [[Kullback–Leibler散度]]<br />
* [[互信息]]<br />
* [[点间互信息]](PMI)<br />
* [[接收器 (信息论)]]<br />
* [[冗余 (信息论)|冗余]]<br />
* [[瑞丽熵]]<br />
* [[子信息]]<br />
* [[单一性距离]]<br />
* [[种类 (控制论)|种类]]<br />
* [[汉明距离]]<br />
<br />
{{div col end}}<br />
<br />
==参考资料==<br />
<br />
{{Reflist}}<br />
<br />
<br />
===经典之作===<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]] (1948), "A Mathematical Theory of Communication", ''Bell System Technical Journal'', 27, pp.&nbsp;379–423 & 623–656, July & October, 1948. [http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf PDF.] <br />[http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Notes and other formats.]<br />
<br />
* R.V.L. Hartley, [http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdf "Transmission of Information"], ''Bell System Technical Journal'', July 1928<br />
<br />
* Andrey Kolmogorov (1968), "[https://www.tandfonline.com/doi/pdf/10.1080/00207166808803030 Three approaches to the quantitative definition of information]" in International Journal of Computer Mathematics.<br />
<br />
===其他期刊文章===<br />
<br />
* J. L. Kelly, Jr., [http://betbubbles.com/wp-content/uploads/2017/07/kelly.pdf Betbubbles.com], "A New Interpretation of Information Rate" ''Bell System Technical Journal'', Vol. 35, July 1956, pp.&nbsp;917–26.<br />
<br />
* R. Landauer, [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=615478 IEEE.org], "Information is Physical" ''Proc. Workshop on Physics and Computation PhysComp'92'' (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp.&nbsp;1–4.<br />
<br />
* {{cite journal | last1 = Landauer | first1 = R. | year = 1961 | title = Irreversibility and Heat Generation in the Computing Process | url = http://www.research.ibm.com/journal/rd/441/landauerii.pdf | journal = IBM J. Res. Dev. | volume = 5 | issue = 3| pages = 183–191 | doi = 10.1147/rd.53.0183 }}<br />
<br />
* {{cite arXiv |last=Timme |first=Nicholas|last2=Alford |first2=Wesley|last3=Flecker |first3=Benjamin|last4=Beggs |first4=John M.|date=2012 |title=Multivariate information measures: an experimentalist's perspective |eprint=1111.6857|class=cs.IT}}<br />
<br />
===信息论教材===<br />
<br />
* Arndt, C. ''Information Measures, Information and its Description in Science and Engineering'' (Springer Series: Signals and Communication Technology), 2004,<br />
<br />
* Ash, RB. ''Information Theory''. New York: Interscience, 1965. New York: Dover 1990. <br />
<br />
* Gallager, R. ''Information Theory and Reliable Communication.'' New York: John Wiley and Sons, 1968. <br />
<br />
* Goldman, S. ''Information Theory''. New York: Prentice Hall, 1953. New York: Dover 1968, 2005.<br />
<br />
* {{cite book |last1=Cover |first1=Thomas |author-link1=Thomas M. Cover |last2=Thomas |first2=Joy A. |title=Elements of information theory |edition=2nd |location=New York |publisher=[[Wiley-Interscience]] |date=2006}}<br />
<br />
* Csiszar, I, Korner, J. ''Information Theory: Coding Theorems for Discrete Memoryless Systems'' Akademiai Kiado: 2nd edition, 1997. <br />
<br />
* David J. C. MacKay|MacKay, David J. C.. ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'' Cambridge: Cambridge University Press, 2003. <br />
<br />
* Mansuripur, M. ''Introduction to Information Theory''. New York: Prentice Hall, 1987. <br />
<br />
* Robert McEliece|McEliece, R. ''The Theory of Information and Coding". Cambridge, 2002. <br />
<br />
*Pierce, JR. "An introduction to information theory: symbols, signals and noise". Dover (2nd Edition). 1961 (reprinted by Dover 1980).<br />
<br />
* Reza, F. ''An Introduction to Information Theory''. New York: McGraw-Hill 1961. New York: Dover 1994. <br />
<br />
* {{cite book |last1=Shannon |first1=Claude |author-link1=Claude Shannon |last2=Weaver |first2=Warren |author-link2=Warren Weaver |date=1949 |title=The Mathematical Theory of Communication |url=http://monoskop.org/images/b/be/Shannon_Claude_E_Weaver_Warren_The_Mathematical_Theory_of_Communication_1963.pdf |location=[[Urbana, Illinois]] |publisher=[[University of Illinois Press]] |lccn=49-11922 }}<br />
<br />
* Stone, JV. Chapter 1 of book [http://jim-stone.staff.shef.ac.uk/BookInfoTheory/InfoTheoryBookMain.html "Information Theory: A Tutorial Introduction"], University of Sheffield, England, 2014.<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book/ A First Course in Information Theory]'' Kluwer Academic/Plenum Publishers, 2002.<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book2/ Information Theory and Network Coding]'' Springer 2008, 2002.<br />
<br />
===其他书籍===<br />
<br />
* Leon Brillouin, ''Science and Information Theory'', Mineola, N.Y.: Dover, [1956, 1962] 2004.<br />
<br />
* James Gleick, ''The Information: A History, a Theory, a Flood'', New York: Pantheon, 2011.<br />
<br />
* A. I. Khinchin, ''Mathematical Foundations of Information Theory'', New York: Dover, 1957.<br />
<br />
* H. S. Leff and A. F. Rex, Editors, ''Maxwell's Demon: Entropy, Information, Computing'', Princeton University Press, Princeton, New Jersey (1990). <br />
<br />
* Robert K. Logan. ''What is Information? - Propagating Organization in the Biosphere, the Symbolosphere, the Technosphere and the Econosphere'', Toronto: DEMO Publishing.<br />
<br />
* Tom Siegfried, ''The Bit and the Pendulum'', Wiley, 2000.<br />
<br />
* Charles Seife, ''Decoding the Universe'', Viking, 2006.<br />
<br />
* Jeremy Campbell, ''Grammatical Man'', Touchstone/Simon & Schuster, 1982, <br />
<br />
* Henri Theil, ''Economics and Information Theory'', Rand McNally & Company - Chicago, 1967.<br />
<br />
* Escolano, Suau, Bonev, ''[https://www.springer.com/computer/image+processing/book/978-1-84882-296-2 Information Theory in Computer Vision and Pattern Recognition]'', Springer, 2009. <br />
<br />
* Vlatko Vedral, ''Decoding Reality: The Universe as Quantum Information'', Oxford University Press 2010.<br />
<br />
===信息论大型开放式课程===<br />
<br />
* Raymond W. Yeung, "[http://www.inc.cuhk.edu.hk/InformationTheory/index.html Information Theory]" (The Chinese University of Hong Kong)<br />
<br />
==外部链接==<br />
<br />
* Lambert F. L. (1999), "[http://jchemed.chem.wisc.edu/Journal/Issues/1999/Oct/abs1385.html Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!]", ''Journal of Chemical Education''<br />
<br />
* [http://www.itsoc.org/ IEEE Information Theory Society] and [https://www.itsoc.org/resources/surveys ITSOC Monographs, Surveys, and Reviews]<br />
<br />
==编者推荐==<br />
[[File:Last1.png|400px|thumb|right|[https://swarma.org/?p=13364 用神经学习模型计算海量实际网络中的节点中心性度量 | 论文速递1篇|集智俱乐部]]]<br />
===集智文章推荐===<br />
====[https://swarma.org/?p=21423 计算美学前沿速递:用信息论“重新发现”风景画艺术史]====<br />
美术研究中的一个核心问题是,不同年代和流派的绘画作品,在组织架构上,是否有着相似之处?2020年10月发表在美国国家科学院院刊PNAS的论文中,研究者通过信息论和网络分析,对来自61个国家,1476名画家总计14912幅西方风景画的研究,证实了该假说。<br />
<br/><br/><br />
<br />
====[https://swarma.org/?p=20253 Science前沿:用信息论解释动植物间的军备竞赛]====<br />
在植物与植食性昆虫组成的生态系统中,不同物种在相互作用的过程中, 彼此适应,形成了一个相互影响的协同适应系统。近期Sicence的一项研究从气味信息的角度,讨论动植物协同进化中的军备竞赛,对研究生态网络内部的交流机制很有启发。同期的一篇相关评论文章对该话题进行了全新解答,本文是该评论文章的编译。<br />
<br />
<br />
<br/><br />
----<br />
本中文词条由[[用户:Pjhhh|Pjhhh]][[用户:Moonscar|Moonscar]]参与编译, [[用户:Qige96|Ricky]] 审校,[[用户:不是海绵宝宝|不是海绵宝宝]]、[[用户:唐糖糖|唐糖糖]]编辑,欢迎在讨论页面留言。<br />
<br />
'''本词条内容源自wikipedia及公开资料,遵守 CC3.0协议。'''<br />
[[分类: 信息时代]] [[分类: 正规科学]] [[分类: 控制论]] [[分类: 计算机科学]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E7%94%A8%E6%88%B7:Pjhhh&diff=18150用户:Pjhhh2020-11-12T06:57:31Z<p>Pjhhh:/* Hi,我是瑾晗 */</p>
<hr />
<div>== '''Hi,我是Pjhhh''' ==<br />
<br />
*'''性别:'''男<br />
*'''当前就读:'''中国民航大学空中交通管理学院研究生在读,本科也曾就读于中国民航大学空中交通管理学院;<br />
*'''主要研究内容:'''空中交通流量管理、飞行区调度规划、交通运输网络相关内容、交通复杂网络、航空网络弹性等;<br />
*'''兴趣与爱好:'''长跑、骑行、爬山;喜欢在一个陌生的地方漫无目的闲逛;做一些自己没尝试过的菜;<br />
*'''联系方式:'''mail:2019031013@cauc.edu.cn</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E4%BF%A1%E6%81%AF%E8%AE%BA_Information_theory&diff=13528信息论 Information theory2020-09-01T07:27:12Z<p>Pjhhh:</p>
<hr />
<div>{{distinguish|Information science}}<br />
<br />
<br />
<br />
{{Information theory}}<br />
<br />
<br />
<br />
'''Information theory''' studies the [[quantification (science)|quantification]], [[computer data storage|storage]], and [[telecommunication|communication]] of [[information]]. It was originally proposed by [[Claude Shannon]] in 1948 to find fundamental limits on [[signal processing]] and communication operations such as [[data compression]], in a landmark paper titled "[[A Mathematical Theory of Communication]]". Its impact has been crucial to the success of the [[Voyager program|Voyager]] missions to deep space, the invention of the [[compact disc]], the feasibility of mobile phones, the development of the Internet, the study of [[linguistics]] and of human perception, the understanding of [[black hole]]s, and numerous other fields.<br />
<br />
Information theory studies the quantification, storage, and communication of information. It was originally proposed by Claude Shannon in 1948 to find fundamental limits on signal processing and communication operations such as data compression, in a landmark paper titled "A Mathematical Theory of Communication". Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception, the understanding of black holes, and numerous other fields.<br />
<br />
'''信息论'''研究的是信息的量化、存储与传播。信息论最初是由[[Claude Shannon]]在1948年的一篇题为"[[A Mathematical Theory of Communication]]"的论文中提出的,其目的是找到信号处理和通信操作(如数据压缩)的基本限制。信息论对于旅行者号深空探测任务的成功、光盘的发明、移动电话的可行性、互联网的发展、语言学和人类感知的研究、对黑洞的理解以及许多其他领域的研究都是至关重要的。<br />
<br />
<br />
<br />
<br />
<br />
The field is at the intersection of mathematics, [[statistics]], computer science, physics, [[Neuroscience|neurobiology]], [[information engineering (field)|information engineering]], and electrical engineering. The theory has also found applications in other areas, including [[statistical inference]], [[natural language processing]], [[cryptography]], [[neurobiology]],<ref name="Spikes">{{cite book|title=Spikes: Exploring the Neural Code|author1=F. Rieke|author2=D. Warland|author3=R Ruyter van Steveninck|author4=W Bialek|publisher=The MIT press|year=1997|isbn=978-0262681087}}</ref> [[human vision]],<ref>{{Cite journal|last=Delgado-Bonal|first=Alfonso|last2=Martín-Torres|first2=Javier|date=2016-11-03|title=Human vision is determined based on information theory|journal=Scientific Reports|language=En|volume=6|issue=1|pages=36038|bibcode=2016NatSR...636038D|doi=10.1038/srep36038|issn=2045-2322|pmc=5093619|pmid=27808236}}</ref> the evolution<ref>{{cite journal|last1=cf|last2=Huelsenbeck|first2=J. P.|last3=Ronquist|first3=F.|last4=Nielsen|first4=R.|last5=Bollback|first5=J. P.|year=2001|title=Bayesian inference of phylogeny and its impact on evolutionary biology|url=|journal=Science|volume=294|issue=5550|pages=2310–2314|bibcode=2001Sci...294.2310H|doi=10.1126/science.1065889|pmid=11743192}}</ref> and function<ref>{{cite journal|last1=Allikmets|first1=Rando|last2=Wasserman|first2=Wyeth W.|last3=Hutchinson|first3=Amy|last4=Smallwood|first4=Philip|last5=Nathans|first5=Jeremy|last6=Rogan|first6=Peter K.|year=1998|title=Thomas D. Schneider], Michael Dean (1998) Organization of the ABCR gene: analysis of promoter and splice junction sequences|url=http://alum.mit.edu/www/toms/|journal=Gene|volume=215|issue=1|pages=111–122|doi=10.1016/s0378-1119(98)00269-8|pmid=9666097}}</ref> of molecular codes ([[bioinformatics]]), [[model selection]] in statistics,<ref>Burnham, K. P. and Anderson D. R. (2002) ''Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition'' (Springer Science, New York) {{ISBN|978-0-387-95364-9}}.</ref> [[thermal physics]],<ref>{{cite journal|last1=Jaynes|first1=E. T.|year=1957|title=Information Theory and Statistical Mechanics|url=http://bayes.wustl.edu/|journal=Phys. Rev.|volume=106|issue=4|page=620|bibcode=1957PhRv..106..620J|doi=10.1103/physrev.106.620}}</ref> [[quantum computing]], linguistics, [[plagiarism detection]],<ref>{{cite journal|last1=Bennett|first1=Charles H.|last2=Li|first2=Ming|last3=Ma|first3=Bin|year=2003|title=Chain Letters and Evolutionary Histories|url=http://sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|journal=Scientific American|volume=288|issue=6|pages=76–81|bibcode=2003SciAm.288f..76B|doi=10.1038/scientificamerican0603-76|pmid=12764940|access-date=2008-03-11|archive-url=https://web.archive.org/web/20071007041539/http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|archive-date=2007-10-07|url-status=dead}}</ref> [[pattern recognition]], and [[anomaly detection]].<ref>{{Cite web|url=http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|title=Some background on why people in the empirical sciences may want to better understand the information-theoretic methods|author=David R. Anderson|date=November 1, 2003|archiveurl=https://web.archive.org/web/20110723045720/http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|archivedate=July 23, 2011|url-status=dead|accessdate=2010-06-23}}<br />
<br />
The field is at the intersection of mathematics, statistics, computer science, physics, neurobiology, information engineering, and electrical engineering. The theory has also found applications in other areas, including statistical inference, natural language processing, cryptography, neurobiology, human vision, the evolution and function of molecular codes (bioinformatics), model selection in statistics, thermal physics, quantum computing, linguistics, plagiarism detection, pattern recognition, and anomaly detection.<ref><br />
<br />
该领域是数学、统计学、计算机科学、物理学、神经生物学、信息工程和电气工程的交叉学科。这一理论也在其他领域得到了应用,比如推论统计学、自然语言处理、密码学、神经生物学、人类视觉、分子编码的进化和功能(生物信息学)、统计学中的模型选择、热物理学、量子计算、语言学、剽窃检测、模式识别和异常检测。 <br />
<br />
</ref> Important sub-fields of information theory include [[source coding]], [[algorithmic complexity theory]], [[algorithmic information theory]], [[information-theoretic security]], [[Grey system theory]] and measures of information.<br />
<br />
</ref> Important sub-fields of information theory include source coding, algorithmic complexity theory, algorithmic information theory, information-theoretic security, Grey system theory and measures of information.<br />
<br />
信息论的重要分支包括信源编码、算法复杂性理论、算法信息论、资讯理论安全性、灰色系统理论和信息度量。<br />
<br />
<br />
<br />
<br />
<br />
Applications of fundamental topics of information theory include [[lossless data compression]] (e.g. [[ZIP (file format)|ZIP files]]), [[lossy data compression]] (e.g. [[MP3]]s and [[JPEG]]s), and [[channel capacity|channel coding]] (e.g. for [[digital subscriber line|DSL]]). Information theory is used in [[information retrieval]], [[intelligence (information gathering)|intelligence gathering]], gambling, and even in musical composition.<br />
<br />
Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s and JPEGs), and channel coding (e.g. for DSL). Information theory is used in information retrieval, intelligence gathering, gambling, and even in musical composition.<br />
<br />
信息论基本主题的应用包括无损数据压缩(例如:ZIP压缩文件)、有损数据压缩(例如:Mp3和jpeg格式) ,以及频道编码(例如:DSL)。信息论应用于信息检索、情报收集、赌博,甚至在音乐创作中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
A key measure in information theory is [[information entropy|entropy]]. Entropy quantifies the amount of uncertainty involved in the value of a [[random variable]] or the outcome of a [[random process]]. For example, identifying the outcome of a fair [[coin flip]] (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a {{dice}} (with six equally likely outcomes). Some other important measures in information theory are [[mutual information]], channel capacity, [[error exponent]]s, and [[relative entropy]].<br />
<br />
A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. For example, identifying the outcome of a fair coin flip (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a (with six equally likely outcomes). Some other important measures in information theory are mutual information, channel capacity, error exponents, and relative entropy.<br />
<br />
信息论中的一个关键度量是熵。熵量化了一个随机变量的值或者一个随机过程的结果所包含的不确定性。例如,识别一次公平抛硬币的结果(有两个同样可能的结果)所提供的信息(较低的熵)少于指定一卷 a 的结果(有六个同样可能的结果)。信息论中的其他一些重要指标有:互信息、信道容量、误差指数和相对熵。<br />
<br />
<br />
<br />
<br />
<br />
==Overview==<br />
<br />
==Overview==<br />
<br />
概览<br />
<br />
<br />
<br />
<br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the [[noisy-channel coding theorem]] showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<ref name="Spikes" /><br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the noisy-channel coding theorem showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<br />
<br />
信息论主要研究信息的传递、处理、提取和利用。抽象地说,信息可以作为不确定性的解决方案。1948年,Claude Shannon在他的论文"[[A Mathematical Theory of Communication]]"中将这个抽象的概念具体化,在这篇论文中“信息”被认为是一组可能的信息,其目标是通过噪声信道发送这些信息,然后让接收器在信道噪声的影响下以较低的错误概率来重构信息。Shannon的主要结果为:噪信道编码定理表明,在许多信道(这个数量仅仅依赖于信息发送所经过的信道的统计信息)使用的限制下,信道容量为渐近可达到的信息传输速率,。<br />
<br />
<br />
<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of [[Rubric (academic)|rubrics]] throughout the world over the past half century or more: [[adaptive system]]s, [[anticipatory system]]s, [[artificial intelligence]], [[complex system]]s, [[complexity science]], [[cybernetics]], [[Informatics (academic field)|informatics]], [[machine learning]], along with [[systems science]]s of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of [[coding theory]].<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory.<br />
<br />
信息论与一系列纯粹的、应用的学科密切相关,在过去半个世纪甚至更久的时间里,在全球范围内已经有各种专栏下被研究和简化为工程实践,比如在自适应系统,预期系统,人工智能,复杂系统,复杂性科学,控制论,信息学,机器学习,以及许多描述的系统科学等学科中的研究与应用。信息论是一个广泛而深入的数学理论,同样也具有广泛而深入的应用,其中编码理论是至关重要的领域。<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is concerned with finding explicit methods, called ''codes'', for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and [[error-correction]] (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
Coding theory is concerned with finding explicit methods, called codes, for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
编码理论与寻找明确的方法(编码)有关,用于提高效率和将噪声信道上传输的数据错误率降低到接近信道容量。这些编码可大致分为数据压缩编码(信源编码)和纠错(信道编码)技术。在后一种技术中,花了很多年才证明Shannon的工作是可行的。<br />
<br />
<br />
<br />
<br />
<br />
A third class of information theory codes are cryptographic algorithms (both [[code (cryptography)|code]]s and [[cipher]]s). Concepts, methods and results from coding theory and information theory are widely used in cryptography and [[cryptanalysis]]. ''See the article [[ban (unit)]] for a historical application.''<br />
<br />
A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis. See the article ban (unit) for a historical application.<br />
<br />
第三类信息论代码是密码算法(包括代码和密码)。编码理论和信息论的概念、方法和结果在密码学和密码分析中得到了广泛的应用。有关历史应用,请参阅文章禁令(单位)。<br />
<br />
<br />
<br />
<br />
==Historical background==<br />
<br />
==Historical background==<br />
<br />
历史背景<br />
<br />
{{Main|History of information theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
The landmark event that ''established'' the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the ''[[Bell System Technical Journal]]'' in July and October 1948.<br />
<br />
The landmark event that established the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October 1948.<br />
<br />
1948年7月和10月,Claude Shannon在''[[Bell System Technical Journal]]''上发表了经典论文:"A Mathematical Theory of Communication",这是建立信息论学科并立即引起全世界关注的里程碑事件。<br />
<br />
<br />
<br />
<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at [[Bell Labs]], all implicitly assuming events of equal probability. [[Harry Nyquist]]'s 1924 paper, ''Certain Factors Affecting Telegraph Speed'', contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation {{math|1=''W'' = ''K'' log ''m''}} (recalling [[Boltzmann's constant]]), where ''W'' is the speed of transmission of intelligence, ''m'' is the number of different voltage levels to choose from at each time step, and ''K'' is a constant. [[Ralph Hartley]]'s 1928 paper, ''Transmission of Information'', uses the word ''information'' as a measurable quantity, reflecting the receiver's ability to distinguish one [[sequence of symbols]] from any other, thus quantifying information as {{math|1=''H'' = log ''S''<sup>''n''</sup> = ''n'' log ''S''}}, where ''S'' was the number of possible symbols, and ''n'' the number of symbols in a transmission. The unit of information was therefore the [[decimal digit]], which has since sometimes been called the [[Hartley (unit)|hartley]] in his honor as a unit or scale or measure of information. [[Alan Turing]] in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war [[Cryptanalysis of the Enigma|Enigma]] ciphers.<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation (recalling Boltzmann's constant), where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish one sequence of symbols from any other, thus quantifying information as , where S was the number of possible symbols, and n the number of symbols in a transmission. The unit of information was therefore the decimal digit, which has since sometimes been called the hartley in his honor as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers.<br />
<br />
在此之前,贝尔实验室已经提出了有限的信息论思想,所有这些理论都隐含地假设了概率均等的事件。Harry Nyquist 在1924年发表的论文”Certain Factors Affecting Telegraph Speed”中包含一个理论部分,量化了“智能”和通信系统可以传输的“线路速度”,并给出了关系式(检索Boltzmann常数) ,其中 w 是智能传输的速度,m 是每个时间步长可以选择的不同电压电平的数,k 是常数。Ralph Hartley在1928年发表的论文” Transmission of Information”中,将单词信息作为一个可测量的量,以此反映接收器区分一系列符号的能力,从而将信息量化,其中 s 是可能符号的数量,n 是传输中符号的数量。因此信息的单位就是十进制数字,为了表示对他的尊敬,这个单位有时被称为Hartley,作为信息的单位、尺度或度量。1940年,Alan Turing在对德国二战时期破解迷密码(Enigma ciphers)的统计分析中使用了类似的思想。<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of [[thermodynamics]] by [[Ludwig Boltzmann]] and [[J. Willard Gibbs]]. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by [[Rolf Landauer]] in the 1960s, are explored in ''[[Entropy in thermodynamics and information theory]]''.<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory.<br />
<br />
信息论背后的许多数学理论(包括不同概率的事件)都是由Ludwig Boltzmann和 j. Willard Gibbs 为热力学领域而发展起来的。信息论中的熵和热力学中的熵之间的联系,包括 Rolf Landauer 在20世纪60年代的重要贡献,在热力学和信息论的熵中进行了探讨。<br />
<br />
<br />
<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
1944年底之前,Shannon的工作在贝尔实验室已基本完成。在Shannon的开创性的论文中首次引入了定性和定量的通信模型,将其作为信息理论基础的统计过程。<br />
:"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
通信的基本问题是在一点上精确地或近似地再现在另一点上选择的信息<br />
<br />
<br />
<br />
<br />
<br />
With it came the ideas of<br />
<br />
With it came the ideas of<br />
<br />
相关观点<br />
<br />
* the information entropy and [[redundancy (information theory)|redundancy]] of a source, and its relevance through the [[source coding theorem]];<br />
<br />
<br />
* the mutual information, and the channel capacity of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;<br />
<br />
<br />
* the practical result of the [[Shannon–Hartley law]] for the channel capacity of a [[Gaussian channel]]; as well as<br />
<br />
<br />
<br />
* the [[bit]]—a new way of seeing the most fundamental unit of information.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==Quantities of information==<br />
<br />
==Quantities of information==<br />
<br />
信息的度量<br />
<br />
{{Main|Quantities of information}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Information theory is based on [[probability theory]] and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy [[Communication channel|channel]] in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
Information theory is based on probability theory and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. <br />
The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy channel in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
信息论基于概率论和统计学,其中经常涉及与随机变量相关的分布信息的度量。信息论中重要的信息量有:熵(两个随机变量中信息的度量)和互信息(两个随机变量之间共有的信息的度量)。熵是随机变量的概率分布的一个属性,它限制了由具有给定分布的独立样本生成的数据能够进行可靠压缩的速率。互信息是两个随机变量联合分布的一个属性,是当信道统计量由联合分布确定时,在长块长度的限制下通过噪声信道的可靠通信的最大速率。<br />
<br />
<br />
<br />
<br />
<br />
The choice of logarithmic base in the following formulae determines the [[units of measurement|unit]] of information entropy that is used. A common unit of information is the bit, based on the [[binary logarithm]]. Other units include the [[nat (unit)|nat]], which is based on the [[natural logarithm]], and the [[deciban|decimal digit]], which is based on the [[common logarithm]].<br />
<br />
The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. A common unit of information is the bit, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the decimal digit, which is based on the common logarithm.<br />
<br />
在下列公式中,对数底数的选择决定了信息熵的单位。信息的常见单位是比特(基于二进制对数)。其他单位包括nat(自然对数)和十进制数字(常用对数)。<br />
<br />
<br />
<br />
<br />
In what follows, an expression of the form {{math|''p'' log ''p''}} is considered by convention to be equal to zero whenever {{math|1=''p'' = 0}}. This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
In what follows, an expression of the form is considered by convention to be equal to zero whenever . This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
下文中,按惯例将形式的表达式视为等于零。这是合理的,因为<math>\lim_{p \rightarrow 0+} p \log p = 0</math>适用于任何对数底。<br />
<br />
<br />
<br />
<br />
<br />
===Entropy of an information source===<br />
<br />
===Entropy of an information source===<br />
<br />
信源的熵<br />
<br />
Based on the [[probability mass function]] of each source symbol to be communicated, the Shannon [[Entropy (information theory)|entropy]] {{math|''H''}}, in units of bits (per symbol), is given by<br />
<br />
Based on the probability mass function of each source symbol to be communicated, the Shannon entropy , in units of bits (per symbol), is given by<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
基于每个要通信的源符号的概率质量函数,香农熵(以比特为单位)由下式给出:<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
<br />
where {{math|''p<sub>i</sub>''}} is the probability of occurrence of the {{math|''i''}}-th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the [[Shannon (unit)|shannon]] in his honor. Entropy is also commonly computed using the natural logarithm (base [[E (mathematical constant)|{{mvar|e}}]], where {{mvar|e}} is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base {{nowrap|1=2<sup>8</sup> = 256}} will produce a measurement in [[byte]]s per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
where is the probability of occurrence of the -th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the shannon in his honor. Entropy is also commonly computed using the natural logarithm (base E (mathematical constant)|, where is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base will produce a measurement in bytes per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
其中{{math|''p<sub>i</sub>''}}是源符号的第{{math|''i''}}个可能值出现的概率。该方程式以比特(每个符号)为单位给出熵,因为它使用以2为底的对数,所以这个熵有时被称为香农熵以表纪念。熵的计算也通常使用自然对数(以[[E (mathematical constant)|{{mvar|e}}]]为底数,其中{{mvar|e}}是欧拉数,其他底数也是可行的,但不常用),这样就可以测量每个符号的熵值,有时在公式中可以通过避免额外的常量来简化分析。例如以{{nowrap|1=2<sup>8</sup> = 256}}为底的对数,每个符号将产生以的字节为单位的测量值。以10为底的对数,每个符号将产生以十进制数字(或哈特利)为单位的测量值。<br />
<br />
<br />
Intuitively, the entropy {{math|''H<sub>X</sub>''}} of a discrete random variable {{math|''X''}} is a measure of the amount of ''uncertainty'' associated with the value of {{math|''X''}} when only its distribution is known.<br />
<br />
Intuitively, the entropy of a discrete random variable is a measure of the amount of uncertainty associated with the value of when only its distribution is known.<br />
<br />
直观的来看,离散型随机变量{{math|''X''}}的熵{{math|''H<sub>X</sub>''}}是不确定性度量,当只知道其分布时,它的值与{{math|''X''}}的值相关。<br />
<br />
<br />
<br />
<br />
<br />
The entropy of a source that emits a sequence of {{math|''N''}} symbols that are [[independent and identically distributed]] (iid) is {{math|''N'' ⋅ ''H''}} bits (per message of {{math|''N''}} symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length {{math|''N''}} will be less than {{math|''N'' ⋅ ''H''}}.<br />
<br />
The entropy of a source that emits a sequence of symbols that are independent and identically distributed (iid) is bits (per message of symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length will be less than .<br />
<br />
发出{{math|''N''}}个[[独立同分布]] (iid)的符号序列的源,其熵为{{math|''N'' ⋅ ''H''}}位(每个信息{{math|''N''}}符号)。如果源数据符号是同分布但不独立的,则长度为{{math|''N''}}的消息的熵将小于{{math|''N'' ⋅ ''H''}}。<br />
<br />
<br />
<br />
<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px|The entropy of a [[Bernoulli trial]] as a function of success probability, often called the {{em|[[binary entropy function]]}}, {{math|''H''<sub>b</sub>(''p'')}}. The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
The entropy of a [[Bernoulli trial as a function of success probability, often called the , . The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px|以[[Bernoulli trial]]的熵作为成功概率的函数,通常称作{{em|[[binary entropy function]]}}, {{math|''H''<sub>b</sub>(''p'')}}。当两个可能的结果发生概率相等时(例如投掷无偏硬币),每次试验的熵最大为1比特。]]<br />
<br />
<br />
<br />
<br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages {{math|{{mset|''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}}} that {{math|''X''}} could be, and {{math|''p''(''x'')}} is the probability of some <math>x \in \mathbb X</math>, then the entropy, {{math|''H''}}, of is defined:<ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages }} that could be, and is the probability of some <math>x \in \mathbb X</math>, then the entropy, , of is defined: <br />
<br />
如果一个人发送了1000比特(0s和1s),并且接收者在发送之前就已知这些比特中的每一个的值(具有确定性的特定值),显然就不会发送任何信息。但是,如果每个比特独立且等可能的为0或1时,则已经发送了1000香农信息(通常称为:比特)。在这两个极端之间,信息可以按以下方式进行量化。如果𝕏是{{math|''X''}}可能在的所有消息的集合{{math|{{mset|''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}}},且{{math|''p''(''x'')}}是<math>x \in \mathbb X</math>的概率,那么熵、{{math|''H''}}和{{math|''H''}}的定义如下: <ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
<br />
<br />
<br />
<br />
(Here, is the [[self-information]], which is the entropy contribution of an individual message, and {{math|''I''(''x'')}}{{math|𝔼<sub>''X''</sub>}} is the [[expected value]].) A property of entropy is that it is maximized when all the messages in the message space are equiprobable {{math|1=''p''(''x'') = 1/''n''}}; i.e., most unpredictable, in which case {{math|1=''H''(''X'') = log ''n''}}.<br />
<br />
(Here, is the self-information, which is the entropy contribution of an individual message, and is the expected value.) A property of entropy is that it is maximized when all the messages in the message space are equiprobable ; i.e., most unpredictable, in which case .<br />
<br />
(其中:{{math|''I''(''x'')}}是[[自信息]],表示单个信息的熵贡献;{{math|''I''(''x'')}}{{math|𝔼<sub>''X''</sub>}}为{{math|''X''}}的期望。)熵的一个特性是,当消息空间中的所有消息都是等概率{{math|1=''p''(''x'') = 1/''n''}}时熵最大; 也就是说,在{{math|1=''H''(''X'') = log ''n''}}这种情况下,熵是最不可预测的。<br />
<br />
<br />
<br />
<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
对于具有两个结果的随机变量的信息熵,其特殊情况为二进制熵函数(通常用以为底2对数,因此以香农(Sh)为单位):<br />
<br />
<br />
<br />
<br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
<br />
<br />
<br />
<br />
===Joint entropy===<br />
<br />
===Joint entropy===<br />
<br />
联合熵<br />
<br />
The {{em|[[joint entropy]]}} of two discrete random variables {{math|''X''}} and {{math|''Y''}} is merely the entropy of their pairing: {{math|(''X'', ''Y'')}}. This implies that if {{math|''X''}} and {{math|''Y''}} are [[statistical independence|independent]], then their joint entropy is the sum of their individual entropies.<br />
<br />
The of two discrete random variables and is merely the entropy of their pairing: . This implies that if and are independent, then their joint entropy is the sum of their individual entropies.<br />
<br />
两个离散的随机变量{{math|''X''}}和{{math|''Y''}}的联合熵大致是它们的组合熵: {{math|(''X'', ''Y'')}}。若{{math|''X''}}和{{math|''Y''}}是独立的,那么它们的联合熵就是其各自熵的总和。<br />
<br />
<br />
<br />
<br />
<br />
For example, if {{math|(''X'', ''Y'')}} represents the position of a chess piece — {{math|''X''}} the row and {{math|''Y''}} the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
For example, if represents the position of a chess piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
例如:如果{{math|(''X'', ''Y'')}}代表棋子的位置({{math|''X''}} 表示行和{{math|''Y''}}表示列),那么棋子所在位置的熵就是棋子行、列的联合熵。<br />
<br />
<br />
<br />
<br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
<br />
<br />
<br />
<br />
Despite similar notation, joint entropy should not be confused with {{em|[[cross entropy]]}}.<br />
<br />
Despite similar notation, joint entropy should not be confused with .<br />
<br />
尽管符号相似,注意联合熵与交叉熵不能混淆。<br />
<br />
<br />
<br />
<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
条件熵(含糊度)<br />
<br />
The {{em|[[conditional entropy]]}} or ''conditional uncertainty'' of {{math|''X''}} given random variable {{math|''Y''}} (also called the ''equivocation'' of {{math|''X''}} about {{math|''Y''}}) is the average conditional entropy over {{math|''Y''}}:<ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
The or conditional uncertainty of given random variable (also called the equivocation of about ) is the average conditional entropy over :<br />
在给定随机变量{{math|''Y''}}下{{math|''X''}}的条件熵(或条件不确定性,也可称为{{math|''X''}}关于{{math|''Y''}}的含糊度))是{{math|''Y''}}上的条件平均熵: <ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
<br />
<br />
<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
由于熵能够以随机变量或该随机变量的某个值为条件,所以应注意不要混淆条件熵的这两个定义(前者更为常用)。该类条件熵的一个基本属性为:<br />
<br />
<br />
<br />
<br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<br />
<br />
<br />
<br />
===Mutual information (transinformation)===<br />
<br />
===Mutual information (transinformation)===<br />
<br />
互信息(转移信息)<br />
<br />
''[[Mutual information]]'' measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of {{math|''X''}} relative to {{math|''Y''}} is given by:<br />
<br />
Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of relative to is given by:<br />
<br />
互信息度量的是通过观察另一个随机变量可以获得的信息量。在通信中可以用它来最大化发送和接收信号之间共享的信息量,这一点至关重要。{{math|''X''}}相对于{{math|''X''}}的互信息由以下公式给出:<br />
<br />
<br />
<br />
<br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
where {{math|SI}} (''S''pecific mutual ''I''nformation) is the [[pointwise mutual information]].<br />
<br />
where (Specific mutual Information) is the pointwise mutual information.<br />
<br />
其中{{math|SI}} (Specific mutual Information,特定互信息)是点间的互信息。<br />
<br />
<br />
<br />
<br />
<br />
A basic property of the mutual information is that<br />
<br />
A basic property of the mutual information is that<br />
<br />
互信息的一个基本属性为:<br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
<math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
That is, knowing ''Y'', we can save an average of {{math|''I''(''X''; ''Y'')}} bits in encoding ''X'' compared to not knowing ''Y''.<br />
<br />
That is, knowing Y, we can save an average of bits in encoding X compared to not knowing Y.<br />
<br />
也就是说,在编码的过程中知道''Y''比不知道''Y''平均节省{{math|''I''(''X''; ''Y'')}}比特。<br />
<br />
<br />
<br />
<br />
<br />
Mutual information is [[symmetric function|symmetric]]:<br />
<br />
Mutual information is symmetric:<br />
<br />
互信息是对称的:<br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
<math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
<br />
<br />
<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the [[posterior probability|posterior probability distribution]] of ''X'' given the value of ''Y'' and the [[prior probability|prior distribution]] on ''X'':<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of X given the value of Y and the prior distribution on X:<br />
<br />
互信息可以表示为在给定''Y''值和''X''的后验分布的情况下,''X''的后验概率之间的平均 Kullback-Leibler 散度(信息增益) :<br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
<math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on ''X'' will change if we are given the value of ''Y''. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
换句话说这是一种度量方法:若我们给出''Y''的值,得出''X''上的概率分布将会平均变化多少。这通常用于计算边际分布的乘积与实际联合分布的差异:<br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
<math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
<br />
<br />
<br />
<br />
Mutual information is closely related to the [[likelihood-ratio test|log-likelihood ratio test]] in the context of contingency tables and the [[multinomial distribution]] and to [[Pearson's chi-squared test|Pearson's χ<sup>2</sup> test]]: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ<sup>2</sup> test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
互信息与列联表中的似然比检验和多项分布以及皮尔森卡方检验密切相关: 互信息可以视为评估一对变量之间独立性的统计量,并且具有明确指定的渐近分布。<br />
<br />
<br />
<br />
<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
Kullback-Leibler 散度(信息增益)<br />
<br />
The ''[[Kullback–Leibler divergence]]'' (or ''information divergence'', ''information gain'', or ''relative entropy'') is a way of comparing two distributions: a "true" [[probability distribution]] ''p(X)'', and an arbitrary probability distribution ''q(X)''. If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
Kullback-Leibler 散度(或信息散度、相对熵、信息增益)是比较两种分布的方法: “真实的”概率分布''p(X)''和任意概率分布''q(X)''。若假设''q(X)''是基于某种方式压缩的数据的分布,而实际上''p(X)''才是真正分布,那么 Kullback-Leibler 散度是每个数据压缩所需的平均额外比特数。因此定义:<br />
<br />
<br />
<br />
<br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
<br />
<br />
<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true [[Metric (mathematics)|metric]] since it is not symmetric and does not satisfy the [[triangle inequality]] (making it a semi-quasimetric).<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasimetric).<br />
<br />
尽管有时会将KL散度用作距离量度但它并不是一个真正的指标,因为它是不对称的,同时也不满足三角不等式(KL散度为一个半准度量)。<br />
<br />
<br />
<br />
<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number ''X'' is about to be drawn randomly from a discrete set with probability distribution ''p(x)''. If Alice knows the true distribution ''p(x)'', while Bob believes (has a [[prior probability|prior]]) that the distribution is ''q(x)'', then Bob will be more [[Information content|surprised]] than Alice, on average, upon seeing the value of ''X''. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the ''log'' is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number X is about to be drawn randomly from a discrete set with probability distribution p(x). If Alice knows the true distribution p(x), while Bob believes (has a prior) that the distribution is q(x), then Bob will be more surprised than Alice, on average, upon seeing the value of X. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
KL散度的另一种解释是先验者从事实中引入的“不必要的惊喜”。假设将从概率分布为“ p(x)”的离散集合中随机抽取数字“ X”,如果Alice知道真实的分布“p(x)”,而Bob认为“q(x)”具有先验概率分布,那么Bob将比Alice更多次看到“X”的值,Bob也将具有更多的信息内容。KL散度就是Bob意外惊喜的期望值减去Alice意外惊喜的期望值(如果对数以2为底,则以比特为单位),这样Bob的先验是“错误的”程度可以根据他变得“不必要的惊讶”的期望来进行量化。<br />
<br />
<br />
<br />
===Other quantities===<br />
<br />
===Other quantities===<br />
<br />
其他度量<br />
<br />
Other important information theoretic quantities include [[Rényi entropy]] (a generalization of entropy), [[differential entropy]] (a generalization of quantities of information to continuous distributions), and the [[conditional mutual information]].<br />
<br />
Other important information theoretic quantities include Rényi entropy (a generalization of entropy), differential entropy (a generalization of quantities of information to continuous distributions), and the conditional mutual information.<br />
<br />
信息论中其他重要的量包括Rényi熵(一种熵的推广),微分熵(信息量推广到连续分布),以及条件互信息。<br />
<br />
<br />
<br />
<br />
<br />
==Coding theory==<br />
<br />
==Coding theory==<br />
<br />
编码理论<br />
<br />
{{Main|Coding theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction]].]]<br />
<br />
A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction.]]<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|在可读CD的表面上显示划痕的图片。音乐和数据CD使用纠错编码进行编码,因此即使它们有轻微的划痕,也可以通过错误检测和纠正来对CD进行读取。]]<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into [[data compression|source coding]] theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
编码理论是信息论最重要、最直接的应用之一,可以细分为信源编码理论和信道编码理论。信息理论使用统计数据描述来量化描述数据所需的比特数,这是源的信息熵。<br />
<br />
<br />
<br />
<br />
* Data compression (source coding): There are two formulations for the compression problem:<br />
*数据压缩(源编码):压缩问题有两个相关公式;<br />
<br />
<br />
*[[lossless data compression]]: the data must be reconstructed exactly;<br />
* [[[无损数据压缩]]:数据必须准确重构;<br />
<br />
<br />
*[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called ''[[rate–distortion theory]]''.<br />
* [[有损数据压缩]]:由失真函数测得的在指定保真度级别内分配重构数据所需的比特数。信息理论中的这个子集称为率失真理论。<br />
<br />
<br />
* Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.<br />
*纠错码(信道编码):数据压缩会尽可能多的消除冗余,而纠错码会添加所需的冗余(即纠错),以便在嘈杂的信道上有效且保真地传输数据。<br />
<br />
<br />
<br />
<br />
<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the [[broadcast channel]]) or intermediary "helpers" (the [[relay channel]]), or more general [[computer network|networks]], compression followed by transmission may no longer be optimal. [[Network information theory]] refers to these multi-agent communication models.<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models.<br />
<br />
信息传输定理或源-信道分离定理证明编码理论划分为压缩和传输是正确的,这些定理证明了在许多情况下使用比特作为信息的‘’通用货币‘’是合理的,但这只在发送用户与特定接收用户建立通信的情况下才成立。在具有多个发送器(多路访问信道),多个接收器(广播信道)或中转器(中继信道)或多个计算机网络的情况下,压缩后再进行传输可能就不再是最佳选择。[[网络信息论]]是指这些多主体通信模型。<br />
<br />
<br />
<br />
===Source theory===<br />
<br />
===Source theory===<br />
<br />
源理论<br />
<br />
Any process that generates successive messages can be considered a {{em|[[Communication source|source]]}} of information. A memoryless source is one in which each message is an [[Independent identically distributed random variables|independent identically distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose less restrictive constraints. All such sources are [[stochastic process|stochastic]]. These terms are well studied in their own right outside information theory.<br />
<br />
Any process that generates successive messages can be considered a of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory.<br />
<br />
生成连续消息的任何过程都可以视为信息的通讯来源。无记忆信源是指每个消息都是独立同分布的随机变量,而遍历理论和平稳过程的性质对信源施加的限制较少。所有这些源都满足随机过程。在信息论领域外,这些术语已经有很全面的相关研究。<br />
<br />
<br />
<br />
====Rate====<!-- This section is linked from [[Channel capacity]] --><br />
<br />
====Rate====<!-- This section is linked from Channel capacity --><br />
<br />
====速率====<br />
Information ''[[Entropy rate|rate]]'' is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
信息的熵率是每个符号的平均熵。对于无记忆信源,信息的熵率仅表示每个符号的熵,而在平稳随机过程中,它是:<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
<br />
<br />
<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the ''average rate'' is<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is<br />
<br />
也就是说,一个符号的条件熵给出了所有之前生成的符号。对于不一定平稳的过程的更一般情况,平均速率为:<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
<br />
<br />
<br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<br />
<br />
也就是每个符号的联合熵的极限。对于固定源,这两个表达式得出的结果相同。<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
<br />
<br />
<br />
<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of {{em|source coding}}.<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of .<br />
<br />
在信息论中谈论一种语言的“速率”或“熵”是很常见的,比如当信源是英文散文时就很合适。信息源的速率与其冗余度以及可被压缩程度有关。<br />
<br />
<br />
<br />
<br />
<br />
===Channel capacity===<br />
<br />
===Channel capacity===<br />
<br />
信道容量<br />
<br />
{{Main|Channel capacity}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Communications over a channel—such as an [[ethernet]] cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
通过信道(例如:英特网电缆)进行通信是信息论的主要动机。然而,这样的信道往往不能产生信号的精确重建; 在静默时段内的噪声和其他形式的信号损坏往往会使得信息质量的降低。<br />
<br />
<br />
<br />
<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
考虑离散信道上的通信过程。该过程的简单模型如下:<br />
<br />
<br />
<br />
<br />
<br />
[[File:Channel model.svg|center|800px|Channel model]]<br />
<br />
Channel model<br />
<br />
[[File:Channel model.svg|center|800px|信道模型]<br />
<br />
<br />
<br />
<br />
<br />
Here ''X'' represents the space of messages transmitted, and ''Y'' the space of messages received during a unit time over our channel. Let {{math|''p''(''y''{{pipe}}''x'')}} be the [[conditional probability]] distribution function of ''Y'' given ''X''. We will consider {{math|''p''(''y''{{pipe}}''x'')}} to be an inherent fixed property of our communications channel (representing the nature of the ''[[Signal noise|noise]]'' of our channel). Then the joint distribution of ''X'' and ''Y'' is completely determined by our channel and by our choice of {{math|''f''(''x'')}}, the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the ''[[Signal (electrical engineering)|signal]]'', we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the {{em|channel capacity}} and is given by:<br />
<br />
Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let x)}} be the conditional probability distribution function of Y given X. We will consider x)}} to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the and is given by:<br />
<br />
这里''X''表示单位时间内通过信道发送的信息空间,''Y''表示单位时间内通过信道接收的信息空间。设{{math|''p''(''y''{{pipe}}''x'')}}是给定''X''的''Y''的条件概率分布函数。将{{math|''p''(''y''{{pipe}}''x'')}}视为通信信道的固定属性(表示信道噪声的性质)。那么''X''和''Y''的联合分布完全取决于所选用的信道和{{math|''f''(''x'')}},以及通过信道发送的信息的边缘分布。在这些约束条件下,我们希望最大化信息速率或信号速率,可以通过信道进行通信。对此的适当度量为互信息,信道容量即为最大互信息,且由下式给出:<br />
<br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
This capacity has the following property related to communicating at information rate ''R'' (where ''R'' is usually bits per symbol). For any information rate ''R < C'' and coding error ε > 0, for large enough ''N'', there exists a code of length ''N'' and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate ''R &gt; C'', it is impossible to transmit with arbitrarily small block error.<br />
<br />
This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R &gt; C, it is impossible to transmit with arbitrarily small block error.<br />
<br />
信道容量具有以下与以信息速率“R”进行通信有关的属性(其中“R”通常为每个符号的比特数)。对于任意信息速率''R < C''和编码错误ε > 0,对于足够大的''N'',存在长度为''N''和速率大于等于R的代码以及解码算法使得块错误的最大概率小于等于ε;即总是可以在任意小的块错误下进行传输。此外对于任何速率的“ R> C”,不可能以很小的块错误进行发送。<br />
<br />
<br />
<br />
<br />
''[[Channel code|Channel coding]]'' is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
信道编码就寻找一种接近最优的编码,它可以用于在噪声信道上以接近信道容量的速率传输数据,且编码错误很小。<br />
<br />
<br />
<br />
<br />
<br />
====Capacity of particular channel models====<br />
<br />
====Capacity of particular channel models====<br />
<br />
特定信道容量模型<br />
<br />
* A continuous-time analog communications channel subject to [[Gaussian noise]] — see [[Shannon–Hartley theorem]].<br />
<br />
*连续时间内受高斯噪声(Gaussian noise)限制的模拟通信信道(详细内容请参见[[Shannon–Hartley定理]])。<br />
<br />
* A [[binary symmetric channel]] (BSC) with crossover probability ''p'' is a binary input, binary output channel that flips the input bit with probability ''p''. The BSC has a capacity of {{math|1 &minus; ''H''<sub>b</sub>(''p'')}} bits per channel use, where {{math|''H''<sub>b</sub>}} is the binary entropy function to the base-2 logarithm:<br />
<br />
*二进制对称通道([[binary symmetric channel]],BSC)是交叉概率为''p''的二进制输入、二进制输出(以概率''p''翻转输入位)通道。每个通道使用的BSC容量为{{math|1 &minus; ''H''<sub>b</sub>(''p'')}}比特,其中{{math|''H''<sub>b</sub>}}是以2为底的对数的二进制熵函数:<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
File:Binary symmetric channel.svg<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
<br />
<br />
<br />
<br />
* A [[binary erasure channel]] (BEC) with erasure probability ''p'' is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is {{nowrap|1 &minus; ''p''}} bits per channel use.<br />
*二进制擦除通道([[binary erasure channel]],BEC)是擦除概率为“ p”的二进制输入、三进制输出通道。可能的通道输出为0、1和擦除符号'e'。擦除表示信息输入位的完全丢失。每个通道使用的BEC容量为{{nowrap|1 &minus; ''p''}}比特。<br />
<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
File:Binary erasure channel.svg<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
<br />
<br />
<br />
<br />
==Applications to other fields==<br />
<br />
==Applications to other fields==<br />
<br />
其他领域的应用<br />
<br />
<br />
<br />
<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
情报使用和安保中的应用<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the [[Ban (unit)|ban]], was used in the [[Ultra]] project, breaking the German [[Enigma machine]] code and hastening the [[Victory in Europe Day|end of World War II in Europe]]. Shannon himself defined an important concept now called the [[unicity distance]]. Based on the redundancy of the [[plaintext]], it attempts to give a minimum amount of [[ciphertext]] necessary to ensure unique decipherability.<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of World War II in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.<br />
<br />
信息论概念应用于密码学和密码分析。在[[Ultra]]的项目中使用了Turing的信息单元[[Ban(unit)| ban]],破解了德国的恩尼格玛密码,加速了二战在欧洲的结束。香农定义了一个重要的概念,现在称为单一性距离([[unicity distance]]),基于明文的冗余性尝试给出具有唯一可解密性所需的最少量的密文。<br />
<br />
<br />
<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A [[brute force attack]] can break systems based on [[public-key cryptography|asymmetric key algorithms]] or on most commonly used methods of [[symmetric-key algorithm|symmetric key algorithms]] (sometimes called secret key algorithms), such as [[block cipher]]s. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on asymmetric key algorithms or on most commonly used methods of symmetric key algorithms (sometimes called secret key algorithms), such as block ciphers. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
信息论使我们觉得保守秘密比最初看起来要困难得多。穷举法可以基于非对称密钥算法或最常用的对称密钥算法(也称为秘密密钥算法),如分组密码破坏系统。当前,所有这些方法的安全性都来自一下假设:在已知的时间内没有已知的攻击可以破坏它们。<br />
<br />
<br />
<br />
<br />
[[Information theoretic security]] refers to methods such as the [[one-time pad]] that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the [[key (cryptography)|key]]) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the [[Venona project]] was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
Information theoretic security refers to methods such as the one-time pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
信息理论安全性指的是诸如一次性密钥之类的不易受到这种暴力攻击的方法。在这种情况下,可以确保明文和密文(以密钥为条件)之间的正条件互信息正确的传输,而明文和密文之间的无条件互信息仍为零,从而保证绝对安全的通信。换句话说,窃听者将无法通过获取密文而不是密钥的知识来改善其对纯文本的猜测。但是,就像在其他任何密码系统中一样,必须小心正确的使用信息论中安全的方法; 之所以Venona 项目能够破解苏联的一次性密钥,是因为苏联不当地重复使用关键材料。<br />
<br />
<br />
<br />
<br />
===Pseudorandom number generation===<br />
<br />
===Pseudorandom number generation===<br />
<br />
伪随机数的生成<br />
<br />
[[Pseudorandom number generator]]s are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed [[cryptographically secure pseudorandom number generator]]s, but even they require [[random seed]]s external to the software to work as intended. These can be obtained via [[Extractor (mathematics)|extractors]], if done carefully. The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
Pseudorandom number generators are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed cryptographically secure pseudorandom number generators, but even they require random seeds external to the software to work as intended. These can be obtained via extractors, if done carefully. The measure of sufficient randomness in extractors is min-entropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
伪随机数生成器在计算机语言库和应用程序中广泛应用。由于它们没有规避现代计算机设备和软件的确定性,因此普遍不适合密码使用。一类改进的随机数生成器称为加密安全的伪随机数生成器,但也需要软件外部的随机种子才能正常工作。如果更进一步开发可以通过提取器来获得。提取器中充分随机性的度量是最小熵,该值与通过[[Rényi熵]]的Shannon熵有关;Rényi熵还用于评估密码系统中的随机性。虽然相关,具有较高Shannon熵的随机变量不一定适合在提取器中使用,因此在密码学用途中并不令人满意。<br />
<br />
<br />
<br />
===Seismic exploration===<br />
<br />
===Seismic exploration===<br />
<br />
地震勘探<br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and [[digital signal processing]] offer a major improvement of resolution and image clarity over previous analog methods.<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing offer a major improvement of resolution and image clarity over previous analog methods.<br />
<br />
信息论的一个早期商业应用是在地震石油勘探领域。在该领域的应用可以从期望的地震信号中剔除和分离不需要的噪声。与以前的模拟方法相比,信息论和数字信号处理大大提高了图像的分辨率和清晰度。<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
<br />
<br />
<br />
===Semiotics===<br />
<br />
===Semiotics===<br />
<br />
符号学<br />
<br />
[[Semiotics|Semioticians]] [[:nl:Doede Nauta|Doede Nauta]] and [[Winfried Nöth]] both considered [[Charles Sanders Peirce]] as having created a theory of information in his works on semiotics.<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref>{{rp|171}}<ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref>{{rp|137}} Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
Semioticians Doede Nauta and Winfried Nöth both considered Charles Sanders Peirce as having created a theory of information in his works on semiotics. Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<br />
<br />
符号学家[[:nl:Doede Nauta|Doede Nauta]]和[[Winfried Nöth]]都认为[[Charles Sanders Peirce]]在他的符号学著作中创造了信息论。<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref>{{rp|171}}<ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref>{{rp|137}} Nauta将符号信息论定义为研究编码、过滤和信息处理的内部过程。<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as [[Umberto Eco]] and [[:it:Ferruccio Rossi-Landi|Ferruccio Rossi-Landi]] to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<ref>Nöth, Winfried (1981). "[https://kobra.uni-kassel.de/bitstream/handle/123456789/2014122246977/semi_2004_002.pdf?sequence=1&isAllowed=y Semiotics of ideology]". ''Semiotica'', Issue 148.</ref><br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<br />
<br />
来自信息论的概念,如冗余和代码控制,已经被符号学家如 Umberto Eco 和 Ferruccio Rossi-Landi 用来解释意识形态作为一种信息传递的形式,主导社会阶层通过使用高度冗余的标志来发出信息,这样在一系列相互竞争的标志中只有一个信息被解码。<br />
<br />
信息论的概念(例如冗余和代码控制)已被符号学家如Umberto Eco和Ferruccio Rossi-Landi用来解释意识形态,将其作为消息传输的一种形式,占统治地位的社会阶层通过使用具有高度冗余性的标志来发出其信息,从而在一系列相互竞争的标志中只有一个信息被解码。<br />
<br />
<br />
<br />
===Miscellaneous applications===<br />
<br />
===Miscellaneous applications===<br />
<br />
其他杂项中的应用<br />
Information theory also has applications in [[Gambling and information theory]], [[black hole information paradox|black holes]], and [[bioinformatics]].<br />
<br />
Information theory also has applications in Gambling and information theory, black holes, and bioinformatics.<br />
<br />
信息论在赌博和信息论、黑洞信息悖论和生物信息学中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
==See also==<br />
<br />
==See also==<br />
<br />
另请参阅<br />
<br />
{{Portal|Mathematics}}<br />
<br />
<br />
<br />
* [[Algorithmic probability]]<br />
<br />
<br />
<br />
* [[Bayesian inference]]<br />
<br />
<br />
<br />
* [[Communication theory]]<br />
<br />
<br />
<br />
* [[Constructor theory]] - a generalization of information theory that includes quantum information<br />
<br />
<br />
<br />
* [[Inductive probability]]<br />
<br />
<br />
<br />
* [[Info-metrics]]<br />
<br />
<br />
<br />
* [[Minimum message length]]<br />
<br />
<br />
<br />
* [[Minimum description length]]<br />
<br />
<br />
<br />
* [[List of important publications in theoretical computer science#Information theory|List of important publications]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Applications===<br />
<br />
===Applications===<br />
<br />
申请<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Active networking]]<br />
<br />
<br />
<br />
* [[Cryptanalysis]]<br />
<br />
<br />
<br />
* [[Cryptography]]<br />
<br />
<br />
<br />
* [[Cybernetics]]<br />
<br />
<br />
<br />
* [[Entropy in thermodynamics and information theory]]<br />
<br />
<br />
<br />
* [[Gambling]]<br />
<br />
<br />
<br />
* [[Intelligence (information gathering)]]<br />
<br />
<br />
<br />
* [[reflection seismology|Seismic exploration]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===History===<br />
<br />
===History===<br />
<br />
历史<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ralph Hartley|Hartley, R.V.L.]]<br />
<br />
<br />
<br />
* [[History of information theory]]<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]]<br />
<br />
<br />
<br />
* [[Timeline of information theory]]<br />
<br />
<br />
<br />
* [[Hubert Yockey|Yockey, H.P.]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Theory===<br />
<br />
===Theory===<br />
<br />
理论<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Coding theory]]<br />
<br />
<br />
<br />
* [[Detection theory]]<br />
<br />
<br />
<br />
* [[Estimation theory]]<br />
<br />
<br />
<br />
* [[Fisher information]]<br />
<br />
<br />
<br />
* [[Information algebra]]<br />
<br />
<br />
<br />
* [[Information asymmetry]]<br />
<br />
<br />
<br />
* [[Information field theory]]<br />
<br />
<br />
<br />
* [[Information geometry]]<br />
<br />
<br />
<br />
* [[Information theory and measure theory]]<br />
<br />
<br />
<br />
* [[Kolmogorov complexity]]<br />
<br />
<br />
<br />
* [[List of unsolved problems in information theory]]<br />
<br />
<br />
<br />
* [[Logic of information]]<br />
<br />
<br />
<br />
* [[Network coding]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
* [[Quantum information science]]<br />
<br />
<br />
<br />
* [[Source coding]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Concepts===<br />
<br />
===Concepts===<br />
<br />
概念<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ban (unit)]]<br />
<br />
<br />
<br />
* [[Channel capacity]]<br />
<br />
<br />
<br />
* [[Communication channel]]<br />
<br />
<br />
<br />
* [[Communication source]]<br />
<br />
<br />
<br />
* [[Conditional entropy]]<br />
<br />
<br />
<br />
* [[Covert channel]]<br />
<br />
<br />
<br />
* [[Data compression]]<br />
<br />
<br />
<br />
* Decoder<br />
<br />
<br />
<br />
* [[Differential entropy]]<br />
<br />
<br />
<br />
* [[Fungible information]]<br />
<br />
<br />
<br />
* [[Information fluctuation complexity]]<br />
<br />
<br />
<br />
* [[Information entropy]]<br />
<br />
<br />
<br />
* [[Joint entropy]]<br />
<br />
<br />
<br />
* [[Kullback–Leibler divergence]]<br />
<br />
<br />
<br />
* [[Mutual information]]<br />
<br />
<br />
<br />
* [[Pointwise mutual information]] (PMI)<br />
<br />
<br />
<br />
* [[Receiver (information theory)]]<br />
<br />
<br />
<br />
* [[Redundancy (information theory)|Redundancy]]<br />
<br />
<br />
<br />
* [[Rényi entropy]]<br />
<br />
<br />
<br />
* [[Self-information]]<br />
<br />
<br />
<br />
* [[Unicity distance]]<br />
<br />
<br />
<br />
* [[Variety (cybernetics)|Variety]]<br />
<br />
<br />
<br />
* [[Hamming distance]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==References==<br />
<br />
==References==<br />
<br />
参考资料<br />
<br />
{{Reflist}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===The classic work===<br />
<br />
===The classic work===<br />
<br />
经典之作<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]] (1948), "[[A Mathematical Theory of Communication]]", ''Bell System Technical Journal'', 27, pp.&nbsp;379–423 & 623–656, July & October, 1948. [http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf PDF.] <br />[http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Notes and other formats.]<br />
<br />
<br />
<br />
* R.V.L. Hartley, [http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdf "Transmission of Information"], ''Bell System Technical Journal'', July 1928<br />
<br />
<br />
<br />
* [[Andrey Kolmogorov]] (1968), "[https://www.tandfonline.com/doi/pdf/10.1080/00207166808803030 Three approaches to the quantitative definition of information]" in International Journal of Computer Mathematics.<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other journal articles===<br />
<br />
===Other journal articles===<br />
<br />
其他期刊文章<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* J. L. Kelly, Jr., [http://betbubbles.com/wp-content/uploads/2017/07/kelly.pdf Betbubbles.com]{{Dead link|date=January 2020 |bot=InternetArchiveBot |fix-attempted=yes }}, "A New Interpretation of Information Rate" ''Bell System Technical Journal'', Vol. 35, July 1956, pp.&nbsp;917–26.<br />
<br />
<br />
<br />
* R. Landauer, [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=615478 IEEE.org], "Information is Physical" ''Proc. Workshop on Physics and Computation PhysComp'92'' (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp.&nbsp;1–4.<br />
<br />
<br />
<br />
* {{cite journal | last1 = Landauer | first1 = R. | year = 1961 | title = Irreversibility and Heat Generation in the Computing Process | url = http://www.research.ibm.com/journal/rd/441/landauerii.pdf | journal = IBM J. Res. Dev. | volume = 5 | issue = 3| pages = 183–191 | doi = 10.1147/rd.53.0183 }}<br />
<br />
<br />
<br />
* {{cite arXiv |last=Timme |first=Nicholas|last2=Alford |first2=Wesley|last3=Flecker |first3=Benjamin|last4=Beggs |first4=John M.|date=2012 |title=Multivariate information measures: an experimentalist's perspective |eprint=1111.6857|class=cs.IT}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Textbooks on information theory===<br />
<br />
===Textbooks on information theory===<br />
<br />
信息论教材<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Arndt, C. ''Information Measures, Information and its Description in Science and Engineering'' (Springer Series: Signals and Communication Technology), 2004, {{isbn|978-3-540-40855-0}}<br />
<br />
<br />
<br />
* Ash, RB. ''Information Theory''. New York: Interscience, 1965. {{isbn|0-470-03445-9}}. New York: Dover 1990. {{isbn|0-486-66521-6}}<br />
<br />
<br />
<br />
* [[Gallager, R]]. ''Information Theory and Reliable Communication.'' New York: John Wiley and Sons, 1968. {{isbn|0-471-29048-3}}<br />
<br />
<br />
<br />
* Goldman, S. ''Information Theory''. New York: Prentice Hall, 1953. New York: Dover 1968 {{isbn|0-486-62209-6}}, 2005 {{isbn|0-486-44271-3}}<br />
<br />
<br />
<br />
* {{cite book |last1=Cover |first1=Thomas |author-link1=Thomas M. Cover |last2=Thomas |first2=Joy A. |title=Elements of information theory |edition=2nd |location=New York |publisher=[[Wiley-Interscience]] |date=2006 |isbn=0-471-24195-4}}<br />
<br />
<br />
<br />
* [[Csiszar, I]], Korner, J. ''Information Theory: Coding Theorems for Discrete Memoryless Systems'' Akademiai Kiado: 2nd edition, 1997. {{isbn|963-05-7440-3}}<br />
<br />
<br />
<br />
* [[David J. C. MacKay|MacKay, David J. C.]]. ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'' Cambridge: Cambridge University Press, 2003. {{isbn|0-521-64298-1}}<br />
<br />
<br />
<br />
* Mansuripur, M. ''Introduction to Information Theory''. New York: Prentice Hall, 1987. {{isbn|0-13-484668-0}}<br />
<br />
<br />
<br />
* [[Robert McEliece|McEliece, R]]. ''The Theory of Information and Coding". Cambridge, 2002. {{isbn|978-0521831857}}<br />
<br />
<br />
<br />
*Pierce, JR. "An introduction to information theory: symbols, signals and noise". Dover (2nd Edition). 1961 (reprinted by Dover 1980).<br />
<br />
<br />
<br />
* [[Reza, F]]. ''An Introduction to Information Theory''. New York: McGraw-Hill 1961. New York: Dover 1994. {{isbn|0-486-68210-2}}<br />
<br />
<br />
<br />
* {{cite book |last1=Shannon |first1=Claude |author-link1=Claude Shannon |last2=Weaver |first2=Warren |author-link2=Warren Weaver |date=1949 |title=The Mathematical Theory of Communication |url=http://monoskop.org/images/b/be/Shannon_Claude_E_Weaver_Warren_The_Mathematical_Theory_of_Communication_1963.pdf |location=[[Urbana, Illinois]] |publisher=[[University of Illinois Press]] |lccn=49-11922 |isbn=0-252-72548-4}}<br />
<br />
<br />
<br />
* Stone, JV. Chapter 1 of book [http://jim-stone.staff.shef.ac.uk/BookInfoTheory/InfoTheoryBookMain.html "Information Theory: A Tutorial Introduction"], University of Sheffield, England, 2014. {{isbn|978-0956372857}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book/ A First Course in Information Theory]'' Kluwer Academic/Plenum Publishers, 2002. {{isbn|0-306-46791-7}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book2/ Information Theory and Network Coding]'' Springer 2008, 2002. {{isbn|978-0-387-79233-0}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other books===<br />
<br />
===Other books===<br />
<br />
其他书籍<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Leon Brillouin, ''Science and Information Theory'', Mineola, N.Y.: Dover, [1956, 1962] 2004. {{isbn|0-486-43918-6}}<br />
<br />
<br />
<br />
* [[James Gleick]], ''[[The Information: A History, a Theory, a Flood]]'', New York: Pantheon, 2011. {{isbn|978-0-375-42372-7}}<br />
<br />
<br />
<br />
* A. I. Khinchin, ''Mathematical Foundations of Information Theory'', New York: Dover, 1957. {{isbn|0-486-60434-9}}<br />
<br />
<br />
<br />
* H. S. Leff and A. F. Rex, Editors, ''Maxwell's Demon: Entropy, Information, Computing'', Princeton University Press, Princeton, New Jersey (1990). {{isbn|0-691-08727-X}}<br />
<br />
<br />
<br />
* [[Robert K. Logan]]. ''What is Information? - Propagating Organization in the Biosphere, the Symbolosphere, the Technosphere and the Econosphere'', Toronto: DEMO Publishing.<br />
<br />
<br />
<br />
* Tom Siegfried, ''The Bit and the Pendulum'', Wiley, 2000. {{isbn|0-471-32174-5}}<br />
<br />
<br />
<br />
* Charles Seife, ''[[Decoding the Universe]]'', Viking, 2006. {{isbn|0-670-03441-X}}<br />
<br />
<br />
<br />
* Jeremy Campbell, ''[[Grammatical Man]]'', Touchstone/Simon & Schuster, 1982, {{isbn|0-671-44062-4}}<br />
<br />
<br />
<br />
* Henri Theil, ''Economics and Information Theory'', Rand McNally & Company - Chicago, 1967.<br />
<br />
<br />
<br />
* Escolano, Suau, Bonev, ''[https://www.springer.com/computer/image+processing/book/978-1-84882-296-2 Information Theory in Computer Vision and Pattern Recognition]'', Springer, 2009. {{isbn|978-1-84882-296-2}}<br />
<br />
<br />
<br />
* Vlatko Vedral, ''Decoding Reality: The Universe as Quantum Information'', Oxford University Press 2010. {{ISBN|0-19-923769-7}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===MOOC on information theory===<br />
<br />
===MOOC on information theory===<br />
<br />
信息论大型开放式课程<br />
<br />
* Raymond W. Yeung, "[http://www.inc.cuhk.edu.hk/InformationTheory/index.html Information Theory]" ([[The Chinese University of Hong Kong]])<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==External links==<br />
<br />
==External links==<br />
<br />
外部链接<br />
<br />
{{Wikiquote}}<br />
<br />
<br />
<br />
{{Library resources box}}<br />
<br />
<br />
<br />
* {{SpringerEOM |title=Information |id=p/i051040}}<br />
<br />
<br />
<br />
* Lambert F. L. (1999), "[http://jchemed.chem.wisc.edu/Journal/Issues/1999/Oct/abs1385.html Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!]", ''Journal of Chemical Education''<br />
<br />
<br />
<br />
* [http://www.itsoc.org/ IEEE Information Theory Society] and [https://www.itsoc.org/resources/surveys ITSOC Monographs, Surveys, and Reviews]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Cybernetics}}<br />
<br />
<br />
<br />
{{Compression methods}}<br />
<br />
<br />
<br />
{{Areas of mathematics}}<br />
<br />
<br />
<br />
{{Computer science}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Authority control}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{DEFAULTSORT:Information Theory}}<br />
<br />
<br />
<br />
[[Category:Information theory| ]]<br />
<br />
<br />
<br />
[[Category:Computer science]]<br />
<br />
Category:Computer science<br />
<br />
类别: 计算机科学<br />
<br />
[[Category:Cybernetics]]<br />
<br />
Category:Cybernetics<br />
<br />
类别: 控制论<br />
<br />
[[Category:Formal sciences]]<br />
<br />
Category:Formal sciences<br />
<br />
类别: 正规科学<br />
<br />
[[Category:Information Age]]<br />
<br />
Category:Information Age<br />
<br />
类别: 信息时代<br />
<br />
<noinclude><br />
<br />
<small>This page was moved from [[wikipedia:en:Information theory]]. Its edit history can be viewed at [[信息论/edithistory]]</small></noinclude><br />
<br />
[[Category:待整理页面]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E4%BF%A1%E6%81%AF%E8%AE%BA_Information_theory&diff=10926信息论 Information theory2020-07-20T07:11:53Z<p>Pjhhh:</p>
<hr />
<div>此词条暂由彩云小译翻译,未经人工整理和审校,带来阅读不便,请见谅。{{distinguish|Information science}}<br />
<br />
<br />
<br />
{{Information theory}}<br />
<br />
<br />
<br />
'''Information theory''' studies the [[quantification (science)|quantification]], [[computer data storage|storage]], and [[telecommunication|communication]] of [[information]]. It was originally proposed by [[Claude Shannon]] in 1948 to find fundamental limits on [[signal processing]] and communication operations such as [[data compression]], in a landmark paper titled "[[A Mathematical Theory of Communication]]". Its impact has been crucial to the success of the [[Voyager program|Voyager]] missions to deep space, the invention of the [[compact disc]], the feasibility of mobile phones, the development of the Internet, the study of [[linguistics]] and of human perception, the understanding of [[black hole]]s, and numerous other fields.<br />
<br />
Information theory studies the quantification, storage, and communication of information. It was originally proposed by Claude Shannon in 1948 to find fundamental limits on signal processing and communication operations such as data compression, in a landmark paper titled "A Mathematical Theory of Communication". Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception, the understanding of black holes, and numerous other fields.<br />
<br />
'''信息论'''研究的是信息的量化、存储与传播。信息论最初是由[[Claude Shannon]]在1948年的一篇题为"[[A Mathematical Theory of Communication]]"的论文中提出的,其目的是找到信号处理和通信操作(如数据压缩)的基本限制。信息论对于旅行者号深空探测任务的成功、光盘的发明、移动电话的可行性、互联网的发展、语言学和人类感知的研究、对黑洞的理解以及许多其他领域的研究都是至关重要的。<br />
<br />
<br />
<br />
<br />
<br />
The field is at the intersection of mathematics, [[statistics]], computer science, physics, [[Neuroscience|neurobiology]], [[information engineering (field)|information engineering]], and electrical engineering. The theory has also found applications in other areas, including [[statistical inference]], [[natural language processing]], [[cryptography]], [[neurobiology]],<ref name="Spikes">{{cite book|title=Spikes: Exploring the Neural Code|author1=F. Rieke|author2=D. Warland|author3=R Ruyter van Steveninck|author4=W Bialek|publisher=The MIT press|year=1997|isbn=978-0262681087}}</ref> [[human vision]],<ref>{{Cite journal|last=Delgado-Bonal|first=Alfonso|last2=Martín-Torres|first2=Javier|date=2016-11-03|title=Human vision is determined based on information theory|journal=Scientific Reports|language=En|volume=6|issue=1|pages=36038|bibcode=2016NatSR...636038D|doi=10.1038/srep36038|issn=2045-2322|pmc=5093619|pmid=27808236}}</ref> the evolution<ref>{{cite journal|last1=cf|last2=Huelsenbeck|first2=J. P.|last3=Ronquist|first3=F.|last4=Nielsen|first4=R.|last5=Bollback|first5=J. P.|year=2001|title=Bayesian inference of phylogeny and its impact on evolutionary biology|url=|journal=Science|volume=294|issue=5550|pages=2310–2314|bibcode=2001Sci...294.2310H|doi=10.1126/science.1065889|pmid=11743192}}</ref> and function<ref>{{cite journal|last1=Allikmets|first1=Rando|last2=Wasserman|first2=Wyeth W.|last3=Hutchinson|first3=Amy|last4=Smallwood|first4=Philip|last5=Nathans|first5=Jeremy|last6=Rogan|first6=Peter K.|year=1998|title=Thomas D. Schneider], Michael Dean (1998) Organization of the ABCR gene: analysis of promoter and splice junction sequences|url=http://alum.mit.edu/www/toms/|journal=Gene|volume=215|issue=1|pages=111–122|doi=10.1016/s0378-1119(98)00269-8|pmid=9666097}}</ref> of molecular codes ([[bioinformatics]]), [[model selection]] in statistics,<ref>Burnham, K. P. and Anderson D. R. (2002) ''Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition'' (Springer Science, New York) {{ISBN|978-0-387-95364-9}}.</ref> [[thermal physics]],<ref>{{cite journal|last1=Jaynes|first1=E. T.|year=1957|title=Information Theory and Statistical Mechanics|url=http://bayes.wustl.edu/|journal=Phys. Rev.|volume=106|issue=4|page=620|bibcode=1957PhRv..106..620J|doi=10.1103/physrev.106.620}}</ref> [[quantum computing]], linguistics, [[plagiarism detection]],<ref>{{cite journal|last1=Bennett|first1=Charles H.|last2=Li|first2=Ming|last3=Ma|first3=Bin|year=2003|title=Chain Letters and Evolutionary Histories|url=http://sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|journal=Scientific American|volume=288|issue=6|pages=76–81|bibcode=2003SciAm.288f..76B|doi=10.1038/scientificamerican0603-76|pmid=12764940|access-date=2008-03-11|archive-url=https://web.archive.org/web/20071007041539/http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|archive-date=2007-10-07|url-status=dead}}</ref> [[pattern recognition]], and [[anomaly detection]].<ref>{{Cite web|url=http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|title=Some background on why people in the empirical sciences may want to better understand the information-theoretic methods|author=David R. Anderson|date=November 1, 2003|archiveurl=https://web.archive.org/web/20110723045720/http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|archivedate=July 23, 2011|url-status=dead|accessdate=2010-06-23}}<br />
<br />
The field is at the intersection of mathematics, statistics, computer science, physics, neurobiology, information engineering, and electrical engineering. The theory has also found applications in other areas, including statistical inference, natural language processing, cryptography, neurobiology, human vision, the evolution and function of molecular codes (bioinformatics), model selection in statistics, thermal physics, quantum computing, linguistics, plagiarism detection, pattern recognition, and anomaly detection.<ref><br />
<br />
该领域是数学、统计学、计算机科学、物理学、神经生物学、信息工程和电气工程的交叉学科。这一理论也在其他领域得到了应用,比如推论统计学、自然语言处理、密码学、神经生物学、人类视觉、分子编码的进化和功能(生物信息学)、统计学中的模型选择、热物理学、量子计算、语言学、剽窃检测、模式识别和异常检测。 <br />
<br />
</ref> Important sub-fields of information theory include [[source coding]], [[algorithmic complexity theory]], [[algorithmic information theory]], [[information-theoretic security]], [[Grey system theory]] and measures of information.<br />
<br />
</ref> Important sub-fields of information theory include source coding, algorithmic complexity theory, algorithmic information theory, information-theoretic security, Grey system theory and measures of information.<br />
<br />
信息论的重要分支包括信源编码、算法复杂性理论、算法信息论、资讯理论安全性、灰色系统理论和信息度量。<br />
<br />
<br />
<br />
<br />
<br />
Applications of fundamental topics of information theory include [[lossless data compression]] (e.g. [[ZIP (file format)|ZIP files]]), [[lossy data compression]] (e.g. [[MP3]]s and [[JPEG]]s), and [[channel capacity|channel coding]] (e.g. for [[digital subscriber line|DSL]]). Information theory is used in [[information retrieval]], [[intelligence (information gathering)|intelligence gathering]], gambling, and even in musical composition.<br />
<br />
Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s and JPEGs), and channel coding (e.g. for DSL). Information theory is used in information retrieval, intelligence gathering, gambling, and even in musical composition.<br />
<br />
信息论基本主题的应用包括无损数据压缩(例如:ZIP压缩文件)、有损数据压缩(例如:Mp3和jpeg格式) ,以及频道编码(例如:DSL)。信息论应用于信息检索、情报收集、赌博,甚至在音乐创作中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
A key measure in information theory is [[information entropy|entropy]]. Entropy quantifies the amount of uncertainty involved in the value of a [[random variable]] or the outcome of a [[random process]]. For example, identifying the outcome of a fair [[coin flip]] (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a {{dice}} (with six equally likely outcomes). Some other important measures in information theory are [[mutual information]], channel capacity, [[error exponent]]s, and [[relative entropy]].<br />
<br />
A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. For example, identifying the outcome of a fair coin flip (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a (with six equally likely outcomes). Some other important measures in information theory are mutual information, channel capacity, error exponents, and relative entropy.<br />
<br />
信息论中的一个关键度量是熵。熵量化了一个随机变量的值或者一个随机过程的结果所包含的不确定性。例如,识别一次公平抛硬币的结果(有两个同样可能的结果)所提供的信息(较低的熵)少于指定一卷 a 的结果(有六个同样可能的结果)。信息论中的其他一些重要指标有:互信息、信道容量、误差指数和相对熵。<br />
<br />
<br />
<br />
<br />
<br />
==Overview==<br />
<br />
==Overview==<br />
<br />
概览<br />
<br />
<br />
<br />
<br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the [[noisy-channel coding theorem]] showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<ref name="Spikes" /><br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the noisy-channel coding theorem showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<br />
<br />
信息论主要研究信息的传递、处理、提取和利用。抽象地说,信息可以作为不确定性的解决方案。1948年,Claude Shannon在他的论文"[[A Mathematical Theory of Communication]]"中将这个抽象的概念具体化,在这篇论文中“信息”被认为是一组可能的信息,其目标是通过噪声信道发送这些信息,然后让接收器在信道噪声的影响下以较低的错误概率来重构信息。Shannon的主要结果为:噪信道编码定理表明,在许多信道(这个数量仅仅依赖于信息发送所经过的信道的统计信息)使用的限制下,信道容量为渐近可达到的信息传输速率,。<br />
<br />
<br />
<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of [[Rubric (academic)|rubrics]] throughout the world over the past half century or more: [[adaptive system]]s, [[anticipatory system]]s, [[artificial intelligence]], [[complex system]]s, [[complexity science]], [[cybernetics]], [[Informatics (academic field)|informatics]], [[machine learning]], along with [[systems science]]s of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of [[coding theory]].<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory.<br />
<br />
信息论与一系列纯粹的、应用的学科密切相关,在过去半个世纪甚至更久的时间里,在全球范围内已经有各种专栏下被研究和简化为工程实践,比如在自适应系统,预期系统,人工智能,复杂系统,复杂性科学,控制论,信息学,机器学习,以及许多描述的系统科学等学科中的研究与应用。信息论是一个广泛而深入的数学理论,同样也具有广泛而深入的应用,其中编码理论是至关重要的领域。<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is concerned with finding explicit methods, called ''codes'', for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and [[error-correction]] (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
Coding theory is concerned with finding explicit methods, called codes, for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
编码理论与寻找明确的方法(编码)有关,用于提高效率和将噪声信道上传输的数据错误率降低到接近信道容量。这些编码可大致分为数据压缩编码(信源编码)和纠错(信道编码)技术。在后一种技术中,花了很多年才证明Shannon的工作是可行的。<br />
<br />
<br />
<br />
<br />
<br />
A third class of information theory codes are cryptographic algorithms (both [[code (cryptography)|code]]s and [[cipher]]s). Concepts, methods and results from coding theory and information theory are widely used in cryptography and [[cryptanalysis]]. ''See the article [[ban (unit)]] for a historical application.''<br />
<br />
A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis. See the article ban (unit) for a historical application.<br />
<br />
第三类信息论代码是密码算法(包括代码和密码)。编码理论和信息论的概念、方法和结果在密码学和密码分析中得到了广泛的应用。有关历史应用,请参阅文章禁令(单位)。<br />
<br />
<br />
<br />
<br />
==Historical background==<br />
<br />
==Historical background==<br />
<br />
历史背景<br />
<br />
{{Main|History of information theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
The landmark event that ''established'' the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the ''[[Bell System Technical Journal]]'' in July and October 1948.<br />
<br />
The landmark event that established the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October 1948.<br />
<br />
1948年7月和10月,Claude Shannon在''[[Bell System Technical Journal]]''上发表了经典论文:"A Mathematical Theory of Communication",这是建立信息论学科并立即引起全世界关注的里程碑事件。<br />
<br />
<br />
<br />
<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at [[Bell Labs]], all implicitly assuming events of equal probability. [[Harry Nyquist]]'s 1924 paper, ''Certain Factors Affecting Telegraph Speed'', contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation {{math|1=''W'' = ''K'' log ''m''}} (recalling [[Boltzmann's constant]]), where ''W'' is the speed of transmission of intelligence, ''m'' is the number of different voltage levels to choose from at each time step, and ''K'' is a constant. [[Ralph Hartley]]'s 1928 paper, ''Transmission of Information'', uses the word ''information'' as a measurable quantity, reflecting the receiver's ability to distinguish one [[sequence of symbols]] from any other, thus quantifying information as {{math|1=''H'' = log ''S''<sup>''n''</sup> = ''n'' log ''S''}}, where ''S'' was the number of possible symbols, and ''n'' the number of symbols in a transmission. The unit of information was therefore the [[decimal digit]], which has since sometimes been called the [[Hartley (unit)|hartley]] in his honor as a unit or scale or measure of information. [[Alan Turing]] in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war [[Cryptanalysis of the Enigma|Enigma]] ciphers.<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation (recalling Boltzmann's constant), where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish one sequence of symbols from any other, thus quantifying information as , where S was the number of possible symbols, and n the number of symbols in a transmission. The unit of information was therefore the decimal digit, which has since sometimes been called the hartley in his honor as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers.<br />
<br />
在此之前,贝尔实验室已经提出了有限的信息论思想,所有这些理论都隐含地假设了概率均等的事件。Harry Nyquist 在1924年发表的论文”Certain Factors Affecting Telegraph Speed”中包含一个理论部分,量化了“智能”和通信系统可以传输的“线路速度”,并给出了关系式(检索Boltzmann常数) ,其中 w 是智能传输的速度,m 是每个时间步长可以选择的不同电压电平的数,k 是常数。Ralph Hartley在1928年发表的论文” Transmission of Information”中,将单词信息作为一个可测量的量,以此反映接收器区分一系列符号的能力,从而将信息量化,其中 s 是可能符号的数量,n 是传输中符号的数量。因此信息的单位就是十进制数字,为了表示对他的尊敬,这个单位有时被称为Hartley,作为信息的单位、尺度或度量。1940年,Alan Turing在对德国二战时期破解迷密码(Enigma ciphers)的统计分析中使用了类似的思想。<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of [[thermodynamics]] by [[Ludwig Boltzmann]] and [[J. Willard Gibbs]]. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by [[Rolf Landauer]] in the 1960s, are explored in ''[[Entropy in thermodynamics and information theory]]''.<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory.<br />
<br />
信息论背后的许多数学理论(包括不同概率的事件)都是由Ludwig Boltzmann和 j. Willard Gibbs 为热力学领域而发展起来的。信息论中的熵和热力学中的熵之间的联系,包括 Rolf Landauer 在20世纪60年代的重要贡献,在热力学和信息论的熵中进行了探讨。<br />
<br />
<br />
<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
1944年底之前,Shannon的工作在贝尔实验室已基本完成。在Shannon的开创性的论文中首次引入了定性和定量的通信模型,将其作为信息理论基础的统计过程。<br />
:"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
通信的基本问题是在一点上精确地或近似地再现在另一点上选择的信息<br />
<br />
<br />
<br />
<br />
<br />
With it came the ideas of<br />
<br />
With it came the ideas of<br />
<br />
相关观点<br />
<br />
* the information entropy and [[redundancy (information theory)|redundancy]] of a source, and its relevance through the [[source coding theorem]];<br />
<br />
<br />
* the mutual information, and the channel capacity of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;<br />
<br />
<br />
* the practical result of the [[Shannon–Hartley law]] for the channel capacity of a [[Gaussian channel]]; as well as<br />
<br />
<br />
<br />
* the [[bit]]—a new way of seeing the most fundamental unit of information.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==Quantities of information==<br />
<br />
==Quantities of information==<br />
<br />
资料数量<br />
<br />
{{Main|Quantities of information}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Information theory is based on [[probability theory]] and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy [[Communication channel|channel]] in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
Information theory is based on probability theory and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy channel in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
信息理论是基于概率论和统计学的。信息论常常涉及到与随机变量相关的分布信息的度量。重要的信息量是熵和互信息,前者是单一随机变量中信息的度量,后者是两个随机变量之间信息的度量。前者是一个随机变量的概率分布的性质,并且给出了一个限制,在给定分布的独立样本生成的数据可以被可靠压缩的速率。后者是两个随机变量联合分布的一个性质,是当信道统计量由联合分布确定时,在长块长度的限制下通过噪声信道的可靠通信的最大速率。<br />
<br />
<br />
<br />
<br />
<br />
The choice of logarithmic base in the following formulae determines the [[units of measurement|unit]] of information entropy that is used. A common unit of information is the bit, based on the [[binary logarithm]]. Other units include the [[nat (unit)|nat]], which is based on the [[natural logarithm]], and the [[deciban|decimal digit]], which is based on the [[common logarithm]].<br />
<br />
The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. A common unit of information is the bit, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the decimal digit, which is based on the common logarithm.<br />
<br />
以下公式中对数底的选择决定了使用的熵单位。一个常见的信息单位是位,基于以2爲底的对数。其他单位包括基于自然对数的 nat 和基于常用对数的十进制数字。<br />
<br />
<br />
<br />
<br />
<br />
In what follows, an expression of the form {{math|''p'' log ''p''}} is considered by convention to be equal to zero whenever {{math|1=''p'' = 0}}. This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
In what follows, an expression of the form is considered by convention to be equal to zero whenever . This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
在接下来的部分中,按照惯例,只要存在一个表达式,该表达式就被认为等于零。这是合理的,因为 math lim { p right tarrow 0 + } p log p0 / math 适用于任何对数底。<br />
<br />
<br />
<br />
<br />
<br />
===Entropy of an information source===<br />
<br />
===Entropy of an information source===<br />
<br />
信息源的熵<br />
<br />
Based on the [[probability mass function]] of each source symbol to be communicated, the Shannon [[Entropy (information theory)|entropy]] {{math|''H''}}, in units of bits (per symbol), is given by<br />
<br />
Based on the probability mass function of each source symbol to be communicated, the Shannon entropy , in units of bits (per symbol), is given by<br />
<br />
基于要通信的每个源符号的概率质量函数,香农熵以比特为单位(每个符号)由<br />
<br />
:<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
数学 h- sum { i } i log 2(pi) / math<br />
<br />
where {{math|''p<sub>i</sub>''}} is the probability of occurrence of the {{math|''i''}}-th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the [[Shannon (unit)|shannon]] in his honor. Entropy is also commonly computed using the natural logarithm (base [[E (mathematical constant)|{{mvar|e}}]], where {{mvar|e}} is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base {{nowrap|1=2<sup>8</sup> = 256}} will produce a measurement in [[byte]]s per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
where is the probability of occurrence of the -th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the shannon in his honor. Entropy is also commonly computed using the natural logarithm (base E (mathematical constant)|, where is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base will produce a measurement in bytes per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
其中是源符号的第-个可能值出现的概率。这个方程给出了以“比特”(每个符号)为单位的熵,因为它使用了以2为底的对数,而这个以2为底的熵度量有时被称为香农(shannon) ,以纪念他。熵的计算也通常使用自然对数(基数 e (数学常数) | ,其中是欧拉数) ,它产生以每个符号的 nats 为单位的熵的测量,有时通过避免在公式中包含额外常数来简化分析。其他基地也是可能的,但不常用。例如,以为底的对数将产生以每个符号的字节为单位的测量值,以10为底的对数将产生以每个符号的十进制数字(或哈特利)为单位的测量值。<br />
<br />
<br />
<br />
<br />
<br />
Intuitively, the entropy {{math|''H<sub>X</sub>''}} of a discrete random variable {{math|''X''}} is a measure of the amount of ''uncertainty'' associated with the value of {{math|''X''}} when only its distribution is known.<br />
<br />
Intuitively, the entropy of a discrete random variable is a measure of the amount of uncertainty associated with the value of when only its distribution is known.<br />
<br />
直观上,离散型随机变量的熵是一种度量,当只有它的分布是已知的时候,与它的值相关的不确定性的量度。<br />
<br />
<br />
<br />
<br />
<br />
The entropy of a source that emits a sequence of {{math|''N''}} symbols that are [[independent and identically distributed]] (iid) is {{math|''N'' ⋅ ''H''}} bits (per message of {{math|''N''}} symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length {{math|''N''}} will be less than {{math|''N'' ⋅ ''H''}}.<br />
<br />
The entropy of a source that emits a sequence of symbols that are independent and identically distributed (iid) is bits (per message of symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length will be less than .<br />
<br />
发出独立且同分布的符号序列(iid)的源的熵是位(每个符号消息)。如果源数据符号是同分布的,但不是独立的,则消息长度的熵将小于。<br />
<br />
<br />
<br />
<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px|The entropy of a [[Bernoulli trial]] as a function of success probability, often called the {{em|[[binary entropy function]]}}, {{math|''H''<sub>b</sub>(''p'')}}. The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
The entropy of a [[Bernoulli trial as a function of success probability, often called the , . The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
作为成功概率的函数,[ Bernoulli 试验的熵,通常称为,。当两种可能的结果的概率相等时,每次试验的熵最大化为1位,就像无偏的掷硬币一样。]<br />
<br />
<br />
<br />
<br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages {{math|{{mset|''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}}} that {{math|''X''}} could be, and {{math|''p''(''x'')}} is the probability of some <math>x \in \mathbb X</math>, then the entropy, {{math|''H''}}, of {{math|''X''}} is defined:<ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages }} that could be, and is the probability of some <math>x \in \mathbb X</math>, then the entropy, , of is defined:<br />
<br />
如果一个人在传输之前传输1000位(0和1) ,并且接收者知道每个位的值(有一个确定的特定值) ,那么很明显没有信息被传输。然而,如果每个比特都是独立的,那么它们传输的信息量可能是0或1,1000个信息子(通常被称为比特)。在这两个极端之间,信息可以被量化如下。如果 x 是所有消息}}的集合,并且是某个 mathbb x / math 中某个数学 x 的概率,那么熵,,的定义如下:<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
Math h (x) mathbb { x }[ i (x)]-sum { x in mathbb { x } p (x) log p (x) . / math<br />
<br />
<br />
<br />
<br />
<br />
(Here, {{math|''I''(''x'')}} is the [[self-information]], which is the entropy contribution of an individual message, and {{math|𝔼<sub>''X''</sub>}} is the [[expected value]].) A property of entropy is that it is maximized when all the messages in the message space are equiprobable {{math|1=''p''(''x'') = 1/''n''}}; i.e., most unpredictable, in which case {{math|1=''H''(''X'') = log ''n''}}.<br />
<br />
(Here, is the self-information, which is the entropy contribution of an individual message, and is the expected value.) A property of entropy is that it is maximized when all the messages in the message space are equiprobable ; i.e., most unpredictable, in which case .<br />
<br />
(这里是自信息,它是单个信息的熵贡献,也是期望值。)熵的一个特性是,当消息空间中的所有消息都是等概率时,熵就会最大化; 也就是说,在这种情况下,熵是最不可预测的。<br />
<br />
<br />
<br />
<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
对于具有两个结果的随机变量,熵的特殊情况是二元熵函数,通常以对数为底2,因此以 shannon (Sh)为单位:<br />
<br />
<br />
<br />
<br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
(p)-p log 2 p-(1-p) log 2(1-p) . / math<br />
<br />
<br />
<br />
<br />
<br />
===Joint entropy===<br />
<br />
===Joint entropy===<br />
<br />
联合熵<br />
<br />
The {{em|[[joint entropy]]}} of two discrete random variables {{math|''X''}} and {{math|''Y''}} is merely the entropy of their pairing: {{math|(''X'', ''Y'')}}. This implies that if {{math|''X''}} and {{math|''Y''}} are [[statistical independence|independent]], then their joint entropy is the sum of their individual entropies.<br />
<br />
The of two discrete random variables and is merely the entropy of their pairing: . This implies that if and are independent, then their joint entropy is the sum of their individual entropies.<br />
<br />
两个离散随机变量的熵,仅仅是它们配对的熵: 。这意味着如果和是独立的,那么它们的联合熵就是它们各自熵的总和。<br />
<br />
<br />
<br />
<br />
<br />
For example, if {{math|(''X'', ''Y'')}} represents the position of a chess piece — {{math|''X''}} the row and {{math|''Y''}} the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
For example, if represents the position of a chess piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
例如,如果代表棋子的位置ー行和列,那么棋子行和列的联合熵就是棋子位置的联合熵。<br />
<br />
<br />
<br />
<br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
数学 h (x,y) mathbb { x,y }[- log p (x,y)]- sum { x,y } p (x,y) log p (x,y) ,/ math<br />
<br />
<br />
<br />
<br />
<br />
Despite similar notation, joint entropy should not be confused with {{em|[[cross entropy]]}}.<br />
<br />
Despite similar notation, joint entropy should not be confused with .<br />
<br />
尽管符号相似,联合熵不应与。<br />
<br />
<br />
<br />
<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
条件熵(含糊其辞)<br />
<br />
The {{em|[[conditional entropy]]}} or ''conditional uncertainty'' of {{math|''X''}} given random variable {{math|''Y''}} (also called the ''equivocation'' of {{math|''X''}} about {{math|''Y''}}) is the average conditional entropy over {{math|''Y''}}:<ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
The or conditional uncertainty of given random variable (also called the equivocation of about ) is the average conditional entropy over :<br />
<br />
给定随机变量的或条件不确定性(也称为约的模糊性)是除以以下条件熵的平均值:<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
数学 h (x | y) mathbb e y [ h (x | y)]- sum { y } p (y) sum { x } p (x | y) log p (x | y)- sum { x,y } p (x,y) log p (x | y)。 数学<br />
<br />
<br />
<br />
<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
因为熵可以取决于一个随机变量或者一个确定的随机变量,所以应该注意不要混淆条件熵的这两个定义,前者更为常用。这种条件熵的一个基本属性是:<br />
<br />
<br />
<br />
<br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
数学 h (x | y) h (x,y)-h (y)。 ,/ 数学<br />
<br />
<br />
<br />
<br />
<br />
===Mutual information (transinformation)===<br />
<br />
===Mutual information (transinformation)===<br />
<br />
互信息(交换信息)<br />
<br />
''[[Mutual information]]'' measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of {{math|''X''}} relative to {{math|''Y''}} is given by:<br />
<br />
Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of relative to is given by:<br />
<br />
互信息测量的是通过观察另一个随机变量可以获得的信息量。在通信中,它可以用来最大限度地在发送和接收信号之间共享信息量,这一点非常重要。相对于的相互信息是通过以下方式给出的:<br />
<br />
<br />
<br />
<br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
数学 i (x; y) mathbb { x,y }[ SI (x,y)] sum { x,y } p (x,y) log frac { p (x,y)} ,p (y)} / math<br />
<br />
where {{math|SI}} (''S''pecific mutual ''I''nformation) is the [[pointwise mutual information]].<br />
<br />
where (Specific mutual Information) is the pointwise mutual information.<br />
<br />
其中(特定的相互信息)是点间互信息。<br />
<br />
<br />
<br />
<br />
<br />
A basic property of the mutual information is that<br />
<br />
A basic property of the mutual information is that<br />
<br />
互信息的一个基本属性是<br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
<math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
数学 i (x; y) h (x)-h (x | y)。 ,/ 数学<br />
<br />
That is, knowing ''Y'', we can save an average of {{math|''I''(''X''; ''Y'')}} bits in encoding ''X'' compared to not knowing ''Y''.<br />
<br />
That is, knowing Y, we can save an average of bits in encoding X compared to not knowing Y.<br />
<br />
也就是说,了解了 y,我们就可以在编码 x 时比不知道 y 平均节省位。<br />
<br />
<br />
<br />
<br />
<br />
Mutual information is [[symmetric function|symmetric]]:<br />
<br />
Mutual information is symmetric:<br />
<br />
互信息是对称的:<br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
<math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
数学 i (x; y) i (y; x) h (x) + h (y)-h (x,y)。 ,/ math<br />
<br />
<br />
<br />
<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the [[posterior probability|posterior probability distribution]] of ''X'' given the value of ''Y'' and the [[prior probability|prior distribution]] on ''X'':<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of X given the value of Y and the prior distribution on X:<br />
<br />
在给定 y 值和 x 的先验分布的情况下,互信息可以表示为 x 的后验概率之间的平均 Kullback-Leibler 散度(信息增益) :<br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
<math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
Math i (x; y) mathbb e { p (y)}[ d { mathrm { KL }(p (x | y) | p (x))] . / math<br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on ''X'' will change if we are given the value of ''Y''. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
换句话说,这是一种衡量方法,如果我们给出 y 的值,平均来说,x 上的概率分布将会改变多少。这经常被重新计算为从边际分布乘积到实际联合分布的背离:<br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
<math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
Math i (x; y) d { mathrum { KL }(p (x,y) | p (x) p (y)) . / math<br />
<br />
<br />
<br />
<br />
<br />
Mutual information is closely related to the [[likelihood-ratio test|log-likelihood ratio test]] in the context of contingency tables and the [[multinomial distribution]] and to [[Pearson's chi-squared test|Pearson's χ<sup>2</sup> test]]: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ<sup>2</sup> test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
互信息与列联表和多项分布上的对数似然比检验以及 Pearson 的 sup 2 / sup 检验密切相关: 互信息可以被看作是一个评估一对变量之间独立性的统计量,并且具有良好的指定的渐近分布。<br />
<br />
<br />
<br />
<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
Kullback-Leibler 分歧(信息增益)<br />
<br />
The ''[[Kullback–Leibler divergence]]'' (or ''information divergence'', ''information gain'', or ''relative entropy'') is a way of comparing two distributions: a "true" [[probability distribution]] ''p(X)'', and an arbitrary probability distribution ''q(X)''. If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
Kullback-Leibler 分歧(或相对熵,信息增益,或相对熵)是一种比较两种分布的方法: 一种是“真实的”概率分布 p (x) ,另一种是任意的概率分布 q (x)。如果我们假设 q (x)是某些数据的分布,而实际上 p (x)是正确的分布,那么 Kullback-Leibler 散度就是每个数据压缩所需的平均附加比特数。它就是这样定义的<br />
<br />
<br />
<br />
<br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
数学 d { mathrm { KL }(p (x) | q (x)) sum { x }-p (x) log { q (x)} ,-,sum { x }-p (x) log { p (x) log { sum { p (x)}{ q (x)}。 数学<br />
<br />
<br />
<br />
<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true [[Metric (mathematics)|metric]] since it is not symmetric and does not satisfy the [[triangle inequality]] (making it a semi-quasimetric).<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasimetric).<br />
<br />
尽管有时它被用作一个距离度量,KL 散度不是一个真正的度量,因为它不是对称的,也不满足三角不等式(使它成为一个半准度量)。<br />
<br />
<br />
<br />
<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number ''X'' is about to be drawn randomly from a discrete set with probability distribution ''p(x)''. If Alice knows the true distribution ''p(x)'', while Bob believes (has a [[prior probability|prior]]) that the distribution is ''q(x)'', then Bob will be more [[Information content|surprised]] than Alice, on average, upon seeing the value of ''X''. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the ''log'' is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number X is about to be drawn randomly from a discrete set with probability distribution p(x). If Alice knows the true distribution p(x), while Bob believes (has a prior) that the distribution is q(x), then Bob will be more surprised than Alice, on average, upon seeing the value of X. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
对 KL 分歧的另一种解释是“不必要的惊讶” ,由真理之前引入: 假设一个数字 x 是从一个离散的集合随机抽取的概率分布 p (x)。如果爱丽丝知道真正的分布 p (x) ,而鲍勃认为(有先验)分布是 q (x) ,那么平均而言,鲍勃在看到 x 的值时会比爱丽丝更惊讶。Kl 发散度是 Bob (主观)惊喜的(客观)预期值减去 Alice 的惊喜值,如果日志位于以2为基数,则以位为单位进行测量。通过这种方式,鲍勃的先验“错误”的程度可以量化为预期会让他“不必要地惊讶”的程度。<br />
<br />
<br />
<br />
<br />
<br />
===Other quantities===<br />
<br />
===Other quantities===<br />
<br />
其他数量<br />
<br />
Other important information theoretic quantities include [[Rényi entropy]] (a generalization of entropy), [[differential entropy]] (a generalization of quantities of information to continuous distributions), and the [[conditional mutual information]].<br />
<br />
Other important information theoretic quantities include Rényi entropy (a generalization of entropy), differential entropy (a generalization of quantities of information to continuous distributions), and the conditional mutual information.<br />
<br />
其他重要的信息理论量包括 r nyi 熵(熵的推广) ,微分熵(信息量的推广到连续分布) ,以及条件互信息。<br />
<br />
<br />
<br />
<br />
<br />
==Coding theory==<br />
<br />
==Coding theory==<br />
<br />
编码理论<br />
<br />
{{Main|Coding theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction]].]]<br />
<br />
A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction.]]<br />
<br />
在可读光盘的表面上显示划痕的图片。 音乐和数据光盘使用纠错编码进行编码,因此即使它们有轻微的划痕,也可以通过[纠错]进行读取<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into [[data compression|source coding]] theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
编码理论是信息论最重要和最直接的应用之一。它可以细分为信源编码理论和信道编码理论。利用数据的统计描述,信息理论量化描述数据所需的位数,这是数据源的熵。<br />
<br />
<br />
<br />
<br />
<br />
* Data compression (source coding): There are two formulations for the compression problem:<br />
<br />
<br />
<br />
*[[lossless data compression]]: the data must be reconstructed exactly;<br />
<br />
<br />
<br />
*[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called ''[[rate–distortion theory]]''.<br />
<br />
<br />
<br />
* Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the [[broadcast channel]]) or intermediary "helpers" (the [[relay channel]]), or more general [[computer network|networks]], compression followed by transmission may no longer be optimal. [[Network information theory]] refers to these multi-agent communication models.<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models.<br />
<br />
这种将编码理论划分为压缩和传输的做法得到了信息传输定理或信源信道分离定理的证明,这些定理证明在许多情况下使用比特作为信息的通用货币是正确的。然而,这些定理只适用于一个发送用户希望与一个接收用户通信的情况。在多于一个发射器(多重接达频道)、多于一个接收器(广播频道)或中介”辅助器”(中继频道)或多个一般网络的情况下,先压缩后传输可能不再是最佳选择。网络信息理论就是指这些多 agent 通信模型。<br />
<br />
<br />
<br />
<br />
<br />
===Source theory===<br />
<br />
===Source theory===<br />
<br />
来源理论<br />
<br />
Any process that generates successive messages can be considered a {{em|[[Communication source|source]]}} of information. A memoryless source is one in which each message is an [[Independent identically distributed random variables|independent identically distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose less restrictive constraints. All such sources are [[stochastic process|stochastic]]. These terms are well studied in their own right outside information theory.<br />
<br />
Any process that generates successive messages can be considered a of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory.<br />
<br />
任何生成连续消息的进程都可以被认为是一种信息。无记忆信源是指每个信息是一个独立的同分布随机变量,而遍历性和平稳性的性质对信源的限制较少。所有这些源都是随机的。这些术语本身在信息论之外已经得到了很好的研究。<br />
<br />
<br />
<br />
<br />
<br />
====Rate====<!-- This section is linked from [[Channel capacity]] --><br />
<br />
====Rate====<!-- This section is linked from Channel capacity --><br />
<br />
速率! ——这部分与频道容量相关——<br />
<br />
Information ''[[Entropy rate|rate]]'' is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
信息率是每个符号的平均熵。对于无记忆的源,这仅仅是每个符号的熵,而对于静止的随机过程来说,它是<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
(x n | x { n-1} ,x { n-2} ,x { n-3} ,ldots) ; / math<br />
<br />
<br />
<br />
<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the ''average rate'' is<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is<br />
<br />
也就是说,一个符号的条件熵给出了所有之前产生的符号。对于一个不一定是平稳的过程的更一般的情况,平均速率是<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
(x1,x2, dots xn) ; / math<br />
<br />
<br />
<br />
<br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<br />
<br />
也就是说,每个符号的联合熵的极限。对于静止源,这两个表达式给出了相同的结果。<br />
<br />
<br />
<br />
<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of {{em|source coding}}.<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of .<br />
<br />
在信息论中,谈论一种语言的“速率”或“熵”是很常见的。这是适当的,例如,当信息来源是英语散文。信息来源的速度与它的冗余度以及它可以被压缩的程度有关。<br />
<br />
<br />
<br />
<br />
<br />
===Channel capacity===<br />
<br />
===Channel capacity===<br />
<br />
信道容量<br />
<br />
{{Main|Channel capacity}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Communications over a channel—such as an [[ethernet]] cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
通过信道(如以太网电缆)进行通信是信息理论的主要动机。然而,这样的通道往往不能产生准确的重建信号; 噪声,静默时间,和其他形式的信号损坏往往降低质量。<br />
<br />
<br />
<br />
<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
考虑一个离散通道上的通信过程。这个过程的一个简单模型如下:<br />
<br />
<br />
<br />
<br />
<br />
[[File:Channel model.svg|center|800px|Channel model]]<br />
<br />
Channel model<br />
<br />
通道模型<br />
<br />
<br />
<br />
<br />
<br />
Here ''X'' represents the space of messages transmitted, and ''Y'' the space of messages received during a unit time over our channel. Let {{math|''p''(''y''{{pipe}}''x'')}} be the [[conditional probability]] distribution function of ''Y'' given ''X''. We will consider {{math|''p''(''y''{{pipe}}''x'')}} to be an inherent fixed property of our communications channel (representing the nature of the ''[[Signal noise|noise]]'' of our channel). Then the joint distribution of ''X'' and ''Y'' is completely determined by our channel and by our choice of {{math|''f''(''x'')}}, the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the ''[[Signal (electrical engineering)|signal]]'', we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the {{em|channel capacity}} and is given by:<br />
<br />
Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let x)}} be the conditional probability distribution function of Y given X. We will consider x)}} to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the and is given by:<br />
<br />
这里 x 表示传输的消息空间,y 表示单位时间内通过我们的信道接收的消息空间。设 x)}是给定 x 的 y 的条件概率分布函数。我们将考虑 x)}是我们通信信道的固有固定属性(表示我们信道噪声的性质)。那么,x 和 y 的联合分布完全取决于我们的渠道,以及我们选择通过渠道发送的信息边缘分布。在这些约束条件下,我们希望最大化信息速率,或者说信号速率,我们可以通过信道进行通信。对于这个问题的适当措施是互信息,这个最大的互信息被称为互信息,并由以下人员给出:<br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
数学 c max { f } i (x; y) . !数学<br />
<br />
This capacity has the following property related to communicating at information rate ''R'' (where ''R'' is usually bits per symbol). For any information rate ''R < C'' and coding error ε > 0, for large enough ''N'', there exists a code of length ''N'' and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate ''R &gt; C'', it is impossible to transmit with arbitrarily small block error.<br />
<br />
This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R &gt; C, it is impossible to transmit with arbitrarily small block error.<br />
<br />
此容量具有以下与以信息速率 r (其中 r 通常为每个符号位)进行通信有关的属性。对于任意信息速率 rc 和编码错误0,对于足够大的 n,存在一个长度 n 和速率≥ r 的码和一个译码算法,使得块错误的最大概率≤ ,即总是可以在任意小的块错误下传输。此外,对于任何速率的 rc,它不可能传输任意小的块误差。<br />
<br />
<br />
<br />
<br />
<br />
''[[Channel code|Channel coding]]'' is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
信道编码涉及到寻找这种接近最优的编码,它可以用于在噪声信道上以接近信道容量的速率传输数据,而且编码错误很小。<br />
<br />
<br />
<br />
<br />
<br />
====Capacity of particular channel models====<br />
<br />
====Capacity of particular channel models====<br />
<br />
特定渠道模型的容量<br />
<br />
* A continuous-time analog communications channel subject to [[Gaussian noise]] — see [[Shannon–Hartley theorem]].<br />
<br />
<br />
<br />
* A [[binary symmetric channel]] (BSC) with crossover probability ''p'' is a binary input, binary output channel that flips the input bit with probability ''p''. The BSC has a capacity of {{math|1 &minus; ''H''<sub>b</sub>(''p'')}} bits per channel use, where {{math|''H''<sub>b</sub>}} is the binary entropy function to the base-2 logarithm:<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
File:Binary symmetric channel.svg<br />
<br />
文件: Binary symmetric channel.svg<br />
<br />
<br />
<br />
<br />
<br />
* A [[binary erasure channel]] (BEC) with erasure probability ''p'' is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is {{nowrap|1 &minus; ''p''}} bits per channel use.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
File:Binary erasure channel.svg<br />
<br />
文件: Binary erasure channel.svg<br />
<br />
<br />
<br />
<br />
<br />
==Applications to other fields==<br />
<br />
==Applications to other fields==<br />
<br />
其他领域的应用<br />
<br />
<br />
<br />
<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
情报使用和保密应用<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the [[Ban (unit)|ban]], was used in the [[Ultra]] project, breaking the German [[Enigma machine]] code and hastening the [[Victory in Europe Day|end of World War II in Europe]]. Shannon himself defined an important concept now called the [[unicity distance]]. Based on the redundancy of the [[plaintext]], it attempts to give a minimum amount of [[ciphertext]] necessary to ensure unique decipherability.<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of World War II in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.<br />
<br />
信息论概念应用于密码学和密码分析。图灵的信息单元,即禁令,被用于 Ultra 项目,破解了德国的恩尼格玛密码,加速了二战在欧洲的结束。香农自己定义了一个重要的概念,现在称为单一性距离。基于明文的冗余性,它试图给出最小数量的密文,以确保独特的破译能力。<br />
<br />
<br />
<br />
<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A [[brute force attack]] can break systems based on [[public-key cryptography|asymmetric key algorithms]] or on most commonly used methods of [[symmetric-key algorithm|symmetric key algorithms]] (sometimes called secret key algorithms), such as [[block cipher]]s. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on asymmetric key algorithms or on most commonly used methods of symmetric key algorithms (sometimes called secret key algorithms), such as block ciphers. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
信息理论使我们相信,保守秘密比它最初可能出现时要困难得多。穷举法可以基于非对称密钥算法或最常用的对称密钥算法(有时称为秘密密钥算法) ,如分组密码破坏系统。目前所有这些方法的安全性都来自于这样一个假设,即任何已知的攻击都不能在实际的时间内破坏它们。<br />
<br />
<br />
<br />
<br />
<br />
[[Information theoretic security]] refers to methods such as the [[one-time pad]] that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the [[key (cryptography)|key]]) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the [[Venona project]] was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
Information theoretic security refers to methods such as the one-time pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
信息理论安全指的是诸如一次性垫子这样的方法,它不容易受到这种蛮力攻击。在这种情况下,明文和密文(以密钥为条件)之间的正向有条件的互信息可以确保适当的传输,而明文和密文之间的无条件互信息仍然为零,从而导致绝对安全的通信。换句话说,窃听者不能通过获得密文的知识而不是密钥来改进他或她对明文的猜测。但是,如同任何其他密码系统一样,必须小心谨慎,甚至正确应用信息——理论上安全的方法; Venona 项目能够破解苏联的一次性密封垫,因为它们不适当地重复使用关键材料。<br />
<br />
<br />
<br />
<br />
<br />
===Pseudorandom number generation===<br />
<br />
===Pseudorandom number generation===<br />
<br />
伪随机数生成<br />
<br />
[[Pseudorandom number generator]]s are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed [[cryptographically secure pseudorandom number generator]]s, but even they require [[random seed]]s external to the software to work as intended. These can be obtained via [[Extractor (mathematics)|extractors]], if done carefully. The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
Pseudorandom number generators are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed cryptographically secure pseudorandom number generators, but even they require random seeds external to the software to work as intended. These can be obtained via extractors, if done carefully. The measure of sufficient randomness in extractors is min-entropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
伪随机数发生器广泛应用于计算机语言库和应用程序中。它们几乎普遍不适合加密使用,因为它们不回避现代计算机设备和软件的确定性本质。一类改进的随机数生成器被称为密码安全的伪随机数生成器,但即使它们也需要软件外部的随机种子才能正常工作。这些可以通过提取器获得,如果仔细做。提取器中充分随机性的度量是最小熵,这是一个通过 r nyi 熵与 Shannon 熵相关的值,r nyi 熵也用于评估密码系统中的随机性。虽然相关,这些措施之间的区别意味着具有高香农熵的随机变量不一定令人满意的使用在提取器和密码学使用。<br />
<br />
<br />
<br />
<br />
<br />
===Seismic exploration===<br />
<br />
===Seismic exploration===<br />
<br />
地震勘探<br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and [[digital signal processing]] offer a major improvement of resolution and image clarity over previous analog methods.<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing offer a major improvement of resolution and image clarity over previous analog methods.<br />
<br />
信息论的一个早期商业应用是在地震石油勘探领域。这一领域的工作使从所需的地震信号中剔除和分离不需要的噪声成为可能。信息理论和数字信号处理提供了一个重大改善的分辨率和图像清晰度比以前的模拟方法。<br />
<br />
<br />
<br />
<br />
<br />
===Semiotics===<br />
<br />
===Semiotics===<br />
<br />
符号学<br />
<br />
[[Semiotics|Semioticians]] [[:nl:Doede Nauta|Doede Nauta]] and [[Winfried Nöth]] both considered [[Charles Sanders Peirce]] as having created a theory of information in his works on semiotics.<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref>{{rp|171}}<ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref>{{rp|137}} Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
Semioticians Doede Nauta and Winfried Nöth both considered Charles Sanders Peirce as having created a theory of information in his works on semiotics. Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<br />
<br />
符号学家 Doede Nauta 和 Winfried n 都认为查尔斯·桑德斯·皮尔士在他的符号学著作中创造了信息理论。诺塔将符号信息理论定义为研究“编码、过滤和信息处理的内部过程”<br />
<br />
<br />
<br />
<br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as [[Umberto Eco]] and [[:it:Ferruccio Rossi-Landi|Ferruccio Rossi-Landi]] to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<ref>Nöth, Winfried (1981). "[https://kobra.uni-kassel.de/bitstream/handle/123456789/2014122246977/semi_2004_002.pdf?sequence=1&isAllowed=y Semiotics of ideology]". ''Semiotica'', Issue 148.</ref><br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<br />
<br />
来自信息论的概念,如冗余和代码控制,已经被符号学家如 Umberto Eco 和 Ferruccio Rossi-Landi 用来解释意识形态作为一种信息传递的形式,主导社会阶层通过使用高度冗余的标志来发出信息,这样在一系列相互竞争的标志中只有一个信息被解码。<br />
<br />
<br />
<br />
<br />
<br />
===Miscellaneous applications===<br />
<br />
===Miscellaneous applications===<br />
<br />
杂项申请<br />
<br />
Information theory also has applications in [[Gambling and information theory]], [[black hole information paradox|black holes]], and [[bioinformatics]].<br />
<br />
Information theory also has applications in Gambling and information theory, black holes, and bioinformatics.<br />
<br />
信息理论在赌博和信息理论、黑洞和生物信息学中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
==See also==<br />
<br />
==See also==<br />
<br />
参见<br />
<br />
{{Portal|Mathematics}}<br />
<br />
<br />
<br />
* [[Algorithmic probability]]<br />
<br />
<br />
<br />
* [[Bayesian inference]]<br />
<br />
<br />
<br />
* [[Communication theory]]<br />
<br />
<br />
<br />
* [[Constructor theory]] - a generalization of information theory that includes quantum information<br />
<br />
<br />
<br />
* [[Inductive probability]]<br />
<br />
<br />
<br />
* [[Info-metrics]]<br />
<br />
<br />
<br />
* [[Minimum message length]]<br />
<br />
<br />
<br />
* [[Minimum description length]]<br />
<br />
<br />
<br />
* [[List of important publications in theoretical computer science#Information theory|List of important publications]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Applications===<br />
<br />
===Applications===<br />
<br />
申请<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Active networking]]<br />
<br />
<br />
<br />
* [[Cryptanalysis]]<br />
<br />
<br />
<br />
* [[Cryptography]]<br />
<br />
<br />
<br />
* [[Cybernetics]]<br />
<br />
<br />
<br />
* [[Entropy in thermodynamics and information theory]]<br />
<br />
<br />
<br />
* [[Gambling]]<br />
<br />
<br />
<br />
* [[Intelligence (information gathering)]]<br />
<br />
<br />
<br />
* [[reflection seismology|Seismic exploration]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===History===<br />
<br />
===History===<br />
<br />
历史<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ralph Hartley|Hartley, R.V.L.]]<br />
<br />
<br />
<br />
* [[History of information theory]]<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]]<br />
<br />
<br />
<br />
* [[Timeline of information theory]]<br />
<br />
<br />
<br />
* [[Hubert Yockey|Yockey, H.P.]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Theory===<br />
<br />
===Theory===<br />
<br />
理论<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Coding theory]]<br />
<br />
<br />
<br />
* [[Detection theory]]<br />
<br />
<br />
<br />
* [[Estimation theory]]<br />
<br />
<br />
<br />
* [[Fisher information]]<br />
<br />
<br />
<br />
* [[Information algebra]]<br />
<br />
<br />
<br />
* [[Information asymmetry]]<br />
<br />
<br />
<br />
* [[Information field theory]]<br />
<br />
<br />
<br />
* [[Information geometry]]<br />
<br />
<br />
<br />
* [[Information theory and measure theory]]<br />
<br />
<br />
<br />
* [[Kolmogorov complexity]]<br />
<br />
<br />
<br />
* [[List of unsolved problems in information theory]]<br />
<br />
<br />
<br />
* [[Logic of information]]<br />
<br />
<br />
<br />
* [[Network coding]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
* [[Quantum information science]]<br />
<br />
<br />
<br />
* [[Source coding]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Concepts===<br />
<br />
===Concepts===<br />
<br />
概念<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ban (unit)]]<br />
<br />
<br />
<br />
* [[Channel capacity]]<br />
<br />
<br />
<br />
* [[Communication channel]]<br />
<br />
<br />
<br />
* [[Communication source]]<br />
<br />
<br />
<br />
* [[Conditional entropy]]<br />
<br />
<br />
<br />
* [[Covert channel]]<br />
<br />
<br />
<br />
* [[Data compression]]<br />
<br />
<br />
<br />
* Decoder<br />
<br />
<br />
<br />
* [[Differential entropy]]<br />
<br />
<br />
<br />
* [[Fungible information]]<br />
<br />
<br />
<br />
* [[Information fluctuation complexity]]<br />
<br />
<br />
<br />
* [[Information entropy]]<br />
<br />
<br />
<br />
* [[Joint entropy]]<br />
<br />
<br />
<br />
* [[Kullback–Leibler divergence]]<br />
<br />
<br />
<br />
* [[Mutual information]]<br />
<br />
<br />
<br />
* [[Pointwise mutual information]] (PMI)<br />
<br />
<br />
<br />
* [[Receiver (information theory)]]<br />
<br />
<br />
<br />
* [[Redundancy (information theory)|Redundancy]]<br />
<br />
<br />
<br />
* [[Rényi entropy]]<br />
<br />
<br />
<br />
* [[Self-information]]<br />
<br />
<br />
<br />
* [[Unicity distance]]<br />
<br />
<br />
<br />
* [[Variety (cybernetics)|Variety]]<br />
<br />
<br />
<br />
* [[Hamming distance]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==References==<br />
<br />
==References==<br />
<br />
参考资料<br />
<br />
{{Reflist}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===The classic work===<br />
<br />
===The classic work===<br />
<br />
经典之作<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]] (1948), "[[A Mathematical Theory of Communication]]", ''Bell System Technical Journal'', 27, pp.&nbsp;379–423 & 623–656, July & October, 1948. [http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf PDF.] <br />[http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Notes and other formats.]<br />
<br />
<br />
<br />
* R.V.L. Hartley, [http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdf "Transmission of Information"], ''Bell System Technical Journal'', July 1928<br />
<br />
<br />
<br />
* [[Andrey Kolmogorov]] (1968), "[https://www.tandfonline.com/doi/pdf/10.1080/00207166808803030 Three approaches to the quantitative definition of information]" in International Journal of Computer Mathematics.<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other journal articles===<br />
<br />
===Other journal articles===<br />
<br />
其他期刊文章<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* J. L. Kelly, Jr., [http://betbubbles.com/wp-content/uploads/2017/07/kelly.pdf Betbubbles.com]{{Dead link|date=January 2020 |bot=InternetArchiveBot |fix-attempted=yes }}, "A New Interpretation of Information Rate" ''Bell System Technical Journal'', Vol. 35, July 1956, pp.&nbsp;917–26.<br />
<br />
<br />
<br />
* R. Landauer, [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=615478 IEEE.org], "Information is Physical" ''Proc. Workshop on Physics and Computation PhysComp'92'' (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp.&nbsp;1–4.<br />
<br />
<br />
<br />
* {{cite journal | last1 = Landauer | first1 = R. | year = 1961 | title = Irreversibility and Heat Generation in the Computing Process | url = http://www.research.ibm.com/journal/rd/441/landauerii.pdf | journal = IBM J. Res. Dev. | volume = 5 | issue = 3| pages = 183–191 | doi = 10.1147/rd.53.0183 }}<br />
<br />
<br />
<br />
* {{cite arXiv |last=Timme |first=Nicholas|last2=Alford |first2=Wesley|last3=Flecker |first3=Benjamin|last4=Beggs |first4=John M.|date=2012 |title=Multivariate information measures: an experimentalist's perspective |eprint=1111.6857|class=cs.IT}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Textbooks on information theory===<br />
<br />
===Textbooks on information theory===<br />
<br />
信息论教材<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Arndt, C. ''Information Measures, Information and its Description in Science and Engineering'' (Springer Series: Signals and Communication Technology), 2004, {{isbn|978-3-540-40855-0}}<br />
<br />
<br />
<br />
* Ash, RB. ''Information Theory''. New York: Interscience, 1965. {{isbn|0-470-03445-9}}. New York: Dover 1990. {{isbn|0-486-66521-6}}<br />
<br />
<br />
<br />
* [[Gallager, R]]. ''Information Theory and Reliable Communication.'' New York: John Wiley and Sons, 1968. {{isbn|0-471-29048-3}}<br />
<br />
<br />
<br />
* Goldman, S. ''Information Theory''. New York: Prentice Hall, 1953. New York: Dover 1968 {{isbn|0-486-62209-6}}, 2005 {{isbn|0-486-44271-3}}<br />
<br />
<br />
<br />
* {{cite book |last1=Cover |first1=Thomas |author-link1=Thomas M. Cover |last2=Thomas |first2=Joy A. |title=Elements of information theory |edition=2nd |location=New York |publisher=[[Wiley-Interscience]] |date=2006 |isbn=0-471-24195-4}}<br />
<br />
<br />
<br />
* [[Csiszar, I]], Korner, J. ''Information Theory: Coding Theorems for Discrete Memoryless Systems'' Akademiai Kiado: 2nd edition, 1997. {{isbn|963-05-7440-3}}<br />
<br />
<br />
<br />
* [[David J. C. MacKay|MacKay, David J. C.]]. ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'' Cambridge: Cambridge University Press, 2003. {{isbn|0-521-64298-1}}<br />
<br />
<br />
<br />
* Mansuripur, M. ''Introduction to Information Theory''. New York: Prentice Hall, 1987. {{isbn|0-13-484668-0}}<br />
<br />
<br />
<br />
* [[Robert McEliece|McEliece, R]]. ''The Theory of Information and Coding". Cambridge, 2002. {{isbn|978-0521831857}}<br />
<br />
<br />
<br />
*Pierce, JR. "An introduction to information theory: symbols, signals and noise". Dover (2nd Edition). 1961 (reprinted by Dover 1980).<br />
<br />
<br />
<br />
* [[Reza, F]]. ''An Introduction to Information Theory''. New York: McGraw-Hill 1961. New York: Dover 1994. {{isbn|0-486-68210-2}}<br />
<br />
<br />
<br />
* {{cite book |last1=Shannon |first1=Claude |author-link1=Claude Shannon |last2=Weaver |first2=Warren |author-link2=Warren Weaver |date=1949 |title=The Mathematical Theory of Communication |url=http://monoskop.org/images/b/be/Shannon_Claude_E_Weaver_Warren_The_Mathematical_Theory_of_Communication_1963.pdf |location=[[Urbana, Illinois]] |publisher=[[University of Illinois Press]] |lccn=49-11922 |isbn=0-252-72548-4}}<br />
<br />
<br />
<br />
* Stone, JV. Chapter 1 of book [http://jim-stone.staff.shef.ac.uk/BookInfoTheory/InfoTheoryBookMain.html "Information Theory: A Tutorial Introduction"], University of Sheffield, England, 2014. {{isbn|978-0956372857}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book/ A First Course in Information Theory]'' Kluwer Academic/Plenum Publishers, 2002. {{isbn|0-306-46791-7}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book2/ Information Theory and Network Coding]'' Springer 2008, 2002. {{isbn|978-0-387-79233-0}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other books===<br />
<br />
===Other books===<br />
<br />
其他书籍<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Leon Brillouin, ''Science and Information Theory'', Mineola, N.Y.: Dover, [1956, 1962] 2004. {{isbn|0-486-43918-6}}<br />
<br />
<br />
<br />
* [[James Gleick]], ''[[The Information: A History, a Theory, a Flood]]'', New York: Pantheon, 2011. {{isbn|978-0-375-42372-7}}<br />
<br />
<br />
<br />
* A. I. Khinchin, ''Mathematical Foundations of Information Theory'', New York: Dover, 1957. {{isbn|0-486-60434-9}}<br />
<br />
<br />
<br />
* H. S. Leff and A. F. Rex, Editors, ''Maxwell's Demon: Entropy, Information, Computing'', Princeton University Press, Princeton, New Jersey (1990). {{isbn|0-691-08727-X}}<br />
<br />
<br />
<br />
* [[Robert K. Logan]]. ''What is Information? - Propagating Organization in the Biosphere, the Symbolosphere, the Technosphere and the Econosphere'', Toronto: DEMO Publishing.<br />
<br />
<br />
<br />
* Tom Siegfried, ''The Bit and the Pendulum'', Wiley, 2000. {{isbn|0-471-32174-5}}<br />
<br />
<br />
<br />
* Charles Seife, ''[[Decoding the Universe]]'', Viking, 2006. {{isbn|0-670-03441-X}}<br />
<br />
<br />
<br />
* Jeremy Campbell, ''[[Grammatical Man]]'', Touchstone/Simon & Schuster, 1982, {{isbn|0-671-44062-4}}<br />
<br />
<br />
<br />
* Henri Theil, ''Economics and Information Theory'', Rand McNally & Company - Chicago, 1967.<br />
<br />
<br />
<br />
* Escolano, Suau, Bonev, ''[https://www.springer.com/computer/image+processing/book/978-1-84882-296-2 Information Theory in Computer Vision and Pattern Recognition]'', Springer, 2009. {{isbn|978-1-84882-296-2}}<br />
<br />
<br />
<br />
* Vlatko Vedral, ''Decoding Reality: The Universe as Quantum Information'', Oxford University Press 2010. {{ISBN|0-19-923769-7}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===MOOC on information theory===<br />
<br />
===MOOC on information theory===<br />
<br />
信息理论大型开放式课程<br />
<br />
* Raymond W. Yeung, "[http://www.inc.cuhk.edu.hk/InformationTheory/index.html Information Theory]" ([[The Chinese University of Hong Kong]])<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==External links==<br />
<br />
==External links==<br />
<br />
外部链接<br />
<br />
{{Wikiquote}}<br />
<br />
<br />
<br />
{{Library resources box}}<br />
<br />
<br />
<br />
* {{SpringerEOM |title=Information |id=p/i051040}}<br />
<br />
<br />
<br />
* Lambert F. L. (1999), "[http://jchemed.chem.wisc.edu/Journal/Issues/1999/Oct/abs1385.html Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!]", ''Journal of Chemical Education''<br />
<br />
<br />
<br />
* [http://www.itsoc.org/ IEEE Information Theory Society] and [https://www.itsoc.org/resources/surveys ITSOC Monographs, Surveys, and Reviews]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Cybernetics}}<br />
<br />
<br />
<br />
{{Compression methods}}<br />
<br />
<br />
<br />
{{Areas of mathematics}}<br />
<br />
<br />
<br />
{{Computer science}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Authority control}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{DEFAULTSORT:Information Theory}}<br />
<br />
<br />
<br />
[[Category:Information theory| ]]<br />
<br />
<br />
<br />
[[Category:Computer science]]<br />
<br />
Category:Computer science<br />
<br />
类别: 计算机科学<br />
<br />
[[Category:Cybernetics]]<br />
<br />
Category:Cybernetics<br />
<br />
类别: 控制论<br />
<br />
[[Category:Formal sciences]]<br />
<br />
Category:Formal sciences<br />
<br />
类别: 正规科学<br />
<br />
[[Category:Information Age]]<br />
<br />
Category:Information Age<br />
<br />
类别: 信息时代<br />
<br />
<noinclude><br />
<br />
<small>This page was moved from [[wikipedia:en:Information theory]]. Its edit history can be viewed at [[信息论/edithistory]]</small></noinclude><br />
<br />
[[Category:待整理页面]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E7%94%A8%E6%88%B7:Pjhhh&diff=10923用户:Pjhhh2020-07-20T07:04:34Z<p>Pjhhh:</p>
<hr />
<div>== '''Hi,我是瑾晗''' ==<br />
<br />
*'''性别:'''男<br />
*'''当前就读:'''中国民航大学空中交通管理学院研究生在读,本科也曾就读于中国民航大学空中交通管理学院;<br />
*'''主要研究内容:'''空中交通流量管理、交通运输网络相关内容、交通复杂网络、航空网络弹性等;<br />
*'''兴趣与爱好:'''长跑、骑行、爬山;喜欢在一个陌生的地方漫无目的闲逛;做一些自己没尝试过的菜;<br />
*'''联系方式:'''mail:2019031013@cauc.edu.cn</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E4%BF%A1%E6%81%AF%E8%AE%BA_Information_theory&diff=10913信息论 Information theory2020-07-20T06:10:02Z<p>Pjhhh:</p>
<hr />
<div>此词条暂由彩云小译翻译,未经人工整理和审校,带来阅读不便,请见谅。{{distinguish|Information science}}<br />
<br />
<br />
<br />
{{Information theory}}<br />
<br />
<br />
<br />
'''Information theory''' studies the [[quantification (science)|quantification]], [[computer data storage|storage]], and [[telecommunication|communication]] of [[information]]. It was originally proposed by [[Claude Shannon]] in 1948 to find fundamental limits on [[signal processing]] and communication operations such as [[data compression]], in a landmark paper titled "[[A Mathematical Theory of Communication]]". Its impact has been crucial to the success of the [[Voyager program|Voyager]] missions to deep space, the invention of the [[compact disc]], the feasibility of mobile phones, the development of the Internet, the study of [[linguistics]] and of human perception, the understanding of [[black hole]]s, and numerous other fields.<br />
<br />
Information theory studies the quantification, storage, and communication of information. It was originally proposed by Claude Shannon in 1948 to find fundamental limits on signal processing and communication operations such as data compression, in a landmark paper titled "A Mathematical Theory of Communication". Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception, the understanding of black holes, and numerous other fields.<br />
<br />
'''信息论'''研究的是信息的量化、存储与传播。信息论最初是由[[Claude Shannon]]在1948年的一篇题为"[[A Mathematical Theory of Communication]]"的论文中提出的,其目的是找到信号处理和通信操作(如数据压缩)的基本限制。信息论对于旅行者号深空探测任务的成功、光盘的发明、移动电话的可行性、互联网的发展、语言学和人类感知的研究、对黑洞的理解以及许多其他领域的研究都是至关重要的。<br />
<br />
<br />
<br />
<br />
<br />
The field is at the intersection of mathematics, [[statistics]], computer science, physics, [[Neuroscience|neurobiology]], [[information engineering (field)|information engineering]], and electrical engineering. The theory has also found applications in other areas, including [[statistical inference]], [[natural language processing]], [[cryptography]], [[neurobiology]],<ref name="Spikes">{{cite book|title=Spikes: Exploring the Neural Code|author1=F. Rieke|author2=D. Warland|author3=R Ruyter van Steveninck|author4=W Bialek|publisher=The MIT press|year=1997|isbn=978-0262681087}}</ref> [[human vision]],<ref>{{Cite journal|last=Delgado-Bonal|first=Alfonso|last2=Martín-Torres|first2=Javier|date=2016-11-03|title=Human vision is determined based on information theory|journal=Scientific Reports|language=En|volume=6|issue=1|pages=36038|bibcode=2016NatSR...636038D|doi=10.1038/srep36038|issn=2045-2322|pmc=5093619|pmid=27808236}}</ref> the evolution<ref>{{cite journal|last1=cf|last2=Huelsenbeck|first2=J. P.|last3=Ronquist|first3=F.|last4=Nielsen|first4=R.|last5=Bollback|first5=J. P.|year=2001|title=Bayesian inference of phylogeny and its impact on evolutionary biology|url=|journal=Science|volume=294|issue=5550|pages=2310–2314|bibcode=2001Sci...294.2310H|doi=10.1126/science.1065889|pmid=11743192}}</ref> and function<ref>{{cite journal|last1=Allikmets|first1=Rando|last2=Wasserman|first2=Wyeth W.|last3=Hutchinson|first3=Amy|last4=Smallwood|first4=Philip|last5=Nathans|first5=Jeremy|last6=Rogan|first6=Peter K.|year=1998|title=Thomas D. Schneider], Michael Dean (1998) Organization of the ABCR gene: analysis of promoter and splice junction sequences|url=http://alum.mit.edu/www/toms/|journal=Gene|volume=215|issue=1|pages=111–122|doi=10.1016/s0378-1119(98)00269-8|pmid=9666097}}</ref> of molecular codes ([[bioinformatics]]), [[model selection]] in statistics,<ref>Burnham, K. P. and Anderson D. R. (2002) ''Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition'' (Springer Science, New York) {{ISBN|978-0-387-95364-9}}.</ref> [[thermal physics]],<ref>{{cite journal|last1=Jaynes|first1=E. T.|year=1957|title=Information Theory and Statistical Mechanics|url=http://bayes.wustl.edu/|journal=Phys. Rev.|volume=106|issue=4|page=620|bibcode=1957PhRv..106..620J|doi=10.1103/physrev.106.620}}</ref> [[quantum computing]], linguistics, [[plagiarism detection]],<ref>{{cite journal|last1=Bennett|first1=Charles H.|last2=Li|first2=Ming|last3=Ma|first3=Bin|year=2003|title=Chain Letters and Evolutionary Histories|url=http://sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|journal=Scientific American|volume=288|issue=6|pages=76–81|bibcode=2003SciAm.288f..76B|doi=10.1038/scientificamerican0603-76|pmid=12764940|access-date=2008-03-11|archive-url=https://web.archive.org/web/20071007041539/http://www.sciamdigital.com/index.cfm?fa=Products.ViewIssuePreview&ARTICLEID_CHAR=08B64096-0772-4904-9D48227D5C9FAC75|archive-date=2007-10-07|url-status=dead}}</ref> [[pattern recognition]], and [[anomaly detection]].<ref>{{Cite web|url=http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|title=Some background on why people in the empirical sciences may want to better understand the information-theoretic methods|author=David R. Anderson|date=November 1, 2003|archiveurl=https://web.archive.org/web/20110723045720/http://aicanderson2.home.comcast.net/~aicanderson2/home.pdf|archivedate=July 23, 2011|url-status=dead|accessdate=2010-06-23}}<br />
<br />
The field is at the intersection of mathematics, statistics, computer science, physics, neurobiology, information engineering, and electrical engineering. The theory has also found applications in other areas, including statistical inference, natural language processing, cryptography, neurobiology, human vision, the evolution and function of molecular codes (bioinformatics), model selection in statistics, thermal physics, quantum computing, linguistics, plagiarism detection, pattern recognition, and anomaly detection.<ref><br />
<br />
该领域是数学、统计学、计算机科学、物理学、神经生物学、信息工程和电气工程的交叉学科。这一理论也在其他领域得到了应用,比如推论统计学、自然语言处理、密码学、神经生物学、人类视觉、分子编码的进化和功能(生物信息学)、统计学中的模型选择、热物理学、量子计算、语言学、剽窃检测、模式识别和异常检测。 <br />
<br />
</ref> Important sub-fields of information theory include [[source coding]], [[algorithmic complexity theory]], [[algorithmic information theory]], [[information-theoretic security]], [[Grey system theory]] and measures of information.<br />
<br />
</ref> Important sub-fields of information theory include source coding, algorithmic complexity theory, algorithmic information theory, information-theoretic security, Grey system theory and measures of information.<br />
<br />
信息论的重要分支包括信源编码、算法复杂性理论、算法信息论、资讯理论安全性、灰色系统理论和信息度量。<br />
<br />
<br />
<br />
<br />
<br />
Applications of fundamental topics of information theory include [[lossless data compression]] (e.g. [[ZIP (file format)|ZIP files]]), [[lossy data compression]] (e.g. [[MP3]]s and [[JPEG]]s), and [[channel capacity|channel coding]] (e.g. for [[digital subscriber line|DSL]]). Information theory is used in [[information retrieval]], [[intelligence (information gathering)|intelligence gathering]], gambling, and even in musical composition.<br />
<br />
Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s and JPEGs), and channel coding (e.g. for DSL). Information theory is used in information retrieval, intelligence gathering, gambling, and even in musical composition.<br />
<br />
信息论基本主题的应用包括无损数据压缩(例如:ZIP压缩文件)、有损数据压缩(例如:Mp3和jpeg格式) ,以及频道编码(例如:用于DSL)。信息论应用于信息检索、情报收集、赌博,甚至在音乐创作中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
A key measure in information theory is [[information entropy|entropy]]. Entropy quantifies the amount of uncertainty involved in the value of a [[random variable]] or the outcome of a [[random process]]. For example, identifying the outcome of a fair [[coin flip]] (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a {{dice}} (with six equally likely outcomes). Some other important measures in information theory are [[mutual information]], channel capacity, [[error exponent]]s, and [[relative entropy]].<br />
<br />
A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. For example, identifying the outcome of a fair coin flip (with two equally likely outcomes) provides less information (lower entropy) than specifying the outcome from a roll of a (with six equally likely outcomes). Some other important measures in information theory are mutual information, channel capacity, error exponents, and relative entropy.<br />
<br />
信息论中的一个关键度量是熵。熵量化了一个随机变量的值或者一个随机过程的结果所包含的不确定性。例如,识别一次公平抛硬币的结果(有两个同样可能的结果)所提供的信息(较低的熵)少于指定一卷 a 的结果(有六个同样可能的结果)。信息论中的其他一些重要指标是互信息、信道容量、误差指数和相对熵。<br />
<br />
<br />
<br />
<br />
<br />
==Overview==<br />
<br />
==Overview==<br />
<br />
概览<br />
<br />
<br />
<br />
<br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the [[noisy-channel coding theorem]] showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<ref name="Spikes" /><br />
<br />
Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was made concrete in 1948 by Claude Shannon in his paper "A Mathematical Theory of Communication", in which "information" is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the noisy-channel coding theorem showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.<br />
<br />
信息论主要研究信息的传递、处理、提取和利用。抽象地说,信息可以作为不确定性的解决方案。1948年,Claude Shannon在他的论文"[[A Mathematical Theory of Communication]]"中将这个抽象的概念具体化,在这篇论文中“信息”被认为是一组可能的信息,其目标是通过噪声信道发送这些信息,然后让接收器在信道噪声的影响下以较低的错误概率来重构信息。Shannon的主要结果为:噪信道编码定理表明,在许多信道(这个数量仅仅依赖于信息发送所经过的信道的统计信息)使用的限制下,信道容量为渐近可达到的信息传输速率,。<br />
<br />
<br />
<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of [[Rubric (academic)|rubrics]] throughout the world over the past half century or more: [[adaptive system]]s, [[anticipatory system]]s, [[artificial intelligence]], [[complex system]]s, [[complexity science]], [[cybernetics]], [[Informatics (academic field)|informatics]], [[machine learning]], along with [[systems science]]s of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of [[coding theory]].<br />
<br />
Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory.<br />
<br />
信息论与一系列纯粹的、应用的学科密切相关,在过去半个世纪甚至更久的时间里,在全球范围内已经有各种专栏下被研究和简化为工程实践,比如在自适应系统,预期系统,人工智能,复杂系统,复杂性科学,控制论,信息学,机器学习,以及许多描述的系统科学等学科中的研究与应用。信息论是一个广泛而深入的数学理论,同样也具有广泛而深入的应用,其中编码理论是至关重要的领域。<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is concerned with finding explicit methods, called ''codes'', for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and [[error-correction]] (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
Coding theory is concerned with finding explicit methods, called codes, for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.<br />
<br />
编码理论与寻找明确的方法(编码)有关,用于提高效率和将噪声信道上传输的数据错误率降低到接近信道容量。这些编码可大致分为数据压缩编码(信源编码)和纠错(信道编码)技术。在后一种技术中,花了很多年才证明Shannon的工作是可行的。<br />
<br />
<br />
<br />
<br />
<br />
A third class of information theory codes are cryptographic algorithms (both [[code (cryptography)|code]]s and [[cipher]]s). Concepts, methods and results from coding theory and information theory are widely used in cryptography and [[cryptanalysis]]. ''See the article [[ban (unit)]] for a historical application.''<br />
<br />
A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis. See the article ban (unit) for a historical application.<br />
<br />
第三类信息论代码是密码算法(包括代码和密码)。编码理论和信息论的概念、方法和结果在密码学和密码分析中得到了广泛的应用。有关历史应用,请参阅文章禁令(单位)。<br />
<br />
<br />
<br />
<br />
<br />
==Historical background==<br />
<br />
==Historical background==<br />
<br />
历史背景<br />
<br />
{{Main|History of information theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
The landmark event that ''established'' the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the ''[[Bell System Technical Journal]]'' in July and October 1948.<br />
<br />
The landmark event that established the discipline of information theory and brought it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October 1948.<br />
<br />
1948年7月和10月,在《贝尔系统技术杂志》上发表了克劳德·香农的经典论文《交流的数学理论》 ,这是建立信息理论学科并立即引起全世界关注的里程碑事件。<br />
<br />
<br />
<br />
<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at [[Bell Labs]], all implicitly assuming events of equal probability. [[Harry Nyquist]]'s 1924 paper, ''Certain Factors Affecting Telegraph Speed'', contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation {{math|1=''W'' = ''K'' log ''m''}} (recalling [[Boltzmann's constant]]), where ''W'' is the speed of transmission of intelligence, ''m'' is the number of different voltage levels to choose from at each time step, and ''K'' is a constant. [[Ralph Hartley]]'s 1928 paper, ''Transmission of Information'', uses the word ''information'' as a measurable quantity, reflecting the receiver's ability to distinguish one [[sequence of symbols]] from any other, thus quantifying information as {{math|1=''H'' = log ''S''<sup>''n''</sup> = ''n'' log ''S''}}, where ''S'' was the number of possible symbols, and ''n'' the number of symbols in a transmission. The unit of information was therefore the [[decimal digit]], which has since sometimes been called the [[Hartley (unit)|hartley]] in his honor as a unit or scale or measure of information. [[Alan Turing]] in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war [[Cryptanalysis of the Enigma|Enigma]] ciphers.<br />
<br />
Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation (recalling Boltzmann's constant), where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish one sequence of symbols from any other, thus quantifying information as , where S was the number of possible symbols, and n the number of symbols in a transmission. The unit of information was therefore the decimal digit, which has since sometimes been called the hartley in his honor as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers.<br />
<br />
在这篇文章之前,有限的信息理论思想已经在贝尔实验室开发,所有隐含假设事件的等概率。Harry Nyquist 在1924年的论文《影响电报速度的某些因素》中有一个理论部分,量化了“智能”和通信系统可以传输的“线路速度” ,给出了关系(回顾 Boltzmann 常数) ,其中 w 是智能传输的速度,m 是每个时间步骤可以选择的不同电压级数,k 是常数。拉尔夫 · 哈特利在1928年发表的论文《信息的传递》中,将单词信息作为一个可测量的量,反映了接收者区分一系列符号的能力,从而将信息量化为,其中 s 是可能符号的数量,n 是传输中符号的数量。因此,信息的单位就是十进制数字,为了表示对他的尊敬,这个单位有时被称为哈特莱,作为信息的单位、尺度或度量。1940年,阿兰 · 图灵在对德国二战时期破解英格玛密码的统计分析中使用了类似的想法。<br />
<br />
<br />
<br />
<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of [[thermodynamics]] by [[Ludwig Boltzmann]] and [[J. Willard Gibbs]]. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by [[Rolf Landauer]] in the 1960s, are explored in ''[[Entropy in thermodynamics and information theory]]''.<br />
<br />
Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory.<br />
<br />
信息论背后的许多数学理论,包括不同概率的事件,都是由路德维希·玻尔兹曼和 j. Willard Gibbs 为热力学领域而发展起来的。信息论熵和熵之间的联系,包括 Rolf Landauer 在20世纪60年代的重要贡献,在《西拉德发动机志中进行了探讨。<br />
<br />
<br />
<br />
<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that<br />
<br />
1944年底,贝尔实验室基本完成了香农具有革命性和开创性的论文<br />
<br />
:"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."<br />
<br />
通信的基本问题是在一点上精确地或近似地再现在另一点上选择的信息<br />
<br />
<br />
<br />
<br />
<br />
With it came the ideas of<br />
<br />
With it came the ideas of<br />
<br />
随之而来的是<br />
<br />
* the information entropy and [[redundancy (information theory)|redundancy]] of a source, and its relevance through the [[source coding theorem]];<br />
<br />
<br />
<br />
* the mutual information, and the channel capacity of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;<br />
<br />
<br />
<br />
* the practical result of the [[Shannon–Hartley law]] for the channel capacity of a [[Gaussian channel]]; as well as<br />
<br />
<br />
<br />
* the [[bit]]—a new way of seeing the most fundamental unit of information.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==Quantities of information==<br />
<br />
==Quantities of information==<br />
<br />
资料数量<br />
<br />
{{Main|Quantities of information}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Information theory is based on [[probability theory]] and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy [[Communication channel|channel]] in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
Information theory is based on probability theory and statistics. Information theory often concerns itself with measures of information of the distributions associated with random variables. Important quantities of information are entropy, a measure of information in a single random variable, and mutual information, a measure of information in common between two random variables. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy channel in the limit of long block lengths, when the channel statistics are determined by the joint distribution.<br />
<br />
信息理论是基于概率论和统计学的。信息论常常涉及到与随机变量相关的分布信息的度量。重要的信息量是熵和互信息,前者是单一随机变量中信息的度量,后者是两个随机变量之间信息的度量。前者是一个随机变量的概率分布的性质,并且给出了一个限制,在给定分布的独立样本生成的数据可以被可靠压缩的速率。后者是两个随机变量联合分布的一个性质,是当信道统计量由联合分布确定时,在长块长度的限制下通过噪声信道的可靠通信的最大速率。<br />
<br />
<br />
<br />
<br />
<br />
The choice of logarithmic base in the following formulae determines the [[units of measurement|unit]] of information entropy that is used. A common unit of information is the bit, based on the [[binary logarithm]]. Other units include the [[nat (unit)|nat]], which is based on the [[natural logarithm]], and the [[deciban|decimal digit]], which is based on the [[common logarithm]].<br />
<br />
The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. A common unit of information is the bit, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the decimal digit, which is based on the common logarithm.<br />
<br />
以下公式中对数底的选择决定了使用的熵单位。一个常见的信息单位是位,基于以2爲底的对数。其他单位包括基于自然对数的 nat 和基于常用对数的十进制数字。<br />
<br />
<br />
<br />
<br />
<br />
In what follows, an expression of the form {{math|''p'' log ''p''}} is considered by convention to be equal to zero whenever {{math|1=''p'' = 0}}. This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
In what follows, an expression of the form is considered by convention to be equal to zero whenever . This is justified because <math>\lim_{p \rightarrow 0+} p \log p = 0</math> for any logarithmic base.<br />
<br />
在接下来的部分中,按照惯例,只要存在一个表达式,该表达式就被认为等于零。这是合理的,因为 math lim { p right tarrow 0 + } p log p0 / math 适用于任何对数底。<br />
<br />
<br />
<br />
<br />
<br />
===Entropy of an information source===<br />
<br />
===Entropy of an information source===<br />
<br />
信息源的熵<br />
<br />
Based on the [[probability mass function]] of each source symbol to be communicated, the Shannon [[Entropy (information theory)|entropy]] {{math|''H''}}, in units of bits (per symbol), is given by<br />
<br />
Based on the probability mass function of each source symbol to be communicated, the Shannon entropy , in units of bits (per symbol), is given by<br />
<br />
基于要通信的每个源符号的概率质量函数,香农熵以比特为单位(每个符号)由<br />
<br />
:<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
<math>H = - \sum_{i} p_i \log_2 (p_i)</math><br />
<br />
数学 h- sum { i } i log 2(pi) / math<br />
<br />
where {{math|''p<sub>i</sub>''}} is the probability of occurrence of the {{math|''i''}}-th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the [[Shannon (unit)|shannon]] in his honor. Entropy is also commonly computed using the natural logarithm (base [[E (mathematical constant)|{{mvar|e}}]], where {{mvar|e}} is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base {{nowrap|1=2<sup>8</sup> = 256}} will produce a measurement in [[byte]]s per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
where is the probability of occurrence of the -th possible value of the source symbol. This equation gives the entropy in the units of "bits" (per symbol) because it uses a logarithm of base 2, and this base-2 measure of entropy has sometimes been called the shannon in his honor. Entropy is also commonly computed using the natural logarithm (base E (mathematical constant)|, where is Euler's number), which produces a measurement of entropy in nats per symbol and sometimes simplifies the analysis by avoiding the need to include extra constants in the formulas. Other bases are also possible, but less commonly used. For example, a logarithm of base will produce a measurement in bytes per symbol, and a logarithm of base 10 will produce a measurement in decimal digits (or hartleys) per symbol.<br />
<br />
其中是源符号的第-个可能值出现的概率。这个方程给出了以“比特”(每个符号)为单位的熵,因为它使用了以2为底的对数,而这个以2为底的熵度量有时被称为香农(shannon) ,以纪念他。熵的计算也通常使用自然对数(基数 e (数学常数) | ,其中是欧拉数) ,它产生以每个符号的 nats 为单位的熵的测量,有时通过避免在公式中包含额外常数来简化分析。其他基地也是可能的,但不常用。例如,以为底的对数将产生以每个符号的字节为单位的测量值,以10为底的对数将产生以每个符号的十进制数字(或哈特利)为单位的测量值。<br />
<br />
<br />
<br />
<br />
<br />
Intuitively, the entropy {{math|''H<sub>X</sub>''}} of a discrete random variable {{math|''X''}} is a measure of the amount of ''uncertainty'' associated with the value of {{math|''X''}} when only its distribution is known.<br />
<br />
Intuitively, the entropy of a discrete random variable is a measure of the amount of uncertainty associated with the value of when only its distribution is known.<br />
<br />
直观上,离散型随机变量的熵是一种度量,当只有它的分布是已知的时候,与它的值相关的不确定性的量度。<br />
<br />
<br />
<br />
<br />
<br />
The entropy of a source that emits a sequence of {{math|''N''}} symbols that are [[independent and identically distributed]] (iid) is {{math|''N'' ⋅ ''H''}} bits (per message of {{math|''N''}} symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length {{math|''N''}} will be less than {{math|''N'' ⋅ ''H''}}.<br />
<br />
The entropy of a source that emits a sequence of symbols that are independent and identically distributed (iid) is bits (per message of symbols). If the source data symbols are identically distributed but not independent, the entropy of a message of length will be less than .<br />
<br />
发出独立且同分布的符号序列(iid)的源的熵是位(每个符号消息)。如果源数据符号是同分布的,但不是独立的,则消息长度的熵将小于。<br />
<br />
<br />
<br />
<br />
<br />
[[File:Binary entropy plot.svg|thumbnail|right|200px|The entropy of a [[Bernoulli trial]] as a function of success probability, often called the {{em|[[binary entropy function]]}}, {{math|''H''<sub>b</sub>(''p'')}}. The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
The entropy of a [[Bernoulli trial as a function of success probability, often called the , . The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.]]<br />
<br />
作为成功概率的函数,[ Bernoulli 试验的熵,通常称为,。当两种可能的结果的概率相等时,每次试验的熵最大化为1位,就像无偏的掷硬币一样。]<br />
<br />
<br />
<br />
<br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages {{math|{{mset|''x''<sub>1</sub>, ..., ''x''<sub>''n''</sub>}}}} that {{math|''X''}} could be, and {{math|''p''(''x'')}} is the probability of some <math>x \in \mathbb X</math>, then the entropy, {{math|''H''}}, of {{math|''X''}} is defined:<ref name = Reza>{{cite book | title = An Introduction to Information Theory | author = Fazlollah M. Reza | publisher = Dover Publications, Inc., New York | origyear = 1961| year = 1994 | isbn = 0-486-68210-2 | url = https://books.google.com/books?id=RtzpRAiX6OgC&pg=PA8&dq=intitle:%22An+Introduction+to+Information+Theory%22++%22entropy+of+a+simple+source%22}}</ref><br />
<br />
If one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, it is clear that no information is transmitted. If, however, each bit is independently equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted. Between these two extremes, information can be quantified as follows. If 𝕏 is the set of all messages }} that could be, and is the probability of some <math>x \in \mathbb X</math>, then the entropy, , of is defined:<br />
<br />
如果一个人在传输之前传输1000位(0和1) ,并且接收者知道每个位的值(有一个确定的特定值) ,那么很明显没有信息被传输。然而,如果每个比特都是独立的,那么它们传输的信息量可能是0或1,1000个信息子(通常被称为比特)。在这两个极端之间,信息可以被量化如下。如果 x 是所有消息}}的集合,并且是某个 mathbb x / math 中某个数学 x 的概率,那么熵,,的定义如下:<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
<math> H(X) = \mathbb{E}_{X} [I(x)] = -\sum_{x \in \mathbb{X}} p(x) \log p(x).</math><br />
<br />
Math h (x) mathbb { x }[ i (x)]-sum { x in mathbb { x } p (x) log p (x) . / math<br />
<br />
<br />
<br />
<br />
<br />
(Here, {{math|''I''(''x'')}} is the [[self-information]], which is the entropy contribution of an individual message, and {{math|𝔼<sub>''X''</sub>}} is the [[expected value]].) A property of entropy is that it is maximized when all the messages in the message space are equiprobable {{math|1=''p''(''x'') = 1/''n''}}; i.e., most unpredictable, in which case {{math|1=''H''(''X'') = log ''n''}}.<br />
<br />
(Here, is the self-information, which is the entropy contribution of an individual message, and is the expected value.) A property of entropy is that it is maximized when all the messages in the message space are equiprobable ; i.e., most unpredictable, in which case .<br />
<br />
(这里是自信息,它是单个信息的熵贡献,也是期望值。)熵的一个特性是,当消息空间中的所有消息都是等概率时,熵就会最大化; 也就是说,在这种情况下,熵是最不可预测的。<br />
<br />
<br />
<br />
<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
The special case of information entropy for a random variable with two outcomes is the binary entropy function, usually taken to the logarithmic base 2, thus having the shannon (Sh) as unit:<br />
<br />
对于具有两个结果的随机变量,熵的特殊情况是二元熵函数,通常以对数为底2,因此以 shannon (Sh)为单位:<br />
<br />
<br />
<br />
<br />
<br />
:<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
<math>H_{\mathrm{b}}(p) = - p \log_2 p - (1-p)\log_2 (1-p).</math><br />
<br />
(p)-p log 2 p-(1-p) log 2(1-p) . / math<br />
<br />
<br />
<br />
<br />
<br />
===Joint entropy===<br />
<br />
===Joint entropy===<br />
<br />
联合熵<br />
<br />
The {{em|[[joint entropy]]}} of two discrete random variables {{math|''X''}} and {{math|''Y''}} is merely the entropy of their pairing: {{math|(''X'', ''Y'')}}. This implies that if {{math|''X''}} and {{math|''Y''}} are [[statistical independence|independent]], then their joint entropy is the sum of their individual entropies.<br />
<br />
The of two discrete random variables and is merely the entropy of their pairing: . This implies that if and are independent, then their joint entropy is the sum of their individual entropies.<br />
<br />
两个离散随机变量的熵,仅仅是它们配对的熵: 。这意味着如果和是独立的,那么它们的联合熵就是它们各自熵的总和。<br />
<br />
<br />
<br />
<br />
<br />
For example, if {{math|(''X'', ''Y'')}} represents the position of a chess piece — {{math|''X''}} the row and {{math|''Y''}} the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
For example, if represents the position of a chess piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.<br />
<br />
例如,如果代表棋子的位置ー行和列,那么棋子行和列的联合熵就是棋子位置的联合熵。<br />
<br />
<br />
<br />
<br />
<br />
:<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
<math>H(X, Y) = \mathbb{E}_{X,Y} [-\log p(x,y)] = - \sum_{x, y} p(x, y) \log p(x, y) \,</math><br />
<br />
数学 h (x,y) mathbb { x,y }[- log p (x,y)]- sum { x,y } p (x,y) log p (x,y) ,/ math<br />
<br />
<br />
<br />
<br />
<br />
Despite similar notation, joint entropy should not be confused with {{em|[[cross entropy]]}}.<br />
<br />
Despite similar notation, joint entropy should not be confused with .<br />
<br />
尽管符号相似,联合熵不应与。<br />
<br />
<br />
<br />
<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
===Conditional entropy (equivocation)===<br />
<br />
条件熵(含糊其辞)<br />
<br />
The {{em|[[conditional entropy]]}} or ''conditional uncertainty'' of {{math|''X''}} given random variable {{math|''Y''}} (also called the ''equivocation'' of {{math|''X''}} about {{math|''Y''}}) is the average conditional entropy over {{math|''Y''}}:<ref name=Ash>{{cite book | title = Information Theory | author = Robert B. Ash | publisher = Dover Publications, Inc. | origyear = 1965| year = 1990 | isbn = 0-486-66521-6 | url = https://books.google.com/books?id=ngZhvUfF0UIC&pg=PA16&dq=intitle:information+intitle:theory+inauthor:ash+conditional+uncertainty}}</ref><br />
<br />
The or conditional uncertainty of given random variable (also called the equivocation of about ) is the average conditional entropy over :<br />
<br />
给定随机变量的或条件不确定性(也称为约的模糊性)是除以以下条件熵的平均值:<br />
<br />
<br />
<br />
<br />
<br />
:<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
<math> H(X|Y) = \mathbb E_Y [H(X|y)] = -\sum_{y \in Y} p(y) \sum_{x \in X} p(x|y) \log p(x|y) = -\sum_{x,y} p(x,y) \log p(x|y).</math><br />
<br />
数学 h (x | y) mathbb e y [ h (x | y)]- sum { y } p (y) sum { x } p (x | y) log p (x | y)- sum { x,y } p (x,y) log p (x | y)。 数学<br />
<br />
<br />
<br />
<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:<br />
<br />
因为熵可以取决于一个随机变量或者一个确定的随机变量,所以应该注意不要混淆条件熵的这两个定义,前者更为常用。这种条件熵的一个基本属性是:<br />
<br />
<br />
<br />
<br />
<br />
: <math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
<math> H(X|Y) = H(X,Y) - H(Y) .\,</math><br />
<br />
数学 h (x | y) h (x,y)-h (y)。 ,/ 数学<br />
<br />
<br />
<br />
<br />
<br />
===Mutual information (transinformation)===<br />
<br />
===Mutual information (transinformation)===<br />
<br />
互信息(交换信息)<br />
<br />
''[[Mutual information]]'' measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of {{math|''X''}} relative to {{math|''Y''}} is given by:<br />
<br />
Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of relative to is given by:<br />
<br />
互信息测量的是通过观察另一个随机变量可以获得的信息量。在通信中,它可以用来最大限度地在发送和接收信号之间共享信息量,这一点非常重要。相对于的相互信息是通过以下方式给出的:<br />
<br />
<br />
<br />
<br />
<br />
:<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
<math>I(X;Y) = \mathbb{E}_{X,Y} [SI(x,y)] = \sum_{x,y} p(x,y) \log \frac{p(x,y)}{p(x)\, p(y)}</math><br />
<br />
数学 i (x; y) mathbb { x,y }[ SI (x,y)] sum { x,y } p (x,y) log frac { p (x,y)} ,p (y)} / math<br />
<br />
where {{math|SI}} (''S''pecific mutual ''I''nformation) is the [[pointwise mutual information]].<br />
<br />
where (Specific mutual Information) is the pointwise mutual information.<br />
<br />
其中(特定的相互信息)是点间互信息。<br />
<br />
<br />
<br />
<br />
<br />
A basic property of the mutual information is that<br />
<br />
A basic property of the mutual information is that<br />
<br />
互信息的一个基本属性是<br />
<br />
: <math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
<math>I(X;Y) = H(X) - H(X|Y).\,</math><br />
<br />
数学 i (x; y) h (x)-h (x | y)。 ,/ 数学<br />
<br />
That is, knowing ''Y'', we can save an average of {{math|''I''(''X''; ''Y'')}} bits in encoding ''X'' compared to not knowing ''Y''.<br />
<br />
That is, knowing Y, we can save an average of bits in encoding X compared to not knowing Y.<br />
<br />
也就是说,了解了 y,我们就可以在编码 x 时比不知道 y 平均节省位。<br />
<br />
<br />
<br />
<br />
<br />
Mutual information is [[symmetric function|symmetric]]:<br />
<br />
Mutual information is symmetric:<br />
<br />
互信息是对称的:<br />
<br />
: <math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
<math>I(X;Y) = I(Y;X) = H(X) + H(Y) - H(X,Y).\,</math><br />
<br />
数学 i (x; y) i (y; x) h (x) + h (y)-h (x,y)。 ,/ math<br />
<br />
<br />
<br />
<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the [[posterior probability|posterior probability distribution]] of ''X'' given the value of ''Y'' and the [[prior probability|prior distribution]] on ''X'':<br />
<br />
Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of X given the value of Y and the prior distribution on X:<br />
<br />
在给定 y 值和 x 的先验分布的情况下,互信息可以表示为 x 的后验概率之间的平均 Kullback-Leibler 散度(信息增益) :<br />
<br />
: <math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
<math>I(X;Y) = \mathbb E_{p(y)} [D_{\mathrm{KL}}( p(X|Y=y) \| p(X) )].</math><br />
<br />
Math i (x; y) mathbb e { p (y)}[ d { mathrm { KL }(p (x | y) | p (x))] . / math<br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on ''X'' will change if we are given the value of ''Y''. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
In other words, this is a measure of how much, on the average, the probability distribution on X will change if we are given the value of Y. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:<br />
<br />
换句话说,这是一种衡量方法,如果我们给出 y 的值,平均来说,x 上的概率分布将会改变多少。这经常被重新计算为从边际分布乘积到实际联合分布的背离:<br />
<br />
: <math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
<math>I(X; Y) = D_{\mathrm{KL}}(p(X,Y) \| p(X)p(Y)).</math><br />
<br />
Math i (x; y) d { mathrum { KL }(p (x,y) | p (x) p (y)) . / math<br />
<br />
<br />
<br />
<br />
<br />
Mutual information is closely related to the [[likelihood-ratio test|log-likelihood ratio test]] in the context of contingency tables and the [[multinomial distribution]] and to [[Pearson's chi-squared test|Pearson's χ<sup>2</sup> test]]: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ<sup>2</sup> test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.<br />
<br />
互信息与列联表和多项分布上的对数似然比检验以及 Pearson 的 sup 2 / sup 检验密切相关: 互信息可以被看作是一个评估一对变量之间独立性的统计量,并且具有良好的指定的渐近分布。<br />
<br />
<br />
<br />
<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
===Kullback–Leibler divergence (information gain)===<br />
<br />
Kullback-Leibler 分歧(信息增益)<br />
<br />
The ''[[Kullback–Leibler divergence]]'' (or ''information divergence'', ''information gain'', or ''relative entropy'') is a way of comparing two distributions: a "true" [[probability distribution]] ''p(X)'', and an arbitrary probability distribution ''q(X)''. If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution p(X), and an arbitrary probability distribution q(X). If we compress data in a manner that assumes q(X) is the distribution underlying some data, when, in reality, p(X) is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined<br />
<br />
Kullback-Leibler 分歧(或相对熵,信息增益,或相对熵)是一种比较两种分布的方法: 一种是“真实的”概率分布 p (x) ,另一种是任意的概率分布 q (x)。如果我们假设 q (x)是某些数据的分布,而实际上 p (x)是正确的分布,那么 Kullback-Leibler 散度就是每个数据压缩所需的平均附加比特数。它就是这样定义的<br />
<br />
<br />
<br />
<br />
<br />
:<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
<math>D_{\mathrm{KL}}(p(X) \| q(X)) = \sum_{x \in X} -p(x) \log {q(x)} \, - \, \sum_{x \in X} -p(x) \log {p(x)} = \sum_{x \in X} p(x) \log \frac{p(x)}{q(x)}.</math><br />
<br />
数学 d { mathrm { KL }(p (x) | q (x)) sum { x }-p (x) log { q (x)} ,-,sum { x }-p (x) log { p (x) log { sum { p (x)}{ q (x)}。 数学<br />
<br />
<br />
<br />
<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true [[Metric (mathematics)|metric]] since it is not symmetric and does not satisfy the [[triangle inequality]] (making it a semi-quasimetric).<br />
<br />
Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasimetric).<br />
<br />
尽管有时它被用作一个距离度量,KL 散度不是一个真正的度量,因为它不是对称的,也不满足三角不等式(使它成为一个半准度量)。<br />
<br />
<br />
<br />
<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number ''X'' is about to be drawn randomly from a discrete set with probability distribution ''p(x)''. If Alice knows the true distribution ''p(x)'', while Bob believes (has a [[prior probability|prior]]) that the distribution is ''q(x)'', then Bob will be more [[Information content|surprised]] than Alice, on average, upon seeing the value of ''X''. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the ''log'' is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number X is about to be drawn randomly from a discrete set with probability distribution p(x). If Alice knows the true distribution p(x), while Bob believes (has a prior) that the distribution is q(x), then Bob will be more surprised than Alice, on average, upon seeing the value of X. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.<br />
<br />
对 KL 分歧的另一种解释是“不必要的惊讶” ,由真理之前引入: 假设一个数字 x 是从一个离散的集合随机抽取的概率分布 p (x)。如果爱丽丝知道真正的分布 p (x) ,而鲍勃认为(有先验)分布是 q (x) ,那么平均而言,鲍勃在看到 x 的值时会比爱丽丝更惊讶。Kl 发散度是 Bob (主观)惊喜的(客观)预期值减去 Alice 的惊喜值,如果日志位于以2为基数,则以位为单位进行测量。通过这种方式,鲍勃的先验“错误”的程度可以量化为预期会让他“不必要地惊讶”的程度。<br />
<br />
<br />
<br />
<br />
<br />
===Other quantities===<br />
<br />
===Other quantities===<br />
<br />
其他数量<br />
<br />
Other important information theoretic quantities include [[Rényi entropy]] (a generalization of entropy), [[differential entropy]] (a generalization of quantities of information to continuous distributions), and the [[conditional mutual information]].<br />
<br />
Other important information theoretic quantities include Rényi entropy (a generalization of entropy), differential entropy (a generalization of quantities of information to continuous distributions), and the conditional mutual information.<br />
<br />
其他重要的信息理论量包括 r nyi 熵(熵的推广) ,微分熵(信息量的推广到连续分布) ,以及条件互信息。<br />
<br />
<br />
<br />
<br />
<br />
==Coding theory==<br />
<br />
==Coding theory==<br />
<br />
编码理论<br />
<br />
{{Main|Coding theory}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
[[File:CDSCRATCHES.jpg|thumb|right|A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction]].]]<br />
<br />
A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction.]]<br />
<br />
在可读光盘的表面上显示划痕的图片。 音乐和数据光盘使用纠错编码进行编码,因此即使它们有轻微的划痕,也可以通过[纠错]进行读取<br />
<br />
<br />
<br />
<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into [[data compression|source coding]] theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.<br />
<br />
编码理论是信息论最重要和最直接的应用之一。它可以细分为信源编码理论和信道编码理论。利用数据的统计描述,信息理论量化描述数据所需的位数,这是数据源的熵。<br />
<br />
<br />
<br />
<br />
<br />
* Data compression (source coding): There are two formulations for the compression problem:<br />
<br />
<br />
<br />
*[[lossless data compression]]: the data must be reconstructed exactly;<br />
<br />
<br />
<br />
*[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called ''[[rate–distortion theory]]''.<br />
<br />
<br />
<br />
* Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the [[broadcast channel]]) or intermediary "helpers" (the [[relay channel]]), or more general [[computer network|networks]], compression followed by transmission may no longer be optimal. [[Network information theory]] refers to these multi-agent communication models.<br />
<br />
This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models.<br />
<br />
这种将编码理论划分为压缩和传输的做法得到了信息传输定理或信源信道分离定理的证明,这些定理证明在许多情况下使用比特作为信息的通用货币是正确的。然而,这些定理只适用于一个发送用户希望与一个接收用户通信的情况。在多于一个发射器(多重接达频道)、多于一个接收器(广播频道)或中介”辅助器”(中继频道)或多个一般网络的情况下,先压缩后传输可能不再是最佳选择。网络信息理论就是指这些多 agent 通信模型。<br />
<br />
<br />
<br />
<br />
<br />
===Source theory===<br />
<br />
===Source theory===<br />
<br />
来源理论<br />
<br />
Any process that generates successive messages can be considered a {{em|[[Communication source|source]]}} of information. A memoryless source is one in which each message is an [[Independent identically distributed random variables|independent identically distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose less restrictive constraints. All such sources are [[stochastic process|stochastic]]. These terms are well studied in their own right outside information theory.<br />
<br />
Any process that generates successive messages can be considered a of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory.<br />
<br />
任何生成连续消息的进程都可以被认为是一种信息。无记忆信源是指每个信息是一个独立的同分布随机变量,而遍历性和平稳性的性质对信源的限制较少。所有这些源都是随机的。这些术语本身在信息论之外已经得到了很好的研究。<br />
<br />
<br />
<br />
<br />
<br />
====Rate====<!-- This section is linked from [[Channel capacity]] --><br />
<br />
====Rate====<!-- This section is linked from Channel capacity --><br />
<br />
速率! ——这部分与频道容量相关——<br />
<br />
Information ''[[Entropy rate|rate]]'' is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is<br />
<br />
信息率是每个符号的平均熵。对于无记忆的源,这仅仅是每个符号的熵,而对于静止的随机过程来说,它是<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math><br />
<br />
(x n | x { n-1} ,x { n-2} ,x { n-3} ,ldots) ; / math<br />
<br />
<br />
<br />
<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the ''average rate'' is<br />
<br />
that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is<br />
<br />
也就是说,一个符号的条件熵给出了所有之前产生的符号。对于一个不一定是平稳的过程的更一般的情况,平均速率是<br />
<br />
<br />
<br />
<br />
<br />
:<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math><br />
<br />
(x1,x2, dots xn) ; / math<br />
<br />
<br />
<br />
<br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref><br />
<br />
that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.<br />
<br />
也就是说,每个符号的联合熵的极限。对于静止源,这两个表达式给出了相同的结果。<br />
<br />
<br />
<br />
<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of {{em|source coding}}.<br />
<br />
It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of .<br />
<br />
在信息论中,谈论一种语言的“速率”或“熵”是很常见的。这是适当的,例如,当信息来源是英语散文。信息来源的速度与它的冗余度以及它可以被压缩的程度有关。<br />
<br />
<br />
<br />
<br />
<br />
===Channel capacity===<br />
<br />
===Channel capacity===<br />
<br />
信道容量<br />
<br />
{{Main|Channel capacity}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Communications over a channel—such as an [[ethernet]] cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.<br />
<br />
通过信道(如以太网电缆)进行通信是信息理论的主要动机。然而,这样的通道往往不能产生准确的重建信号; 噪声,静默时间,和其他形式的信号损坏往往降低质量。<br />
<br />
<br />
<br />
<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
Consider the communications process over a discrete channel. A simple model of the process is shown below:<br />
<br />
考虑一个离散通道上的通信过程。这个过程的一个简单模型如下:<br />
<br />
<br />
<br />
<br />
<br />
[[File:Channel model.svg|center|800px|Channel model]]<br />
<br />
Channel model<br />
<br />
通道模型<br />
<br />
<br />
<br />
<br />
<br />
Here ''X'' represents the space of messages transmitted, and ''Y'' the space of messages received during a unit time over our channel. Let {{math|''p''(''y''{{pipe}}''x'')}} be the [[conditional probability]] distribution function of ''Y'' given ''X''. We will consider {{math|''p''(''y''{{pipe}}''x'')}} to be an inherent fixed property of our communications channel (representing the nature of the ''[[Signal noise|noise]]'' of our channel). Then the joint distribution of ''X'' and ''Y'' is completely determined by our channel and by our choice of {{math|''f''(''x'')}}, the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the ''[[Signal (electrical engineering)|signal]]'', we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the {{em|channel capacity}} and is given by:<br />
<br />
Here X represents the space of messages transmitted, and Y the space of messages received during a unit time over our channel. Let x)}} be the conditional probability distribution function of Y given X. We will consider x)}} to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of X and Y is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the and is given by:<br />
<br />
这里 x 表示传输的消息空间,y 表示单位时间内通过我们的信道接收的消息空间。设 x)}是给定 x 的 y 的条件概率分布函数。我们将考虑 x)}是我们通信信道的固有固定属性(表示我们信道噪声的性质)。那么,x 和 y 的联合分布完全取决于我们的渠道,以及我们选择通过渠道发送的信息边缘分布。在这些约束条件下,我们希望最大化信息速率,或者说信号速率,我们可以通过信道进行通信。对于这个问题的适当措施是互信息,这个最大的互信息被称为互信息,并由以下人员给出:<br />
<br />
:<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
<math> C = \max_{f} I(X;Y).\! </math><br />
<br />
数学 c max { f } i (x; y) . !数学<br />
<br />
This capacity has the following property related to communicating at information rate ''R'' (where ''R'' is usually bits per symbol). For any information rate ''R < C'' and coding error ε > 0, for large enough ''N'', there exists a code of length ''N'' and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate ''R &gt; C'', it is impossible to transmit with arbitrarily small block error.<br />
<br />
This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R &gt; C, it is impossible to transmit with arbitrarily small block error.<br />
<br />
此容量具有以下与以信息速率 r (其中 r 通常为每个符号位)进行通信有关的属性。对于任意信息速率 rc 和编码错误0,对于足够大的 n,存在一个长度 n 和速率≥ r 的码和一个译码算法,使得块错误的最大概率≤ ,即总是可以在任意小的块错误下传输。此外,对于任何速率的 rc,它不可能传输任意小的块误差。<br />
<br />
<br />
<br />
<br />
<br />
''[[Channel code|Channel coding]]'' is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.<br />
<br />
信道编码涉及到寻找这种接近最优的编码,它可以用于在噪声信道上以接近信道容量的速率传输数据,而且编码错误很小。<br />
<br />
<br />
<br />
<br />
<br />
====Capacity of particular channel models====<br />
<br />
====Capacity of particular channel models====<br />
<br />
特定渠道模型的容量<br />
<br />
* A continuous-time analog communications channel subject to [[Gaussian noise]] — see [[Shannon–Hartley theorem]].<br />
<br />
<br />
<br />
* A [[binary symmetric channel]] (BSC) with crossover probability ''p'' is a binary input, binary output channel that flips the input bit with probability ''p''. The BSC has a capacity of {{math|1 &minus; ''H''<sub>b</sub>(''p'')}} bits per channel use, where {{math|''H''<sub>b</sub>}} is the binary entropy function to the base-2 logarithm:<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary symmetric channel.svg]]<br />
<br />
File:Binary symmetric channel.svg<br />
<br />
文件: Binary symmetric channel.svg<br />
<br />
<br />
<br />
<br />
<br />
* A [[binary erasure channel]] (BEC) with erasure probability ''p'' is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is {{nowrap|1 &minus; ''p''}} bits per channel use.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
::[[File:Binary erasure channel.svg]]<br />
<br />
File:Binary erasure channel.svg<br />
<br />
文件: Binary erasure channel.svg<br />
<br />
<br />
<br />
<br />
<br />
==Applications to other fields==<br />
<br />
==Applications to other fields==<br />
<br />
其他领域的应用<br />
<br />
<br />
<br />
<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
===Intelligence uses and secrecy applications===<br />
<br />
情报使用和保密应用<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the [[Ban (unit)|ban]], was used in the [[Ultra]] project, breaking the German [[Enigma machine]] code and hastening the [[Victory in Europe Day|end of World War II in Europe]]. Shannon himself defined an important concept now called the [[unicity distance]]. Based on the redundancy of the [[plaintext]], it attempts to give a minimum amount of [[ciphertext]] necessary to ensure unique decipherability.<br />
<br />
Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of World War II in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.<br />
<br />
信息论概念应用于密码学和密码分析。图灵的信息单元,即禁令,被用于 Ultra 项目,破解了德国的恩尼格玛密码,加速了二战在欧洲的结束。香农自己定义了一个重要的概念,现在称为单一性距离。基于明文的冗余性,它试图给出最小数量的密文,以确保独特的破译能力。<br />
<br />
<br />
<br />
<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A [[brute force attack]] can break systems based on [[public-key cryptography|asymmetric key algorithms]] or on most commonly used methods of [[symmetric-key algorithm|symmetric key algorithms]] (sometimes called secret key algorithms), such as [[block cipher]]s. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on asymmetric key algorithms or on most commonly used methods of symmetric key algorithms (sometimes called secret key algorithms), such as block ciphers. The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.<br />
<br />
信息理论使我们相信,保守秘密比它最初可能出现时要困难得多。穷举法可以基于非对称密钥算法或最常用的对称密钥算法(有时称为秘密密钥算法) ,如分组密码破坏系统。目前所有这些方法的安全性都来自于这样一个假设,即任何已知的攻击都不能在实际的时间内破坏它们。<br />
<br />
<br />
<br />
<br />
<br />
[[Information theoretic security]] refers to methods such as the [[one-time pad]] that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the [[key (cryptography)|key]]) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the [[Venona project]] was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
Information theoretic security refers to methods such as the one-time pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.<br />
<br />
信息理论安全指的是诸如一次性垫子这样的方法,它不容易受到这种蛮力攻击。在这种情况下,明文和密文(以密钥为条件)之间的正向有条件的互信息可以确保适当的传输,而明文和密文之间的无条件互信息仍然为零,从而导致绝对安全的通信。换句话说,窃听者不能通过获得密文的知识而不是密钥来改进他或她对明文的猜测。但是,如同任何其他密码系统一样,必须小心谨慎,甚至正确应用信息——理论上安全的方法; Venona 项目能够破解苏联的一次性密封垫,因为它们不适当地重复使用关键材料。<br />
<br />
<br />
<br />
<br />
<br />
===Pseudorandom number generation===<br />
<br />
===Pseudorandom number generation===<br />
<br />
伪随机数生成<br />
<br />
[[Pseudorandom number generator]]s are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed [[cryptographically secure pseudorandom number generator]]s, but even they require [[random seed]]s external to the software to work as intended. These can be obtained via [[Extractor (mathematics)|extractors]], if done carefully. The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
Pseudorandom number generators are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed cryptographically secure pseudorandom number generators, but even they require random seeds external to the software to work as intended. These can be obtained via extractors, if done carefully. The measure of sufficient randomness in extractors is min-entropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.<br />
<br />
伪随机数发生器广泛应用于计算机语言库和应用程序中。它们几乎普遍不适合加密使用,因为它们不回避现代计算机设备和软件的确定性本质。一类改进的随机数生成器被称为密码安全的伪随机数生成器,但即使它们也需要软件外部的随机种子才能正常工作。这些可以通过提取器获得,如果仔细做。提取器中充分随机性的度量是最小熵,这是一个通过 r nyi 熵与 Shannon 熵相关的值,r nyi 熵也用于评估密码系统中的随机性。虽然相关,这些措施之间的区别意味着具有高香农熵的随机变量不一定令人满意的使用在提取器和密码学使用。<br />
<br />
<br />
<br />
<br />
<br />
===Seismic exploration===<br />
<br />
===Seismic exploration===<br />
<br />
地震勘探<br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and [[digital signal processing]] offer a major improvement of resolution and image clarity over previous analog methods.<ref>{{cite journal|doi=10.1002/smj.4250020202 | volume=2 | issue=2 | title=The corporation and innovation | year=1981 | journal=Strategic Management Journal | pages=97–118 | last1 = Haggerty | first1 = Patrick E.}}</ref><br />
<br />
One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing offer a major improvement of resolution and image clarity over previous analog methods.<br />
<br />
信息论的一个早期商业应用是在地震石油勘探领域。这一领域的工作使从所需的地震信号中剔除和分离不需要的噪声成为可能。信息理论和数字信号处理提供了一个重大改善的分辨率和图像清晰度比以前的模拟方法。<br />
<br />
<br />
<br />
<br />
<br />
===Semiotics===<br />
<br />
===Semiotics===<br />
<br />
符号学<br />
<br />
[[Semiotics|Semioticians]] [[:nl:Doede Nauta|Doede Nauta]] and [[Winfried Nöth]] both considered [[Charles Sanders Peirce]] as having created a theory of information in his works on semiotics.<ref name="Nauta 1972">{{cite book |ref=harv |last1=Nauta |first1=Doede |title=The Meaning of Information |date=1972 |publisher=Mouton |location=The Hague |isbn=9789027919960}}</ref>{{rp|171}}<ref name="Nöth 2012">{{cite journal |ref=harv |last1=Nöth |first1=Winfried |title=Charles S. Peirce's theory of information: a theory of the growth of symbols and of knowledge |journal=Cybernetics and Human Knowing |date=January 2012 |volume=19 |issue=1–2 |pages=137–161 |url=https://edisciplinas.usp.br/mod/resource/view.php?id=2311849}}</ref>{{rp|137}} Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<ref name="Nauta 1972"/>{{rp|91}}<br />
<br />
Semioticians Doede Nauta and Winfried Nöth both considered Charles Sanders Peirce as having created a theory of information in his works on semiotics. Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."<br />
<br />
符号学家 Doede Nauta 和 Winfried n 都认为查尔斯·桑德斯·皮尔士在他的符号学著作中创造了信息理论。诺塔将符号信息理论定义为研究“编码、过滤和信息处理的内部过程”<br />
<br />
<br />
<br />
<br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as [[Umberto Eco]] and [[:it:Ferruccio Rossi-Landi|Ferruccio Rossi-Landi]] to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<ref>Nöth, Winfried (1981). "[https://kobra.uni-kassel.de/bitstream/handle/123456789/2014122246977/semi_2004_002.pdf?sequence=1&isAllowed=y Semiotics of ideology]". ''Semiotica'', Issue 148.</ref><br />
<br />
Concepts from information theory such as redundancy and code control have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.<br />
<br />
来自信息论的概念,如冗余和代码控制,已经被符号学家如 Umberto Eco 和 Ferruccio Rossi-Landi 用来解释意识形态作为一种信息传递的形式,主导社会阶层通过使用高度冗余的标志来发出信息,这样在一系列相互竞争的标志中只有一个信息被解码。<br />
<br />
<br />
<br />
<br />
<br />
===Miscellaneous applications===<br />
<br />
===Miscellaneous applications===<br />
<br />
杂项申请<br />
<br />
Information theory also has applications in [[Gambling and information theory]], [[black hole information paradox|black holes]], and [[bioinformatics]].<br />
<br />
Information theory also has applications in Gambling and information theory, black holes, and bioinformatics.<br />
<br />
信息理论在赌博和信息理论、黑洞和生物信息学中也有应用。<br />
<br />
<br />
<br />
<br />
<br />
==See also==<br />
<br />
==See also==<br />
<br />
参见<br />
<br />
{{Portal|Mathematics}}<br />
<br />
<br />
<br />
* [[Algorithmic probability]]<br />
<br />
<br />
<br />
* [[Bayesian inference]]<br />
<br />
<br />
<br />
* [[Communication theory]]<br />
<br />
<br />
<br />
* [[Constructor theory]] - a generalization of information theory that includes quantum information<br />
<br />
<br />
<br />
* [[Inductive probability]]<br />
<br />
<br />
<br />
* [[Info-metrics]]<br />
<br />
<br />
<br />
* [[Minimum message length]]<br />
<br />
<br />
<br />
* [[Minimum description length]]<br />
<br />
<br />
<br />
* [[List of important publications in theoretical computer science#Information theory|List of important publications]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Applications===<br />
<br />
===Applications===<br />
<br />
申请<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Active networking]]<br />
<br />
<br />
<br />
* [[Cryptanalysis]]<br />
<br />
<br />
<br />
* [[Cryptography]]<br />
<br />
<br />
<br />
* [[Cybernetics]]<br />
<br />
<br />
<br />
* [[Entropy in thermodynamics and information theory]]<br />
<br />
<br />
<br />
* [[Gambling]]<br />
<br />
<br />
<br />
* [[Intelligence (information gathering)]]<br />
<br />
<br />
<br />
* [[reflection seismology|Seismic exploration]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===History===<br />
<br />
===History===<br />
<br />
历史<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ralph Hartley|Hartley, R.V.L.]]<br />
<br />
<br />
<br />
* [[History of information theory]]<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]]<br />
<br />
<br />
<br />
* [[Timeline of information theory]]<br />
<br />
<br />
<br />
* [[Hubert Yockey|Yockey, H.P.]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Theory===<br />
<br />
===Theory===<br />
<br />
理论<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Coding theory]]<br />
<br />
<br />
<br />
* [[Detection theory]]<br />
<br />
<br />
<br />
* [[Estimation theory]]<br />
<br />
<br />
<br />
* [[Fisher information]]<br />
<br />
<br />
<br />
* [[Information algebra]]<br />
<br />
<br />
<br />
* [[Information asymmetry]]<br />
<br />
<br />
<br />
* [[Information field theory]]<br />
<br />
<br />
<br />
* [[Information geometry]]<br />
<br />
<br />
<br />
* [[Information theory and measure theory]]<br />
<br />
<br />
<br />
* [[Kolmogorov complexity]]<br />
<br />
<br />
<br />
* [[List of unsolved problems in information theory]]<br />
<br />
<br />
<br />
* [[Logic of information]]<br />
<br />
<br />
<br />
* [[Network coding]]<br />
<br />
<br />
<br />
* [[Philosophy of information]]<br />
<br />
<br />
<br />
* [[Quantum information science]]<br />
<br />
<br />
<br />
* [[Source coding]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Concepts===<br />
<br />
===Concepts===<br />
<br />
概念<br />
<br />
{{div col|colwidth=20em}}<br />
<br />
<br />
<br />
* [[Ban (unit)]]<br />
<br />
<br />
<br />
* [[Channel capacity]]<br />
<br />
<br />
<br />
* [[Communication channel]]<br />
<br />
<br />
<br />
* [[Communication source]]<br />
<br />
<br />
<br />
* [[Conditional entropy]]<br />
<br />
<br />
<br />
* [[Covert channel]]<br />
<br />
<br />
<br />
* [[Data compression]]<br />
<br />
<br />
<br />
* Decoder<br />
<br />
<br />
<br />
* [[Differential entropy]]<br />
<br />
<br />
<br />
* [[Fungible information]]<br />
<br />
<br />
<br />
* [[Information fluctuation complexity]]<br />
<br />
<br />
<br />
* [[Information entropy]]<br />
<br />
<br />
<br />
* [[Joint entropy]]<br />
<br />
<br />
<br />
* [[Kullback–Leibler divergence]]<br />
<br />
<br />
<br />
* [[Mutual information]]<br />
<br />
<br />
<br />
* [[Pointwise mutual information]] (PMI)<br />
<br />
<br />
<br />
* [[Receiver (information theory)]]<br />
<br />
<br />
<br />
* [[Redundancy (information theory)|Redundancy]]<br />
<br />
<br />
<br />
* [[Rényi entropy]]<br />
<br />
<br />
<br />
* [[Self-information]]<br />
<br />
<br />
<br />
* [[Unicity distance]]<br />
<br />
<br />
<br />
* [[Variety (cybernetics)|Variety]]<br />
<br />
<br />
<br />
* [[Hamming distance]]<br />
<br />
<br />
<br />
{{div col end}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==References==<br />
<br />
==References==<br />
<br />
参考资料<br />
<br />
{{Reflist}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===The classic work===<br />
<br />
===The classic work===<br />
<br />
经典之作<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* [[Claude Elwood Shannon|Shannon, C.E.]] (1948), "[[A Mathematical Theory of Communication]]", ''Bell System Technical Journal'', 27, pp.&nbsp;379–423 & 623–656, July & October, 1948. [http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf PDF.] <br />[http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Notes and other formats.]<br />
<br />
<br />
<br />
* R.V.L. Hartley, [http://www.dotrose.com/etext/90_Miscellaneous/transmission_of_information_1928b.pdf "Transmission of Information"], ''Bell System Technical Journal'', July 1928<br />
<br />
<br />
<br />
* [[Andrey Kolmogorov]] (1968), "[https://www.tandfonline.com/doi/pdf/10.1080/00207166808803030 Three approaches to the quantitative definition of information]" in International Journal of Computer Mathematics.<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other journal articles===<br />
<br />
===Other journal articles===<br />
<br />
其他期刊文章<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* J. L. Kelly, Jr., [http://betbubbles.com/wp-content/uploads/2017/07/kelly.pdf Betbubbles.com]{{Dead link|date=January 2020 |bot=InternetArchiveBot |fix-attempted=yes }}, "A New Interpretation of Information Rate" ''Bell System Technical Journal'', Vol. 35, July 1956, pp.&nbsp;917–26.<br />
<br />
<br />
<br />
* R. Landauer, [http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=615478 IEEE.org], "Information is Physical" ''Proc. Workshop on Physics and Computation PhysComp'92'' (IEEE Comp. Sci.Press, Los Alamitos, 1993) pp.&nbsp;1–4.<br />
<br />
<br />
<br />
* {{cite journal | last1 = Landauer | first1 = R. | year = 1961 | title = Irreversibility and Heat Generation in the Computing Process | url = http://www.research.ibm.com/journal/rd/441/landauerii.pdf | journal = IBM J. Res. Dev. | volume = 5 | issue = 3| pages = 183–191 | doi = 10.1147/rd.53.0183 }}<br />
<br />
<br />
<br />
* {{cite arXiv |last=Timme |first=Nicholas|last2=Alford |first2=Wesley|last3=Flecker |first3=Benjamin|last4=Beggs |first4=John M.|date=2012 |title=Multivariate information measures: an experimentalist's perspective |eprint=1111.6857|class=cs.IT}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Textbooks on information theory===<br />
<br />
===Textbooks on information theory===<br />
<br />
信息论教材<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Arndt, C. ''Information Measures, Information and its Description in Science and Engineering'' (Springer Series: Signals and Communication Technology), 2004, {{isbn|978-3-540-40855-0}}<br />
<br />
<br />
<br />
* Ash, RB. ''Information Theory''. New York: Interscience, 1965. {{isbn|0-470-03445-9}}. New York: Dover 1990. {{isbn|0-486-66521-6}}<br />
<br />
<br />
<br />
* [[Gallager, R]]. ''Information Theory and Reliable Communication.'' New York: John Wiley and Sons, 1968. {{isbn|0-471-29048-3}}<br />
<br />
<br />
<br />
* Goldman, S. ''Information Theory''. New York: Prentice Hall, 1953. New York: Dover 1968 {{isbn|0-486-62209-6}}, 2005 {{isbn|0-486-44271-3}}<br />
<br />
<br />
<br />
* {{cite book |last1=Cover |first1=Thomas |author-link1=Thomas M. Cover |last2=Thomas |first2=Joy A. |title=Elements of information theory |edition=2nd |location=New York |publisher=[[Wiley-Interscience]] |date=2006 |isbn=0-471-24195-4}}<br />
<br />
<br />
<br />
* [[Csiszar, I]], Korner, J. ''Information Theory: Coding Theorems for Discrete Memoryless Systems'' Akademiai Kiado: 2nd edition, 1997. {{isbn|963-05-7440-3}}<br />
<br />
<br />
<br />
* [[David J. C. MacKay|MacKay, David J. C.]]. ''[http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms]'' Cambridge: Cambridge University Press, 2003. {{isbn|0-521-64298-1}}<br />
<br />
<br />
<br />
* Mansuripur, M. ''Introduction to Information Theory''. New York: Prentice Hall, 1987. {{isbn|0-13-484668-0}}<br />
<br />
<br />
<br />
* [[Robert McEliece|McEliece, R]]. ''The Theory of Information and Coding". Cambridge, 2002. {{isbn|978-0521831857}}<br />
<br />
<br />
<br />
*Pierce, JR. "An introduction to information theory: symbols, signals and noise". Dover (2nd Edition). 1961 (reprinted by Dover 1980).<br />
<br />
<br />
<br />
* [[Reza, F]]. ''An Introduction to Information Theory''. New York: McGraw-Hill 1961. New York: Dover 1994. {{isbn|0-486-68210-2}}<br />
<br />
<br />
<br />
* {{cite book |last1=Shannon |first1=Claude |author-link1=Claude Shannon |last2=Weaver |first2=Warren |author-link2=Warren Weaver |date=1949 |title=The Mathematical Theory of Communication |url=http://monoskop.org/images/b/be/Shannon_Claude_E_Weaver_Warren_The_Mathematical_Theory_of_Communication_1963.pdf |location=[[Urbana, Illinois]] |publisher=[[University of Illinois Press]] |lccn=49-11922 |isbn=0-252-72548-4}}<br />
<br />
<br />
<br />
* Stone, JV. Chapter 1 of book [http://jim-stone.staff.shef.ac.uk/BookInfoTheory/InfoTheoryBookMain.html "Information Theory: A Tutorial Introduction"], University of Sheffield, England, 2014. {{isbn|978-0956372857}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book/ A First Course in Information Theory]'' Kluwer Academic/Plenum Publishers, 2002. {{isbn|0-306-46791-7}}.<br />
<br />
<br />
<br />
* Yeung, RW. ''[http://iest2.ie.cuhk.edu.hk/~whyeung/book2/ Information Theory and Network Coding]'' Springer 2008, 2002. {{isbn|978-0-387-79233-0}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===Other books===<br />
<br />
===Other books===<br />
<br />
其他书籍<br />
<br />
{{refbegin}}<br />
<br />
<br />
<br />
* Leon Brillouin, ''Science and Information Theory'', Mineola, N.Y.: Dover, [1956, 1962] 2004. {{isbn|0-486-43918-6}}<br />
<br />
<br />
<br />
* [[James Gleick]], ''[[The Information: A History, a Theory, a Flood]]'', New York: Pantheon, 2011. {{isbn|978-0-375-42372-7}}<br />
<br />
<br />
<br />
* A. I. Khinchin, ''Mathematical Foundations of Information Theory'', New York: Dover, 1957. {{isbn|0-486-60434-9}}<br />
<br />
<br />
<br />
* H. S. Leff and A. F. Rex, Editors, ''Maxwell's Demon: Entropy, Information, Computing'', Princeton University Press, Princeton, New Jersey (1990). {{isbn|0-691-08727-X}}<br />
<br />
<br />
<br />
* [[Robert K. Logan]]. ''What is Information? - Propagating Organization in the Biosphere, the Symbolosphere, the Technosphere and the Econosphere'', Toronto: DEMO Publishing.<br />
<br />
<br />
<br />
* Tom Siegfried, ''The Bit and the Pendulum'', Wiley, 2000. {{isbn|0-471-32174-5}}<br />
<br />
<br />
<br />
* Charles Seife, ''[[Decoding the Universe]]'', Viking, 2006. {{isbn|0-670-03441-X}}<br />
<br />
<br />
<br />
* Jeremy Campbell, ''[[Grammatical Man]]'', Touchstone/Simon & Schuster, 1982, {{isbn|0-671-44062-4}}<br />
<br />
<br />
<br />
* Henri Theil, ''Economics and Information Theory'', Rand McNally & Company - Chicago, 1967.<br />
<br />
<br />
<br />
* Escolano, Suau, Bonev, ''[https://www.springer.com/computer/image+processing/book/978-1-84882-296-2 Information Theory in Computer Vision and Pattern Recognition]'', Springer, 2009. {{isbn|978-1-84882-296-2}}<br />
<br />
<br />
<br />
* Vlatko Vedral, ''Decoding Reality: The Universe as Quantum Information'', Oxford University Press 2010. {{ISBN|0-19-923769-7}}<br />
<br />
<br />
<br />
{{refend}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
===MOOC on information theory===<br />
<br />
===MOOC on information theory===<br />
<br />
信息理论大型开放式课程<br />
<br />
* Raymond W. Yeung, "[http://www.inc.cuhk.edu.hk/InformationTheory/index.html Information Theory]" ([[The Chinese University of Hong Kong]])<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
==External links==<br />
<br />
==External links==<br />
<br />
外部链接<br />
<br />
{{Wikiquote}}<br />
<br />
<br />
<br />
{{Library resources box}}<br />
<br />
<br />
<br />
* {{SpringerEOM |title=Information |id=p/i051040}}<br />
<br />
<br />
<br />
* Lambert F. L. (1999), "[http://jchemed.chem.wisc.edu/Journal/Issues/1999/Oct/abs1385.html Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!]", ''Journal of Chemical Education''<br />
<br />
<br />
<br />
* [http://www.itsoc.org/ IEEE Information Theory Society] and [https://www.itsoc.org/resources/surveys ITSOC Monographs, Surveys, and Reviews]<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Cybernetics}}<br />
<br />
<br />
<br />
{{Compression methods}}<br />
<br />
<br />
<br />
{{Areas of mathematics}}<br />
<br />
<br />
<br />
{{Computer science}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{Authority control}}<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
{{DEFAULTSORT:Information Theory}}<br />
<br />
<br />
<br />
[[Category:Information theory| ]]<br />
<br />
<br />
<br />
[[Category:Computer science]]<br />
<br />
Category:Computer science<br />
<br />
类别: 计算机科学<br />
<br />
[[Category:Cybernetics]]<br />
<br />
Category:Cybernetics<br />
<br />
类别: 控制论<br />
<br />
[[Category:Formal sciences]]<br />
<br />
Category:Formal sciences<br />
<br />
类别: 正规科学<br />
<br />
[[Category:Information Age]]<br />
<br />
Category:Information Age<br />
<br />
类别: 信息时代<br />
<br />
<noinclude><br />
<br />
<small>This page was moved from [[wikipedia:en:Information theory]]. Its edit history can be viewed at [[信息论/edithistory]]</small></noinclude><br />
<br />
[[Category:待整理页面]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E7%94%A8%E6%88%B7:Pjhhh&diff=5121用户:Pjhhh2020-04-23T06:18:30Z<p>Pjhhh:/* Hi,我是瑾晗 */</p>
<hr />
<div>== '''Hi,我是瑾晗''' ==<br />
<br />
*'''性别:'''男<br />
*'''当前就读:'''中国民航大学空中交通管理学院研究生在读,本科也曾就读于中国民航大学空中交通管理学院<br />
*'''主要研究内容:'''空中交通流量管理、交通运输网络相关内容、交通复杂网络、网络弹性(也才是入门,感兴趣的小伙伴们阔以一起交流)<br />
*'''兴趣与爱好:'''长跑、骑行、爬山;喜欢在一个陌生的地方漫无目的闲逛;做一些自己没尝试过的菜;英雄联盟老黄金选手 佛系游戏<br />
*'''联系方式:'''mail:2019031013@cauc.edu.cn</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4681超图 Hypergraph2020-04-22T09:52:36Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区---厚朴<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
在[[数学中]], '''超图'''是一种广义上的[[graph(discrete mathematics)|图]] ,它的一条[[graph theory|边]]可以连接任意数量的[[vertex (graph theory)|顶点]]. 相对而言,在普通图中,一条边只能连接两个顶点.形式上, 超图 <math>H</math> 是一个集合组 <math>H = (X,E)</math> 其中<math>X</math> 是一个以节点或顶点为元素的集合,即顶点集, 而 <math>E</math> 是一组非空子集,被称为边或超边. <br />
因此,若<math>\mathcal{P}(X)</math>是 <math>E</math>的幂集,则<math>E</math>是 <math>\mathcal{P}(X) \setminus\{\emptyset\}</math> 的一个子集。在<math>H</math>中,顶点集的大小被称为超图的阶数,边集的大小被称为超图的大小。<br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
普通图的边是节点的二元子集,超边则是节点的任意集合,所以可以包含任意数量的节点。但我们总先需要研究具有相同基数超边的超图,即一个 k-均匀超图,所有超边的大小都为 k。因此一个 2-均匀超图就是图,一个 3-均匀超图就是三元组的集合,依此类推。超图也被称为从[[泛集]](universal set)中抽取的一个集系统或[[集族]]。<br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
超图可以看做是[[关联结构]](incidence structure)。特别的,每个超图都有一个与超图相对应的二分 "关联图 "或 "[[列维图]]"(Levi graph),反之,大多数(但不是全部)[[二分图]]都可以看作是超图的关联图。<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
超图还有许多其它名称。在[[计算几何学]]中,超图有时可以被称为'''范围空间'''(range space),将超图的边称为''范围''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
在[[合作博弈论]]中,超图被称为'''简单博弈'''(投票博弈);这个概念被应用于解决[[社会选择理论]](social choice theory)中的问题。在一些文献中,超边被称为''超连接''或''连接器''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
特殊类型的超图包括:上文简单讨论过的 k-均匀超图;散簇,没有一条边作是另一条边的子集;以及[[抽象单纯复形]](abstract simplicial complexes),包含每条边的所有子集。<br />
超图是一个以超图同态为[[态射]](morphism)的范畴。<br />
<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
超图有不同的类型,如:<br />
* 空超图:没有边的超图<br />
* 非简单(或多重)超图:允许有循环(有单个顶点的超边)或重复边的超图,也就是说可以有两个或两个以上的边包含同一组顶点。<br />
* 简单超图:不包含循环和重复边的超图。<br />
* 𝑘-均匀超图:每条超边都正好包含 k 个顶点的超图。<br />
* 𝑑-正则超图:每个顶点的度数都是 𝑑 的超图<br />
* 无环超图:不包含任何圈的超图。<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
因为超图的链接可以有任意基数,所以有几种子图的概念,分别是''子超图''(subhypergraphs)、''部分超图''(partial hypergraphs)和''分段超图''(section hypergraphs)。<br />
<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
让 𝐻=(𝑋,𝐸) 是一个超图,包含顶点集:<br />
𝑋={𝑥𝑖|𝑖∈𝐼𝑣},<br />
和边集<br />
𝐸={𝑒𝑖|𝑖∈𝐼𝑒∧𝑒𝑖⊆𝑋∧𝑒𝑖≠∅𝐸}<br />
其中 𝐼𝑣 和 𝐼𝑒 分别是顶点和边集的索引集。<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
子超图是去掉某些顶点的超图。在形式上,若 𝐴⊆𝑋 是顶点子集,则子超图 𝐻𝐴 被定义为:<br />
𝐻𝐴=(𝐴,{𝑒𝐴∩∩|𝑒𝐴∈𝐸∧𝑒∩𝐴≠∅)<br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
一个子超图的扩展是一个超图,其中每个属于 H 的超边都部分包含在子超图的 𝐻𝐴,并且完全包含在扩展的𝐸𝑥(𝐻𝐴) 中。即在形式上:<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
部分超图是去掉一些边的超图。给定一个边索引集的子集 𝐽⊂𝐼𝑒 ,由 𝐽 生成的部分超图就是<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
而给定一个子集 𝐴⊆𝑋,则分段超图是部分超图<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
𝐻 的重记号 𝐻∗ 则是一个顶点和边互换的超图,因此顶点由 {𝑒𝑖 } 给出,边由 {𝑋𝑚} 给出,其中<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
当等式的记号被正确定义时,如下,对一个超图采取两次运算是对偶的:<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
对于不连通的超图 G 和具有相同顶点连通的超图 H,如果 H 的每个超边都有 G 中一个子图连接,则 G 是一个主图(host graph);<br />
对于不连通的超图 H,如果 G 和 H 的连通部分之间存在一个双射,使得 G 的每个连通部分 G' 都是对应的 H' 的主图,则 G 是一个主图。<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
一个超图是二分(bipartite)的,当且仅当它的顶点能被分成两类 U 和 V :每个基数至少为 2 超边包含两类中的至少一个顶点。相反的,超图则被称为具有属性 B。<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
2-段超图(或团图,代表图、原始图、盖夫曼图)是具有相同顶点的图,并且所有顶点对之间的边包含在相同的超边中。<br />
<br />
==二部图模型 Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
[[File:bipartie graph.jpeg|200px|缩略图|右| 设<math>G=(V,E)</math>是一个无向图,如果顶点V可分割为两个互不相交的子集<math> {(group1, group2)}</math>,并且图中的每条边<math>{(i,j)}</math>所关联的两个顶点<math>{i}</math>和<math>{j}</math>分别属于这两个不同的部分<math>{(i \in group1,j \in group2)}</math>,则称图<math>{G}</math>为一个二部图。]]<br />
<br />
一个'''超图“ <math>{H} </math>”'''可以用二部图“<math>{BG} </math>”表示,其构成如下: 集合"X"和" E"是"BG"的分割,而且 ("x<sub>1</sub>", "e<sub>1</sub>") 与边连通当且仅当顶点"x<sub>1</sub>"包含在" <math>H </math>"的边" e<sub>1</sub>"中。 反之,任何具有固定的'''部分 part'''且在第二部分中没有不连通节点的二部图也代表具有上述性质的部分超图。 这个二部图也称为'''关联图'''。<br />
<br />
==无环性 Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
与只有'''圈 cycle'''和'''森林 forest'''的普通无向图相比,对于超图的特殊情形,那些坍缩为平凡图的无环性超图有多种自然不等价的'''无环性 acyclicity''' 定义。<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
由Claude Berge 给出了超图无环性的首个定义: <ref>Claude Berge,[https://www.amazon.com/Graphs-hypergraphs-North-Holland-mathematical-library/dp/0444103996 ''Graphs and Hypergraphs'']</ref> 如果它的'''关联图'''(上面定义的二部图)是无环的,则称这个超图是 Berge 无环的 Berge-acyclic。 这个定义是非常严格的:例如,假设一个超图有一些顶点<math>v \neq v'</math>和一些超边<math>f \neq f'</math> ,例如 <math>v, v' \in f</math> 和<math>v, v' \in f'</math>,那么它就是 Berge成环的 Berge-cyclic。 通过对关联图的探讨,Berge成环性 berge-cyclity可以在线性时间 linear time内得到有效验证 。<br />
<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
此处,我们可以定义一个减弱的超图无环性的概念<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref>,后来被称为 <math> {\alpha}</math>-无环性 <math> {\alpha}</math> acyclicity。 这个无环性的概念等价于超图是同构的(原图的每个团都被某个超边所覆盖) ,它的原图称为弦图 chordal graph ; 它也等价于通过 GYO 算法 Graham-Yu-Ozsoyoglu Algorithm (也称为格雷厄姆算法 Graham's algorithm) 得到具有可约性的空图<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref>。GYO 算法是一个合流 confluence(抽象重写 abstract rewriting)迭代过程,该算法中使用耳朵 ear的广义定义去除超边 (图论中的耳朵就定义为一条路径,其中除了端点外的点的度数均为 2(端点可以重合),而且删去后不破坏图的连通性)。总所周知, 在数据库理论 database theory 的领域中,如果一个数据库模式 database schema的底层超图是<math> {\alpha}</math>无环的,那么它就具有某些理想的性质。 <ref>Serge Abiteboul, Richard B. Hull, Victor Vianu|V. Vianu, ''Foundations of Databases''</ref> 除此之外,<math> {\alpha}</math>无环性也与一阶逻辑 first-order logic 保护的片段 guarded fragment 的表达能力有关。<br />
<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
我们可以在线性时间 linear time内检验超图是否是-无环的。 <ref>Robert Tarjan|R. E. Tarjan, Mihalis Yannakakis|M. Yannakakis. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
注意到<math> {\alpha}</math>-无环性似乎直觉不相符,即<math> {\alpha}</math>-成环超图添加超边可能使其成为<math> {\alpha}</math>-无环的(例如,添加一条包含超图所有顶点的超边总能其成为<math> {\alpha}</math>-无环的)。 为了克服这个缺点,Ronald Fagin<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> 定义了更强的 <math> {\beta}</math>-无环性 <math> {\beta}</math>-acylicity 和 <math> {\gamma}</math>无环性 <math> {\gamma}</math>-acylicity 概念。 应当指出:<math> {\gamma}</math>无环超图是推出其所有子超图都是<math> {\alpha}</math>无环的必要条件,这与 Graham 的早期定义<ref name="graham1979universal"/> 等价。 <math> {\gamma}</math>无环性的概念是一个更加严苛的条件,它等价于数据库模式的几个理想性质,并且与Bachman 图 Bachman diagrams有关. <math> {\beta}</math>-无环性 和 <math> {\gamma}</math>无环性 都可以在多项式时间 polynomial time(PTIME)内完成检测。<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
无环性的四个概念具有可比性: berge-无环性意味着 <math> {\gamma}</math>- 无环性, <math> {\gamma}</math>- 无环性又意味着 <math> {\beta}</math>- 无环性, <math> {\beta}</math>- 无环性又可以推出 <math> {\alpha}</math> 无环性。 然而,反之均不成立。<ref name="fagin1983degrees" /><br />
<br />
==同构和相等 Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
超图同态 homomorphism是指从一个超图的顶点集到另一个超图的顶点集的映射,如此使得每条边映射到另一条边。<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]] <br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
<br />
如果一个超图 <math>H=(X,E)</math>同构 isomorphic 与另外一个超图<math>G=(Y,F)</math>,则存在一个双射:<math>H \simeq G</math> :<math>\phi:X \to Y</math><br />
<br />
和 关于<math>I</math>的置换 permutation 使得: :<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
那么这个双射被称为图的同构 isomorphism,记作:<math>H \simeq G</math>但且仅当<math>H^* \simeq G^*</math>。<br />
<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
<br />
当超图的边被明确标记时,就有了'''“强同构 strong isomorphism ”'''这个新的概念。 当前面提及的置换是唯一的,则称<math>H</math> 强同构于 <math>G</math> 。 记作<math>H \cong G</math>。 注意,所有强同构图都是同构的,但反过来就不成立。<br />
<br />
当超图的顶点被明确标记时,就有了'''“等价 equivalence”'''和'''“相等 equality”'''的概念。 我们称<math>H</math>和<math>G</math>等价记作:<math>H\equiv G</math> 如果同构<math>\phi</math> 满足:<br />
<br />
<math>\phi(x_n) = y_n</math><br />
<br />
而且:<br />
<br />
<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
记作:<br />
<math>H\equiv G</math> 当且仅当 <math>H^* \cong G^*</math><br />
<br />
超图'''自同构 automorphism'''是从顶点集到自身的同构,也就是顶点的重标号。 超图“ <math>{H }</math>”(= (''X'',&nbsp;''E''))的自同构集合是超图的群 group,称为超图的'''自同构群 automorphism group''',并写成 <math>{Aut(''H'')}</math>。<br />
<br />
===例子 Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
<br />
考虑超图<math>H</math>,他的边为:<br />
<br />
<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
<br />
和超图<math>G</math>:<br />
<br />
<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
很明显 <math>H</math> 和 <math>G</math> 同构(有<math>\phi(a)=\alpha</math>等),但是他们不是强同构,因为比如在超图<math>H</math>中,<math>a</math> 顶点连接1,4,6三条边,所以:<br />
<br />
<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
在图<math>G</math>,不存在连接边1,4,6的顶点:<br />
<br />
<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
在这个例子,<math>H</math> 和 <math>G</math>是等价的, <math>H\equiv G</math>,而且两者的对偶图是强同构的:<math>H^*\cong G^*</math><br />
<br />
==对称超图 Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
<br />
超图<math>H</math>的<math>r(H)</math>表示该超图中任何一条边的最大'''基数'''。如果所有边具有相同的基数''k'',则称该超图为均匀的或k-均匀的,或称之为k-超图。图只是一个2-均匀的超图。<br />
<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
<br />
'''顶点'''''v''的'''度'''''d(v)''表示包含该顶点的边的数量。如果每个顶点的度都为''k'',则超图''H''是'''k-正则'''的。<br />
<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
<br />
均匀超图的对偶是正则的,反之亦然。<br />
<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
<br />
如果存在一个形如<math>\phi(x)=y</math>的自同构,则超图''H''的两个顶点''x''和''y''对称。如果存在一个自同构使得<math>\phi(e_i)=e_j</math>,则称两个边<math>e_i</math>和<math>e_j</math>为对称。<br />
<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
<br />
如果超图的所有顶点都是对称的,则称其为顶点可传递的(或顶点对称的)。类似地,如果超图的所有边都是对称的,则该超图是边传递的。 如果一个超图既是边对称的又是顶点对称的,则该超图是简单传递的。<br />
<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
<br />
由于超图的对偶性,边传递性的研究与顶点传递性的研究是相一致的。<br />
<br />
==横截面 Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
<br />
超图''H'' = (''X'', ''E'')的横截集(或命中集)是一个<math>T\subseteq X</math>集合,该集合与每条边都有非空的交集。如果''T''的真子集不是横截集,则称''T''为极小截集。''H'' 的横截超图是超图(''X'', ''F''),其边集''F''包含''H''的所有最小横截。<br />
<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
<br />
计算横截面超图在[[组合优化 Combinatorial Optimization]]、[[博弈论 Game Theory]]和[[计算机科学 Computer Science]]的一些领域(例如[[机器学习 Machine Learning]]、[[数据库索引 Indexing of Databases]]、[[可满足性问题the Satisfiability Problem]]、[[数据挖掘Data Mining]]和[[计算机程序优化 Program Optimization]])都有应用。<br />
<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
<br />
分别设 <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math>, <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>。<br />
每一个超图都有一个 <math>n \times m</math>[[关联矩阵]]<math>A = (a_{ij})</math>其为:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
<br />
其关联矩阵的[[转设]] <math>A^t</math>定义了 <math>H^* = (V^*,\ E^*)</math>称为<math>H</math>的'''对偶''',其中<math>V^*</math>是一个''m''元集合 <math>E^*</math>是一个<math>V^*</math>子集的''n''元集合。<br />
<br />
对于<math>v^*_j \in V^*</math> 和 <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[当且仅当]] <math>a_{ij} = 1</math>。<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
经典超图着色是将集合<math>\{1,2,3,...\lambda\}</math>中的其中一种颜色赋予给超图的每个顶点,使每个超边至少包含两个不同颜色的顶点。换句话说,不能存在基数至少为2的单色深边。从此意义上出发,它是通常图着色的直接推广。在所有着色行为中使用到最小的不同颜色数称为超图的色数。<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
存在着使用多达''k'' 种颜色着色的超图称为''k- 可着色图''。'''2-可染超图就是二分图'''。<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
经典超图着色有许多推广。在允许单色边情况下,混合超图着色是其中之一。一些混合超图对于任意数量的颜色都是不可着色的。同时不可着色性的内在标准是未知的。当一个混合超图是可着色时,其所使用的最小和最大颜色数分别称为下色数和上色数。详情请参阅 http://spectrum.troy.edu/voloshin/mh.html<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
<br />
由E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref>所提出的一个分区定理表明,对于边传递超图 <math>H=(X,E)</math>存在一个[[分区]]:<math>(X_1, X_2,\cdots,X_K)</math>对于顶点集 <math>X</math>使得由<math>X_k</math>生成的子超图<math>H_{X_k}</math>在<math>1\le k \le K</math>时是可传递的,并且使得<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math>,其中<math>r(H)</math>是 ''H''的秩。<br />
<br />
作为推论,不是点传递的边传递超图则是双色的。<br />
<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
<br />
[[图分区]](特别是超图分区)在集成电路设计<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> 和并行计算<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref>中有很多应用。在机器学习任务中,高效、可扩展的[[超图分区算法]]对于处理大规模超图也很重要。<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
许多涉及图的定理和概念也适用于超图,典型的例子有[[拉姆西定理]](Ramsey's theorem)和超图的线图。研究图的对称性的一些方法也被扩展到超图。<br />
均匀超图上有[[Erdős-Ko-Rado theorem]]和[[Kruskal-Katona theorem]]两个著名定理。<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.(这个线路图可以解释为一个超图,其中四个顶点(用白色的矩形和圆盘表示)由三个用树表示的超图连接)]]<br />
<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
尽管超图比图更难画在纸上,但一些研究者已经研究了超图可视化方法。<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = <br />
}}.</ref><br />
<br />
其中一种超图的可视化表示法,类似于标准的图的画法:用平面内的曲线来描绘图边,将超图的顶点画成点状、圆盘或盒子,超边则被描绘成以顶点为叶子的树[16][17]。如果顶点表示为点,超边也可以被描绘成连接点集的平滑曲线,或显示为封闭点集的简单闭合曲线[18][19][20]。 <br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).(一个4阶维恩图,可以被解释为一个15个顶点(15个有色区域)和4个超边(4个椭圆)的超图的细分图)]]<br />
<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><br />
<br />
超图可视化的另一种样式,是绘制超图的细分模型[21],平面被细分为区域,每个区域代表超图的一个顶点。超图的超边用这些区域的相邻子集来表示,这些子集可以通过着色、或在它们周围画轮廓来表示,或者兼而有之。<br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
超图的另一种表示法被称做 PAOH[24],如上图所示,边是连接顶点的垂线,顶点在左边是对齐的。右边的图例显示了边的名称。它是为动态超图设计的,但也可以用于简单的超图。<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
通过扩充一组替换规则于超图,[[图语法]]可以被推广超边上。<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
<br />
==超图概念的延伸==<br />
<br />
超图的相关概念可以进行进一步的延伸,如超图中的一些边可以指向另一些边。这种延伸有两种变体。在第一种变体中,超图的边不仅包含一组节点,而且还可以包含这组节点的子集、子集的子集等等。本质上,超图的每条边只是树结构或有向无环图的一个内部节点,而节点就是叶子。从这个意义上来说,超图就是具有共享节点的树的集合(即内部节点或叶子可能出现在不同的树结构中),反过来说,每个树的集合又可以理解为一个超图。因为树结构在计算机科学和许多数学分支中被广泛使用,所以超图的出现也是自然而然的。比如这种延伸是作为项代数的模型而自然产生的:边对应项,节点对应常量或变量。<br />
<br />
<br />
对于上述的超图,节点集提供了一种排序。但是该排序既不是偏序也不是预序,因为它是不可传递的。与这一延伸方式的Levi图相对应的图是有向无环图。例如,一个超图的节点集为<math>V= \{a,b\}</math>,边为<math>e_1=\{a,b\}</math>和<math>e_2=\{a,e_1\}</math>。那么,虽然<math>b\in e_1</math>且<math>e_1\in e_2</math>,但<math>b\in e_2</math>却不是真的。然而,这类超图节点集的封闭传递确实诱导了偏序,并将超图“展平”为一个偏序集。<br />
<br />
<br />
第二种变体中,超图中的边可以指向其他边,同时不用考虑必须形成有向非循环图的要求。这允许超图具有边的循环,而不需要有任何节点。例如,考虑由两条边e1和e2组成的,节点个数为零的广义超图,使得<math>e_1 = \{e_2\}</math>且<math>e_2 = \{e_1\}</math>。因为这个循环是无限递归的,所以边的集合违反了基础公理。具体来说,对于这样的超图,不存在节点集的封闭传递。虽然这样的结构乍看起来可能很奇怪,但只要注意到它的Levi图的等价延伸不再是二分图,而是一般的有向图,就可以很容易地去理解。<br />
<br />
根据定义,这种超图的广义关联矩阵是一个方阵,其秩等于节点和边的总数。因此,对于上面的示例,关联矩阵为:<br />
<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right]</math>。<br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
<br />
==超图与机器学习==<br />
<br />
超图已被广泛用于机器学习中,常作为一种数据结构或一种正则化的方式[25]。这些应用包括推荐系统(社团作为超边)[26]、图像检索(相关性作为超边)[27]、和生物信息学(生物、化学分子间相互作用作为超边)[28]。比较典型的超图机器学习方法包括:超图谱聚类法(用超图Laplacian扩展光谱图理论)[29]和超图半监督学习(通过引入超图结构来对结果进行限定)。对于大尺寸的超图,可以使用Apache Spark构建的分布式框架[15]。<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN <br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]<br />
<br />
<br />
==编者推荐==<br />
*[https://book.douban.com/subject/1237624/ 《超图-限集的组合学》]by [法]Claude Berge<br />
超图的第一本专著,作者是近代图论之父法国数学家Claude Berge,将图里的普通边拓展为超边,小小的一步拓展却引发了一个大的领域。</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4668超图 Hypergraph2020-04-22T09:47:33Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区---厚朴<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
在[[数学中]], '''超图'''是一种广义上的[[graph(discrete mathematics)|图]] ,它的一条[[graph theory|边]]可以连接任意数量的[[vertex (graph theory)|顶点]]. 相对而言,在普通图中,一条边只能连接两个顶点.形式上, 超图 <math>H</math> 是一个集合组 <math>H = (X,E)</math> 其中<math>X</math> 是一个以节点或顶点为元素的集合,即顶点集, 而 <math>E</math> 是一组非空子,被称为边或超边. <br />
因此,若<math>\mathcal{P}(X)</math>是 <math>E</math>的幂集,则<math>E</math>是 <math>\mathcal{P}(X) \setminus\{\emptyset\}</math> 的一个子集。在<math>H</math>中,顶点集的大小被称为超图的阶数,边集的大小被称为超图的大小。<br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
普通图的边是节点的二元子集,超边则是节点的任意集合,所以可以包含任意数量的节点。但我们总先需要研究具有相同基数超边的超图,即一个 k-均匀超图,所有超边的大小都为 k。因此一个 2-均匀超图就是图,一个 3-均匀超图就是三元组的集合,依此类推。超图也被称为从[[泛集]](universal set)中抽取的一个集系统或[[集族]]。<br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
超图可以看做是[[关联结构]](incidence structure)。特别的,每个超图都有一个与超图相对应的二分 "关联图 "或 "[[列维图]]"(Levi graph),反之,大多数(但不是全部)[[二分图]]都可以看作是超图的关联图。<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
超图还有许多其它名称。在[[计算几何学]]中,超图有时可以被称为'''范围空间'''(range space),将超图的边称为''范围''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
在[[合作博弈论]]中,超图被称为'''简单博弈'''(投票博弈);这个概念被应用于解决[[社会选择理论]](social choice theory)中的问题。在一些文献中,超边被称为''超连接''或''连接器''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
特殊类型的超图包括:上文简单讨论过的 k-均匀超图;散簇,没有一条边作是另一条边的子集;以及[[抽象单纯复形]](abstract simplicial complexes),包含每条边的所有子集。<br />
超图是一个以超图同态为[[态射]](morphism)的范畴。<br />
<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
超图有不同的类型,如:<br />
* 空超图:没有边的超图<br />
* 非简单(或多重)超图:允许有循环(有单个顶点的超边)或重复边的超图,也就是说可以有两个或两个以上的边包含同一组顶点。<br />
* 简单超图:不包含循环和重复边的超图。<br />
* 𝑘-均匀超图:每条超边都正好包含 k 个顶点的超图。<br />
* 𝑑-正则超图:每个顶点的度数都是 𝑑 的超图<br />
* 无环超图:不包含任何圈的超图。<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
因为超图的链接可以有任意基数,所以有几种子图的概念,分别是''子超图''(subhypergraphs)、''部分超图''(partial hypergraphs)和''分段超图''(section hypergraphs)。<br />
<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
让 𝐻=(𝑋,𝐸) 是一个超图,包含顶点集:<br />
𝑋={𝑥𝑖|𝑖∈𝐼𝑣},<br />
和边集<br />
𝐸={𝑒𝑖|𝑖∈𝐼𝑒∧𝑒𝑖⊆𝑋∧𝑒𝑖≠∅𝐸}<br />
其中 𝐼𝑣 和 𝐼𝑒 分别是顶点和边集的索引集。<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
子超图是去掉某些顶点的超图。在形式上,若 𝐴⊆𝑋 是顶点子集,则子超图 𝐻𝐴 被定义为:<br />
𝐻𝐴=(𝐴,{𝑒𝐴∩∩|𝑒𝐴∈𝐸∧𝑒∩𝐴≠∅)<br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
一个子超图的扩展是一个超图,其中每个属于 H 的超边都部分包含在子超图的 𝐻𝐴,并且完全包含在扩展的𝐸𝑥(𝐻𝐴) 中。即在形式上:<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
部分超图是去掉一些边的超图。给定一个边索引集的子集 𝐽⊂𝐼𝑒 ,由 𝐽 生成的部分超图就是<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
而给定一个子集 𝐴⊆𝑋,则分段超图是部分超图<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
𝐻 的重记号 𝐻∗ 则是一个顶点和边互换的超图,因此顶点由 {𝑒𝑖 } 给出,边由 {𝑋𝑚} 给出,其中<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
当等式的记号被正确定义时,如下,对一个超图采取两次运算是对偶的:<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
对于不连通的超图 G 和具有相同顶点连通的超图 H,如果 H 的每个超边都有 G 中一个子图连接,则 G 是一个主图(host graph);<br />
对于不连通的超图 H,如果 G 和 H 的连通部分之间存在一个双射,使得 G 的每个连通部分 G' 都是对应的 H' 的主图,则 G 是一个主图。<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
一个超图是二分(bipartite)的,当且仅当它的顶点能被分成两类 U 和 V :每个基数至少为 2 超边包含两类中的至少一个顶点。相反的,超图则被称为具有属性 B。<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
2-段超图(或团图,代表图、原始图、盖夫曼图)是具有相同顶点的图,并且所有顶点对之间的边包含在相同的超边中。<br />
<br />
==二部图模型 Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
[[File:bipartie graph.jpeg|200px|缩略图|右| 设<math>G=(V,E)</math>是一个无向图,如果顶点V可分割为两个互不相交的子集<math> {(group1, group2)}</math>,并且图中的每条边<math>{(i,j)}</math>所关联的两个顶点<math>{i}</math>和<math>{j}</math>分别属于这两个不同的部分<math>{(i \in group1,j \in group2)}</math>,则称图<math>{G}</math>为一个二部图。]]<br />
<br />
一个'''超图“ <math>{H} </math>”'''可以用二部图“<math>{BG} </math>”表示,其构成如下: 集合"X"和" E"是"BG"的分割,而且 ("x<sub>1</sub>", "e<sub>1</sub>") 与边连通当且仅当顶点"x<sub>1</sub>"包含在" <math>H </math>"的边" e<sub>1</sub>"中。 反之,任何具有固定的'''部分 part'''且在第二部分中没有不连通节点的二部图也代表具有上述性质的部分超图。 这个二部图也称为'''关联图'''。<br />
<br />
==无环性 Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
与只有'''圈 cycle'''和'''森林 forest'''的普通无向图相比,对于超图的特殊情形,那些坍缩为平凡图的无环性超图有多种自然不等价的'''无环性 acyclicity''' 定义。<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
由Claude Berge 给出了超图无环性的首个定义: <ref>Claude Berge,[https://www.amazon.com/Graphs-hypergraphs-North-Holland-mathematical-library/dp/0444103996 ''Graphs and Hypergraphs'']</ref> 如果它的'''关联图'''(上面定义的二部图)是无环的,则称这个超图是 Berge 无环的 Berge-acyclic。 这个定义是非常严格的:例如,假设一个超图有一些顶点<math>v \neq v'</math>和一些超边<math>f \neq f'</math> ,例如 <math>v, v' \in f</math> 和<math>v, v' \in f'</math>,那么它就是 Berge成环的 Berge-cyclic。 通过对关联图的探讨,Berge成环性 berge-cyclity可以在线性时间 linear time内得到有效验证 。<br />
<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
此处,我们可以定义一个减弱的超图无环性的概念<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref>,后来被称为 <math> {\alpha}</math>-无环性 <math> {\alpha}</math> acyclicity。 这个无环性的概念等价于超图是同构的(原图的每个团都被某个超边所覆盖) ,它的原图称为弦图 chordal graph ; 它也等价于通过 GYO 算法 Graham-Yu-Ozsoyoglu Algorithm (也称为格雷厄姆算法 Graham's algorithm) 得到具有可约性的空图<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref>。GYO 算法是一个合流 confluence(抽象重写 abstract rewriting)迭代过程,该算法中使用耳朵 ear的广义定义去除超边 (图论中的耳朵就定义为一条路径,其中除了端点外的点的度数均为 2(端点可以重合),而且删去后不破坏图的连通性)。总所周知, 在数据库理论 database theory 的领域中,如果一个数据库模式 database schema的底层超图是<math> {\alpha}</math>无环的,那么它就具有某些理想的性质。 <ref>Serge Abiteboul, Richard B. Hull, Victor Vianu|V. Vianu, ''Foundations of Databases''</ref> 除此之外,<math> {\alpha}</math>无环性也与一阶逻辑 first-order logic 保护的片段 guarded fragment 的表达能力有关。<br />
<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
我们可以在线性时间 linear time内检验超图是否是-无环的。 <ref>Robert Tarjan|R. E. Tarjan, Mihalis Yannakakis|M. Yannakakis. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
注意到<math> {\alpha}</math>-无环性似乎直觉不相符,即<math> {\alpha}</math>-成环超图添加超边可能使其成为<math> {\alpha}</math>-无环的(例如,添加一条包含超图所有顶点的超边总能其成为<math> {\alpha}</math>-无环的)。 为了克服这个缺点,Ronald Fagin<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> 定义了更强的 <math> {\beta}</math>-无环性 <math> {\beta}</math>-acylicity 和 <math> {\gamma}</math>无环性 <math> {\gamma}</math>-acylicity 概念。 应当指出:<math> {\gamma}</math>无环超图是推出其所有子超图都是<math> {\alpha}</math>无环的必要条件,这与 Graham 的早期定义<ref name="graham1979universal"/> 等价。 <math> {\gamma}</math>无环性的概念是一个更加严苛的条件,它等价于数据库模式的几个理想性质,并且与Bachman 图 Bachman diagrams有关. <math> {\beta}</math>-无环性 和 <math> {\gamma}</math>无环性 都可以在多项式时间 polynomial time(PTIME)内完成检测。<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
无环性的四个概念具有可比性: berge-无环性意味着 <math> {\gamma}</math>- 无环性, <math> {\gamma}</math>- 无环性又意味着 <math> {\beta}</math>- 无环性, <math> {\beta}</math>- 无环性又可以推出 <math> {\alpha}</math> 无环性。 然而,反之均不成立。<ref name="fagin1983degrees" /><br />
<br />
==同构和相等 Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
超图同态 homomorphism是指从一个超图的顶点集到另一个超图的顶点集的映射,如此使得每条边映射到另一条边。<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]] <br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
<br />
如果一个超图 <math>H=(X,E)</math>同构 isomorphic 与另外一个超图<math>G=(Y,F)</math>,则存在一个双射:<math>H \simeq G</math> :<math>\phi:X \to Y</math><br />
<br />
和 关于<math>I</math>的置换 permutation 使得: :<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
那么这个双射被称为图的同构 isomorphism,记作:<math>H \simeq G</math>但且仅当<math>H^* \simeq G^*</math>。<br />
<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
<br />
当超图的边被明确标记时,就有了'''“强同构 strong isomorphism ”'''这个新的概念。 当前面提及的置换是唯一的,则称<math>H</math> 强同构于 <math>G</math> 。 记作<math>H \cong G</math>。 注意,所有强同构图都是同构的,但反过来就不成立。<br />
<br />
当超图的顶点被明确标记时,就有了'''“等价 equivalence”'''和'''“相等 equality”'''的概念。 我们称<math>H</math>和<math>G</math>等价记作:<math>H\equiv G</math> 如果同构<math>\phi</math> 满足:<br />
<br />
<math>\phi(x_n) = y_n</math><br />
<br />
而且:<br />
<br />
<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
记作:<br />
<math>H\equiv G</math> 当且仅当 <math>H^* \cong G^*</math><br />
<br />
超图'''自同构 automorphism'''是从顶点集到自身的同构,也就是顶点的重标号。 超图“ <math>{H }</math>”(= (''X'',&nbsp;''E''))的自同构集合是超图的群 group,称为超图的'''自同构群 automorphism group''',并写成 <math>{Aut(''H'')}</math>。<br />
<br />
===例子 Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
<br />
考虑超图<math>H</math>,他的边为:<br />
<br />
<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
<br />
和超图<math>G</math>:<br />
<br />
<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
很明显 <math>H</math> 和 <math>G</math> 同构(有<math>\phi(a)=\alpha</math>等),但是他们不是强同构,因为比如在超图<math>H</math>中,<math>a</math> 顶点连接1,4,6三条边,所以:<br />
<br />
<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
在图<math>G</math>,,不存在连接边1,4,6的顶点:<br />
<br />
<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
在这个例子,<math>H</math> 和 <math>G</math>是等价的, <math>H\equiv G</math>,而且两者强同构的:<math>H^*\cong G^*</math><br />
<br />
==Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
<br />
超图<math>H</math>的<math>r(H)</math>表示该超图中任何一条边的最大'''基数'''。如果所有边具有相同的基数''k'',则称该超图为均匀的或k-均匀的,或称之为k-超图。图只是一个2-均匀的超图。<br />
<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
<br />
'''顶点'''''v''的'''度'''''d(v)''表示包含该顶点的边的数量。如果每个顶点的度都为''k'',则超图''H''是'''k-正则'''的。<br />
<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
<br />
均匀超图的对偶是正则的,反之亦然。<br />
<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
<br />
如果存在一个形如<math>\phi(x)=y</math>的自同构,则超图''H''的两个顶点''x''和''y''对称。如果存在一个自同构使得<math>\phi(e_i)=e_j</math>,则称两个边<math>e_i</math>和<math>e_j</math>为对称。<br />
<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
<br />
如果超图的所有顶点都是对称的,则称其为顶点可传递的(或顶点对称的)。类似地,如果超图的所有边都是对称的,则该超图是边传递的。 如果一个超图既是边对称的又是顶点对称的,则该超图是简单传递的。<br />
<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
<br />
由于超图的对偶性,边传递性的研究与顶点传递性的研究是相一致的。<br />
<br />
==Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
<br />
超图''H'' = (''X'', ''E'')的横截集(或命中集)是一个<math>T\subseteq X</math>集合,该集合与每条边都有非空的交集。如果''T''的真子集不是横截集,则称''T''为极小截集。''H'' 的横截超图是超图(''X'', ''F''),其边集''F''包含''H''的所有最小横截。<br />
<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
<br />
计算横截面超图在[[组合优化 Combinatorial Optimization]]、[[博弈论 Game Theory]]和[[计算机科学 Computer Science]]的一些领域(例如[[机器学习 Machine Learning]]、[[数据库索引 Indexing of Databases]]、[[可满足性问题the Satisfiability Problem]]、[[数据挖掘Data Mining]]和[[计算机程序优化 Program Optimization]])都有应用。<br />
<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
<br />
分别设 <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math>, <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>。<br />
每一个超图都有一个 <math>n \times m</math>[[关联矩阵]]<math>A = (a_{ij})</math>其为:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
<br />
其关联矩阵的[[转设]] <math>A^t</math>定义了 <math>H^* = (V^*,\ E^*)</math>称为<math>H</math>的'''对偶''',其中<math>V^*</math>是一个''m''元集合 <math>E^*</math>是一个<math>V^*</math>子集的''n''元集合。<br />
<br />
对于<math>v^*_j \in V^*</math> 和 <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[当且仅当]] <math>a_{ij} = 1</math>。<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
经典超图着色是将集合<math>\{1,2,3,...\lambda\}</math>中的其中一种颜色赋予给超图的每个顶点,使每个超边至少包含两个不同颜色的顶点。换句话说,不能存在基数至少为2的单色深边。从此意义上出发,它是通常图着色的直接推广。在所有着色行为中使用到最小的不同颜色数称为超图的色数。<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
存在着使用多达''k'' 种颜色着色的超图称为''k- 可着色图''。'''2-可染超图就是二分图'''。<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
经典超图着色有许多推广。在允许单色边情况下,混合超图着色是其中之一。一些混合超图对于任意数量的颜色都是不可着色的。同时不可着色性的内在标准是未知的。当一个混合超图是可着色时,其所使用的最小和最大颜色数分别称为下色数和上色数。详情请参阅 http://spectrum.troy.edu/voloshin/mh.html<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
<br />
由E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref>所提出的一个分区定理表明,对于边传递超图 <math>H=(X,E)</math>存在一个[[分区]]:<math>(X_1, X_2,\cdots,X_K)</math>对于顶点集 <math>X</math>使得由<math>X_k</math>生成的子超图<math>H_{X_k}</math>在<math>1\le k \le K</math>时是可传递的,并且使得<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math>,其中<math>r(H)</math>是 ''H''的秩。<br />
<br />
作为推论,不是点传递的边传递超图则是双色的。<br />
<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
<br />
[[图分区]](特别是超图分区)在集成电路设计<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> 和并行计算<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref>中有很多应用。在机器学习任务中,高效、可扩展的[[超图分区算法]]对于处理大规模超图也很重要。<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
许多涉及图的定理和概念也适用于超图,典型的例子有[[拉姆西定理]](Ramsey's theorem)和超图的线图。研究图的对称性的一些方法也被扩展到超图。<br />
均匀超图上有[[Erdős-Ko-Rado theorem]]和[[Kruskal-Katona theorem]]两个著名定理。<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.]](这个线路图可以解释为一个超图,其中四个顶点(用白色的矩形和圆盘表示)由三个用树表示的超图连接)<br />
<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
尽管超图比图更难画在纸上,但一些研究者已经研究了超图可视化方法。<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = <br />
}}.</ref><br />
<br />
其中一种超图的可视化表示法,类似于标准的图的画法:用平面内的曲线来描绘图边,将超图的顶点画成点状、圆盘或盒子,超边则被描绘成以顶点为叶子的树[16][17]。如果顶点表示为点,超边也可以被描绘成连接点集的平滑曲线,或显示为封闭点集的简单闭合曲线[18][19][20]。 <br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).]](一个4阶维恩图,可以被解释为一个15个顶点(15个有色区域)和4个超边(4个椭圆)的超图的细分图)<br />
<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><br />
<br />
超图可视化的另一种样式,是绘制超图的细分模型[21],平面被细分为区域,每个区域代表超图的一个顶点。超图的超边用这些区域的相邻子集来表示,这些子集可以通过着色、或在它们周围画轮廓来表示,或者兼而有之。<br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
超图的另一种表示法被称做 PAOH[24],如上图所示,边是连接顶点的垂线,顶点在左边是对齐的。右边的图例显示了边的名称。它是为动态超图设计的,但也可以用于简单的超图。<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
通过扩充一组替换规则于超图,[[图语法]]可以被推广超边上。<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
<br />
==超图概念的延伸==<br />
<br />
超图的相关概念可以进行进一步的延伸,如超图中的一些边可以指向另一些边。这种延伸有两种变体。在第一种变体中,超图的边不仅包含一组节点,而且还可以包含这组节点的子集、子集的子集等等。本质上,超图的每条边只是树结构或有向无环图的一个内部节点,而节点就是叶子。从这个意义上来说,超图就是具有共享节点的树的集合(即内部节点或叶子可能出现在不同的树结构中),反过来说,每个树的集合又可以理解为一个超图。因为树结构在计算机科学和许多数学分支中被广泛使用,所以超图的出现也是自然而然的。比如这种延伸是作为项代数的模型而自然产生的:边对应项,节点对应常量或变量。<br />
<br />
<br />
对于上述的超图,节点集提供了一种排序。但是该排序既不是偏序也不是预序,因为它是不可传递的。与这一延伸方式的Levi图相对应的图是有向无环图。例如,一个超图的节点集为<math>V= \{a,b\}</math>,边为<math>e_1=\{a,b\}</math>和<math>e_2=\{a,e_1\}</math>。那么,虽然<math>b\in e_1</math>且<math>e_1\in e_2</math>,但<math>b\in e_2</math>却不是真的。然而,这类超图节点集的封闭传递确实诱导了偏序,并将超图“展平”为一个偏序集。<br />
<br />
<br />
第二种变体中,超图中的边可以指向其他边,同时不用考虑必须形成有向非循环图的要求。这允许超图具有边的循环,而不需要有任何节点。例如,考虑由两条边e1和e2组成的,节点个数为零的广义超图,使得<math>e_1 = \{e_2\}</math>且<math>e_2 = \{e_1\}</math>。因为这个循环是无限递归的,所以边的集合违反了基础公理。具体来说,对于这样的超图,不存在节点集的封闭传递。虽然这样的结构乍看起来可能很奇怪,但只要注意到它的Levi图的等价延伸不再是二分图,而是一般的有向图,就可以很容易地去理解。<br />
<br />
根据定义,这种超图的广义关联矩阵是一个方阵,其秩等于节点和边的总数。因此,对于上面的示例,关联矩阵为:<br />
<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right]</math>。<br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
<br />
==超图与机器学习==<br />
<br />
超图已被广泛用于机器学习中,常作为一种数据结构或一种正则化的方式[25]。这些应用包括推荐系统(社团作为超边)[26]、图像检索(相关性作为超边)[27]、和生物信息学(生物、化学分子间相互作用作为超边)[28]。比较典型的超图机器学习方法包括:超图谱聚类法(用超图Laplacian扩展光谱图理论)[29]和超图半监督学习(通过引入超图结构来对结果进行限定)。对于大尺寸的超图,可以使用Apache Spark构建的分布式框架[15]。<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN <br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]<br />
<br />
<br />
==编者推荐==<br />
*[https://book.douban.com/subject/1237624/ 《超图-限集的组合学》]by [法]Claude Berge<br />
超图的第一本专著,作者是近代图论之父法国数学家Claude Berge,将图里的普通边拓展为超边,小小的一步拓展却引发了一个大的领域。</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4643超图 Hypergraph2020-04-22T09:45:09Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区---厚朴<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
在[[数学中]], '''超图'''是一种广义上的[[graph(discrete mathematics)|图]] ,它的一条[[graph theory|边]]可以连接任意数量的[[vertex (graph theory)|顶点]]. 相对而言,在普通图中,一条边只能连接两个顶点.形式上, 超图 <math>H</math> 是一个集合组 <math>H = (X,E)</math> 其中<math>X</math> 是一个以节点或顶点为元素的集合,即顶点集, 而 <math>E</math> 是一组非空子,被称为边或超边. <br />
因此,若<math>\mathcal{P}(X)</math>是 <math>E</math>的幂集,则<math>E</math>是 <math>\mathcal{P}(X) \setminus\{\emptyset\}</math> 的一个子集。在<math>H</math>中,顶点集的大小被称为超图的阶数,边集的大小被称为超图的大小。<br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
普通图的边是节点的二元子集,超边则是节点的任意集合,所以可以包含任意数量的节点。但我们总先需要研究具有相同基数超边的超图,即一个 k-均匀超图,所有超边的大小都为 k。因此一个 2-均匀超图就是图,一个 3-均匀超图就是三元组的集合,依此类推。超图也被称为从[[泛集]](universal set)中抽取的一个集系统或[[集族]]。<br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
超图可以看做是[[关联结构]](incidence structure)。特别的,每个超图都有一个与超图相对应的二分 "关联图 "或 "[[列维图]]"(Levi graph),反之,大多数(但不是全部)[[二分图]]都可以看作是超图的关联图。<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
超图还有许多其它名称。在[[计算几何学]]中,超图有时可以被称为'''范围空间'''(range space),将超图的边称为''范围''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
在[[合作博弈论]]中,超图被称为'''简单博弈'''(投票博弈);这个概念被应用于解决[[社会选择理论]](social choice theory)中的问题。在一些文献中,超边被称为''超连接''或''连接器''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
特殊类型的超图包括:上文简单讨论过的 k-均匀超图;散簇,没有一条边作是另一条边的子集;以及[[抽象单纯复形]](abstract simplicial complexes),包含每条边的所有子集。<br />
超图是一个以超图同态为[[态射]](morphism)的范畴。<br />
<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
超图有不同的类型,如:<br />
* 空超图:没有边的超图<br />
* 非简单(或多重)超图:允许有循环(有单个顶点的超边)或重复边的超图,也就是说可以有两个或两个以上的边包含同一组顶点。<br />
* 简单超图:不包含循环和重复边的超图。<br />
* 𝑘-均匀超图:每条超边都正好包含 k 个顶点的超图。<br />
* 𝑑-正则超图:每个顶点的度数都是 𝑑 的超图<br />
* 非循环超图:不包含任何循环的超图。<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
因为超图的链接可以有任意基数,所以有几种子图的概念,分别是''子超图''(subhypergraphs)、''部分超图''(partial hypergraphs)和''分段超图''(section hypergraphs)。<br />
<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
让 𝐻=(𝑋,𝐸) 是一个超图,包含顶点集:<br />
𝑋={𝑥𝑖|𝑖∈𝐼𝑣},<br />
和边集<br />
𝐸={𝑒𝑖|𝑖∈𝐼𝑒∧𝑒𝑖⊆𝑋∧𝑒𝑖≠∅𝐸}<br />
其中 𝐼𝑣 和 𝐼𝑒 分别是顶点和边集的索引集。<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
子超图是去掉某些顶点的超图。在形式上,若 𝐴⊆𝑋 是顶点子集,则子超图 𝐻𝐴 被定义为:<br />
𝐻𝐴=(𝐴,{𝑒𝐴∩∩|𝑒𝐴∈𝐸∧𝑒∩𝐴≠∅)<br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
一个子超图的扩展是一个超图,其中每个属于 H 的超边都部分包含在子超图的 𝐻𝐴,并且完全包含在扩展的𝐸𝑥(𝐻𝐴) 中。即在形式上:<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
部分超图是去掉一些边的超图。给定一个边索引集的子集 𝐽⊂𝐼𝑒 ,由 𝐽 生成的部分超图就是<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
而给定一个子集 𝐴⊆𝑋,则分段超图是部分超图<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
𝐻 的重记号 𝐻∗ 则是一个顶点和边互换的超图,因此顶点由 {𝑒𝑖 } 给出,边由 {𝑋𝑚} 给出,其中<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
当等式的记号被正确定义时,如下,对一个超图采取两次运算是对偶的:<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
对于不连通的超图 G 和具有相同顶点连通的超图 H,如果 H 的每个超边都有 G 中一个子图连接,则 G 是一个主图(host graph);<br />
对于不连通的超图 H,如果 G 和 H 的连通部分之间存在一个双射,使得 G 的每个连通部分 G' 都是对应的 H' 的主图,则 G 是一个主图。<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
一个超图是二分(bipartite)的,当且仅当它的顶点能被分成两类 U 和 V :每个基数至少为 2 超边包含两类中的至少一个顶点。相反的,超图则被称为具有属性 B。<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
2-段超图(或团图,代表图、原始图、盖夫曼图)是具有相同顶点的图,并且所有顶点对之间的边包含在相同的超边中。<br />
<br />
==二部图模型 Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
[[File:bipartie graph.jpeg|200px|缩略图|右| 设<math>G=(V,E)</math>是一个无向图,如果顶点V可分割为两个互不相交的子集<math> {(group1, group2)}</math>,并且图中的每条边<math>{(i,j)}</math>所关联的两个顶点<math>{i}</math>和<math>{j}</math>分别属于这两个不同的部分<math>{(i \in group1,j \in group2)}</math>,则称图<math>{G}</math>为一个二部图。]]<br />
<br />
一个'''超图“ <math>{H} </math>”'''可以用二部图“<math>{BG} </math>”表示,其构成如下: 集合"X"和" E"是"BG"的分割,而且 ("x<sub>1</sub>", "e<sub>1</sub>") 与边连通当且仅当顶点"x<sub>1</sub>"包含在" <math>H </math>"的边" e<sub>1</sub>"中。 反之,任何具有固定的'''部分 part'''且在第二部分中没有不连通节点的二部图也代表具有上述性质的部分超图。 这个二部图也称为'''关联图'''。<br />
<br />
==无环性 Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
与只有'''圈 cycle'''和'''森林 forest'''的普通无向图相比,对于超图的特殊情形,那些坍缩为平凡图的无环性超图有多种自然不等价的'''无环性 acyclicity''' 定义。<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
由Claude Berge 给出了超图无环性的首个定义: <ref>Claude Berge,[https://www.amazon.com/Graphs-hypergraphs-North-Holland-mathematical-library/dp/0444103996 ''Graphs and Hypergraphs'']</ref> 如果它的'''关联图'''(上面定义的二部图)是无环的,则称这个超图是 Berge 无环的 Berge-acyclic。 这个定义是非常严格的:例如,假设一个超图有一些顶点<math>v \neq v'</math>和一些超边<math>f \neq f'</math> ,例如 <math>v, v' \in f</math> 和<math>v, v' \in f'</math>,那么它就是 Berge成环的 Berge-cyclic。 通过对关联图的探讨,Berge成环性 berge-cyclity可以在线性时间 linear time内得到有效验证 。<br />
<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
此处,我们可以定义一个减弱的超图无环性的概念<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref>,后来被称为 <math> {\alpha}</math>-无环性 <math> {\alpha}</math> acyclicity。 这个无环性的概念等价于超图是同构的(原图的每个团都被某个超边所覆盖) ,它的原图称为弦图 chordal graph ; 它也等价于通过 GYO 算法 Graham-Yu-Ozsoyoglu Algorithm (也称为格雷厄姆算法 Graham's algorithm) 得到具有可约性的空图<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref>。GYO 算法是一个合流 confluence(抽象重写 abstract rewriting)迭代过程,该算法中使用耳朵 ear的广义定义去除超边 (图论中的耳朵就定义为一条路径,其中除了端点外的点的度数均为 2(端点可以重合),而且删去后不破坏图的连通性)。总所周知, 在数据库理论 database theory 的领域中,如果一个数据库模式 database schema的底层超图是<math> {\alpha}</math>无环的,那么它就具有某些理想的性质。 <ref>Serge Abiteboul, Richard B. Hull, Victor Vianu|V. Vianu, ''Foundations of Databases''</ref> 除此之外,<math> {\alpha}</math>无环性也与一阶逻辑 first-order logic 保护的片段 guarded fragment 的表达能力有关。<br />
<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
我们可以在线性时间 linear time内检验超图是否是-无环的。 <ref>Robert Tarjan|R. E. Tarjan, Mihalis Yannakakis|M. Yannakakis. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
注意到<math> {\alpha}</math>-无环性似乎直觉不相符,即<math> {\alpha}</math>-成环超图添加超边可能使其成为<math> {\alpha}</math>-无环的(例如,添加一条包含超图所有顶点的超边总能其成为<math> {\alpha}</math>-无环的)。 为了克服这个缺点,Ronald Fagin<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> 定义了更强的 <math> {\beta}</math>-无环性 <math> {\beta}</math>-acylicity 和 <math> {\gamma}</math>无环性 <math> {\gamma}</math>-acylicity 概念。 应当指出:<math> {\gamma}</math>无环超图是推出其所有子超图都是<math> {\alpha}</math>无环的必要条件,这与 Graham 的早期定义<ref name="graham1979universal"/> 等价。 <math> {\gamma}</math>无环性的概念是一个更加严苛的条件,它等价于数据库模式的几个理想性质,并且与Bachman 图 Bachman diagrams有关. <math> {\beta}</math>-无环性 和 <math> {\gamma}</math>无环性 都可以在多项式时间 polynomial time(PTIME)内完成检测。<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
无环性的四个概念具有可比性: berge-无环性意味着 <math> {\gamma}</math>- 无环性, <math> {\gamma}</math>- 无环性又意味着 <math> {\beta}</math>- 无环性, <math> {\beta}</math>- 无环性又可以推出 <math> {\alpha}</math> 无环性。 然而,反之均不成立。<ref name="fagin1983degrees" /><br />
<br />
==同构和相等 Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
超图同态 homomorphism是指从一个超图的顶点集到另一个超图的顶点集的映射,如此使得每条边映射到另一条边。<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]] <br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
<br />
如果一个超图 <math>H=(X,E)</math>同构 isomorphic 与另外一个超图<math>G=(Y,F)</math>,则存在一个双射:<math>H \simeq G</math> :<math>\phi:X \to Y</math><br />
<br />
和 关于<math>I</math>的置换 permutation 使得: :<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
那么这个双射被称为图的同构 isomorphism,记作:<math>H \simeq G</math>但且仅当<math>H^* \simeq G^*</math>。<br />
<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
<br />
当超图的边被明确标记时,就有了'''“强同构 strong isomorphism ”'''这个新的概念。 当前面提及的置换是唯一的,则称<math>H</math> 强同构于 <math>G</math> 。 记作<math>H \cong G</math>。 注意,所有强同构图都是同构的,但反过来就不成立。<br />
<br />
当超图的顶点被明确标记时,就有了'''“等价 equivalence”'''和'''“相等 equality”'''的概念。 我们称<math>H</math>和<math>G</math>等价记作:<math>H\equiv G</math> 如果同构<math>\phi</math> 满足:<br />
<br />
<math>\phi(x_n) = y_n</math><br />
<br />
而且:<br />
<br />
<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
记作:<br />
<math>H\equiv G</math> 当且仅当 <math>H^* \cong G^*</math><br />
<br />
超图'''自同构 automorphism'''是从顶点集到自身的同构,也就是顶点的重标号。 超图“ <math>{H }</math>”(= (''X'',&nbsp;''E''))的自同构集合是超图的群 group,称为超图的'''自同构群 automorphism group''',并写成 <math>{Aut(''H'')}</math>。<br />
<br />
===例子 Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
<br />
考虑超图<math>H</math>,他的边为:<br />
<br />
<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
<br />
和超图<math>G</math>:<br />
<br />
<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
很明显 <math>H</math> 和 <math>G</math> 同构(有<math>\phi(a)=\alpha</math>等),但是他们不是强同构,因为比如在超图<math>H</math>中,<math>a</math> 顶点连接1,4,6三条边,所以:<br />
<br />
<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
在图<math>G</math>,,不存在连接边1,4,6的顶点:<br />
<br />
<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
在这个例子,<math>H</math> 和 <math>G</math>是等价的, <math>H\equiv G</math>,而且两者强同构的:<math>H^*\cong G^*</math><br />
<br />
==Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
<br />
超图<math>H</math>的<math>r(H)</math>表示该超图中任何一条边的最大基数。如果所有边具有相同的基数''k'',则称该超图为均匀的或k-均匀的,或称之为k-超图。图只是一个2-均匀的超图。<br />
<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
<br />
顶点''v''的度''d(v)''表示包含该顶点的边的数量。如果每个顶点的度都为''k'',则超图''H''是k-正则的。<br />
<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
<br />
均匀超图的对偶是正则的,反之亦然。<br />
<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
<br />
如果存在一个形如<math>\phi(x)=y</math>的自同构,则超图''H''的两个顶点''x''和''y''对称。如果存在一个自同构使得<math>\phi(e_i)=e_j</math>,则称两个边<math>e_i</math>和<math>e_j</math>为对称。<br />
<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
<br />
如果超图的所有顶点都是对称的,则称其为顶点可传递的(或顶点对称的)。类似地,如果超图的所有边都是对称的,则该超图是边传递的。 如果一个超图既是边对称的又是顶点对称的,则该超图是简单传递的。<br />
<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
<br />
由于超图的对偶性,边传递性的研究与顶点传递性的研究是相一致的。<br />
<br />
==Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
<br />
超图''H'' = (''X'', ''E'')的横截集(或命中集)是一个<math>T\subseteq X</math>集合,该集合与每条边都有非空的交集。如果''T''的真子集不是横截集,则称''T''为极小截集。''H'' 的横截超图是超图(''X'', ''F''),其边集''F''包含''H''的所有最小横截。<br />
<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
<br />
计算横截面超图在[[组合优化 Combinatorial Optimization]]、[[博弈论 Game Theory]]和[[计算机科学 Computer Science]]的一些领域(例如[[机器学习 Machine Learning]]、[[数据库索引 Indexing of Databases]]、[[可满足性问题the Satisfiability Problem]]、[[数据挖掘Data Mining]]和[[计算机程序优化 Program Optimization]])都有应用。<br />
<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
<br />
分别设 <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math>, <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>。<br />
每一个超图都有一个 <math>n \times m</math>[[关联矩阵]]<math>A = (a_{ij})</math>其为:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
<br />
其关联矩阵的[[转设]] <math>A^t</math>定义了 <math>H^* = (V^*,\ E^*)</math>称为<math>H</math>的'''对偶''',其中<math>V^*</math>是一个''m''元集合 <math>E^*</math>是一个<math>V^*</math>子集的''n''元集合。<br />
<br />
对于<math>v^*_j \in V^*</math> 和 <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[当且仅当]] <math>a_{ij} = 1</math>。<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
经典超图着色是将集合<math>\{1,2,3,...\lambda\}</math>中的其中一种颜色赋予给超图的每个顶点,使每个超边至少包含两个不同颜色的顶点。换句话说,不能存在基数至少为2的单色深边。从此意义上出发,它是通常图着色的直接推广。在所有着色行为中使用到最小的不同颜色数称为超图的色数。<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
存在着使用多达''k'' 种颜色着色的超图称为''k- 可着色图''。'''2-可染超图就是二分图'''。<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
经典超图着色有许多推广。在允许单色边情况下,混合超图着色是其中之一。一些混合超图对于任意数量的颜色都是不可着色的。同时不可着色性的内在标准是未知的。当一个混合超图是可着色时,其所使用的最小和最大颜色数分别称为下色数和上色数。详情请参阅 http://spectrum.troy.edu/voloshin/mh.html<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
<br />
由E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref>所提出的一个分区定理表明,对于边传递超图 <math>H=(X,E)</math>存在一个[[分区]]:<math>(X_1, X_2,\cdots,X_K)</math>对于顶点集 <math>X</math>使得由<math>X_k</math>生成的子超图<math>H_{X_k}</math>在<math>1\le k \le K</math>时是可传递的,并且使得<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math>,其中<math>r(H)</math>是 ''H''的秩。<br />
<br />
作为推论,不是点传递的边传递超图则是双色的。<br />
<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
<br />
[[图分区]](特别是超图分区)在集成电路设计<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> 和并行计算<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref>中有很多应用。在机器学习任务中,高效、可扩展的[[超图分区算法]]对于处理大规模超图也很重要。<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
许多涉及图的定理和概念也适用于超图,典型的例子有[[拉姆西定理]](Ramsey's theorem)和超图的线图。研究图的对称性的一些方法也被扩展到超图。<br />
均匀超图上有[[Erdős-Ko-Rado theorem]]和[[Kruskal-Katona theorem]]两个著名定理。<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.]](这个线路图可以解释为一个超图,其中四个顶点(用白色的矩形和圆盘表示)由三个用树表示的超图连接)<br />
<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
尽管超图比图更难画在纸上,但一些研究者已经研究了超图可视化方法。<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = <br />
}}.</ref><br />
<br />
其中一种超图的可视化表示法,类似于标准的图的画法:用平面内的曲线来描绘图边,将超图的顶点画成点状、圆盘或盒子,超边则被描绘成以顶点为叶子的树[16][17]。如果顶点表示为点,超边也可以被描绘成连接点集的平滑曲线,或显示为封闭点集的简单闭合曲线[18][19][20]。 <br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).]](一个4阶维恩图,可以被解释为一个15个顶点(15个有色区域)和4个超边(4个椭圆)的超图的细分图)<br />
<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><br />
<br />
超图可视化的另一种样式,是绘制超图的细分模型[21],平面被细分为区域,每个区域代表超图的一个顶点。超图的超边用这些区域的相邻子集来表示,这些子集可以通过着色、或在它们周围画轮廓来表示,或者兼而有之。<br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
超图的另一种表示法被称做 PAOH[24],如上图所示,边是连接顶点的垂线,顶点在左边是对齐的。右边的图例显示了边的名称。它是为动态超图设计的,但也可以用于简单的超图。<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
通过扩充一组替换规则于超图,[[图语法]]可以被推广超边上。<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
<br />
==超图概念的延伸==<br />
<br />
超图的相关概念可以进行进一步的延伸,如超图中的一些边可以指向另一些边。这种延伸有两种变体。在第一种变体中,超图的边不仅包含一组节点,而且还可以包含这组节点的子集、子集的子集等等。本质上,超图的每条边只是树结构或有向无环图的一个内部节点,而节点就是叶子。从这个意义上来说,超图就是具有共享节点的树的集合(即内部节点或叶子可能出现在不同的树结构中),反过来说,每个树的集合又可以理解为一个超图。因为树结构在计算机科学和许多数学分支中被广泛使用,所以超图的出现也是自然而然的。比如这种延伸是作为项代数的模型而自然产生的:边对应项,节点对应常量或变量。<br />
<br />
<br />
对于上述的超图,节点集提供了一种排序。但是该排序既不是偏序也不是预序,因为它是不可传递的。与这一延伸方式的Levi图相对应的图是有向无环图。例如,一个超图的节点集为<math>V= \{a,b\}</math>,边为<math>e_1=\{a,b\}</math>和<math>e_2=\{a,e_1\}</math>。那么,虽然<math>b\in e_1</math>且<math>e_1\in e_2</math>,但<math>b\in e_2</math>却不是真的。然而,这类超图节点集的封闭传递确实诱导了偏序,并将超图“展平”为一个偏序集。<br />
<br />
<br />
第二种变体中,超图中的边可以指向其他边,同时不用考虑必须形成有向非循环图的要求。这允许超图具有边的循环,而不需要有任何节点。例如,考虑由两条边e1和e2组成的,节点个数为零的广义超图,使得<math>e_1 = \{e_2\}</math>且<math>e_2 = \{e_1\}</math>。因为这个循环是无限递归的,所以边的集合违反了基础公理。具体来说,对于这样的超图,不存在节点集的封闭传递。虽然这样的结构乍看起来可能很奇怪,但只要注意到它的Levi图的等价延伸不再是二分图,而是一般的有向图,就可以很容易地去理解。<br />
<br />
根据定义,这种超图的广义关联矩阵是一个方阵,其秩等于节点和边的总数。因此,对于上面的示例,关联矩阵为:<br />
<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right]</math>。<br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
<br />
==超图与机器学习==<br />
<br />
超图已被广泛用于机器学习中,常作为一种数据结构或一种正则化的方式[25]。这些应用包括推荐系统(社团作为超边)[26]、图像检索(相关性作为超边)[27]、和生物信息学(生物、化学分子间相互作用作为超边)[28]。比较典型的超图机器学习方法包括:超图谱聚类法(用超图Laplacian扩展光谱图理论)[29]和超图半监督学习(通过引入超图结构来对结果进行限定)。对于大尺寸的超图,可以使用Apache Spark构建的分布式框架[15]。<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4<br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]<br />
<br />
<br />
==编者推荐==<br />
*[https://book.douban.com/subject/1237624/ 《超图-限集的组合学》]by [法]Claude Berge<br />
超图的第一本专著,作者是近代图论之父法国数学家Claude Berge,将图里的普通边拓展为超边,小小的一步拓展却引发了一个大的领域。</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4614超图 Hypergraph2020-04-22T09:41:54Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区---厚朴<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
在[[数学中]], '''超图'''是一种广义上的[[graph(discrete mathematics)|图]] ,它的一条[[graph theory|边]]可以连接任意数量的[[vertex (graph theory)|顶点]]. 相对而言,在普通图中,一条边只能连接两个顶点.形式上, 超图 <math>H</math> 是一个集合组 <math>H = (X,E)</math> 其中<math>X</math> 是一个以节点或顶点为元素的集合,即顶点集, 而 <math>E</math> 是一组非空子,被称为边或超边. <br />
因此,若<math>\mathcal{P}(X)</math>是 <math>E</math>的幂集,则<math>E</math>是 <math>\mathcal{P}(X) \setminus\{\emptyset\}</math> 的一个子集。在<math>H</math>中,顶点集的大小被称为超图的阶数,边集的大小被称为超图的大小。<br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
普通图的边是节点的二元子集,超边则是节点的任意集合,所以可以包含任意数量的节点。但我们总先需要研究具有相同基数超边的超图,即一个 k-均匀超图,所有超边的大小都为 k。因此一个 2-均匀超图就是图,一个 3-均匀超图就是三元组的集合,依此类推。超图也被称为从[[泛集]](universal set)中抽取的一个集系统或[[集族]]。<br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
超图可以看做是[[关联结构]](incidence structure)。特别的,每个超图都有一个与超图相对应的二分 "关联图 "或 "[[列维图]]"(Levi graph),反之,大多数(但不是全部)[[二分图]]都可以看作是超图的关联图。<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
超图还有许多其它名称。在[[计算几何学]]中,超图有时可以被称为'''范围空间'''(range space),将超图的边称为''范围''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
在[[合作博弈论]]中,超图被称为'''简单博弈'''(投票博弈);这个概念被应用于解决[[社会选择理论]](social choice theory)中的问题。在一些文献中,超边被称为''超连接''或''连接器''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
特殊类型的超图包括:上文简单讨论过的 k-均匀超图;散簇,没有一条边作是另一条边的子集;以及[[抽象单纯复形]](abstract simplicial complexes),包含每条边的所有子集。<br />
超图是一个以超图同态为[[态射]](morphism)的范畴。<br />
<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
超图有不同的类型,如:<br />
* 空超图:没有边的超图<br />
* 非简单(或多重)超图:允许有循环(有单个顶点的超边)或重复边的超图,也就是说可以有两个或两个以上的边包含同一组顶点。<br />
* 简单超图:不包含循环和重复边的超图。<br />
* 𝑘-均匀超图:每条超边都正好包含 k 个顶点的超图。<br />
* 𝑑-正则超图:每个顶点的度数都是 𝑑 的超图<br />
* 非循环超图:不包含任何循环的超图。<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
因为超图的链接可以有任意基数,所以有几种子图的概念,分别是''子超图''(subhypergraphs)、''部分超图''(partial hypergraphs)和''分段超图''(section hypergraphs)。<br />
<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
对于不连通的超图 G 和具有相同顶点连通的超图 H,如果 H 的每个超边都有 G 中一个子图连接,则 G 是一个主图(host graph);<br />
对于不连通的超图 H,如果 G 和 H 的连通部分之间存在一个双射,使得 G 的每个连通部分 G' 都是对应的 H' 的主图,则 G 是一个主图。<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
一个超图是二分(bipartite)的,当且仅当它的顶点能被分成两类 U 和 V :每个基数至少为 2 超边包含两类中的至少一个顶点。相反的,超图则被称为具有属性 B。<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
2-段超图(或团图,代表图、原始图、盖夫曼图)是具有相同顶点的图,并且所有顶点对之间的边包含在相同的超边中。<br />
<br />
==二部图模型 Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
[[File:bipartie graph.jpeg|200px|缩略图|右| 设<math>G=(V,E)</math>是一个无向图,如果顶点V可分割为两个互不相交的子集<math> {(group1, group2)}</math>,并且图中的每条边<math>{(i,j)}</math>所关联的两个顶点<math>{i}</math>和<math>{j}</math>分别属于这两个不同的部分<math>{(i \in group1,j \in group2)}</math>,则称图<math>{G}</math>为一个二部图。]]<br />
<br />
一个'''超图“ <math>{H} </math>”'''可以用二部图“<math>{BG} </math>”表示,其构成如下: 集合"X"和" E"是"BG"的分割,而且 ("x<sub>1</sub>", "e<sub>1</sub>") 与边连通当且仅当顶点"x<sub>1</sub>"包含在" <math>H </math>"的边" e<sub>1</sub>"中。 反之,任何具有固定的'''部分 part'''且在第二部分中没有不连通节点的二部图也代表具有上述性质的部分超图。 这个二部图也称为'''关联图'''。<br />
<br />
==无环性 Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
与只有'''圈 cycle'''和'''森林 forest'''的普通无向图相比,对于超图的特殊情形,那些坍缩为平凡图的无环性超图有多种自然不等价的'''无环性 acyclicity''' 定义。<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
由Claude Berge 给出了超图无环性的首个定义: <ref>Claude Berge,[https://www.amazon.com/Graphs-hypergraphs-North-Holland-mathematical-library/dp/0444103996 ''Graphs and Hypergraphs'']</ref> 如果它的'''关联图'''(上面定义的二部图)是无环的,则称这个超图是 Berge 无环的 Berge-acyclic。 这个定义是非常严格的:例如,假设一个超图有一些顶点<math>v \neq v'</math>和一些超边<math>f \neq f'</math> ,例如 <math>v, v' \in f</math> 和<math>v, v' \in f'</math>,那么它就是 Berge成环的 Berge-cyclic。 通过对关联图的探讨,Berge成环性 berge-cyclity可以在线性时间 linear time内得到有效验证 。<br />
<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
此处,我们可以定义一个减弱的超图无环性的概念<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref>,后来被称为 <math> {\alpha}</math>-无环性 <math> {\alpha}</math> acyclicity。 这个无环性的概念等价于超图是同构的(原图的每个团都被某个超边所覆盖) ,它的原图称为弦图 chordal graph ; 它也等价于通过 GYO 算法 Graham-Yu-Ozsoyoglu Algorithm (也称为格雷厄姆算法 Graham's algorithm) 得到具有可约性的空图<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref>。GYO 算法是一个合流 confluence(抽象重写 abstract rewriting)迭代过程,该算法中使用耳朵 ear的广义定义去除超边 (图论中的耳朵就定义为一条路径,其中除了端点外的点的度数均为 2(端点可以重合),而且删去后不破坏图的连通性)。总所周知, 在数据库理论 database theory 的领域中,如果一个数据库模式 database schema的底层超图是<math> {\alpha}</math>无环的,那么它就具有某些理想的性质。 <ref>Serge Abiteboul, Richard B. Hull, Victor Vianu|V. Vianu, ''Foundations of Databases''</ref> 除此之外,<math> {\alpha}</math>无环性也与一阶逻辑 first-order logic 保护的片段 guarded fragment 的表达能力有关。<br />
<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
我们可以在线性时间 linear time内检验超图是否是-无环的。 <ref>Robert Tarjan|R. E. Tarjan, Mihalis Yannakakis|M. Yannakakis. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
注意到<math> {\alpha}</math>-无环性似乎直觉不相符,即<math> {\alpha}</math>-成环超图添加超边可能使其成为<math> {\alpha}</math>-无环的(例如,添加一条包含超图所有顶点的超边总能其成为<math> {\alpha}</math>-无环的)。 为了克服这个缺点,Ronald Fagin<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> 定义了更强的 <math> {\beta}</math>-无环性 <math> {\beta}</math>-acylicity 和 <math> {\gamma}</math>无环性 <math> {\gamma}</math>-acylicity 概念。 应当指出:<math> {\gamma}</math>无环超图是推出其所有子超图都是<math> {\alpha}</math>无环的必要条件,这与 Graham 的早期定义<ref name="graham1979universal"/> 等价。 <math> {\gamma}</math>无环性的概念是一个更加严苛的条件,它等价于数据库模式的几个理想性质,并且与Bachman 图 Bachman diagrams有关. <math> {\beta}</math>-无环性 和 <math> {\gamma}</math>无环性 都可以在多项式时间 polynomial time(PTIME)内完成检测。<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
无环性的四个概念具有可比性: berge-无环性意味着 <math> {\gamma}</math>- 无环性, <math> {\gamma}</math>- 无环性又意味着 <math> {\beta}</math>- 无环性, <math> {\beta}</math>- 无环性又可以推出 <math> {\alpha}</math> 无环性。 然而,反之均不成立。<ref name="fagin1983degrees" /><br />
<br />
==Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]]<br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
===Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
==Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
<br />
超图<math>H</math>的<math>r(H)</math>表示该超图中任何一条边的最大基数。如果所有边具有相同的基数''k'',则称该超图为''均匀的''或''k-均匀的'',或称之为''k-超图''。图只是一个2-均匀的超图。<br />
<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
<br />
顶点''v''的度''d(v)''表示包含该顶点的边的数量。如果每个顶点的度都为''k'',则超图''H''是''k-正则的''。<br />
<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
<br />
均匀超图的对偶是正则的,反之亦然。<br />
<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
<br />
如果存在一个形如<math>\phi(x)=y</math>的自同构,则超图''H''的两个顶点''x''和''y''''对称''。如果存在一个自同构使得<math>\phi(e_i)=e_j</math>,则称两个边<math>e_i</math>和<math>e_j</math>为''对称''。<br />
<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
<br />
如果超图的所有顶点都是对称的,则称其为''顶点可传递的''(或''顶点对称的'')。类似地,如果超图的所有边都是对称的,则该超图是''边传递的''。 如果一个超图既是边对称的又是顶点对称的,则该超图是''简单传递的''。<br />
<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
<br />
由于超图的对偶性,边传递性的研究与顶点传递性的研究是相一致的。<br />
<br />
==Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
<br />
超图''H'' = (''X'', ''E'')的横截集(或“命中集”)是一个<math>T\subseteq X</math>集合,该集合与每条边都有非空的交集。如果''T''的真子集不是横截集,则称''T''为极小截集。''H'' 的横截超图是超图(''X'', ''F''),其边集''F''包含''H''的所有最小横截。<br />
<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
<br />
计算横截面超图在[[组合优化 Combinatorial Optimization]]、[[博弈论 Game Theory]]和[[计算机科学 Computer Science]]的一些领域(例如[[机器学习 Machine Learning]]、[[数据库索引 Indexing of Databases]]、[[可满足性问题the Satisfiability Problem]]、[[数据挖掘Data Mining]]和[[计算机程序优化 Program Optimization]])都有应用。<br />
<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
<br />
分别设 <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math>, <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>。<br />
每一个超图都有一个 <math>n \times m</math>[[关联矩阵]]<math>A = (a_{ij})</math>其为:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
<br />
其关联矩阵的[[转设]] <math>A^t</math>定义了 <math>H^* = (V^*,\ E^*)</math>称为<math>H</math>的'''对偶''',其中<math>V^*</math>是一个''m''元集合 <math>E^*</math>是一个<math>V^*</math>子集的''n''元集合。<br />
<br />
对于<math>v^*_j \in V^*</math> 和 <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[当且仅当]] <math>a_{ij} = 1</math>。<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
经典超图着色是将集合<math>\{1,2,3,...\lambda\}</math>中的其中一种颜色赋予给超图的每个顶点,使每个超边至少包含两个不同颜色的顶点。换句话说,不能存在基数至少为2的单色深边。从此意义上出发,它是通常图着色的直接推广。在所有着色行为中使用到最小的不同颜色数称为超图的色数。<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
存在着使用多达''k'' 种颜色着色的超图称为''k- 可着色图''。'''2-可染超图就是二分图'''。<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
经典超图着色有许多推广。在允许单色边情况下,混合超图着色是其中之一。一些混合超图对于任意数量的颜色都是不可着色的。同时不可着色性的内在标准是未知的。当一个混合超图是可着色时,其所使用的最小和最大颜色数分别称为下色数和上色数。详情请参阅 http://spectrum.troy.edu/voloshin/mh.html<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
<br />
由E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref>所提出的一个分区定理表明,对于边传递超图 <math>H=(X,E)</math>存在一个[[分区]]:<math>(X_1, X_2,\cdots,X_K)</math>对于顶点集 <math>X</math>使得由<math>X_k</math>生成的子超图<math>H_{X_k}</math>在<math>1\le k \le K</math>时是可传递的,并且使得<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math>,其中<math>r(H)</math>是 ''H''的秩。<br />
<br />
作为推论,不是点传递的边传递超图则是双色的。<br />
<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
<br />
[[图分区]](特别是超图分区)在集成电路设计<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> 和并行计算<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref>中有很多应用。在机器学习任务中,高效、可扩展的[[超图分区算法]]对于处理大规模超图也很重要。<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
许多涉及图的定理和概念也适用于超图,典型的例子有[[拉姆西定理]](Ramsey's theorem)和超图的线图。研究图的对称性的一些方法也被扩展到超图。<br />
均匀超图上有[[Erdős-Ko-Rado theorem]]和[[Kruskal-Katona theorem]]两个著名定理。<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.]](这个线路图可以解释为一个超图,其中四个顶点(用白色的矩形和圆盘表示)由三个用树表示的超图连接)<br />
<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
尽管超图比图更难画在纸上,但一些研究者已经研究了超图可视化方法。<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = <br />
}}.</ref><br />
<br />
其中一种超图的可视化表示法,类似于标准的图的画法:用平面内的曲线来描绘图边,将超图的顶点画成点状、圆盘或盒子,超边则被描绘成以顶点为叶子的树[16][17]。如果顶点表示为点,超边也可以被描绘成连接点集的平滑曲线,或显示为封闭点集的简单闭合曲线[18][19][20]。 <br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).]](一个4阶维恩图,可以被解释为一个15个顶点(15个有色区域)和4个超边(4个椭圆)的超图的细分图)<br />
<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/978-3-642-11805-0_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = 978-3-642-11804-3<br />
| doi-access = free<br />
}}.</ref><br />
<br />
超图可视化的另一种样式,是绘制超图的细分模型[21],平面被细分为区域,每个区域代表超图的一个顶点。超图的超边用这些区域的相邻子集来表示,这些子集可以通过着色、或在它们周围画轮廓来表示,或者兼而有之。<br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
超图的另一种表示法被称做 PAOH[24],如上图所示,边是连接顶点的垂线,顶点在左边是对齐的。右边的图例显示了边的名称。它是为动态超图设计的,但也可以用于简单的超图。<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
通过扩充一组替换规则于超图,[[图语法]]可以被推广超边上。<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
<br />
==超图概念的延伸==<br />
<br />
超图的相关概念可以进行进一步的延伸,如超图中的一些边可以指向另一些边。这种延伸有两种变体。在第一种变体中,超图的边不仅包含一组节点,而且还可以包含这组节点的子集、子集的子集等等。本质上,超图的每条边只是树结构或有向无环图的一个内部节点,而节点就是叶子。从这个意义上来说,超图就是具有共享节点的树的集合(即内部节点或叶子可能出现在不同的树结构中),反过来说,每个树的集合又可以理解为一个超图。因为树结构在计算机科学和许多数学分支中被广泛使用,所以超图的出现也是自然而然的。比如这种延伸是作为项代数的模型而自然产生的:边对应项,节点对应常量或变量。<br />
<br />
<br />
对于上述的超图,节点集提供了一种排序。但是该排序既不是偏序也不是预序,因为它是不可传递的。与这一延伸方式的Levi图相对应的图是有向无环图。例如,一个超图的节点集为<math>V= \{a,b\}</math>,边为<math>e_1=\{a,b\}</math>和<math>e_2=\{a,e_1\}</math>。那么,虽然<math>b\in e_1</math>且<math>e_1\in e_2</math>,但<math>b\in e_2</math>却不是真的。然而,这类超图节点集的封闭传递确实诱导了偏序,并将超图“展平”为一个偏序集。<br />
<br />
<br />
第二种变体中,超图中的边可以指向其他边,同时不用考虑必须形成有向非循环图的要求。这允许超图具有边的循环,而不需要有任何节点。例如,考虑由两条边e1和e2组成的,节点个数为零的广义超图,使得<math>e_1 = \{e_2\}</math>且<math>e_2 = \{e_1\}</math>。因为这个循环是无限递归的,所以边的集合违反了基础公理。具体来说,对于这样的超图,不存在节点集的封闭传递。虽然这样的结构乍看起来可能很奇怪,但只要注意到它的Levi图的等价延伸不再是二分图,而是一般的有向图,就可以很容易地去理解。<br />
<br />
根据定义,这种超图的广义关联矩阵是一个方阵,其秩等于节点和边的总数。因此,对于上面的示例,关联矩阵为:<br />
<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right]</math>。<br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
<br />
==超图与机器学习==<br />
<br />
超图已被广泛用于机器学习中,常作为一种数据结构或一种正则化的方式[25]。这些应用包括推荐系统(社团作为超边)[26]、图像检索(相关性作为超边)[27]、和生物信息学(生物、化学分子间相互作用作为超边)[28]。比较典型的超图机器学习方法包括:超图谱聚类法(用超图Laplacian扩展光谱图理论)[29]和超图半监督学习(通过引入超图结构来对结果进行限定)。对于大尺寸的超图,可以使用Apache Spark构建的分布式框架[15]。<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4<br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]<br />
<br />
<br />
==编者推荐==<br />
*[https://book.douban.com/subject/1237624/ 《超图-限集的组合学》]by [法]Claude Berge<br />
超图的第一本专著,作者是近代图论之父法国数学家Claude Berge,将图里的普通边拓展为超边,小小的一步拓展却引发了一个大的领域。</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4607超图 Hypergraph2020-04-22T09:38:20Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区---厚朴<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
在[[数学中]], '''超图'''是一种广义上的[[graph(discrete mathematics)|图]] ,它的一条[[graph theory|边]]可以连接任意数量的[[vertex (graph theory)|顶点]]. 相对而言,在普通图中,一条边只能连接两个顶点.形式上, 超图 <math>H</math> 是一个集合组 <math>H = (X,E)</math> 其中<math>X</math> 是一个以节点或顶点为元素的集合,即顶点集, 而 <math>E</math> 是一组非空子,被称为边或超边. <br />
因此,若<math>\mathcal{P}(X)</math>是 <math>E</math>的幂集,则<math>E</math>是 <math>\mathcal{P}(X) \setminus\{\emptyset\}</math> 的一个子集。在<math>H</math>中,顶点集的大小被称为超图的阶数,边集的大小被称为超图的大小。<br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
普通图的边是节点的二元子集,超边则是节点的任意集合,所以可以包含任意数量的节点。但我们总先需要研究具有相同基数超边的超图,即一个 k-均匀超图,所有超边的大小都为 k。因此一个 2-均匀超图就是图,一个 3-均匀超图就是三元组的集合,依此类推。超图也被称为从[[泛集]](universal set)中抽取的一个集系统或[[集族]]。<br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
超图可以看做是[[关联结构]](incidence structure)。特别的,每个超图都有一个与超图相对应的二分 "关联图 "或 "[[列维图]]"(Levi graph),反之,大多数(但不是全部)[[二分图]]都可以看作是超图的关联图。<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
超图还有许多其它名称。在[[计算几何学]]中,超图有时可以被称为'''范围空间'''(range space),将超图的边称为''范围''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
在[[合作博弈论]]中,超图被称为'''简单博弈'''(投票博弈);这个概念被应用于解决[[社会选择理论]](social choice theory)中的问题。在一些文献中,超边被称为''超连接''或''连接器''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
特殊类型的超图包括:上文简单讨论过的 k-均匀超图;散簇,没有一条边作是另一条边的子集;以及[[抽象单纯复形]](abstract simplicial complexes),包含每条边的所有子集。<br />
超图是一个以超图同态为[[态射]](morphism)的范畴。<br />
<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
超图有不同的类型,如:<br />
* 空超图:没有边的超图<br />
* 非简单(或多重)超图:允许有循环(有单个顶点的超边)或重复边的超图,也就是说可以有两个或两个以上的边包含同一组顶点。<br />
* 简单超图:不包含循环和重复边的超图。<br />
* 𝑘-均匀超图:每条超边都正好包含 k 个顶点的超图。<br />
* 𝑑-正则超图:每个顶点的度数都是 𝑑 的超图<br />
* 非循环超图:不包含任何循环的超图。<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
因为超图的链接可以有任意基数,所以有几种子图的概念,分别是''子超图''(subhypergraphs)、''部分超图''(partial hypergraphs)和''分段超图''(section hypergraphs)。<br />
<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
对于不连通的超图 G 和具有相同顶点连通的超图 H,如果 H 的每个超边都有 G 中一个子图连接,则 G 是一个主图(host graph);<br />
对于不连通的超图 H,如果 G 和 H 的连通部分之间存在一个双射,使得 G 的每个连通部分 G' 都是对应的 H' 的主图,则 G 是一个主图。<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
一个超图是二分(bipartite)的,当且仅当它的顶点能被分成两类 U 和 V :每个基数至少为 2 超边包含两类中的至少一个顶点。相反的,超图则被称为具有属性 B。<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
2-段超图(或团图,代表图、原始图、盖夫曼图)是具有相同顶点的图,并且所有顶点对之间的边包含在相同的超边中。<br />
<br />
==二部图模型 Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
[[File:bipartie graph.jpeg|200px|缩略图|右| 设<math>G=(V,E)</math>是一个无向图,如果顶点V可分割为两个互不相交的子集<math> {(group1, group2)}</math>,并且图中的每条边<math>{(i,j)}</math>所关联的两个顶点<math>{i}</math>和<math>{j}</math>分别属于这两个不同的部分<math>{(i \in group1,j \in group2)}</math>,则称图<math>{G}</math>为一个二部图。]]<br />
<br />
一个'''超图“ <math>{H} </math>”'''可以用二部图“<math>{BG} </math>”表示,其构成如下: 集合"X"和" E"是"BG"的分割,而且 ("x<sub>1</sub>", "e<sub>1</sub>") 与边连通当且仅当顶点"x<sub>1</sub>"包含在" <math>H </math>"的边" e<sub>1</sub>"中。 反之,任何具有固定的'''部分 part'''且在第二部分中没有不连通节点的二部图也代表具有上述性质的部分超图。 这个二部图也称为'''关联图'''。<br />
<br />
==无环性 Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
与只有'''圈 cycle'''和'''森林 forest'''的普通无向图相比,对于超图的特殊情形,那些坍缩为平凡图的无环性超图有多种自然不等价的'''无环性 acyclicity''' 定义。<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
由Claude Berge 给出了超图无环性的首个定义: <ref>Claude Berge,[https://www.amazon.com/Graphs-hypergraphs-North-Holland-mathematical-library/dp/0444103996 ''Graphs and Hypergraphs'']</ref> 如果它的'''关联图'''(上面定义的二部图)是无环的,则称这个超图是 Berge 无环的 Berge-acyclic。 这个定义是非常严格的:例如,假设一个超图有一些顶点<math>v \neq v'</math>和一些超边<math>f \neq f'</math> ,例如 <math>v, v' \in f</math> 和<math>v, v' \in f'</math>,那么它就是 Berge成环的 Berge-cyclic。 通过对关联图的探讨,Berge成环性 berge-cyclity可以在线性时间 linear time内得到有效验证 。<br />
<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
此处,我们可以定义一个减弱的超图无环性的概念<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref>,后来被称为 <math> {\alpha}</math>-无环性 <math> {\alpha}</math> acyclicity。 这个无环性的概念等价于超图是同构的(原图的每个团都被某个超边所覆盖) ,它的原图称为弦图 chordal graph ; 它也等价于通过 GYO 算法 Graham-Yu-Ozsoyoglu Algorithm (也称为格雷厄姆算法 Graham's algorithm) 得到具有可约性的空图<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref>。GYO 算法是一个合流 confluence(抽象重写 abstract rewriting)迭代过程,该算法中使用耳朵 ear的广义定义去除超边 (图论中的耳朵就定义为一条路径,其中除了端点外的点的度数均为 2(端点可以重合),而且删去后不破坏图的连通性)。总所周知, 在数据库理论 database theory 的领域中,如果一个数据库模式 database schema的底层超图是<math> {\alpha}</math>无环的,那么它就具有某些理想的性质。 <ref>Serge Abiteboul, Richard B. Hull, Victor Vianu|V. Vianu, ''Foundations of Databases''</ref> 除此之外,<math> {\alpha}</math>无环性也与一阶逻辑 first-order logic 保护的片段 guarded fragment 的表达能力有关。<br />
<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
我们可以在线性时间 linear time内检验超图是否是-无环的。 <ref>Robert Tarjan|R. E. Tarjan, Mihalis Yannakakis|M. Yannakakis. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
注意到<math> {\alpha}</math>-无环性似乎直觉不相符,即<math> {\alpha}</math>-成环超图添加超边可能使其成为<math> {\alpha}</math>-无环的(例如,添加一条包含超图所有顶点的超边总能其成为<math> {\alpha}</math>-无环的)。 为了克服这个缺点,Ronald Fagin<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> 定义了更强的 <math> {\beta}</math>-无环性 <math> {\beta}</math>-acylicity 和 <math> {\gamma}</math>无环性 <math> {\gamma}</math>-acylicity 概念。 应当指出:<math> {\gamma}</math>无环超图是推出其所有子超图都是<math> {\alpha}</math>无环的必要条件,这与 Graham 的早期定义<ref name="graham1979universal"/> 等价。 <math> {\gamma}</math>无环性的概念是一个更加严苛的条件,它等价于数据库模式的几个理想性质,并且与Bachman 图 Bachman diagrams有关. <math> {\beta}</math>-无环性 和 <math> {\gamma}</math>无环性 都可以在多项式时间 polynomial time(PTIME)内完成检测。<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
无环性的四个概念具有可比性: berge-无环性意味着 <math> {\gamma}</math>- 无环性, <math> {\gamma}</math>- 无环性又意味着 <math> {\beta}</math>- 无环性, <math> {\beta}</math>- 无环性又可以推出 <math> {\alpha}</math> 无环性。 然而,反之均不成立。<ref name="fagin1983degrees" /><br />
<br />
==Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]]<br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
===Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
==Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
超图<math>H</math>的<math>r(H)</math>表示该超图中任何一条边的最大基数。如果所有边具有相同的基数''k'',则称该超图为''均匀的''或''k-均匀的'',或称之为''k-超图''。图只是一个2-均匀的超图。<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
顶点''v''的度''d(v)''表示包含该顶点的边的数量。如果每个顶点的度都为''k'',则超图''H''是''k-正则的''。<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
均匀超图的对偶是正则的,反之亦然。<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
如果存在一个形如<math>\phi(x)=y</math>的自同构,则超图''H''的两个顶点''x''和''y''''对称''。如果存在一个自同构使得<math>\phi(e_i)=e_j</math>,则称两个边<math>e_i</math>和<math>e_j</math>为''对称''。<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
如果超图的所有顶点都是对称的,则称其为''顶点可传递的''(或''顶点对称的'')。类似地,如果超图的所有边都是对称的,则该超图是''边传递的''。 如果一个超图既是边对称的又是顶点对称的,则该超图是''简单传递的''。<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
由于超图的对偶性,边传递性的研究与顶点传递性的研究是相一致的。<br />
==Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
超图''H'' = (''X'', ''E'')的横截集(或“命中集”)是一个<math>T\subseteq X</math>集合,该集合与每条边都有非空的交集。如果''T''的真子集不是横截集,则称''T''为极小截集。''H'' 的横截超图是超图(''X'', ''F''),其边集''F''包含''H''的所有最小横截。<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
计算横向超图在[[组合优化 Combinatorial Optimization]]、[[博弈论 Game Theory]]和[[计算机科学 Computer Science]]的一些领域(例如[[机器学习 Machine Learning]]、[[数据库索引 Indexing of Databases]]、[[可满足性问题the Satisfiability Problem]]、[[数据挖掘Data Mining]]和[[计算机程序优化 Program Optimization]])都有应用。<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
<br />
分别设 <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math>, <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>。<br />
每一个超图都有一个 <math>n \times m</math>[[关联矩阵]]<math>A = (a_{ij})</math>其为:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
<br />
其关联矩阵的[[转设]] <math>A^t</math>定义了 <math>H^* = (V^*,\ E^*)</math>称为<math>H</math>的'''对偶''',其中<math>V^*</math>是一个''m''元集合 <math>E^*</math>是一个<math>V^*</math>子集的''n''元集合。<br />
<br />
对于<math>v^*_j \in V^*</math> 和 <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[当且仅当]] <math>a_{ij} = 1</math>。<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
经典超图着色是将集合<math>\{1,2,3,...\lambda\}</math>中的其中一种颜色赋予给超图的每个顶点,使每个超边至少包含两个不同颜色的顶点。换句话说,不能存在基数至少为2的单色深边。从此意义上出发,它是通常图着色的直接推广。在所有着色行为中使用到最小的不同颜色数称为超图的色数。<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
存在着使用多达''k'' 种颜色着色的超图称为''k- 可着色图''。'''2-可染超图就是二分图'''。<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
经典超图着色有许多推广。在允许单色边情况下,混合超图着色是其中之一。一些混合超图对于任意数量的颜色都是不可着色的。同时不可着色性的内在标准是未知的。当一个混合超图是可着色时,其所使用的最小和最大颜色数分别称为下色数和上色数。详情请参阅 http://spectrum.troy.edu/voloshin/mh.html<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
<br />
由E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref>所提出的一个分区定理表明,对于边传递超图 <math>H=(X,E)</math>存在一个[[分区]]:<math>(X_1, X_2,\cdots,X_K)</math>对于顶点集 <math>X</math>使得由<math>X_k</math>生成的子超图<math>H_{X_k}</math>在<math>1\le k \le K</math>时是可传递的,并且使得<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math>,其中<math>r(H)</math>是 ''H''的秩。<br />
<br />
作为推论,不是点传递的边传递超图则是双色的。<br />
<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
<br />
[[图分区]](特别是超图分区)在集成电路设计<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> 和并行计算<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref>中有很多应用。在机器学习任务中,高效、可扩展的[[超图分区算法]]对于处理大规模超图也很重要。<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
许多涉及图的定理和概念也适用于超图,典型的例子有[[拉姆西定理]](Ramsey's theorem)和超图的线图。研究图的对称性的一些方法也被扩展到超图。<br />
均匀超图上有[[Erdős-Ko-Rado theorem]]和[[Kruskal-Katona theorem]]两个著名定理。<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.]](这个线路图可以解释为一个超图,其中四个顶点(用白色的矩形和圆盘表示)由三个用树表示的超图连接)<br />
<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
尽管超图比图更难画在纸上,但一些研究者已经研究了超图可视化方法。<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = <br />
}}.</ref><br />
<br />
其中一种超图的可视化表示法,类似于标准的图的画法:用平面内的曲线来描绘图边,将超图的顶点画成点状、圆盘或盒子,超边则被描绘成以顶点为叶子的树[16][17]。如果顶点表示为点,超边也可以被描绘成连接点集的平滑曲线,或显示为封闭点集的简单闭合曲线[18][19][20]。 <br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).]](一个4阶维恩图,可以被解释为一个15个顶点(15个有色区域)和4个超边(4个椭圆)的超图的细分图)<br />
<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = <br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/978-3-642-11805-0_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = 978-3-642-11804-3<br />
| doi-access = free<br />
}}.</ref><br />
<br />
超图可视化的另一种样式,是绘制超图的细分模型[21],平面被细分为区域,每个区域代表超图的一个顶点。超图的超边用这些区域的相邻子集来表示,这些子集可以通过着色、或在它们周围画轮廓来表示,或者兼而有之。<br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
超图的另一种表示法被称做 PAOH[24],如上图所示,边是连接顶点的垂线,顶点在左边是对齐的。右边的图例显示了边的名称。它是为动态超图设计的,但也可以用于简单的超图。<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
通过扩充一组替换规则于超图,[[图语法]]可以被推广超边上。<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
<br />
==超图概念的延伸==<br />
<br />
超图的相关概念可以进行进一步的延伸,如超图中的一些边可以指向另一些边。这种延伸有两种变体。在第一种变体中,超图的边不仅包含一组节点,而且还可以包含这组节点的子集、子集的子集等等。本质上,超图的每条边只是树结构或有向无环图的一个内部节点,而节点就是叶子。从这个意义上来说,超图就是具有共享节点的树的集合(即内部节点或叶子可能出现在不同的树结构中),反过来说,每个树的集合又可以理解为一个超图。因为树结构在计算机科学和许多数学分支中被广泛使用,所以超图的出现也是自然而然的。比如这种延伸是作为项代数的模型而自然产生的:边对应项,节点对应常量或变量。<br />
<br />
<br />
对于上述的超图,节点集提供了一种排序。但是该排序既不是偏序也不是预序,因为它是不可传递的。与这一延伸方式的Levi图相对应的图是有向无环图。例如,一个超图的节点集为<math>V= \{a,b\}</math>,边为<math>e_1=\{a,b\}</math>和<math>e_2=\{a,e_1\}</math>。那么,虽然<math>b\in e_1</math>且<math>e_1\in e_2</math>,但<math>b\in e_2</math>却不是真的。然而,这类超图节点集的封闭传递确实诱导了偏序,并将超图“展平”为一个偏序集。<br />
<br />
<br />
第二种变体中,超图中的边可以指向其他边,同时不用考虑必须形成有向非循环图的要求。这允许超图具有边的循环,而不需要有任何节点。例如,考虑由两条边e1和e2组成的,节点个数为零的广义超图,使得<math>e_1 = \{e_2\}</math>且<math>e_2 = \{e_1\}</math>。因为这个循环是无限递归的,所以边的集合违反了基础公理。具体来说,对于这样的超图,不存在节点集的封闭传递。虽然这样的结构乍看起来可能很奇怪,但只要注意到它的Levi图的等价延伸不再是二分图,而是一般的有向图,就可以很容易地去理解。<br />
<br />
根据定义,这种超图的广义关联矩阵是一个方阵,其秩等于节点和边的总数。因此,对于上面的示例,关联矩阵为:<br />
<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right]</math>。<br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
<br />
==超图与机器学习==<br />
<br />
超图已被广泛用于机器学习中,常作为一种数据结构或一种正则化的方式[25]。这些应用包括推荐系统(社团作为超边)[26]、图像检索(相关性作为超边)[27]、和生物信息学(生物、化学分子间相互作用作为超边)[28]。比较典型的超图机器学习方法包括:超图谱聚类法(用超图Laplacian扩展光谱图理论)[29]和超图半监督学习(通过引入超图结构来对结果进行限定)。对于大尺寸的超图,可以使用Apache Spark构建的分布式框架[15]。<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4<br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]<br />
<br />
<br />
==编者推荐==<br />
*[https://book.douban.com/subject/1237624/ 《超图-限集的组合学》]by [法]Claude Berge<br />
超图的第一本专著,作者是近代图论之父法国数学家Claude Berge,将图里的普通边拓展为超边,小小的一步拓展却引发了一个大的领域。</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E8%B6%85%E5%9B%BE_Hypergraph&diff=4485超图 Hypergraph2020-04-22T04:58:51Z<p>Pjhhh:</p>
<hr />
<div><br />
我们在组织翻译超图这个词条,这个词条是之前Wolfram发的那篇长文中一个非常重要的概念,我们希望可以整理好这个词条,帮助大家更好的理解那篇文章。<br />
<br />
<br />
现在招募6个小伙伴一起翻译超图这个词条 https://wiki.swarma.org/index.php?title=超图_Hypergraph<br />
<br />
*开头正文部分+术语定义(Terminology)——十三维<br />
*二分图模型+不对称性+同构与平等——淑慧<br />
*对称超图+横截面——瑾晗<br />
*关联矩阵+超图着色+分区<br />
*定理+超图绘制+超图语法——十三维<br />
*概括+超图学习——世康<br />
<br />
截止时间:北京时间18:00之前。<br />
<br />
<br />
<br />
In [[mathematics]], a '''hypergraph''' is a generalization of a [[Graph (discrete mathematics)|graph]] in which an [[graph theory|edge]] can join any number of [[vertex (graph theory)|vertices]]. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, a hypergraph <math>H</math> is a pair <math>H = (X,E)</math> where <math>X</math> is a set of elements called ''nodes'' or ''vertices'', and <math>E</math> is a set of non-empty subsets of <math>X</math> called ''[[hyperedges]]'' or ''edges''. Therefore, <math>E</math> is a subset of <math>\mathcal{P}(X) \setminus\{\emptyset\}</math>, where <math>\mathcal{P}(X)</math> is the [[power set]] of <math>X</math>. The size of the vertex set is called the ''order of the hypergraph'', and the size of edges set is the ''size of the hypergraph''. <br />
<br />
While graph edges are 2-element subsets of nodes, hyperedges are arbitrary sets of nodes, and can therefore contain an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a ''k-uniform hypergraph'' is a hypergraph such that all its hyperedges have size ''k''. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting ''k'' nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. A hypergraph is also called a ''set system'' or a ''[[family of sets]]'' drawn from the [[universal set]]. <br />
<br />
Hypergraphs can be viewed as [[incidence structure]]s. In particular, there is a bipartite "incidence graph" or "[[Levi graph]]" corresponding to every hypergraph, and conversely, most, but not all, [[bipartite graph]]s can be regarded as incidence graphs of hypergraphs.<br />
<br />
Hypergraphs have many other names. In [[computational geometry]], a hypergraph may sometimes be called a '''range space''' and then the hyperedges are called ''ranges''.<ref>{{citation<br />
| last1 = Haussler | first1 = David | author1-link = David Haussler<br />
| last2 = Welzl | first2 = Emo | author2-link = Emo Welzl<br />
| doi = 10.1007/BF02187876<br />
| issue = 2<br />
| journal = [[Discrete and Computational Geometry]]<br />
| mr = 884223<br />
| pages = 127–151<br />
| title = ε-nets and simplex range queries<br />
| volume = 2<br />
| year = 1987| doi-access = free<br />
}}.</ref><br />
In [[cooperative game]] theory, hypergraphs are called '''simple games''' (voting games); this notion is applied to solve problems in [[social choice theory]]. In some literature edges are referred to as ''hyperlinks'' or ''connectors''.<ref>Judea Pearl, in ''HEURISTICS Intelligent Search Strategies for Computer Problem Solving'', Addison Wesley (1984), p. 25.</ref><br />
<br />
Special kinds of hypergraphs include: [[#Symmetric hypergraphs|''k''-uniform ones]], as discussed briefly above; [[clutter (mathematics)|clutter]]s, where no edge appears as a subset of another edge; and [[abstract simplicial complex]]es, which contain all subsets of every edge.<br />
<br />
The collection of hypergraphs is a [[Category (mathematics)|category]] with hypergraph [[homomorphism]]s as [[morphism]]s.<br />
<br />
==Terminology==<br />
<br />
==== Definitions ====<br />
There are different types of hypergraphs such as:<br />
* ''Empty hypergraph'': a hypergraph with no edges. <br />
* ''Non-simple (or multiple) hypergraph'': a hypergraph allowing loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.<br />
* ''Simple hypergraph'': a hypergraph that contains no loops and no repeated edges.<br />
* ''<math>k </math>-uniform hypergraph'': a hypergraph where each edge contains precisely <math>k</math> vertices.<br />
* ''<math>d </math>-regular hypergraph'': a hypergraph where every vertex has degree <math>d </math>.<br />
* ''Acyclic hypergraph'': a hypergraph that does not contain any cycles.<br />
<br />
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called ''subhypergraphs'', ''partial hypergraphs'' and ''section hypergraphs''.<br />
<br />
Let <math>H=(X,E)</math> be the hypergraph consisting of vertices<br />
<br />
:<math>X = \lbrace x_i | i \in I_v \rbrace,</math><br />
<br />
and having ''edge set''<br />
<br />
:<math>E = \lbrace e_i | i\in I_e \land e_i \subseteq X \land e_i \neq \emptyset \rbrace,</math><br />
<br />
where <math>I_v</math> and <math>I_e</math> are the [[index set]]s of the vertices and edges respectively.<br />
<br />
A ''subhypergraph'' is a hypergraph with some vertices removed. Formally, the subhypergraph <math>H_A</math> induced by <math>A \subseteq X </math> is defined as<br />
<br />
:<math>H_A=\left(A, \lbrace e \cap A | e \in E \land<br />
e \cap A \neq \emptyset \rbrace \right).</math><br />
<br />
An ''extension'' of a ''subhypergraph'' is a hypergraph where each<br />
hyperedge of <math>H</math> which is partially contained in the subhypergraph <math>H_A</math> and is fully contained in the extension <math>Ex(H_A)</math>.<br />
Formally<br />
:<math>Ex(H_A) = (A \cup A', E' )</math> with <math>A' = \bigcup_{e \in E} e \setminus A</math> and <math>E' = \lbrace e \in E | e \subseteq (A \cup A') \rbrace</math>.<br />
<br />
The ''partial hypergraph'' is a hypergraph with some edges removed. Given a subset <math>J \subset I_e</math> of the edge index set, the partial hypergraph generated by <math>J</math> is the hypergraph<br />
<br />
:<math>\left(X, \lbrace e_i | i\in J \rbrace \right).</math><br />
<br />
Given a subset <math>A\subseteq X</math>, the ''section hypergraph'' is the partial hypergraph<br />
<br />
:<math>H \times A = \left(A, \lbrace e_i | <br />
i\in I_e \land e_i \subseteq A \rbrace \right).</math><br />
<br />
The '''dual''' <math>H^*</math> of <math>H</math> is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by <math>\lbrace e_i \rbrace</math> and whose edges are given by <math>\lbrace X_m \rbrace</math> where<br />
<br />
:<math>X_m = \lbrace e_i | x_m \in e_i \rbrace. </math><br />
<br />
When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an [[involution (mathematics)|involution]], i.e.,<br />
<br />
:<math>\left(H^*\right)^* = H.</math><br />
<br />
A [[connected graph]] ''G'' with the same vertex set as a connected hypergraph ''H'' is a '''host graph''' for ''H'' if every hyperedge of ''H'' [[induced subgraph|induces]] a connected subgraph in ''G''. For a disconnected hypergraph ''H'', ''G'' is a host graph if there is a bijection between the [[connected component (graph theory)|connected components]] of ''G'' and of ''H'', such that each connected component ''G<nowiki>'</nowiki>'' of ''G'' is a host of the corresponding ''H<nowiki>'</nowiki>''.<br />
<br />
A hypergraph is ''bipartite'' if and only if its vertices can be partitioned into two classes ''U'' and ''V'' in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. Alternatively, such a hypergraph is said to have [[Property B]].<br />
<br />
The '''2-section''' (or '''clique graph''', '''representing graph''', '''primal graph''', '''Gaifman graph''') of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.<br />
<br />
==Bipartite graph model==<br />
A hypergraph ''H'' may be represented by a [[bipartite graph]] ''BG'' as follows: the sets ''X'' and ''E'' are the partitions of ''BG'', and (''x<sub>1</sub>'', ''e<sub>1</sub>'') are connected with an edge if and only if vertex ''x<sub>1</sub>'' is contained in edge ''e<sub>1</sub>'' in ''H''. Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called [[incidence graph]].<br />
<br />
==Acyclicity==<br />
In contrast with ordinary undirected graphs for which there is a single natural notion of [[cycle (graph theory)|cycles]] and [[Forest (graph theory)|acyclic graphs]], there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.<br />
<br />
A first definition of acyclicity for hypergraphs was given by [[Claude Berge]]:<ref>[[Claude Berge]], ''Graphs and Hypergraphs''</ref> a hypergraph is Berge-acyclic if its [[incidence graph]] (the [[bipartite graph]] defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair <math>v \neq v'</math> of vertices and some pair <math>f \neq f'</math> of hyperedges such that <math>v, v' \in f</math> and <math>v, v' \in f'</math>, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in [[linear time]] by an exploration of the incidence graph.<br />
<br />
We can define a weaker notion of hypergraph acyclicity,<ref>C. Beeri, [[Ronald Fagin|R. Fagin]], D. Maier, [[Mihalis Yannakakis|M. Yannakakis]], ''On the Desirability of Acyclic Database Schemes''</ref> later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being [[chordal graph|chordal]]; it is also equivalent to reducibility to the empty graph through the GYO algorithm<ref>C. T. Yu and M. Z. Özsoyoğlu. ''[https://www.computer.org/csdl/proceedings/cmpsac/1979/9999/00/00762509.pdf An algorithm for tree-query membership of a distributed query]''. In Proc. IEEE COMPSAC, pages 306-312, 1979</ref><ref name="graham1979universal">M. H. Graham. ''On the universal relation''. Technical Report, University of Toronto, Toronto, Ontario, Canada, 1979</ref> (also known as Graham's algorithm), a [[confluence (abstract rewriting)|confluent]] iterative process which removes hyperedges using a generalized definition of [[ear (graph theory)|ears]]. In the domain of [[database theory]], it is known that a [[database schema]] enjoys certain desirable properties if its underlying hypergraph is α-acyclic.<ref>[[Serge Abiteboul|S. Abiteboul]], [[Richard B. Hull|R. B. Hull]], [[Victor Vianu|V. Vianu]], ''Foundations of Databases''</ref> Besides, α-acyclicity is also related to the expressiveness of the [[guarded fragment]] of [[first-order logic]].<br />
<br />
We can test in [[linear time]] if a hypergraph is α-acyclic.<ref>[[Robert Tarjan|R. E. Tarjan]], [[Mihalis Yannakakis|M. Yannakakis]]. ''Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs''. SIAM J. on Computing, 13(3):566-579, 1984.</ref><br />
<br />
Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, [[Ronald Fagin]]<ref name="fagin1983degrees">[[Ronald Fagin]], ''Degrees of Acyclicity for Hypergraphs and Relational Database Schemes''</ref> defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent<ref name="fagin1983degrees"/> to an earlier definition by Graham.<ref name="graham1979universal"/> The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to [[Bachman diagram]]s. Both β-acyclicity and γ-acyclicity can be tested in [[PTIME|polynomial time]].<br />
<br />
Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.<ref name="fagin1983degrees" /><br />
<br />
==Isomorphism and equality==<br />
A hypergraph [[homomorphism]] is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.<br />
<br />
A hypergraph <math>H=(X,E)</math> is ''isomorphic'' to a hypergraph <math>G=(Y,F)</math>, written as <math>H \simeq G</math> if there exists a [[bijection]]<br />
<br />
:<math>\phi:X \to Y</math><br />
<br />
and a [[permutation]] <math>\pi</math> of <math>I</math> such that<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
The bijection <math>\phi</math> is then called the [[isomorphism]] of the graphs. Note that<br />
<br />
:<math>H \simeq G</math> if and only if <math>H^* \simeq G^*</math>.<br />
<br />
When the edges of a hypergraph are explicitly labeled, one has the additional notion of ''strong isomorphism''. One says that <math>H</math> is ''strongly isomorphic'' to <math>G</math> if the permutation is the identity. One then writes <math>H \cong G</math>. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.<br />
<br />
When the vertices of a hypergraph are explicitly labeled, one has the notions of ''equivalence'', and also of ''equality''. One says that <math>H</math> is ''equivalent'' to <math>G</math>, and writes <math>H\equiv G</math> if the isomorphism <math>\phi</math> has<br />
<br />
:<math>\phi(x_n) = y_n</math><br />
<br />
and<br />
<br />
:<math>\phi(e_i) = f_{\pi(i)}</math><br />
<br />
Note that<br />
<br />
:<math>H\equiv G</math> if and only if <math>H^* \cong G^*</math><br />
<br />
If, in addition, the permutation <math>\pi</math> is the identity, one says that <math>H</math> equals <math>G</math>, and writes <math>H=G</math>. Note that, with this definition of equality, graphs are self-dual:<br />
<br />
:<math>\left(H^*\right) ^* = H</math><br />
<br />
A hypergraph [[automorphism]] is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph ''H'' (= (''X'',&nbsp;''E'')) is a [[group (mathematics)|group]] under composition, called the [[automorphism group]] of the hypergraph and written Aut(''H'').<br />
<br />
===Examples===<br />
Consider the hypergraph <math>H</math> with edges<br />
:<math>H = \lbrace<br />
e_1 = \lbrace a,b \rbrace,<br />
e_2 = \lbrace b,c \rbrace,<br />
e_3 = \lbrace c,d \rbrace,<br />
e_4 = \lbrace d,a \rbrace,<br />
e_5 = \lbrace b,d \rbrace,<br />
e_6 = \lbrace a,c \rbrace<br />
\rbrace</math><br />
and<br />
:<math>G = \lbrace<br />
f_1 = \lbrace \alpha,\beta \rbrace,<br />
f_2 = \lbrace \beta,\gamma \rbrace,<br />
f_3 = \lbrace \gamma,\delta \rbrace,<br />
f_4 = \lbrace \delta,\alpha \rbrace,<br />
f_5 = \lbrace \alpha,\gamma \rbrace,<br />
f_6 = \lbrace \beta,\delta \rbrace<br />
\rbrace</math><br />
<br />
Then clearly <math>H</math> and <math>G</math> are isomorphic (with <math>\phi(a)=\alpha</math>, ''etc.''), but they are not strongly isomorphic. So, for example, in <math>H</math>, vertex <math>a</math> meets edges 1, 4 and 6, so that,<br />
<br />
:<math>e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace</math><br />
<br />
In graph <math>G</math>, there does not exist any vertex that meets edges 1, 4 and 6:<br />
<br />
:<math>f_1 \cap f_4 \cap f_6 = \varnothing</math><br />
<br />
In this example, <math>H</math> and <math>G</math> are equivalent, <math>H\equiv G</math>, and the duals are strongly isomorphic: <math>H^*\cong G^*</math>.<br />
<br />
==Symmetric hypergraphs==<br />
The<math>r(H)</math> of a hypergraph <math>H</math> is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality ''k'', the hypergraph is said to be ''uniform'' or ''k-uniform'', or is called a ''k-hypergraph''. A graph is just a 2-uniform hypergraph.<br />
<br />
The degree ''d(v)'' of a vertex ''v'' is the number of edges that contain it. ''H'' is ''k-regular'' if every vertex has degree ''k''.<br />
<br />
The dual of a uniform hypergraph is regular and vice versa.<br />
<br />
Two vertices ''x'' and ''y'' of ''H'' are called ''symmetric'' if there exists an automorphism such that <math>\phi(x)=y</math>. Two edges <math>e_i</math> and <math>e_j</math> are said to be ''symmetric'' if there exists an automorphism such that <math>\phi(e_i)=e_j</math>.<br />
<br />
A hypergraph is said to be ''vertex-transitive'' (or ''vertex-symmetric'') if all of its vertices are symmetric. Similarly, a hypergraph is ''edge-transitive'' if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply ''transitive''.<br />
<br />
Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.<br />
<br />
==Transversals==<br />
A ''[[Transversal (combinatorics)|transversal]]'' (or "[[hitting set]]") of a hypergraph ''H'' = (''X'', ''E'') is a set <math>T\subseteq X</math> that has nonempty [[intersection (set theory)|intersection]] with every edge. A transversal ''T'' is called ''minimal'' if no proper subset of ''T'' is a transversal. The ''transversal hypergraph'' of ''H'' is the hypergraph (''X'', ''F'') whose edge set ''F'' consists of all minimal transversals of ''H''.<br />
<br />
Computing the transversal hypergraph has applications in [[combinatorial optimization]], in [[game theory]], and in several fields of [[computer science]] such as [[machine learning]], [[Index (database)|indexing of database]]s, the [[Boolean satisfiability problem|satisfiability problem]], [[data mining]], and computer [[program optimization]].<br />
<br />
==Incidence matrix==<br />
Let <math>V = \{v_1, v_2, ~\ldots, ~ v_n\}</math> and <math>E = \{e_1, e_2, ~ \ldots ~ e_m\}</math>. Every hypergraph has an <math>n \times m</math> [[incidence matrix]] <math>A = (a_{ij})</math> where<br />
:<math>a_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.</math><br />
The [[transpose]] <math>A^t</math> of the [[incidence (geometry)|incidence]] matrix defines a hypergraph <math>H^* = (V^*,\ E^*)</math> called the '''dual''' of <math>H</math>, where <math>V^*</math> is an ''m''-element set and <math>E^*</math> is an ''n''-element set of subsets of <math>V^*</math>. For <math>v^*_j \in V^*</math> and <math>e^*_i \in E^*, ~ v^*_j \in e^*_i</math> [[if and only if]] <math>a_{ij} = 1</math>.<br />
<br />
==Hypergraph coloring==<br />
Classic hypergraph coloring is assigning one of the colors from set <math>\{1,2,3,...\lambda\}</math> to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.<br />
<br />
Hypergraphs for which there exists a coloring using up to ''k'' colors are referred to as ''k-colorable''. The 2-colorable hypergraphs are exactly the bipartite ones.<br />
<br />
There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively. See http://spectrum.troy.edu/voloshin/mh.html for details.<br />
<br />
==Partitions==<br />
A partition theorem due to E. Dauber<ref>E. Dauber, in ''Graph theory'', ed. F. Harary, Addison Wesley, (1969) p. 172.</ref> states that, for an edge-transitive hypergraph <math>H=(X,E)</math>, there exists a [[partition of a set|partition]]<br />
<br />
:<math>(X_1, X_2,\cdots,X_K)</math><br />
<br />
of the vertex set <math>X</math> such that the subhypergraph <math>H_{X_k}</math> generated by <math>X_k</math> is transitive for each <math>1\le k \le K</math>, and such that<br />
<br />
:<math>\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)</math><br />
<br />
where <math>r(H)</math> is the rank of ''H''.<br />
<br />
As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.<br />
<br />
[[Graph partitioning]] (and in particular, hypergraph partitioning) has many applications to IC design<ref>{{Citation |title=Multilevel hypergraph partitioning: applications in VLSI domain |author=Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. |journal=IEEE Transactions on Very Large Scale Integration (VLSI) Systems |date=March 1999 |volume=7 |issue=1 |pages=69–79 |doi=10.1109/92.748202 |postscript=.|citeseerx=10.1.1.553.2367 }}</ref> and parallel computing.<ref>{{Citation |doi=10.1016/S0167-8191(00)00048-X |title=Graph partitioning models for parallel computing |author= Hendrickson, B., Kolda, T.G. |journal=Parallel Computing | year=2000 |volume=26 |issue=12 |pages=1519–1545 |postscript=.|url=https://digital.library.unt.edu/ark:/67531/metadc684945/ |type=Submitted manuscript }}</ref><ref>{{Cite conference |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=A Hypergraph Model for Mapping Repeated Sparse Matrix-Vector Product Computations onto Multicomputers |conference=Proc. International Conference on Hi Performance Computing (HiPC'95) |year=1995}}</ref><ref>{{Citation |last1=Catalyurek |first1=U.V. |last2=Aykanat |first2=C. |title=Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication |journal=IEEE Transactions on Parallel and Distributed Systems |volume=10 |issue=7 |pages=673–693 |year=1999|doi=10.1109/71.780863 |postscript=. |citeseerx=10.1.1.67.2498 }}</ref> Efficient and Scalable [[Graph partition|hypergraph partitioning algorithms]] are also important for processing large scale hypergraphs in machine learning tasks.<ref name=hyperx>{{citation|last1=Huang|first1=Jin|last2=Zhang|first2=Rui|last3=Yu|first3=Jeffrey Xu|journal=Proceedings of the IEEE International Conference on Data Mining|title=Scalable Hypergraph Learning and Processing|year=2015}}</ref><br />
<br />
==Theorems==<br />
Many [[theorem]]s and concepts involving graphs also hold for hypergraphs. [[Ramsey's theorem]] and [[Line graph of a hypergraph]] are typical examples. Some methods for studying symmetries of graphs extend to hypergraphs.<br />
<br />
Two prominent theorems are the [[Erdős–Ko–Rado theorem]] and the [[Kruskal–Katona theorem]] on uniform hypergraphs.<br />
<br />
==Hypergraph drawing==<br />
[[File:CircuitoDosMallas.png|thumb|This [[circuit diagram]] can be interpreted as a drawing of a hypergraph in which four vertices (depicted as white rectangles and disks) are connected by three hyperedges drawn as trees.]]<br />
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.<br />
<br />
In one possible visual representation for hypergraphs, similar to the standard [[graph drawing]] style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves.<ref>{{citation<br />
| last = Sander | first = G.<br />
| contribution = Layout of directed hypergraphs with orthogonal hyperedges<br />
| pages = 381–386<br />
| publisher = Springer-Verlag<br />
| series = [[Lecture Notes in Computer Science]]<br />
| title = Proc. 11th International Symposium on Graph Drawing (GD 2003)<br />
| contribution-url = http://gdea.informatik.uni-koeln.de/585/1/hypergraph.ps<br />
| volume = 2912<br />
| year = 2003| title-link = International Symposium on Graph Drawing<br />
}}.</ref><ref>{{citation<br />
| last1 = Eschbach | first1 = Thomas<br />
| last2 = Günther | first2 = Wolfgang<br />
| last3 = Becker | first3 = Bernd<br />
| issue = 2<br />
| journal = [[Journal of Graph Algorithms and Applications]]<br />
| pages = 141–157<br />
| title = Orthogonal hypergraph drawing for improved visibility<br />
| url = http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf<br />
| volume = 10<br />
| year = 2006 | doi=10.7155/jgaa.00122}}.</ref> If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as [[simple closed curve]]s that enclose sets of points.<ref>{{citation<br />
| last = Mäkinen | first = Erkki<br />
| doi = 10.1080/00207169008803875<br />
| issue = 3<br />
| journal = International Journal of Computer Mathematics<br />
| pages = 177–185<br />
| title = How to draw a hypergraph<br />
| volume = 34<br />
| year = 1990}}.</ref><ref>{{citation<br />
| last1 = Bertault | first1 = François<br />
| last2 = Eades | first2 = Peter | author2-link = Peter Eades<br />
| contribution = Drawing hypergraphs in the subset standard<br />
| doi = 10.1007/3-540-44541-2_15<br />
| pages = 45–76<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 8th International Symposium on Graph Drawing (GD 2000)<br />
| volume = 1984<br />
| year = 2001| title-link = International Symposium on Graph Drawing<br />
| isbn = 978-3-540-41554-1<br />
| doi-access = free<br />
}}.</ref><ref>{{citation<br />
| last1 = Naheed Anjum | first1 = Arafat<br />
| last2 = Bressan | first2 = Stéphane<br />
| contribution = Hypergraph Drawing by Force-Directed Placement<br />
| doi = 10.1007/978-3-319-64471-4_31<br />
| pages = 387–394<br />
| publisher = Springer International Publishing<br />
| series = Lecture Notes in Computer Science<br />
| title = 28th International Conference on Database and Expert Systems Applications (DEXA 2017)<br />
| volume = 10439<br />
| year = 2017| isbn = 978-3-319-64470-7<br />
}}.</ref><br />
<br />
[[File:Venn's four ellipse construction.svg|thumb|An order-4 Venn diagram, which can be interpreted as a subdivision drawing of a hypergraph with 15 vertices (the 15 colored regions) and 4 hyperedges (the 4 ellipses).]]<br />
In another style of hypergraph visualization, the subdivision model of hypergraph drawing,<ref>{{citation<br />
| last1 = Kaufmann | first1 = Michael<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Speckmann | first3 = Bettina | author3-link = Bettina Speckmann<br />
| contribution = Subdivision drawings of hypergraphs<br />
| doi = 10.1007/978-3-642-00219-9_39<br />
| pages = 396–407<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 16th International Symposium on Graph Drawing (GD 2008)<br />
| volume = 5417<br />
| year = 2009| title-link = International Symposium on Graph Drawing<br />
| isbn = 978-3-642-00218-2<br />
| doi-access = free<br />
}}.</ref> the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-''n'' [[Venn diagram]], for instance, may be viewed as a subdivision drawing of a hypergraph with ''n'' hyperedges (the curves defining the diagram) and 2<sup>''n''</sup>&nbsp;−&nbsp;1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of [[planar graph]]s, it is [[NP-complete]] to determine whether a hypergraph has a planar subdivision drawing,<ref>{{citation<br />
| last1 = Johnson | first1 = David S. | author1-link = David S. Johnson<br />
| last2 = Pollak | first2 = H. O.<br />
| doi = 10.1002/jgt.3190110306<br />
| issue = 3<br />
| journal = Journal of Graph Theory<br />
| pages = 309–325<br />
| title = Hypergraph planarity and the complexity of drawing Venn diagrams<br />
| volume = 11<br />
| year = 2006}}.</ref> but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.<ref>{{citation<br />
| last1 = Buchin | first1 = Kevin<br />
| last2 = van Kreveld | first2 = Marc<br />
| last3 = Meijer | first3 = Henk<br />
| last4 = Speckmann | first4 = Bettina<br />
| last5 = Verbeek | first5 = Kevin<br />
| contribution = On planar supports for hypergraphs<br />
| doi = 10.1007/978-3-642-11805-0_33<br />
| pages = 345–356<br />
| publisher = Springer-Verlag<br />
| series = Lecture Notes in Computer Science<br />
| title = Proc. 17th International Symposium on Graph Drawing (GD 2009)<br />
| volume = 5849<br />
| year = 2010| title-link = International Symposium on Graph Drawing<br />
| isbn = 978-3-642-11804-3<br />
| doi-access = free<br />
}}.</ref><br />
<br />
An alternative representation of the hypergraph called PAOH<ref name="paoh" /> is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.<br />
<br />
==Hypergraph grammars==<br />
{{main|Hypergraph grammar}}<br />
By augmenting a class of hypergraphs with replacement rules, [[graph grammar]]s can be generalised to allow hyperedges.<br />
<br />
== Generalizations == <br />
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ''ad infinitum''. In essence, every edge is just an internal node of a tree or [[directed acyclic graph]], and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout [[computer science]] and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of [[term algebra]]; edges correspond to [[term (logic)|terms]] and vertices correspond to constants or variables.<br />
<br />
For such a hypergraph, set membership then provides an ordering, but the ordering is neither a [[partial order]] nor a [[preorder]], since it is not transitive. The graph corresponding to the Levi graph of this generalization is a [[directed acyclic graph]]. Consider, for example, the generalized hypergraph whose vertex set is <math>V= \{a,b\}</math> and whose edges are <math>e_1=\{a,b\}</math> and <math>e_2=\{a,e_1\}</math>. Then, although <math>b\in e_1</math> and <math>e_1\in e_2</math>, it is not true that <math>b\in e_2</math>. However, the [[transitive closure]] of set membership for such hypergraphs does induce a [[partial order]], and "flattens" the hypergraph into a [[partially ordered set]].<br />
<br />
Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges <math>e_1</math> and <math>e_2</math>, and zero vertices, so that <math>e_1 = \{e_2\}</math> and <math>e_2 = \{e_1\}</math>. As this loop is infinitely recursive, sets that are the edges violate the [[axiom of foundation]]. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer [[Bipartite graph|bipartite]], but is rather just some general [[directed graph]].<br />
<br />
The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the [[incidence matrix]] is simply<br />
<br />
:<math>\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].</math><br />
<br />
==Hypergraph learning== <br />
<br />
Hypergraphs have been extensively used in [[machine learning]] tasks as the data model and classifier [[regularization (mathematics)]].<ref>{{citation| last1 = Zhou | first1 = Dengyong| last2 = Huang | first2 = Jiayuan | last3=Scholkopf | first3=Bernhard| issue = 2| journal = Advances in Neural Information Processing Systems| pages = 1601–1608| title = Learning with hypergraphs: clustering, classification, and embedding| year = 2006}}</ref> The applications include [[recommender system]] (communities as hyperedges),<ref>{{citation|last1=Tan | first1=Shulong | last2=Bu | first2=Jiajun | last3=Chen | first3=Chun | last4=Xu | first4=Bin | last5=Wang | first5=Can | last6=He | first6=Xiaofei|issue = 1| journal = ACM Transactions on Multimedia Computing, Communications, and Applications| title = Using rich social media information for music recommendation via hypergraph model| year = 2013|url=https://www.researchgate.net/publication/226075153| bibcode=2011smma.book..213T }}</ref> [[image retrieval]] (correlations as hyperedges),<ref>{{citation|last1=Liu | first1=Qingshan | last2=Huang | first2=Yuchi | last3=Metaxas | first3=Dimitris N. |issue = 10–11| journal = Pattern Recognition| title = Hypergraph with sampling for image retrieval| pages=2255–2262| year = 2013| doi=10.1016/j.patcog.2010.07.014 | volume=44}}</ref> and [[bioinformatics]] (biochemical interactions as hyperedges).<ref>{{citation|last1=Patro |first1=Rob | last2=Kingsoford | first2=Carl| issue = 10–11| journal = Bioinformatics| title = Predicting protein interactions via parsimonious network history inference| year = 2013| pages=237–246|doi=10.1093/bioinformatics/btt224 |pmid=23812989 |pmc=3694678 | volume=29}}</ref> Representative hypergraph learning techniques include hypergraph [[spectral clustering]] that extends the [[spectral graph theory]] with hypergraph Laplacian,<ref>{{citation|last1=Gao | first1=Tue | last2=Wang | first2=Meng | last3=Zha|first3=Zheng-Jun|last4=Shen|first4=Jialie|last5=Li|first5=Xuelong|last6=Wu|first6=Xindong|issue = 1| journal = IEEE Transactions on Image Processing| volume=22 | title = Visual-textual joint relevance learning for tag-based social image search| year = 2013| pages=363–376|url=http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=2510&context=sis_research | doi=10.1109/tip.2012.2202676| pmid=22692911 | bibcode=2013ITIP...22..363Y }}</ref> and hypergraph [[semi-supervised learning]] that introduces extra hypergraph structural cost to restrict the learning results.<ref>{{citation|last1=Tian|first1=Ze|last2=Hwang|first2=TaeHyun|last3=Kuang|first3=Rui|issue = 21| journal = Bioinformatics| title = A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge| year = 2009| pages=2831–2838|doi=10.1093/bioinformatics/btp467|pmid=19648139| volume=25|doi-access=free}}</ref> For large scale hypergraphs, a distributed framework<ref name=hyperx /> built using [[Apache Spark]] is also available.<br />
<br />
==See also==<br />
{{Commons category|Hypergraphs}}<br />
<br />
* [[Simplicial complex]]<br />
<br />
* [[Combinatorial design]]<br />
* [[Factor graph]]<br />
* [[Greedoid]]<br />
* [[Incidence structure]]<br />
* [[Matroid]]<br />
* [[Multigraph]]<br />
* [[P system]]<br />
* [[Sparse matrix-vector multiplication]]<br />
*[[Matching in hypergraphs]]<br />
<br />
==Notes==<br />
{{Reflist}}<br />
<br />
==References==<br />
* Claude Berge, "Hypergraphs: Combinatorics of finite sets". North-Holland, 1989.<br />
* Claude Berge, Dijen Ray-Chaudhuri, "Hypergraph Seminar, Ohio State University 1972", ''Lecture Notes in Mathematics'' '''411''' Springer-Verlag<br />
* Hazewinkel, Michiel, ed. (2001) [1994], "Hypergraph", [https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics Encyclopedia of Mathematics], Springer Science+Business Media B.V. / Kluwer Academic Publishers, ISBN 978-1-55608-010-4<br />
* Alain Bretto, "Hypergraph Theory: an Introduction", Springer, 2013.<br />
* Vitaly I. Voloshin. "Coloring Mixed Hypergraphs: Theory, Algorithms and Applications". Fields Institute Monographs, American Mathematical Society, 2002.<br />
* Vitaly I. Voloshin. "Introduction to Graph and Hypergraph Theory". [[Nova Science Publishers, Inc.]], 2009.<br />
* This article incorporates material from hypergraph on PlanetMath, which is licensed under the[https://en.wikipedia.org/wiki/Wikipedia:CC-BY-SA Creative Commons Attribution/Share-Alike License].<br />
<br />
==External links==<br />
* [https://www.aviz.fr/paohvis PAOHVis]: open-source PAOHVis system for visualizing dynamic hypergraphs.<br />
<br />
{{Graph representations}}<br />
<br />
[[Category:Hypergraphs| ]]<br />
<br />
[[de:Graph (Graphentheorie)#Hypergraph]]</div>Pjhhhhttps://wiki.swarma.org/index.php?title=%E7%94%A8%E6%88%B7:Pjhhh&diff=1048用户:Pjhhh2020-03-26T13:19:02Z<p>Pjhhh:</p>
<hr />
<div>== '''Hi,我是瑾晗''' ==<br />
<br />
*'''姓名:'''彭瑾晗<br />
*'''性别:'''男<br />
*'''当前就读:'''中国民航大学空中交通管理学院研究生在读,本科也曾就读于中国民航大学空中交通管理学院<br />
*'''主要研究内容:'''空中交通流量管理、交通运输网络相关内容、交通复杂网络、网络弹性(也才是入门,感兴趣的小伙伴们阔以一起交流)<br />
*'''兴趣与爱好:'''长跑、骑行、爬山;喜欢在一个陌生的地方漫无目的闲逛;做一些自己没尝试过的菜;英雄联盟老黄金选手 佛系游戏<br />
*'''联系方式:'''mail:2019031013@cauc.edu.cn</div>Pjhhh