第371行: |
第371行: |
| 信息论中其他重要的量包括Rényi熵(一种熵的推广),微分熵(信息量推广到连续分布),以及条件互信息。 | | 信息论中其他重要的量包括Rényi熵(一种熵的推广),微分熵(信息量推广到连续分布),以及条件互信息。 |
| | | |
− | ==Coding theory== | + | ==编码理论== |
| | | |
− | ==Coding theory==
| |
− |
| |
− | 编码理论
| |
| | | |
| {{Main|Coding theory}} | | {{Main|Coding theory}} |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
| | | |
| [[File:CDSCRATCHES.jpg|thumb|right|A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction]].]] | | [[File:CDSCRATCHES.jpg|thumb|right|A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using [[error detection and correction]].]] |
第399行: |
第390行: |
| Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source. | | Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source. |
| | | |
− | 编码理论是信息论最重要、最直接的应用之一,可以细分为信源编码理论和信道编码理论。信息理论使用统计数据描述来量化描述数据所需的比特数,这是源的信息熵。
| + | 编码理论是信息论最重要、最直接的应用之一,可以细分为信源编码理论和信道编码理论。信息论使用统计学来量化描述数据所需的比特数,也就是源的信息熵。 |
− | | |
− | | |
| | | |
| | | |
第413行: |
第402行: |
| | | |
| *[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called ''[[rate–distortion theory]]''. | | *[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called ''[[rate–distortion theory]]''. |
− | * [[有损数据压缩]]:由失真函数测得的在指定保真度级别内分配重构数据所需的比特数。信息理论中的这个子集称为率失真理论。 | + | * [[有损数据压缩]]:由失真函数测得的在指定保真度级别内分配重构数据所需的比特数。信息论中的这个部分称为率失真理论。 |
| | | |
| | | |
| * Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel. | | * Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel. |
| *纠错码(信道编码):数据压缩会尽可能多的消除冗余,而纠错码会添加所需的冗余(即纠错),以便在嘈杂的信道上有效且保真地传输数据。 | | *纠错码(信道编码):数据压缩会尽可能多的消除冗余,而纠错码会添加所需的冗余(即纠错),以便在嘈杂的信道上有效且保真地传输数据。 |
− |
| |
− |
| |
− |
| |
| | | |
| | | |
第428行: |
第414行: |
| This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models. | | This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. Network information theory refers to these multi-agent communication models. |
| | | |
− | 信息传输定理或源-信道分离定理证明编码理论划分为压缩和传输是正确的,这些定理证明了在许多情况下使用比特作为信息的‘’通用货币‘’是合理的,但这只在发送用户与特定接收用户建立通信的情况下才成立。在具有多个发送器(多路访问信道),多个接收器(广播信道)或中转器(中继信道)或多个计算机网络的情况下,压缩后再进行传输可能就不再是最佳选择。[[网络信息论]]是指这些多主体通信模型。
| + | 信息传输定理,或着说“信源-信道分离定理”证明编码理论划分为压缩和传输是正确的,这些定理证明了在许多情况下使用比特作为信息的''通用货币''是合理的,但这只在发送用户与特定接收用户建立通信的情况下才成立。在具有多个发送器(多路访问信道),多个接收器(广播信道)或中转器(中继信道)或多个计算机网络的情况下,压缩后再进行传输可能就不再是最佳选择。[[网络信息论]]指的就是这些多主体通信模型。 |
| | | |
| | | |
| | | |
− | ===Source theory=== | + | ===信源理论=== |
− | | |
− | ===Source theory===
| |
− | | |
− | 源理论
| |
| | | |
| Any process that generates successive messages can be considered a {{em|[[Communication source|source]]}} of information. A memoryless source is one in which each message is an [[Independent identically distributed random variables|independent identically distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose less restrictive constraints. All such sources are [[stochastic process|stochastic]]. These terms are well studied in their own right outside information theory. | | Any process that generates successive messages can be considered a {{em|[[Communication source|source]]}} of information. A memoryless source is one in which each message is an [[Independent identically distributed random variables|independent identically distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose less restrictive constraints. All such sources are [[stochastic process|stochastic]]. These terms are well studied in their own right outside information theory. |
第442行: |
第424行: |
| Any process that generates successive messages can be considered a of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory. | | Any process that generates successive messages can be considered a of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory. |
| | | |
− | 生成连续消息的任何过程都可以视为信息的通讯来源。无记忆信源是指每个消息都是独立同分布的随机变量,而遍历理论和平稳过程的性质对信源施加的限制较少。所有这些源都满足随机过程。在信息论领域外,这些术语已经有很全面的相关研究。
| + | 生成连续消息的任何过程都可以视为信息的通讯来源。无记忆信源是指每个消息都是独立同分布的随机变量,而遍历理论和平稳过程的性质对信源施加的限制较少。所有这些信源都可以看作随机的。在信息论领域外,这些术语也已经有很全面的相关研究。 |
| | | |
| | | |
| + | ====速率==== |
| | | |
− | ====Rate====<!-- This section is linked from [[Channel capacity]] -->
| |
− |
| |
− | ====Rate====<!-- This section is linked from Channel capacity -->
| |
− |
| |
− | ====速率====
| |
| Information ''[[Entropy rate|rate]]'' is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is | | Information ''[[Entropy rate|rate]]'' is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is |
| | | |
| Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is | | Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is |
| | | |
− | 信息的熵率是每个符号的平均熵。对于无记忆信源,信息的熵率仅表示每个符号的熵,而在平稳随机过程中,它是:
| + | 信息速率(熵率)是每个符号的平均熵。对于无记忆信源,信息速率仅表示每个符号的熵,而在平稳随机过程中,它是: |
− | | |
− | | |
− | | |
− | | |
− | | |
− | :<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math>
| |
− | | |
− | <math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math>
| |
| | | |
| :<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math> | | :<math>r = \lim_{n \to \infty} H(X_n|X_{n-1},X_{n-2},X_{n-3}, \ldots);</math> |
− |
| |
− |
| |
− |
| |
| | | |
| | | |
第475行: |
第442行: |
| that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is | | that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is |
| | | |
− | 也就是说,一个符号的条件熵给出了所有之前生成的符号。对于不一定平稳的过程的更一般情况,平均速率为:
| + | 也就是,给定所有之前生成的符号下,一个符号的条件熵。对于非平稳的过程的更一般情况,平均速率为: |
− | | |
− | | |
− | | |
− | | |
− | | |
− | :<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math>
| |
− | | |
− | <math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math>
| |
| | | |
| :<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math> | | :<math>r = \lim_{n \to \infty} \frac{1}{n} H(X_1, X_2, \dots X_n);</math> |
− |
| |
− |
| |
− |
| |
| | | |
| | | |
第495行: |
第451行: |
| that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result. | | that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result. |
| | | |
− | 也就是每个符号的联合熵的极限。对于固定源,这两个表达式得出的结果相同。<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref>
| + | 也就是每个符号的联合熵的极限。对于平稳源,这两个表达式得出的结果相同。<ref>{{cite book | title = Digital Compression for Multimedia: Principles and Standards | author = Jerry D. Gibson | publisher = Morgan Kaufmann | year = 1998 | url = https://books.google.com/books?id=aqQ2Ry6spu0C&pg=PA56&dq=entropy-rate+conditional#PPA57,M1 | isbn = 1-55860-369-7 }}</ref> |
− | | |
− | | |
− | | |
| | | |
| | | |
第506行: |
第459行: |
| It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of . | | It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of . |
| | | |
− | 在信息论中谈论一种语言的“速率”或“熵”是很常见的,比如当信源是英文散文时就很合适。信息源的速率与其冗余度以及可被压缩程度有关。
| + | 在信息论中谈论一种语言的“速率”或“熵”是很常见的,也是很合适的,比如当信源是英文散文时。信息源的速率与其冗余度以及可被压缩程度有关。 |
| | | |
| | | |
第512行: |
第465行: |
| | | |
| | | |
− | ===Channel capacity=== | + | ===信道容量=== |
− | | |
− | ===Channel capacity===
| |
− | | |
− | 信道容量
| |
| | | |
| {{Main|Channel capacity}} | | {{Main|Channel capacity}} |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
− |
| |
| | | |
| Communications over a channel—such as an [[ethernet]] cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. | | Communications over a channel—such as an [[ethernet]] cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. |
第530行: |
第473行: |
| Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. | | Communications over a channel—such as an ethernet cable—is the primary motivation of information theory. However, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality. |
| | | |
− | 通过信道(例如:英特网电缆)进行通信是信息论的主要动机。然而,这样的信道往往不能产生信号的精确重建; 在静默时段内的噪声和其他形式的信号损坏往往会使得信息质量的降低。 | + | 通过信道(例如:英特网电缆)进行通信是信息论的主要动机。然而,这样的信道往往不能产生信号的精确重建; 静默时段内、噪声、其他形式的信号损坏往往会使得信息质量的降低。 |
− | | |
− | | |
− | | |
| | | |
| | | |
第541行: |
第481行: |
| | | |
| 考虑离散信道上的通信过程。该过程的简单模型如下: | | 考虑离散信道上的通信过程。该过程的简单模型如下: |
− |
| |
− |
| |
− |
| |
| | | |
| | | |
第551行: |
第488行: |
| | | |
| [[File:Channel model.svg|center|800px|信道模型] | | [[File:Channel model.svg|center|800px|信道模型] |
− |
| |
− |
| |
− |
| |
| | | |
| | | |