“香农信源编码定理”的版本间的差异
小 (Moved page from wikipedia:en:Shannon's source coding theorem (history)) |
|||
第1行: | 第1行: | ||
− | + | 本词条由11初步翻译 | |
+ | |||
+ | https://wiki.swarma.org/index.php?title=%E5%B9%B3%E8%A1%A1%E7%90%86%E8%AE%BA#:~:text=%E6%9C%AC%E8%AF%8D%E6%9D%A1%E7%94%B1,11%E5%88%9D%E6%AD%A5%E7%BF%BB%E8%AF%91 | ||
{{short description|Establishes the limits to possible data compression}} | {{short description|Establishes the limits to possible data compression}} | ||
+ | {{简述|建立可能的数据压缩限制}} | ||
{{Information theory}} | {{Information theory}} | ||
− | + | {{信息论}} | |
{{about|the theory of source coding in data compression|the term in computer programming|Source code}} | {{about|the theory of source coding in data compression|the term in computer programming|Source code}} | ||
+ | {{关于数据压缩中的源码理论|计算机编程中的术语|源码}} | ||
第15行: | 第19行: | ||
In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy. | In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy. | ||
− | + | 在信息理论中,'''<font color="#ff8000"> 香农信源编码定理Shannon's source coding theorem </font>'''(或无噪声编码定理)建立了可能的数据压缩的极限,以及香农熵的操作意义。 | |
− | |||
第23行: | 第26行: | ||
Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss. | Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss. | ||
− | + | 以Claude Shannon克劳德·香农命名的信源编码定理表明,(在极限情况下,当独立的、相同分布的随机变量(i.i.d.)数据流的长度趋于无穷大时)不可能压缩数据,使码率(每个符号的平均比特数)小于信源的香农熵,而事实上又不能确定信息会丢失。然而,可以任意地使码率接近香农熵,损失的概率可以忽略不计。 | |
第36行: | 第39行: | ||
== Statements == | == Statements == | ||
+ | 说明 | ||
''Source coding'' is a mapping from (a sequence of) symbols from an information [[Information theory#Source theory|source]] to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind [[data compression]]. | ''Source coding'' is a mapping from (a sequence of) symbols from an information [[Information theory#Source theory|source]] to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind [[data compression]]. | ||
第41行: | 第45行: | ||
Source coding is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind data compression. | Source coding is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind data compression. | ||
− | 信源编码是从信源符号序列到字母符号序列(通常是比特)的映射,以使信源符号能够准确地从二进制比特位(无损源编码)恢复或在某种失真(有损源编码) | + | 信源编码是从信源符号序列到字母符号序列(通常是比特)的映射,以使信源符号能够准确地从二进制比特位(无损源编码)恢复或在某种失真(有损源编码)范围内恢复。这就是数据压缩的概念。 |
=== Source coding theorem === | === Source coding theorem === | ||
+ | 信源编码定理 | ||
In information theory, the '''source coding theorem''' (Shannon 1948)<ref name="Shannon"/> informally states that (MacKay 2003, pg. 81,<ref name="MacKay"/> Cover 2006, Chapter 5<ref name="Cover"/>): | In information theory, the '''source coding theorem''' (Shannon 1948)<ref name="Shannon"/> informally states that (MacKay 2003, pg. 81,<ref name="MacKay"/> Cover 2006, Chapter 5<ref name="Cover"/>): | ||
第60行: | 第65行: | ||
}} | }} | ||
− | + | <blockquote>{{mvar|N}}。[[独立和相同分布的随机变量|i.i.d.]]每个随机变量都有[[熵(信息论)|熵]]。{{math|''H''(''X'')}}可以压缩成多个{{math|''N H''(''X'')}}。[[位]]的信息丢失风险可以忽略不计,如{{math|''N'' → ∞}};但反过来说,如果它们被压缩成少于{{math|''N H''(''X'')}}位,则几乎可以肯定信息将丢失。 | |
=== Source coding theorem for symbol codes === | === Source coding theorem for symbol codes === | ||
+ | 符号码的源码定理 | ||
Category:Information theory | Category:Information theory | ||
第82行: | 第88行: | ||
Suppose that {{mvar|X}} is a random variable taking values in {{math|Σ<sub>1</sub>}} and let {{math| ''f'' }} be a [[Variable-length code#Uniquely decodable codes|uniquely decodable]] code from {{math|Σ{{su|b=1|p=∗}}}} to {{math|Σ{{su|b=2|p=∗}}}} where {{math|{{!}}Σ<sub>2</sub>{{!}} {{=}} ''a''}}. Let {{mvar|S}} denote the random variable given by the length of codeword {{math| ''f'' (''X'')}}. | Suppose that {{mvar|X}} is a random variable taking values in {{math|Σ<sub>1</sub>}} and let {{math| ''f'' }} be a [[Variable-length code#Uniquely decodable codes|uniquely decodable]] code from {{math|Σ{{su|b=1|p=∗}}}} to {{math|Σ{{su|b=2|p=∗}}}} where {{math|{{!}}Σ<sub>2</sub>{{!}} {{=}} ''a''}}. Let {{mvar|S}} denote the random variable given by the length of codeword {{math| ''f'' (''X'')}}. | ||
+ | |||
+ | 假设{{mvar|X}}是一个随机变量,取值在{{math|Σ<sub>1</sub>}},让{{math| ''f''  }}是一个[[可变长度代码#唯一可解码代码|唯一可解码]]代码,从{{math|Σ{su|b=1|p=∗}}}}到{{math|Σ{su|b=2|p=∗}}}},其中{{math|{{! }}Σ<sub>2</sub>{{!}}。{{=}} ''a''}}. 让{{mvar|S}}表示由码字{{math| ''f'' (''X'')}}的长度给出的随机变量。 | ||
Category:Presentation layer protocols | Category:Presentation layer protocols |
2020年11月1日 (日) 18:24的版本
本词条由11初步翻译
In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy.
In information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible data compression, and the operational meaning of the Shannon entropy.
在信息理论中, 香农信源编码定理Shannon's source coding theorem (或无噪声编码定理)建立了可能的数据压缩的极限,以及香农熵的操作意义。
Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss.
Named after Claude Shannon, the source coding theorem shows that (in the limit, as the length of a stream of independent and identically-distributed random variable (i.i.d.) data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss.
以Claude Shannon克劳德·香农命名的信源编码定理表明,(在极限情况下,当独立的、相同分布的随机变量(i.i.d.)数据流的长度趋于无穷大时)不可能压缩数据,使码率(每个符号的平均比特数)小于信源的香农熵,而事实上又不能确定信息会丢失。然而,可以任意地使码率接近香农熵,损失的概率可以忽略不计。
The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word (which is viewed as a random variable) and of the size of the target alphabet.
The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word (which is viewed as a random variable) and of the size of the target alphabet.
符号码的信源编码定理在最小可能期望码字长度上设置了一个上下界,该上下界是输入字(被视为一个随机变量)熵和目标字母表大小的函数。
Statements
说明
Source coding is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind data compression.
Source coding is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind data compression.
信源编码是从信源符号序列到字母符号序列(通常是比特)的映射,以使信源符号能够准确地从二进制比特位(无损源编码)恢复或在某种失真(有损源编码)范围内恢复。这就是数据压缩的概念。
Source coding theorem
信源编码定理
In information theory, the source coding theorem (Shannon 1948)[1] informally states that (MacKay 2003, pg. 81,[2] Cover 2006, Chapter 5[3]):
In information theory, the source coding theorem (Shannon 1948)
在信息论中,信源编码定理(Shannon,1948)
N i.i.d. random variables each with entropy H(X) can be compressed into more than N H(X) bits with negligible risk of information loss, as N → ∞; but conversely, if they are compressed into fewer than N H(X) bits it is virtually certain that information will be lost.
}}
}}
N。i.i.d.每个随机变量都有熵。H(X)可以压缩成多个N H(X)。位的信息丢失风险可以忽略不计,如N → ∞;但反过来说,如果它们被压缩成少于N H(X)位,则几乎可以肯定信息将丢失。
Source coding theorem for symbol codes
符号码的源码定理
Category:Information theory
范畴: 信息论
Let Σ1, Σ2 denote two finite alphabets and let Σ模板:Su and Σ模板:Su denote the set of all finite words from those alphabets (respectively).
Category:Coding theory
类别: 编码理论
Category:Data compression
类别: 数据压缩
Suppose that X is a random variable taking values in Σ1 and let f be a uniquely decodable code from Σ模板:Su to Σ模板:Su where |Σ2| = a. Let S denote the random variable given by the length of codeword f (X).
假设X是一个随机变量,取值在Σ1,让 f 是一个唯一可解码代码,从Σ{su}}到Σ{su}},其中|Σ2模板:!。= a. 让S表示由码字 f (X)的长度给出的随机变量。
Category:Presentation layer protocols
分类: 表示层协议
Category:Mathematical theorems in theoretical computer science
范畴: 理论计算机科学中的数学定理
If f is optimal in the sense that it has the minimal expected word length for X, then (Shannon 1948):
Category:Articles containing proofs
类别: 包含证明的文章
This page was moved from wikipedia:en:Shannon's source coding theorem. Its edit history can be viewed at 香农信源编码定理/edithistory