更改

结构相似性 (查看源代码)

2022年7月4日 (一) 10:44的版本

添加47,066字节、 2022年7月4日 (一) 10:44

Moved page from wikipedia:en:Structural similarity (history)

此词条暂由彩云小译翻译，翻译字数共1930，未经人工整理和审校，带来阅读不便，请见谅。

SSIM is a perception-based model that considers image degradation as ''perceived change in structural information'', while also incorporating important perceptual phenomena, including both luminance masking and contrast masking terms. The difference with other techniques such as MSE or PSNR is that these approaches estimate ''absolute errors''. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. Luminance masking is a phenomenon whereby image distortions (in this context) tend to be less visible in bright regions, while contrast masking is a phenomenon whereby distortions become less visible where there is significant activity or "texture" in the image.

SSIM is a perception-based model that considers image degradation as perceived change in structural information, while also incorporating important perceptual phenomena, including both luminance masking and contrast masking terms. The difference with other techniques such as MSE or PSNR is that these approaches estimate absolute errors. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. Luminance masking is a phenomenon whereby image distortions (in this context) tend to be less visible in bright regions, while contrast masking is a phenomenon whereby distortions become less visible where there is significant activity or "texture" in the image.

SSIM 是一个基于感知的模型，它将图像退化看作是结构信息的感知变化，同时也包含了重要的感知现象，包括亮度掩蔽和对比度掩蔽术语。与 MSE 或 PSNR 等其他技术的不同之处在于，这些方法估计的是绝对误差。结构信息是指像素之间有很强的相互依赖性，特别是当它们在空间上很接近的时候。这些依赖关系携带有关视觉场景中对象结构的重要信息。亮度掩蔽是一种现象，即图像失真(在这种情况下)在明亮的区域往往不那么明显，而对比度掩蔽是一种现象，即失真变得不那么明显，有重要的活动或“纹理”的图像。

== History ==
The predecessor of SSIM was called ''Universal Quality Index'' (UQI), or ''Wang–Bovik Index'', which was developed by Zhou Wang and [[Alan Bovik]] in 2001. This evolved, through their collaboration with Hamid Sheikh and [[Eero Simoncelli]], into the current version of SSIM, which was published in April 2004 in the ''[[IEEE Transactions on Image Processing]]''.<ref>{{Cite journal|title = Image quality assessment: from error visibility to structural similarity|journal = IEEE Transactions on Image Processing|date = 2004-04-01|issn = 1057-7149|pages = 600–612|volume = 13|issue = 4|doi = 10.1109/TIP.2003.819861|first = Zhou|last = Wang|first2 = A.C.|last2 = Bovik|first3 = H.R.|last3 = Sheikh|first4 = E.P.|last4 = Simoncelli|pmid = 15376593|bibcode = 2004ITIP...13..600W|citeseerx = 10.1.1.2.5689}}</ref> In addition to defining the SSIM quality index, the paper provides a general context for developing and evaluating perceptual quality measures, including connections to human visual neurobiology and perception, and direct validation of the index against human subject ratings.

The predecessor of SSIM was called Universal Quality Index (UQI), or Wang–Bovik Index, which was developed by Zhou Wang and Alan Bovik in 2001. This evolved, through their collaboration with Hamid Sheikh and Eero Simoncelli, into the current version of SSIM, which was published in April 2004 in the IEEE Transactions on Image Processing. In addition to defining the SSIM quality index, the paper provides a general context for developing and evaluating perceptual quality measures, including connections to human visual neurobiology and perception, and direct validation of the index against human subject ratings.

SSIM 的前身被称为通用质量指数(UQI) ，或者 Wang-Bovik 指数，由 Zhou Wang 和 Alan Bovik 于2001年开发。通过与 Hamid Sheikh 和 Eero Simoncelli 的合作，这种技术演变成了 SSIM 的现有版本，该版本于2004年4月在 IEEE 图像处理会刊上发表。除了定义 SSIM 质量指数外，本文还为开发和评估感知质量指标提供了一般背景，包括与人类视觉神经生物学和感知的联系，以及对照人类受试者评分直接验证该指数。

The basic model was developed in the Laboratory for Image and Video Engineering (LIVE) at [[The University of Texas at Austin]] and further developed jointly with the Laboratory for Computational Vision (LCV) at [[New York University]]. Further variants of the model have been developed in the Image and Visual Computing Laboratory at [[University of Waterloo]] and have been commercially marketed.

The basic model was developed in the Laboratory for Image and Video Engineering (LIVE) at The University of Texas at Austin and further developed jointly with the Laboratory for Computational Vision (LCV) at New York University. Further variants of the model have been developed in the Image and Visual Computing Laboratory at University of Waterloo and have been commercially marketed.

这个基本模型是在德克萨斯州大学奥斯汀分校图像和视频工程实验室(LIVE)开发的，并与纽约大学计算视觉实验室(LCV)进一步合作开发。图像和视觉计算实验室已经开发了该模型的其他滑铁卢大学，并已经在市场上销售。

SSIM subsequently found strong adoption in the image processing community and in the television and social media industries. The 2004 SSIM paper has been cited over 20,000 times according to [[Google Scholar]],<ref>{{Cite web|url=https://scholar.google.com/scholar?cites=3765725703375628854&as_sdt=400005&sciodt=0,14&hl=en|title=Google Scholar|website=scholar.google.com|access-date=2019-07-04}}</ref> making it one of the highest cited papers in the image processing and video engineering fields. It was accorded the [[IEEE Signal Processing Society]] Best Paper Award for 2009.<ref>{{Cite web|url = http://www.signalprocessingsociety.org/uploads/awards/Best_Paper.pdf|title = IEEE Signal Processing Society, Best Paper Award|date = |access-date = |website = |publisher = |last = |first = }}</ref> It also received the [[IEEE Signal Processing Society]] Sustained Impact Award for 2016, indicative of a paper having an unusually high impact for at least 10 years following its publication. Because of its high adoption by the television industry, the authors of the original SSIM paper were each accorded a [[Primetime Engineering Emmy Award]] in 2015 by the [[Academy of Television Arts & Sciences|Television Academy]].

SSIM subsequently found strong adoption in the image processing community and in the television and social media industries. The 2004 SSIM paper has been cited over 20,000 times according to Google Scholar, making it one of the highest cited papers in the image processing and video engineering fields. It was accorded the IEEE Signal Processing Society Best Paper Award for 2009. It also received the IEEE Signal Processing Society Sustained Impact Award for 2016, indicative of a paper having an unusually high impact for at least 10 years following its publication. Because of its high adoption by the television industry, the authors of the original SSIM paper were each accorded a Primetime Engineering Emmy Award in 2015 by the Television Academy.

SSIM 随后在图像处理领域以及电视和社交媒体行业得到了广泛应用。根据 Google Scholar 的统计，2004年的 SSIM 论文被引用了超过20,000次，成为图像处理和视频工程领域被引用次数最多的论文之一。它被评为 IEEE 信号处理学会2009年最佳论文奖。它还获得了2016年 IEEE 信号处理协会持续影响奖，这表明一篇论文在其发表后至少10年内具有非同寻常的高影响力。由于被电视行业广泛采用，最初的 SSIM 论文的作者在2015年被电视学院授予黄金时段工程艾美奖。

== Algorithm ==
The SSIM index is calculated on various windows of an image. The measure between two windows <math>x</math> and <math>y</math> of common size <math>N\times N</math> is:<ref name=":0" />

The SSIM index is calculated on various windows of an image. The measure between two windows x and y of common size N\times N is:

SSIM 索引是在图像的不同窗口上计算出来的。两个窗口 x 和 y 之间的公共大小 N 乘以 N 是:

<math display="block">\hbox{SSIM}(x,y) = \frac{(2\mu_x\mu_y + c_1)(2\sigma_{xy} + c_2)}{(\mu_x^2 + \mu_y^2 + c_1)(\sigma_x^2 + \sigma_y^2 + c_2)}</math>

\hbox{SSIM}(x,y) = \frac{(2\mu_x\mu_y + c_1)(2\sigma_{xy} + c_2)}{(\mu_x^2 + \mu_y^2 + c_1)(\sigma_x^2 + \sigma_y^2 + c_2)}

Hbox { SSIM }(x，y) = frac {(2 mu _ x mu _ y + c _ 1)(2 sigma _ { xy } + c _ 2)}{(mu _ x ^ 2 + mu _ y ^ 2 + c _ 1)(sigma _ x ^ 2 + sigma _ y ^ 2 + c _ 2)}

with:
* <math>\mu_x</math> the [[average]] of <math>x</math>;
* <math>\mu_y</math> the [[average]] of <math>y</math>;
* <math>\sigma_x^2</math> the [[variance]] of <math>x</math>;
* <math>\sigma_y^2</math> the [[variance]] of <math>y</math>;
* <math>\sigma_{xy}</math> the [[covariance]] of <math>x</math> and <math>y</math>;
* <math>c_1 = (k_1L)^2</math>, <math>c_2 = (k_2L)^2</math> two variables to stabilize the division with weak denominator;
* <math>L</math> the [[dynamic range]] of the pixel-values (typically this is <math>2^{\#bits\ per\ pixel}-1</math>);
* <math> k_1 = 0.01</math> and <math>k_2 = 0.03</math> by default.

with:
* \mu_x the average of x;
* \mu_y the average of y;
* \sigma_x^2 the variance of x;
* \sigma_y^2 the variance of y;
* \sigma_{xy} the covariance of x and y;
* c_1 = (k_1L)^2, c_2 = (k_2L)^2 two variables to stabilize the division with weak denominator;
* L the dynamic range of the pixel-values (typically this is 2^{\#bits\ per\ pixel}-1);
* k_1 = 0.01 and k_2 = 0.03 by default.

使用:
* mu _ x x 的平均值;
* mu _ y y 的平均值;
* sigma _ x ^ 2 x 的方差;
* sigma _ y ^ 2 y 的方差;
* sigma _ { xy } x 和 y 的协方差;
* c _ 1 = (k _ 1L) ^ 2，c _ 2 = (k _ 2L) ^ 2个变量来稳定弱分母除法;
* L 像素值的动态范围(通常是每像素2 ^ { # bit }-1) ;。

=== Formula components ===
The SSIM formula is based on three comparison measurements between the samples of <math>x</math> and <math>y</math>: luminance (<math>l</math>), contrast (<math>c</math>) and structure (<math>s</math>). The individual comparison functions are:<ref name=":0" />

The SSIM formula is based on three comparison measurements between the samples of x and y: luminance (l), contrast (c) and structure (s). The individual comparison functions are:

= = 公式成分 = = SSIM 公式是基于 x 和 y 样本之间的三个比较测量值: 亮度(l) ，对比度(c)和结构(s)。个别比较功能如下:

<math display="block">l(x,y)=\frac{2\mu_x\mu_y + c_1}{\mu^2_x + \mu^2_y + c_1}</math>

l(x,y)=\frac{2\mu_x\mu_y + c_1}{\mu^2_x + \mu^2_y + c_1}

L (x，y) = frac {2 mu _ x mu _ y + c _ 1}{ mu ^ 2 _ x + mu ^ 2 _ y + c _ 1}

<math display="block">c(x,y)=\frac{2\sigma_x\sigma_y + c_2}{\sigma_x^2 + \sigma_y^2 + c_2}</math>

c(x,y)=\frac{2\sigma_x\sigma_y + c_2}{\sigma_x^2 + \sigma_y^2 + c_2}

C (x，y) = frac {2 sigma _ x sigma _ y + c _ 2}{ sigma _ x ^ 2 + sigma _ y ^ 2 + c _ 2}

<math display="block">s(x,y)=\frac{\sigma_{xy} + c_3}{\sigma_x \sigma_y + c_3}</math>

s(x,y)=\frac{\sigma_{xy} + c_3}{\sigma_x \sigma_y + c_3}

S (x，y) = frac { sigma _ { xy } + c _ 3}{ sigma _ x sigma _ y + c _ 3}

with, in addition to above definitions:
* <math>c_3 = c_2 / 2</math>
SSIM is then a weighted combination of those comparative measures:

with, in addition to above definitions:
* c_3 = c_2 / 2
SSIM is then a weighted combination of those comparative measures:

除上述定义外，
* c _ 3 = c _ 2/2 SSIM 是这些比较指标的加权组合:

<math>\text{SSIM}(x,y) = l(x,y)^\alpha \cdot c(x,y)^\beta \cdot s(x,y)^\gamma </math>

\text{SSIM}(x,y) = l(x,y)^\alpha \cdot c(x,y)^\beta \cdot s(x,y)^\gamma

Text { SSIM }(x，y) = l (x，y) ^ alpha cdot c (x，y) ^ beta cdot s (x，y) ^ γ

Setting the weights <math>\alpha,\beta,\gamma</math> to 1, the formula can be reduced to the form shown above.

Setting the weights \alpha,\beta,\gamma to 1, the formula can be reduced to the form shown above.

将权重 alpha、 beta、 γ 设置为1，公式可以简化为如上所示的形式。

=== Mathematical Properties ===
SSIM satisfies the identity of indiscernibles, and symmetry properties, but not the triangle inequality or non-negativity, and thus is not a [[distance function]]. However, under certain conditions, SSIM may be converted to a normalized root MSE measure, which is a distance function.<ref name=":BrunetTIP2012">{{Cite journal|last=Brunet|first=D. |last2=Vass|first2=J.|last3=Vrscay|first3=E. R.|last4=Wang|first4=Z.|date=April 2012|title=On the mathematical properties of the structural similarity index|journal=IEEE Transactions on Image Processing|volume=21|issue=4|pages=2324–2328|doi=10.1109/TIP.2011.2173206|pmid=22042163 |url=https://ece.uwaterloo.ca/~z70wang/publications/TIP_SSIM_MathProperties.pdf|bibcode=2012ITIP...21.1488B }}</ref> The square of such a function is not convex, but is locally convex and [[quasiconvex]],<ref name=":BrunetTIP2012">{{Cite journal|last=Brunet|first=D. |last2=Vass|first2=J.|last3=Vrscay|first3=E. R.|last4=Wang|first4=Z.|date=April 2012|title=On the mathematical properties of the structural similarity index|journal=IEEE Transactions on Image Processing|volume=21|issue=4|pages=2324–2328|doi=10.1109/TIP.2011.2173206|pmid=22042163 |url=https://ece.uwaterloo.ca/~z70wang/publications/TIP_SSIM_MathProperties.pdf|bibcode=2012ITIP...21.1488B }}</ref> making SSIM a feasible target for optimization.

SSIM satisfies the identity of indiscernibles, and symmetry properties, but not the triangle inequality or non-negativity, and thus is not a distance function. However, under certain conditions, SSIM may be converted to a normalized root MSE measure, which is a distance function. The square of such a function is not convex, but is locally convex and quasiconvex, making SSIM a feasible target for optimization.

= = 数学性质 = = = SSIM 满足不可区别的等同原则和对称性质，但不满足三角不等式或非负性，因此不是距离函数。然而，在一定条件下，SSIM 可以转化为归一化的根 MSE 测度，这是一个距离函数。这类函数的平方不是凸的，而是局部凸和拟凸的，使得 SSIM 成为一个可行的优化目标。

=== Application of the formula ===
In order to evaluate the image quality, this formula is usually applied only on [[Luma (video)|luma]], although it may also be applied on color (e.g., [[RGB color model|RGB]]) values or chromatic (e.g. [[YCbCr]]) values. The resultant SSIM index is a decimal value between 0 and 1, and value 1 is only reachable in the case of two identical sets of data and therefore indicates perfect structural similarity. A value of 0 indicates no structural similarity. For an image, it is typically calculated using a sliding Gaussian window of size 11x11 or a block window of size 8×8. The window can be displaced pixel-by-pixel on the image to create an SSIM quality map of the image. In the case of video quality assessment,<ref name=":Wang2004SignalProcessing">{{Cite journal|last=Wang|first=Z.|last2=Lu|first2=L.|last3=Bovik|first3=A. C.|date=February 2004|title=Video quality assessment based on structural distortion measurement|journal=Signal Processing: Image Communication|volume=19|issue=2|pages=121–132|doi=10.1016/S0923-5965(03)00076-6|url=https://ece.uwaterloo.ca/~z70wang/publications/vssim.html|citeseerx=10.1.1.2.6330}}</ref> the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation.

In order to evaluate the image quality, this formula is usually applied only on luma, although it may also be applied on color (e.g., RGB) values or chromatic (e.g. YCbCr) values. The resultant SSIM index is a decimal value between 0 and 1, and value 1 is only reachable in the case of two identical sets of data and therefore indicates perfect structural similarity. A value of 0 indicates no structural similarity. For an image, it is typically calculated using a sliding Gaussian window of size 11x11 or a block window of size 8×8. The window can be displaced pixel-by-pixel on the image to create an SSIM quality map of the image. In the case of video quality assessment, the authors propose to use only a subgroup of the possible windows to reduce the complexity of the calculation.

为了评价图像质量，这个公式通常只适用于鲁马，虽然它也可以应用于颜色(如 RGB)值或色度(如。YCbCr)值。结果得到的 SSIM 索引是一个介于0和1之间的十进制值，而值1只有在两组相同的数据时才能到达，因此表示完美的结构相似性。值为0表示没有结构相似性。对于图像，通常使用大小为11 × 11的滑动高斯窗口或大小为8 × 8的块窗口来计算。可以在图像上逐个像素地移位窗口，以创建图像的 SSIM 质量映射。在视频质量评估的情况下，作者建议只使用可能窗口的一个子组来降低计算的复杂性。

=== Variants ===

=== Variants ===

= = 变体 = = =

==== Multi-Scale SSIM ====
A more advanced form of SSIM, called Multiscale SSIM (MS-SSIM)<ref name=":0">{{Cite book|title = Multiscale structural similarity for image quality assessment|journal = Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, 2004|date = 2003-11-01|pages = 1398–1402 Vol.2|volume = 2|doi = 10.1109/ACSSC.2003.1292216|first = Z.|last = Wang|first2 = E.P.|last2 = Simoncelli|first3 = A.C.|last3 = Bovik|isbn = 978-0-7803-8104-9|citeseerx = 10.1.1.58.1939}}</ref> is conducted over multiple scales through a process of multiple stages of sub-sampling, reminiscent of multiscale processing in the early vision system. It has been shown to perform equally well or better than SSIM on different subjective image and video databases.<ref name=":0" /><ref name=":1">{{Cite journal|last=Søgaard|first=Jacob|last2=Krasula|first2=Lukáš|last3=Shahid|first3=Muhammad|last4=Temel|first4=Dogancan|last5=Brunnström|first5=Kjell|last6=Razaak|first6=Manzoor|date=2016-02-14|title=Applicability of Existing Objective Metrics of Perceptual Quality for Adaptive Video Streaming|journal=Electronic Imaging|volume=2016|issue=13|pages=1–7|doi=10.2352/issn.2470-1173.2016.13.iqsp-206|url=http://hal.univ-nantes.fr/hal-01395510/file/applicability_of_existing_etc_iqsp_revised_1.3.pdf}}</ref><ref name=":2" />

A more advanced form of SSIM, called Multiscale SSIM (MS-SSIM) is conducted over multiple scales through a process of multiple stages of sub-sampling, reminiscent of multiscale processing in the early vision system. It has been shown to perform equally well or better than SSIM on different subjective image and video databases.

多尺度 SSIM (Multi-Scale SSIM)是一种更高级的 SSIM 形式，称为多尺度 SSIM (MS-SSIM) ，通过多个阶段的子采样过程在多个尺度上进行，使人想起早期视觉系统中的多尺度处理。结果表明，在不同的主观图像和视频数据库上，该方法的性能与 SSIM 方法相当或更好。

==== Multi-component SSIM ====
{{vanchor|Three-component SSIM}} (3-SSIM) is a form of SSIM that takes into account the fact that the human eye can see differences more precisely on textured or edge regions than on smooth regions.<ref>{{Cite journal|last=Li|first=Chaofeng|last2=Bovik|first2=Alan Conrad|date=2010-01-01|title=Content-weighted video quality assessment using a three-component image model|journal=Journal of Electronic Imaging|volume=19|issue=1|pages=011003–011003–9|doi=10.1117/1.3267087|issn=1017-9909|bibcode=2010JEI....19a1003L}}</ref> The resulting metric is calculated as a weighted average of SSIM for three categories of regions: edges, textures, and smooth regions. The proposed weighting is 0.5 for edges, 0.25 for the textured and smooth regions. The authors mention that a 1/0/0 weighting (ignoring anything but edge distortions) leads to results that are closer to subjective ratings. This suggests that edge regions play a dominant role in image quality perception.

(3-SSIM) is a form of SSIM that takes into account the fact that the human eye can see differences more precisely on textured or edge regions than on smooth regions. The resulting metric is calculated as a weighted average of SSIM for three categories of regions: edges, textures, and smooth regions. The proposed weighting is 0.5 for edges, 0.25 for the textured and smooth regions. The authors mention that a 1/0/0 weighting (ignoring anything but edge distortions) leads to results that are closer to subjective ratings. This suggests that edge regions play a dominant role in image quality perception.

多分量 SSIM 是 SSIM 的一种形式，它考虑到人眼可以在纹理或边缘区域比在光滑区域更精确地看到差异这一事实。最终得到的度量结果是以三类区域的 SSIM 加权平均数来计算的: 边缘、纹理和平滑区域。建议的权重是0.5为边缘，0.25为纹理和光滑的区域。作者提到，1/0/0的权重(忽略任何东西，但边缘失真)导致的结果更接近主观评级。这表明边缘区域在图像质量感知中起主导作用。

The authors of 3-SSIM has also extended model into {{vanchor|four-component SSIM}} (4-SSIM). The edge types are further subdivided into preserved and changed edges by their distortion status. The proposed weighting is 0.25 for all four components.<ref>{{cite journal |last1=Li |first1=Chaofeng |last2=Bovik |first2=Alan C. |title=Content-partitioned structural similarity index for image quality assessment |journal=Signal Processing: Image Communication |date=August 2010 |volume=25 |issue=7 |pages=517–526 |doi=10.1016/j.image.2010.03.004}}</ref>

The authors of 3-SSIM has also extended model into (4-SSIM). The edge types are further subdivided into preserved and changed edges by their distortion status. The proposed weighting is 0.25 for all four components.

3-SSIM 的作者还将模型推广到(4-SSIM)。边缘类型根据其变形状态进一步细分为保留边缘和变化边缘。所有四个组成部分的建议权重为0.25。

==== Structural Dissimilarity ====
Structural dissimilarity (DSSIM) may be derived from SSIM, though it does not constitute a distance function as the triangle inequality is not necessarily satisfied.

Structural dissimilarity (DSSIM) may be derived from SSIM, though it does not constitute a distance function as the triangle inequality is not necessarily satisfied.

= = = = 结构不同 = = = = 结构不同 = = = = 结构不同(dSSIM)可能来自 SSIM，但它不构成距离函数，因为三角不等式不一定满足。

<math display="block">\hbox{DSSIM}(x,y) = \frac{1 - \hbox{SSIM}(x, y)}{2}</math>

\hbox{DSSIM}(x,y) = \frac{1 - \hbox{SSIM}(x, y)}{2}

Hbox { DSSIM }(x，y) = frac {1-hbox { SSIM }(x，y)}{2}

==== Video quality metrics and temporal variants ====
It is worth noting that the original version SSIM was designed to measure the quality of still images. It does not contain any parameters directly related to temporal effects of human perception and human judgment.<ref name=":1" /> A common practice is to calculate the average SSIM value over all frames in the video sequence. However, several temporal variants of SSIM have been developed.<ref>{{cite web|url=http://www.compression.ru/video/quality_measure/info_en.html#stssim|title=Redirect page|website=www.compression.ru}}</ref><ref name=":Wang2004SignalProcessing"/><ref name=":Wang2007OpticalSociety">{{Cite journal|last=Wang|first=Z.|last2=Li|first2=Q.|date=December 2007|title=Video quality assessment using a statistical model of human visual speed perception|journal=Journal of the Optical Society of America A|volume=24|issue=12|pages=B61–B69|url=https://ece.uwaterloo.ca/~z70wang/publications/josa07.pdf|doi=10.1364/JOSAA.24.000B61|pmid=18059915|citeseerx=10.1.1.113.4177|bibcode=2007JOSAA..24...61W}}</ref>

It is worth noting that the original version SSIM was designed to measure the quality of still images. It does not contain any parameters directly related to temporal effects of human perception and human judgment. A common practice is to calculate the average SSIM value over all frames in the video sequence. However, several temporal variants of SSIM have been developed.

值得注意的是，原始版本的 SSIM 是为了测量静止图像的质量而设计的。它不包含任何与人类知觉和人类判断的时间效应直接相关的参数。通常的做法是计算视频序列中所有帧的平均 SSIM 值。然而，SSIM 的一些时间变体已经被开发出来。

==== Complex Wavelet SSIM ====
The complex wavelet transform variant of the SSIM (CW-SSIM) is designed to deal with issues of image scaling, translation and rotation. Instead of giving low scores to images with such conditions, the CW-SSIM takes advantage of the complex wavelet transform and therefore yields higher scores to said images. The CW-SSIM is defined as follows:

The complex wavelet transform variant of the SSIM (CW-SSIM) is designed to deal with issues of image scaling, translation and rotation. Instead of giving low scores to images with such conditions, the CW-SSIM takes advantage of the complex wavelet transform and therefore yields higher scores to said images. The CW-SSIM is defined as follows:

= = = = 复数小波 SSIM = = = = 复数小波 SSIM 的复小波转换变体(CW-SSIM)设计用于处理图像缩放、平移和旋转等问题。CW-SSIM 不会对这种情况下的图像给出较低的分数，而是利用了这种复小波转换，因此对上述图像给出了较高的分数。CW-SSIM 的定义如下:

<math display="block">\text{CW-SSIM}(c_x,c_y)=\bigg(\frac{2 \sum_{i=1}^N |c_{x,i}||c_{y,i}|+K}{\sum_{i=1}^N |c_{x,i}|^2+\sum_{i=1}^N |c_{y,i}|^2+K}\bigg)\bigg(\frac{2|\sum_{i=1}^N c_{x,i}c_{y,i}^*|+K}{2\sum_{i=1}^N |c_{x,i}c_{y,i}^*|+K}\bigg)</math>

\text{CW-SSIM}(c_x,c_y)=\bigg(\frac{2 \sum_{i=1}^N |c_{x,i}||c_{y,i}|+K}{\sum_{i=1}^N |c_{x,i}|^2+\sum_{i=1}^N |c_{y,i}|^2+K}\bigg)\bigg(\frac{2|\sum_{i=1}^N c_{x,i}c_{y,i}^*|+K}{2\sum_{i=1}^N |c_{x,i}c_{y,i}^*|+K}\bigg)

Text { CW-SSIM }(c _ x，c _ y) = bigg (frac {2 sum _ i = 1} ^ N | c _ { x，i } | | c _ { y，i } | + K }{ sum _ i = 1} ^ N | c _ { x，i } | ^ 2 + sum _ i = 1} ^ N | c _ { y，i } | ^ 2 + K } bigg (frac {2 | sum _ { i = 1} ^ Nc _ { x，i } c _ { y，i } ^ * | + K }{2 sum _ i = 1} ^ N | c _ { x，i } c _ { y，i } ^ * | + K } bigg)

Where <math>c_x</math> is the complex wavelet transform of the signal <math>x</math> and <math>c_y</math> is the complex wavelet transform for the signal <math>y</math>. Additionally, <math>K</math> is a small positive number used for the purposes of function stability. Ideally, it should be zero. Like the SSIM, the CW-SSIM has a maximum value of 1. The maximum value of 1 indicates that the two signals are perfectly structurally similar while a value of 0 indicates no structural similarity.<ref name="auto">{{Cite journal|last=Zhou Wang|last2=Bovik|first2=A.C.|date=January 2009|title=Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures|journal=IEEE Signal Processing Magazine|volume=26|issue=1|pages=98–117|doi=10.1109/msp.2008.930649|issn=1053-5888|bibcode=2009ISPM...26...98W}}</ref>

Where c_x is the complex wavelet transform of the signal x and c_y is the complex wavelet transform for the signal y. Additionally, K is a small positive number used for the purposes of function stability. Ideally, it should be zero. Like the SSIM, the CW-SSIM has a maximum value of 1. The maximum value of 1 indicates that the two signals are perfectly structurally similar while a value of 0 indicates no structural similarity.

其中 c _ x 是信号 x 的复小波转换，c _ y 是信号 y 的复小波转换。理想情况下，应该是零。与 SSIM 类似，CW-SSIM 的最大值为1。最大值为1表示两个信号在结构上完全相似，而0表示没有结构相似性。

==== SSIMPLUS ====
The SSIMPLUS index is based on SSIM and is a commercially available tool.<ref name="ssimplus">{{Cite journal|last1=Rehman|first1=A.|last2=Zeng|first2=K.|last3=Wang|first3=Zhou|editor3-first=Huib|editor3-last=De Ridder|editor2-first=Thrasyvoulos N|editor2-last=Pappas|editor1-first=Bernice E|editor1-last=Rogowitz|date=February 2015|title=Display device-adapted video quality-of-experience assessment|url=https://ece.uwaterloo.ca/~z70wang/publications/HVEI15.pdf|journal=IS&T-SPIE Electronic Imaging, Human Vision and Electronic Imaging XX|volume=9394|pages=939406|bibcode=2015SPIE.9394E..06R|doi=10.1117/12.2077917|series=Human Vision and Electronic Imaging XX}}</ref> It extends SSIM's capabilities, mainly to target video applications. It provides scores in the range of 0–100, linearly matched to human subjective ratings. It also allows adapting the scores to the intended viewing device, comparing video across different resolutions and contents.

The SSIMPLUS index is based on SSIM and is a commercially available tool. It extends SSIM's capabilities, mainly to target video applications. It provides scores in the range of 0–100, linearly matched to human subjective ratings. It also allows adapting the scores to the intended viewing device, comparing video across different resolutions and contents.

= = = SSIMPLUS = = = SSIMPLUS 索引是基于 SSIM 的，是一种商业上可用的工具。它扩展了 SSIM 的功能，主要针对视频应用。它提供0-100范围内的分数，与人类的主观评分线性匹配。它还允许根据预期的观看设备调整分数，比较不同分辨率和内容的视频。

According to its authors, SSIMPLUS achieves higher accuracy and higher speed than other image and video quality metrics. However, no independent evaluation of SSIMPLUS has been performed, as the algorithm itself is not publicly available.

According to its authors, SSIMPLUS achieves higher accuracy and higher speed than other image and video quality metrics. However, no independent evaluation of SSIMPLUS has been performed, as the algorithm itself is not publicly available.

据作者介绍，SSIMPLUS 比其他图像和视频质量指标具有更高的精度和更快的速度。然而，没有对 SSIMPLUS 进行独立的评估，因为算法本身并不公开。

==== cSSIM ====
In order to further investigate the standard ''discrete'' SSIM from a theoretical perspective, the ''continuous'' SSIM (cSSIM)<ref name="cssim">{{Cite journal|last1=Marchetti|first1=F.|date=January 2021|title=Convergence rate in terms of the continuous SSIM (cSSIM) index in RBF interpolation|url=https://drna.padovauniversitypress.it/system/files/papers/Marchetti_2021_CRT.pdf|journal=Dolom. Res. Notes Approx.|volume=14|pages=27–32}}</ref> has been introduced and studied in the context of [[Radial basis function interpolation]].

In order to further investigate the standard discrete SSIM from a theoretical perspective, the continuous SSIM (cSSIM) has been introduced and studied in the context of Radial basis function interpolation.

为了从理论的角度进一步研究标准离散 SSIM，引入了连续 SSIM (cSSIM) ，并在径向基核函数插值的背景下进行了研究。

==== Other simple modifications ====
The r* cross-correlation metric is based on the variance metrics of SSIM. It's defined as {{math|1=''r''*(''x'', ''y'') = {{sfrac|''σ''''xy''|''σ''''x''''σ''''y''}}}} when {{math|''σ''''x''''σ''''y'' ≠ 0}}, {{math|1=1}} when both standard deviations are zero, and {{math|1=0}} when only one is zero. It has found use in analyzing human response to contrast-detail phantoms.<ref>{{cite journal |last1=Prieto |first1=Gabriel |last2=Guibelalde |first2=Eduardo |last3=Chevalier |first3=Margarita |last4=Turrero |first4=Agustín |title=Use of the cross-correlation component of the multiscale structural similarity metric (R* metric) for the evaluation of medical images: R* metric for the evaluation of medical images |journal=Medical Physics |date=21 July 2011 |volume=38 |issue=8 |pages=4512–4517 |doi=10.1118/1.3605634}}</ref>

The r* cross-correlation metric is based on the variance metrics of SSIM. It's defined as when , when both standard deviations are zero, and when only one is zero. It has found use in analyzing human response to contrast-detail phantoms.

= = = = = 其他简单的修改 = = = = r * 互相关度量是基于 SSIM 的方差度量的。它的定义是，当两个标准偏差都为零时，当只有一个为零时。它已经被用于分析人类对对比细节幻象的反应。

SSIM has also been used on the [[gradient]] of images, making it "G-SSIM". G-SSIM is especially useful on blurred images.<ref>{{cite journal |last1=Chen |first1=Guan-hao |last2=Yang |first2=Chun-ling |last3=Xie |first3=Sheng-li |title=Gradient-Based Structural Similarity for Image Quality Assessment |journal=2006 International Conference on Image Processing |date=October 2006 |pages=2929–2932 |doi=10.1109/ICIP.2006.313132}}</ref>

SSIM has also been used on the gradient of images, making it "G-SSIM". G-SSIM is especially useful on blurred images.

SSIM 也被用于图像的梯度，使它成为“ G-SSIM”。G-SSIM 特别适用于模糊图像。

The modifications above can be combined. For example, 4-G-r* is a combination of 4-SSIM, G-SSIM, and r*. It is able to reflect radiologist preference for images much better than other SSIM variants tested.<ref>{{cite journal |last1=Renieblas |first1=Gabriel Prieto |last2=Nogués |first2=Agustín Turrero |last3=González |first3=Alberto Muñoz |last4=Gómez-Leon |first4=Nieves |last5=del Castillo |first5=Eduardo Guibelalde |title=Structural similarity index family for image quality assessment in radiological images |journal=Journal of Medical Imaging |date=26 July 2017 |volume=4 |issue=3 |pages=035501 |doi=10.1117/1.JMI.4.3.035501 |pmc=5527267 |pmid=28924574}}</ref>

The modifications above can be combined. For example, 4-G-r* is a combination of 4-SSIM, G-SSIM, and r*. It is able to reflect radiologist preference for images much better than other SSIM variants tested.

上面的修改可以组合在一起。例如，4-G-r * 是4-SSIM、 G-SSIM 和 r * 的组合。它能够更好地反映放射科医生对图像的偏好比其他 SSIM 变体测试。

== Application ==
SSIM has applications in a variety of different problems. Some examples are:

SSIM has applications in a variety of different problems. Some examples are:

SSIM 在各种不同的问题中都有应用程序。例如:

* Image Compression: In lossy [[image compression]], information is deliberately discarded to decrease the storage space of images and video. The MSE is typically used in such compression schemes. According to its authors, using SSIM instead of MSE is suggested to produce better results for the decompressed images.<ref name="auto"/>
* Image Restoration: [[Image restoration]] focuses on solving the problem <math>y=h * x+n</math> where <math>y</math> is the blurry image that should be restored, <math>h</math> is the blur kernel, <math>n</math> is the additive noise and <math>x</math> is the original image we wish to recover. The traditional filter which is used to solve this problem is the Wiener Filter. However, the Wiener filter design is based on the MSE. Using an SSIM variant, specifically Stat-SSIM, is claimed to produce better visual results, according to the algorithm's authors.<ref name="auto"/>
* Pattern Recognition: Since SSIM mimics aspects of human perception, it could be used for recognizing patterns. When faced with issues like image scaling, translation and rotation, the algorithm's authors claim that it is better to use CW-SSIM,<ref name=":Gao2011">{{Cite conference|last=Gao|first=Y.|last2=Rehman|first2=A.|last3=Wang|first3=Z.|date=September 2011|title=CW-SSIM based image classification|url=https://ece.uwaterloo.ca/~z70wang/publications/icip11b.pdf|conference=IEEE International Conference on Image Processing (ICIP11)}}</ref> which is insensitive to these variations and may be directly applied by template matching without using any training sample. Since data-driven pattern recognition approaches may produce better performance when a large amount of data is available for training, the authors suggest using CW-SSIM in data-driven approaches.<ref name=":Gao2011" />

* Image Compression: In lossy image compression, information is deliberately discarded to decrease the storage space of images and video. The MSE is typically used in such compression schemes. According to its authors, using SSIM instead of MSE is suggested to produce better results for the decompressed images.
* Image Restoration: Image restoration focuses on solving the problem y=h * x+n where y is the blurry image that should be restored, h is the blur kernel, n is the additive noise and x is the original image we wish to recover. The traditional filter which is used to solve this problem is the Wiener Filter. However, the Wiener filter design is based on the MSE. Using an SSIM variant, specifically Stat-SSIM, is claimed to produce better visual results, according to the algorithm's authors.
* Pattern Recognition: Since SSIM mimics aspects of human perception, it could be used for recognizing patterns. When faced with issues like image scaling, translation and rotation, the algorithm's authors claim that it is better to use CW-SSIM, which is insensitive to these variations and may be directly applied by template matching without using any training sample. Since data-driven pattern recognition approaches may produce better performance when a large amount of data is available for training, the authors suggest using CW-SSIM in data-driven approaches.

* 图像压缩: 在损耗图像压缩中，信息被故意丢弃，以减少图像和视频的存储空间。MSE 通常用于这种压缩方案。作者认为，用 SSIM 代替 MSE 可以获得更好的图像解压效果。
* 影像复原: 影像复原关注于解决 y = h
* x + n 的问题，其中 y 是模糊的图像，应该恢复，h 是模糊内核，n 是加性噪声，x 是我们希望恢复的原始图像。用于解决这一问题的传统滤波器是维纳滤波器。然而，维纳滤波器的设计是基于 MSE 的。该算法的作者称，使用 SSIM 变体，特别是 Stat-SSIM，可以产生更好的视觉效果。
* 模式识别: 由于 SSIM 模拟了人类感知的各个方面，它可以用来识别模式。当遇到图像缩放、平移和旋转等问题时，该算法的作者认为最好使用 CW-SSIM，它对这些变化不敏感，可以不使用任何训练样本直接通过模板匹配应用。由于数据驱动的模式识别方法在大量数据可用于训练时可能产生更好的性能，作者建议在数据驱动方法中使用 CW-SSIM。

==Performance comparison==
Due to its popularity, SSIM is often compared to other metrics, including more simple metrics such as MSE and PSNR, and other perceptual image and [[Video quality|video quality metrics]]. SSIM has been repeatedly shown to significantly outperform MSE and its derivates in accuracy, including research by its own authors and others.<ref name=":1" /><ref>{{Cite book|last=Zhang|first=Lin|last2=Zhang|first2=Lei|last3=Mou|first3=X.|last4=Zhang|first4=D.|date=September 2012|title=A comprehensive evaluation of full reference image quality assessment algorithms|journal=2012 19th IEEE International Conference on Image Processing|pages=1477–1480|doi=10.1109/icip.2012.6467150|isbn=978-1-4673-2533-2|citeseerx=10.1.1.476.2566}}</ref><ref>{{Cite journal|last=Zhou Wang|last2=Wang|first2=Zhou|last3=Li|first3=Qiang|title=Information Content Weighting for Perceptual Image Quality Assessment|journal=IEEE Transactions on Image Processing|volume=20|issue=5|pages=1185–1198|doi=10.1109/tip.2010.2092435|pmid=21078577|date=May 2011|bibcode=2011ITIP...20.1185W}}</ref><ref>{{Cite book|last=Channappayya|first=S. S.|last2=Bovik|first2=A. C.|last3=Caramanis|first3=C.|last4=Heath|first4=R. W.|date=March 2008|title=SSIM-optimal linear image restoration|journal=2008 IEEE International Conference on Acoustics, Speech and Signal Processing|pages=765–768|doi=10.1109/icassp.2008.4517722|isbn=978-1-4244-1483-3|citeseerx=10.1.1.152.7952}}</ref><ref>{{Cite journal|last=Gore|first=Akshay|last2=Gupta|first2=Savita|date=2015-02-01|title=Full reference image quality metrics for JPEG compressed images|journal=AEU - International Journal of Electronics and Communications|volume=69|issue=2|pages=604–608|doi=10.1016/j.aeue.2014.09.002}}</ref><ref name=":Wang2008JOV">{{Cite journal|last=Wang|first=Z.|last2=Simoncelli|first2=E. P.|date=September 2008|title=Maximum differentiation (MAD) competition: a methodology for comparing computational models of perceptual quantities|url=https://ece.uwaterloo.ca/~z70wang/publications/MAD.pdf|journal=Journal of Vision|volume=8|issue=12|pages=8.1–13|doi=10.1167/8.12.8|pmid=18831621|pmc=4143340}}</ref>

Due to its popularity, SSIM is often compared to other metrics, including more simple metrics such as MSE and PSNR, and other perceptual image and video quality metrics. SSIM has been repeatedly shown to significantly outperform MSE and its derivates in accuracy, including research by its own authors and others.

由于 SSIM 的流行，它经常与其他指标进行比较，包括更简单的指标，如 MSE 和 PSNR，以及其他感知图像和视频质量指标。SSIM 在精确度方面一再被证明明显优于 MSE 及其衍生物，包括它自己的作者和其他人的研究。

A paper by Dosselmann and Yang claims that the performance of SSIM is "much closer to that of the MSE" than usually assumed. While they do not dispute the advantage of SSIM over MSE, they state an analytical and functional dependency between the two metrics.<ref name=":2">{{Cite journal|title = A comprehensive assessment of the structural similarity index|journal = Signal, Image and Video Processing|date = 2009-11-06|issn = 1863-1703|pages = 81–91|volume = 5|issue = 1|doi = 10.1007/s11760-009-0144-1|first = Richard|last = Dosselmann|first2 = Xue Dong|last2 = Yang}}</ref> According to their research, SSIM has been found to correlate as well as MSE-based methods on subjective databases other than the databases from SSIM's creators. As an example, they cite Reibman and Poole, who found that MSE outperformed SSIM on a database containing packet-loss–impaired video.<ref>{{Cite book|last=Reibman|first=A. R.|last2=Poole|first2=D.|date=September 2007|title=Characterizing packet-loss impairments in compressed video|journal=2007 IEEE International Conference on Image Processing|volume=5|pages=V – 77–V – 80|doi=10.1109/icip.2007.4379769|isbn=978-1-4244-1436-9|citeseerx=10.1.1.159.5710}}</ref> In another paper, an analytical link between PSNR and SSIM was identified.<ref>{{Cite book|last=Hore|first=A.|last2=Ziou|first2=D.|date=August 2010|title=Image Quality Metrics: PSNR vs. SSIM|journal=2010 20th International Conference on Pattern Recognition|pages=2366–2369|doi=10.1109/icpr.2010.579|isbn=978-1-4244-7542-1}}</ref>

A paper by Dosselmann and Yang claims that the performance of SSIM is "much closer to that of the MSE" than usually assumed. While they do not dispute the advantage of SSIM over MSE, they state an analytical and functional dependency between the two metrics. According to their research, SSIM has been found to correlate as well as MSE-based methods on subjective databases other than the databases from SSIM's creators. As an example, they cite Reibman and Poole, who found that MSE outperformed SSIM on a database containing packet-loss–impaired video. In another paper, an analytical link between PSNR and SSIM was identified.

Dosselmann 和 Yang 的一篇论文声称，SSIM 的性能比通常假设的“更接近 MSE”。虽然他们没有对 SSIM 相对于 MSE 的优势提出异议，但是他们陈述了两个指标之间的分析和功能依赖关系。根据他们的研究，除了 SSIM 的创建者提供的数据库之外，还发现了 SSIM 与基于 MSE 的方法在主观数据库上的相关性。他们引用了 Reibman 和 Poole 的例子，他们发现 MSE 在包含丢包受损视频的数据库中的表现优于 SSIM。在另一篇论文中，确定了 PSNR 和 SSIM 之间的分析联系。

==See also==
* [[Mean squared error]]
* [[Peak signal-to-noise ratio]]
* [[Video quality]]

* Mean squared error
* Peak signal-to-noise ratio
* Video quality

= = 另见 = =
* 均方差
* 峰值信噪比
* 视频质量

==References==
{{reflist}}

==External links==
* [https://ece.uwaterloo.ca/~z70wang/research/ssim/ Home page]
* [https://github.com/pornel/dssim Rust Implementation]
* [http://mehdi.rabah.free.fr/SSIM/ C/C++ Implementation]
* [https://web.archive.org/web/20110206110328/http://pholia.tdi.informatik.uni-frankfurt.de/~philipp/software/dssim.shtml DSSIM C++ Implementation]
* [http://www.lomont.org/software/misc/ssim/SSIM.html Chris Lomont's C# Implementation]
* [http://qpsnr.youlink.org/ qpsnr implementation (multi threaded C++)]
* [http://mmspg.epfl.ch/vqmt Implementation in VQMT software]
* [https://scikit-image.org/docs/dev/api/skimage.metrics.html#skimage.metrics.structural_similarity Implementation in Python]
* [https://elib.dlr.de/91439/1/Gintautas_Palubinskas_ICIP_2014.pdf#"Mystery Behind Similarity Measures MSE and SSIM", Gintautas Palubinskas, 2014]

* Home page
* Rust Implementation
* C/C++ Implementation
* DSSIM C++ Implementation
* Chris Lomont's C# Implementation
* qpsnr implementation (multi threaded C++)
* Implementation in VQMT software
* Implementation in Python
* "Mystery Behind Similarity Measures MSE and SSIM", Gintautas Palubinskas, 2014

* 外部链接 = =
* 主页
* 锈迹实现
* C/C + + 实现
* DSSIM C + + 实现
* Chris Lomont 的 C # 实现
* qpsnr 实现(多线程 C + +)
* VQMT 软件实现
* Python 实现
* “ MSE 和 SSIM 相似性度量背后的奥秘”，Gintautas Palubinskas，2014

[[Category:Image processing]]

Category:Image processing

分类: 图像处理

<noinclude>

This page was moved from [[wikipedia:en:Structural similarity]]. Its edit history can be viewed at [[结构相似性/edithistory]]</noinclude>

[[Category:待整理页面]]

Moonscar

1,564

个编辑

更改

结构相似性 (查看源代码)

2022年7月4日 (一) 10:44的版本

导航菜单

搜索