更改

删除32字节 、 2021年8月16日 (一) 15:38
一些细节
第31行: 第31行:  
'''情感计算''' '''Affective computing ('''也被称为人工情感智能或情感AI)是基于系统和设备的研究和开发来识别、理解、处理和模拟人的情感。这是一个融合'''计算机科学'''、'''心理学'''和'''认知科学'''的跨学科领域<ref name="TaoTan" />。虽然该领域的一些核心思想可以追溯到早期对情感<ref name=":0" />的哲学研究,但计算机科学的现代分支研究起源于罗莎琳德·皮卡德1995年关于情感计算的论文<ref name=":1" />和她的由麻省理工出版社<ref name=":2" /><ref name=":3" /> 出版的《情感计算》<ref name="Affective Computing" /> 。这项研究的动机之一是赋予机器情感智能,包括具备'''同理心'''。机器应能够解读人类的情绪状态,适应人类的情绪,并对这些情绪作出适当的反应。
 
'''情感计算''' '''Affective computing ('''也被称为人工情感智能或情感AI)是基于系统和设备的研究和开发来识别、理解、处理和模拟人的情感。这是一个融合'''计算机科学'''、'''心理学'''和'''认知科学'''的跨学科领域<ref name="TaoTan" />。虽然该领域的一些核心思想可以追溯到早期对情感<ref name=":0" />的哲学研究,但计算机科学的现代分支研究起源于罗莎琳德·皮卡德1995年关于情感计算的论文<ref name=":1" />和她的由麻省理工出版社<ref name=":2" /><ref name=":3" /> 出版的《情感计算》<ref name="Affective Computing" /> 。这项研究的动机之一是赋予机器情感智能,包括具备'''同理心'''。机器应能够解读人类的情绪状态,适应人类的情绪,并对这些情绪作出适当的反应。
   −
= = 研究范围 = =
+
= 研究范围 =
    
=== 检测和识别情感信息 ===
 
=== 检测和识别情感信息 ===
第90行: 第90行:  
语音分析是一种有效的情感状态识别方法,在最近的研究中,语音分析的平均报告准确率为70%-80%.<ref name=":10" /><ref name=":11" />  。这些系统往往比人类的平均准确率(大约60%<ref name="Dellaert" />)更高,但是不如使用其他情绪检测方式准确,比如生理状态或面部表情<ref name="Hudlicka-2003-p24" /> 。然而,由于许多言语特征是独立于语义或文化的,这种技术被认为是一个很有前景的研究方向<ref name="Hudlicka-2003-p25" />。
 
语音分析是一种有效的情感状态识别方法,在最近的研究中,语音分析的平均报告准确率为70%-80%.<ref name=":10" /><ref name=":11" />  。这些系统往往比人类的平均准确率(大约60%<ref name="Dellaert" />)更高,但是不如使用其他情绪检测方式准确,比如生理状态或面部表情<ref name="Hudlicka-2003-p24" /> 。然而,由于许多言语特征是独立于语义或文化的,这种技术被认为是一个很有前景的研究方向<ref name="Hudlicka-2003-p25" />。
   −
==== = = 算法 = = ====
+
==== 算法 ====
    
The process of speech/text affect detection requires the creation of a reliable [[database]], [[knowledge base]], or [[vector space model]],<ref name="Osgood75">
 
The process of speech/text affect detection requires the creation of a reliable [[database]], [[knowledge base]], or [[vector space model]],<ref name="Osgood75">
第131行: 第131行:  
研究证明,如果有足够的声音样本,人的情感可以被大多数主流分类器所正确分类。分类器模型由三个主要分类器组合而成: kNN、 C4.5和 SVM-RBF 核。该分类器比单独采集的基本分类器具有更好的分类性能。另外两组分类器为:1)具有混合内核的一对多 (OAA) 多类 SVM ,2)由C5.0 和神经网络两个基本分类器组成的分类器组,所提出的变体比这两组分类器有更好的性能<ref name=":13" />。
 
研究证明,如果有足够的声音样本,人的情感可以被大多数主流分类器所正确分类。分类器模型由三个主要分类器组合而成: kNN、 C4.5和 SVM-RBF 核。该分类器比单独采集的基本分类器具有更好的分类性能。另外两组分类器为:1)具有混合内核的一对多 (OAA) 多类 SVM ,2)由C5.0 和神经网络两个基本分类器组成的分类器组,所提出的变体比这两组分类器有更好的性能<ref name=":13" />。
   −
==== = = = 数据库 = = ====
+
==== 数据库 ====
    
The vast majority of present systems are data-dependent. This creates one of the biggest challenges in detecting emotions based on speech, as it implicates choosing an appropriate database used to train the classifier. Most of the currently possessed data was obtained from actors and is thus a representation of archetypal emotions. Those so-called acted databases are usually based on the Basic Emotions theory (by [[Paul Ekman]]), which assumes the existence of six basic emotions (anger, fear, disgust, surprise, joy, sadness), the others simply being a mix of the former ones.<ref name="Ekman, P. 1969">Ekman, P. & Friesen, W. V (1969). [http://www.communicationcache.com/uploads/1/0/8/8/10887248/the-repertoire-of-nonverbal-behavior-categories-origins-usage-and-coding.pdf The repertoire of nonverbal behavior: Categories, origins, usage, and coding]. Semiotica, 1, 49–98.</ref> Nevertheless, these still offer high audio quality and balanced classes (although often too few), which contribute to high success rates in recognizing emotions.
 
The vast majority of present systems are data-dependent. This creates one of the biggest challenges in detecting emotions based on speech, as it implicates choosing an appropriate database used to train the classifier. Most of the currently possessed data was obtained from actors and is thus a representation of archetypal emotions. Those so-called acted databases are usually based on the Basic Emotions theory (by [[Paul Ekman]]), which assumes the existence of six basic emotions (anger, fear, disgust, surprise, joy, sadness), the others simply being a mix of the former ones.<ref name="Ekman, P. 1969">Ekman, P. & Friesen, W. V (1969). [http://www.communicationcache.com/uploads/1/0/8/8/10887248/the-repertoire-of-nonverbal-behavior-categories-origins-usage-and-coding.pdf The repertoire of nonverbal behavior: Categories, origins, usage, and coding]. Semiotica, 1, 49–98.</ref> Nevertheless, these still offer high audio quality and balanced classes (although often too few), which contribute to high success rates in recognizing emotions.
第145行: 第145行:  
尽管自然数据比表演数据具有许多优势,但很难获得并且通常情绪强度较低。此外,由于环境噪声的存在、人员与麦克风的距离较远,在自然环境中获得的数据具有较低的信号质量。埃尔朗根-纽约堡大学的AIBO情感资料库(FAU Aibo Emotion Corpus for CEICES, CEICES: Combining Efforts for Improving Automatic Classification of Emotional User States)是建立'''自然情感数据库'''的首次尝试,其采集基于10—13岁儿童与索尼AIBO宠物机器人玩耍的真实情境。同样,在情感研究领域,建立任何一个标准数据库,都需要提供评估方法,以比较不同情感识别系统的差异。
 
尽管自然数据比表演数据具有许多优势,但很难获得并且通常情绪强度较低。此外,由于环境噪声的存在、人员与麦克风的距离较远,在自然环境中获得的数据具有较低的信号质量。埃尔朗根-纽约堡大学的AIBO情感资料库(FAU Aibo Emotion Corpus for CEICES, CEICES: Combining Efforts for Improving Automatic Classification of Emotional User States)是建立'''自然情感数据库'''的首次尝试,其采集基于10—13岁儿童与索尼AIBO宠物机器人玩耍的真实情境。同样,在情感研究领域,建立任何一个标准数据库,都需要提供评估方法,以比较不同情感识别系统的差异。
   −
==== = = 语音叙词 = = ====
+
==== 语音叙词 ====
    
The complexity of the affect recognition process increases with the number of classes (affects) and speech descriptors used within the classifier. It is, therefore, crucial to select only the most relevant features in order to assure the ability of the model to successfully identify emotions, as well as increasing the performance, which is particularly significant to real-time detection. The range of possible choices is vast, with some studies mentioning the use of over 200 distinct features.<ref name="Scherer-2010-p241"/> It is crucial to identify those that are redundant and undesirable in order to optimize the system and increase the success rate of correct emotion detection. The most common speech characteristics are categorized into the following groups.<ref name="Steidl-2011"/><ref name="Scherer-2010-p243"/>
 
The complexity of the affect recognition process increases with the number of classes (affects) and speech descriptors used within the classifier. It is, therefore, crucial to select only the most relevant features in order to assure the ability of the model to successfully identify emotions, as well as increasing the performance, which is particularly significant to real-time detection. The range of possible choices is vast, with some studies mentioning the use of over 200 distinct features.<ref name="Scherer-2010-p241"/> It is crucial to identify those that are redundant and undesirable in order to optimize the system and increase the success rate of correct emotion detection. The most common speech characteristics are categorized into the following groups.<ref name="Steidl-2011"/><ref name="Scherer-2010-p243"/>
第457行: 第457行:       −
==General sources==
+
==其他资源==
    
* {{cite journal | last = Hudlicka | first =  Eva | title = To feel or not to feel: The role of affect in human–computer interaction | journal = International Journal of Human–Computer Studies |  volume = 59 | issue = 1–2 | year = 2003 | pages = 1–32 | citeseerx = 10.1.1.180.6429 | doi=10.1016/s1071-5819(03)00047-8}}
 
* {{cite journal | last = Hudlicka | first =  Eva | title = To feel or not to feel: The role of affect in human–computer interaction | journal = International Journal of Human–Computer Studies |  volume = 59 | issue = 1–2 | year = 2003 | pages = 1–32 | citeseerx = 10.1.1.180.6429 | doi=10.1016/s1071-5819(03)00047-8}}
12

个编辑