第21行: |
第21行: |
| === 更具挑战性的例子 === | | === 更具挑战性的例子 === |
| | | |
− | * 我不是不喜欢游艇(I do not dislike cabin cruisers)。(否定处理) | + | * 我不是不喜欢游艇(I do not dislike cabin cruisers)。(否定处理) |
− | * 不喜欢船不是我真正的爱好(Disliking watercraft is not really my thing)。(否定,倒置的词序) | + | * 不喜欢船不是我真正的爱好(Disliking watercraft is not really my thing)。(否定,倒置的词序) |
− | * 有时候我真的很讨厌肋骨(Sometimes I really hate RIBs)。(状语修饰感情) | + | * 有时候我真的很讨厌肋骨(Sometimes I really hate RIBs)。(状语修饰感情) |
− | * 我真的真的很喜欢在这种天气出去(I'd really truly love going out in this weather)!(可能是讽刺) | + | * 我真的真的很喜欢在这种天气出去(I'd really truly love going out in this weather)!(可能是讽刺) |
− | * Chris Craft比Limestone好看(Chris Craft is better looking than Limestone)。(两个品牌,识别目标的态度是困难的)。 | + | * Chris Craft比Limestone好看(Chris Craft is better looking than Limestone)。(两个品牌,识别目标的态度是困难的)。 |
− | * Chris Craft比Limestone好看,但的适航性和可靠性更突出(Chris Craft is better looking than Limestone, but Limestone projects seaworthiness and reliability)。(两种态度,两个品牌)。 | + | * Chris Craft比Limestone好看,但的适航性和可靠性更突出(Chris Craft is better looking than Limestone, but Limestone projects seaworthiness and reliability)。(两种态度,两个品牌)。 |
− | * 这部电影有很多令人不安的情节,非常令人感到惊奇(The movie is surprising with plenty of unsettling plot twists)。(在某些领域中贬义褒用)。 | + | * 这部电影有很多令人不安的情节,非常令人感到惊奇(The movie is surprising with plenty of unsettling plot twists)。(在某些领域中贬义褒用)。 |
− | * 你应该看看他们的甜点菜单(You should see their decadent dessert menu)。(最近某些态度术语的极性在一些领域中发生了改变) | + | * 你应该看看他们的甜点菜单(You should see their decadent dessert menu)。(最近某些态度术语的极性在一些领域中发生了改变) |
− | * 我喜欢自己的手机,但不会向任何同事推荐(I love my mobile but would not recommend it to any of my colleagues)。(有保留的积极情绪,很难归类) | + | * 我喜欢自己的手机,但不会向任何同事推荐(I love my mobile but would not recommend it to any of my colleagues)。(有保留的积极情绪,很难归类) |
| | | |
| | | |
第35行: |
第35行: |
| | | |
| 情感分析的最底层的任务是识别给定的情感评论文本中的极性倾向是正面的、负面的还是中性的。按照处理文本的粒度不同,情感分析可以分为篇章级、句子级和词语级三个研究层次。高级的“超极性”情感分类研究关注有如情绪状态等,如享受、愤怒、厌恶、悲伤、恐惧和惊讶。<ref name=":2"> Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen. "Emotion Recognition | | 情感分析的最底层的任务是识别给定的情感评论文本中的极性倾向是正面的、负面的还是中性的。按照处理文本的粒度不同,情感分析可以分为篇章级、句子级和词语级三个研究层次。高级的“超极性”情感分类研究关注有如情绪状态等,如享受、愤怒、厌恶、悲伤、恐惧和惊讶。<ref name=":2"> Vong Anh Ho, Duong Huynh-Cong Nguyen, Danh Hoang Nguyen, Linh Thi-Van Pham, Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen. "Emotion Recognition |
− | for Vietnamese Social Media Text". In Proceedings of the 2019 International Conference of the Pacific Association for Computational Linguistics (PACLING 2019), Hanoi, Vietnam (2019).</ref> | + | for Vietnamese Social Media Text". In Proceedings of the 2019 International Conference of the Pacific Association for Computational Linguistics (PACLING 2019), Hanoi, Vietnam (2019).</ref> |
| | | |
| | | |
− | 情感分析的先驱包括 ''the General Inquirer'',''<ref name=":3">Stone, Philip J., Dexter C. Dunphy, and Marshall S. Smith. "The general inquirer: A computer approach to content analysis." MIT Press, Cambridge, MA (1966).</ref>'' 这为文本和心理学研究中的量化模式提供了线索,即根据对一个人的语言行为的分析来研究其心理状态。<ref name=":4">Gottschalk, Louis August, and Goldine C. Gleser. The measurement of psychological states through the content analysis of verbal behavior. Univ of California Press, 1969.</ref> | + | 情感分析的先驱包括 ''the General Inquirer'',''<ref name=":3">Stone, Philip J., Dexter C. Dunphy, and Marshall S. Smith. "The general inquirer: A computer approach to content analysis." MIT Press, Cambridge, MA (1966).</ref>'' 这为文本和心理学研究中的量化模式提供了线索,即根据对一个人的语言行为的分析来研究其心理状态。<ref name=":4">Gottschalk, Louis August, and Goldine C. Gleser. The measurement of psychological states through the content analysis of verbal behavior. Univ of California Press, 1969.</ref> |
| | | |
| | | |
第50行: |
第50行: |
| | first3 = Shivakumar | last3 = Vaithyanathan | | | first3 = Shivakumar | last3 = Vaithyanathan |
| | title = Thumbs up? Sentiment Classification using Machine Learning Techniques | | | title = Thumbs up? Sentiment Classification using Machine Learning Techniques |
− | | book-title = Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) | + | | book-title = Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) |
| | year = 2002 | | | year = 2002 |
| | pages = 79–86 | | | pages = 79–86 |
第60行: |
第60行: |
| | first2 = Lillian | last2 = Lee | | | first2 = Lillian | last2 = Lee |
| | title = Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales | | | title = Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales |
− | | book-title = Proceedings of the Association for Computational Linguistics (ACL) | + | | book-title = Proceedings of the Association for Computational Linguistics (ACL) |
| | year = 2005 | | | year = 2005 |
| | pages = 115–124 | | | pages = 115–124 |
第70行: |
第70行: |
| | first2 = Regina | last2 = Barzilay | | | first2 = Regina | last2 = Barzilay |
| | title = Multiple Aspect Ranking using the Good Grief Algorithm | | | title = Multiple Aspect Ranking using the Good Grief Algorithm |
− | | book-title = Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL) | + | | book-title = Proceedings of the Joint Human Language Technology/North American Chapter of the ACL Conference (HLT-NAACL) |
| | year = 2007 | | | year = 2007 |
| | pages = 300–307 | | | pages = 300–307 |
| | url = http://people.csail.mit.edu/regina/my_papers/ggranker.ps | | | url = http://people.csail.mit.edu/regina/my_papers/ggranker.ps |
| }} | | }} |
− | </ref>等人曾尝试这样做:Pang和Lee<ref name="PangLee05" />拓展了仅仅将电影评论分为正面或负面的基本任务,并以三星或四星的尺度预测电影的评级;而Snyder<ref name="SnyderBarzilay07" /> 对餐馆评论进行了深入分析,预测特定餐馆的各个方面的评级,例如食物和氛围(以五星的尺度)。 | + | </ref>等人曾尝试这样做:Pang和Lee<ref name="PangLee05" />拓展了仅仅将电影评论分为正面或负面的基本任务,并以三星或四星的尺度预测电影的评级;而Snyder<ref name="SnyderBarzilay07" /> 对餐馆评论进行了深入分析,预测特定餐馆的各个方面的评级,例如食物和氛围(以五星的尺度)。 |
| | | |
| | | |
− | 在2004年AAAI春季研讨会上,语言学家、计算机科学家和其他感兴趣的研究人员首次将各种方法——学习、词汇、基于知识等——结合起来,提出了共享任务和基准数据集,以便对文本中的情感、吸引力、主观性和情感进行系统的计算研究。<ref name=":6">Qu, Yan, James Shanahan, and Janyce Wiebe. "Exploring attitude and affect in text: Theories and applications." In AAAI Spring Symposium) Technical report SS-04-07. AAAI Press, Menlo Park, CA. 2004.</ref> | + | 在2004年AAAI春季研讨会上,语言学家、计算机科学家和其他感兴趣的研究人员首次将各种方法——学习、词汇、基于知识等——结合起来,提出了共享任务和基准数据集,以便对文本中的情感、吸引力、主观性和情感进行系统的计算研究。<ref name=":6">Qu, Yan, James Shanahan, and Janyce Wiebe. "Exploring attitude and affect in text: Theories and applications." In AAAI Spring Symposium) Technical report SS-04-07. AAAI Press, Menlo Park, CA. 2004.</ref> |
| | | |
| | | |
第126行: |
第126行: |
| | | |
| | | |
− | 还有各种其他类型的情感分析,如功能/属性为基础的情感分析、分级情感分析(正面、负面、中性) 、多语言情感分析和情感识别。 | + | 还有各种其他类型的情感分析,如功能/属性为基础的情感分析、分级情感分析(正面、负面、中性) 、多语言情感分析和情感识别。 |
| | | |
| | | |
第150行: |
第150行: |
| |last3 = Wiebe | | |last3 = Wiebe |
| |title = Learning Multilingual Subjective Language via Cross-Lingual Projections | | |title = Learning Multilingual Subjective Language via Cross-Lingual Projections |
− | |book-title = Proceedings of the Association for Computational Linguistics (ACL) | + | |book-title = Proceedings of the Association for Computational Linguistics (ACL) |
| |year = 2007 | | |year = 2007 |
| |pages = 976–983 | | |pages = 976–983 |
第170行: |
第170行: |
| | first2 = Lillian | last2 = Lee | | | first2 = Lillian | last2 = Lee |
| | title = A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts | | | title = A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts |
− | | book-title = Proceedings of the Association for Computational Linguistics (ACL) | + | | book-title = Proceedings of the Association for Computational Linguistics (ACL) |
| | year = 2004 | | | year = 2004 |
| | pages = 271–278 | | | pages = 271–278 |
第182行: |
第182行: |
| * 客观句的例子:“要当选美国总统,候选人必须年满35岁。” | | * 客观句的例子:“要当选美国总统,候选人必须年满35岁。” |
| | | |
− | 主观这个术语描述的事件包含各种形式的非事实信息,如个人意见、判断和预测。也被Quirk等人称为“私人状况(private states)”。<ref name=":10">{{Cite book|last1=Quirk|first1=Randolph|title=A Comprehensive Grammar of the English Language (General Grammar)|last2=Greenbaum|first2=Sidney|last3=Geoffrey|first3=Leech|last4=Jan|first4=Svartvik|publisher=Longman|year=1985|isbn=1933108312|pages=175–239}}</ref>在下面的例子中,它反映了“我们美国人”这样一个私人状态。此外,被评论的目标实体可以是从有形到无形的话题事项等多种形式(Liu,2010)。<ref name="Liu2010" /> 此外,刘(2010)还观察到三种类型的态度: 1)正面的观点,2)中性的观点,3)负面的观点。<ref name="Liu2010" /> | + | 主观这个术语描述的事件包含各种形式的非事实信息,如个人意见、判断和预测。也被Quirk等人称为“私人状况(private states)”。<ref name=":10">{{Cite book|last1=Quirk|first1=Randolph|title=A Comprehensive Grammar of the English Language (General Grammar)|last2=Greenbaum|first2=Sidney|last3=Geoffrey|first3=Leech|last4=Jan|first4=Svartvik|publisher=Longman|year=1985|isbn=1933108312|pages=175–239}}</ref>在下面的例子中,它反映了“我们美国人”这样一个私人状态。此外,被评论的目标实体可以是从有形到无形的话题事项等多种形式(Liu,2010)。<ref name="Liu2010" /> 此外,刘(2010)还观察到三种类型的态度: 1)正面的观点,2)中性的观点,3)负面的观点。<ref name="Liu2010" /> |
| | | |
| * 主观句的例子:“我们美国人需要选出一位成熟且能够做出明智决定的总统。” | | * 主观句的例子:“我们美国人需要选出一位成熟且能够做出明智决定的总统。” |
第189行: |
第189行: |
| | | |
| | | |
− | 每个类别的单词或短语指标集合都是为了在未注释的文本上找到理想的模式而定义的。对于主观表达,已经建立了一个不同的单词列表。Riloff等人(2003)指出,语言学家和自然语言处理领域的多位研究人员已经开发出了单词或短语的主观指标列表。<ref name=":11">{{Cite journal|last1=Riloff|first1=Ellen|last2=Wiebe|first2=Janyce|date=2003-07-11|title=Learning extraction patterns for subjective expressions|journal=Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing|series=EMNLP '03|volume=10|location=USA|publisher=Association for Computational Linguistics|pages=105–112|doi=10.3115/1119355.1119369|doi-access=free}}</ref>必须为测量给定的表达方式创建一个提取规则的字典是非常必要的。多年来,在主观性识别方面,从1999年的手工特征提取发展到了2005年的自动特征学习。<ref name=":12">{{Cite journal|last1=Chaturvedi|first1=Iti|last2=Cambria|first2=Erik|last3=Welsch|first3=Roy E.|last4=Herrera|first4=Francisco|date=November 2018|title=Distinguishing between facts and opinions for sentiment analysis: Survey and challenges|url=https://sentic.net/subjectivity-detection.pdf|journal=Information Fusion|volume=44|pages=65–77|doi=10.1016/j.inffus.2017.12.006|via=Elsevier Science Direct|doi-access=free}}</ref>目前,自动学习方法可以进一步分为监督学习和无监督学习。利用机器学习对文本进行注释和去注释的模式提取方法已经成为学术界研究的热点。
| + | 每个类别的单词或短语指标集合都是为了在未注释的文本上找到理想的模式而定义的。对于主观表达,已经建立了一个不同的单词列表。Riloff等人(2003)指出,语言学家和自然语言处理领域的多位研究人员已经开发出了单词或短语的主观指标列表。<ref name=":11">{{Cite journal|last1=Riloff|first1=Ellen|last2=Wiebe|first2=Janyce|date=2003-07-11|title=Learning extraction patterns for subjective expressions|journal=Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing|series=EMNLP '03|volume=10|location=USA|publisher=Association for Computational Linguistics|pages=105–112|doi=10.3115/1119355.1119369|doi-access=free}}</ref>必须为测量给定的表达方式创建一个提取规则的字典是非常必要的。多年来,在主观性识别方面,从1999年的手工特征提取发展到了2005年的自动特征学习。<ref name=":12">{{Cite journal|last1=Chaturvedi|first1=Iti|last2=Cambria|first2=Erik|last3=Welsch|first3=Roy E.|last4=Herrera|first4=Francisco|date=November 2018|title=Distinguishing between facts and opinions for sentiment analysis: Survey and challenges|url=https://sentic.net/subjectivity-detection.pdf|journal=Information Fusion|volume=44|pages=65–77|doi=10.1016/j.inffus.2017.12.006|via=Elsevier Science Direct|doi-access=free}}</ref>目前,自动学习方法可以进一步分为监督学习和无监督学习。利用机器学习对文本进行注释和去注释的模式提取方法已经成为学术界研究的热点。 |
| | | |
| | | |
− | 然而,研究人员认识到在为表达方式分类制定一套固定的规则集方面存在一些挑战。规则开发中的大部分挑战源于文本信息的性质。一些研究人员已经认识到了六个挑战: 1)隐喻性的表达,2)写作中的差异,3)上下文敏感性,4)时间敏感性,5)代表性词用法较少以及6)不断增长的数量。 | + | 然而,研究人员认识到在为表达方式分类制定一套固定的规则集方面存在一些挑战。规则开发中的大部分挑战源于文本信息的性质。一些研究人员已经认识到了六个挑战: 1)隐喻性的表达,2)写作中的差异,3)上下文敏感性,4)时间敏感性,5)代表性词用法较少以及6)不断增长的数量。 |
| | | |
| # 隐喻性的表达:文本中包含隐喻性的表达可能会影响抽取的表现。<ref name=":13">{{Cite journal|last1=Wiebe|first1=Janyce|last2=Riloff|first2=Ellen|date=July 2011|title=Finding Mutual Benefit between Subjectivity Analysis and Information Extraction|url=https://ieeexplore.ieee.org/document/5959154|journal=IEEE Transactions on Affective Computing|volume=2|issue=4|pages=175–191|doi=10.1109/T-AFFC.2011.19|issn=1949-3045}}</ref>此外,隐喻可能采取不同的形式,这会增加识别的难度。 | | # 隐喻性的表达:文本中包含隐喻性的表达可能会影响抽取的表现。<ref name=":13">{{Cite journal|last1=Wiebe|first1=Janyce|last2=Riloff|first2=Ellen|date=July 2011|title=Finding Mutual Benefit between Subjectivity Analysis and Information Extraction|url=https://ieeexplore.ieee.org/document/5959154|journal=IEEE Transactions on Affective Computing|volume=2|issue=4|pages=175–191|doi=10.1109/T-AFFC.2011.19|issn=1949-3045}}</ref>此外,隐喻可能采取不同的形式,这会增加识别的难度。 |
第209行: |
第209行: |
| # 理解上的差异。在人工标记过程中,标记者之间会受限于语言的模糊性,从而可能出现对例子是主观还是客观的判断分歧。 | | # 理解上的差异。在人工标记过程中,标记者之间会受限于语言的模糊性,从而可能出现对例子是主观还是客观的判断分歧。 |
| # 人为错误。人工标记是一项细致的工作,需要精力高度集中才能完成。 | | # 人为错误。人工标记是一项细致的工作,需要精力高度集中才能完成。 |
− | # 耗时长。人工注释是一项繁重的工作。Riloff(1996)的调查研究表明,一个标记者完成160篇文本标记需要8个小时。<ref name=":17">{{Cite journal|last=Riloff|first=Ellen|date=1996-08-01|title=An empirical study of automated dictionary construction for information extraction in three domains|url=https://dx.doi.org/10.1016%2F0004-3702%2895%2900123-9|journal=Artificial Intelligence|language=en|volume=85|issue=1|pages=101–134|doi=10.1016/0004-3702(95)00123-9|issn=0004-3702|doi-access=free}}</ref> | + | # 耗时长。人工注释是一项繁重的工作。Riloff(1996)的调查研究表明,一个标记者完成160篇文本标记需要8个小时。<ref name=":17">{{Cite journal|last=Riloff|first=Ellen|date=1996-08-01|title=An empirical study of automated dictionary construction for information extraction in three domains|url=https://dx.doi.org/10.1016%2F0004-3702%2895%2900123-9|journal=Artificial Intelligence|language=en|volume=85|issue=1|pages=101–134|doi=10.1016/0004-3702(95)00123-9|issn=0004-3702|doi-access=free}}</ref> |
| | | |
| | | |
第215行: |
第215行: |
| | | |
| # Meta-Bootstrapping(Riloff & Jones,1999)。<ref name=":18">{{Cite journal|last1=Riloff|first1=Ellen|last2=Jones|first2=Rosie|date=July 1999|title=Learning dictionaries for information extraction by multi-level bootstrapping|url=https://aaai.org/Papers/AAAI/1999/AAAI99-068.pdf|journal=AAAI '99/IAAI '99: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence|pages=474–479}}</ref> 第一步: 根据预定义的规则生成提取模式,并根据每个模式所包含的种子词数量生成提取模式。第二步: 将分数排名前5的单词标记并添加到语义字典中。重复上述方法。 | | # Meta-Bootstrapping(Riloff & Jones,1999)。<ref name=":18">{{Cite journal|last1=Riloff|first1=Ellen|last2=Jones|first2=Rosie|date=July 1999|title=Learning dictionaries for information extraction by multi-level bootstrapping|url=https://aaai.org/Papers/AAAI/1999/AAAI99-068.pdf|journal=AAAI '99/IAAI '99: Proceedings of the Sixteenth National Conference on Artificial Intelligence and the Eleventh Innovative Applications of Artificial Intelligence Conference Innovative Applications of Artificial Intelligence|pages=474–479}}</ref> 第一步: 根据预定义的规则生成提取模式,并根据每个模式所包含的种子词数量生成提取模式。第二步: 将分数排名前5的单词标记并添加到语义字典中。重复上述方法。 |
− | # Basilisk (Bootstrapping Approach to SemantIc Lexicon inducing using SemantIc Knowledge) (Thelen & Riloff,2002)。<ref name=":19">{{Cite journal|last1=Thelen|first1=Michael|last2=Riloff|first2=Ellen|date=2002-07-06|title=A bootstrapping method for learning semantic lexicons using extraction pattern contexts|journal=Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10|series=EMNLP '02|volume=10|location=USA|publisher=Association for Computational Linguistics|pages=214–221|doi=10.3115/1118693.1118721|doi-access=free}}</ref> 第一步: 生成抽取模式;第二步: 将最好的模式从模式池移动到候选种子词池。第三步: 将分数排名前10的单词标记并添加到语义字典中。重复上述方法。 | + | # Basilisk (Bootstrapping Approach to SemantIc Lexicon inducing using SemantIc Knowledge) (Thelen & Riloff,2002)。<ref name=":19">{{Cite journal|last1=Thelen|first1=Michael|last2=Riloff|first2=Ellen|date=2002-07-06|title=A bootstrapping method for learning semantic lexicons using extraction pattern contexts|journal=Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10|series=EMNLP '02|volume=10|location=USA|publisher=Association for Computational Linguistics|pages=214–221|doi=10.3115/1118693.1118721|doi-access=free}}</ref> 第一步: 生成抽取模式;第二步: 将最好的模式从模式池移动到候选种子词池。第三步: 将分数排名前10的单词标记并添加到语义字典中。重复上述方法。 |
| | | |
| | | |
第226行: |
第226行: |
| * 股票价格预测:在金融行业,分类器通过处理从社会媒体获得的过程辅助信息和从互联网获得的其他文本信息来辅助预测模型。过去Dong等对日本股票价格的研究表明,带有主观和客观模块的模型可能比没有主客观模块的模型表现更好。<ref name=":21">{{Cite journal|last1=Deng|first1=Shangkun|last2=Mitsubuchi|first2=Takashi|last3=Shioda|first3=Kei|last4=Shimada|first4=Tatsuro|last5=Sakurai|first5=Akito|date=December 2011|title=Combining Technical Analysis with Sentiment Analysis for Stock Price Prediction|url=http://dx.doi.org/10.1109/dasc.2011.138|journal=2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing|pages=800–807|publisher=IEEE|doi=10.1109/dasc.2011.138}}</ref> | | * 股票价格预测:在金融行业,分类器通过处理从社会媒体获得的过程辅助信息和从互联网获得的其他文本信息来辅助预测模型。过去Dong等对日本股票价格的研究表明,带有主观和客观模块的模型可能比没有主客观模块的模型表现更好。<ref name=":21">{{Cite journal|last1=Deng|first1=Shangkun|last2=Mitsubuchi|first2=Takashi|last3=Shioda|first3=Kei|last4=Shimada|first4=Tatsuro|last5=Sakurai|first5=Akito|date=December 2011|title=Combining Technical Analysis with Sentiment Analysis for Stock Price Prediction|url=http://dx.doi.org/10.1109/dasc.2011.138|journal=2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing|pages=800–807|publisher=IEEE|doi=10.1109/dasc.2011.138}}</ref> |
| * 社交媒体分析。 | | * 社交媒体分析。 |
− | * 学生意见分类。<ref name=":22">{{Cite journal|last1=Nguyen|first1=Kiet Van|last2=Nguyen|first2=Vu Duc|last3=Nguyen|first3=Phu X.V.|last4=Truong|first4=Tham T.H.|last5=Nguyen|first5=Ngan L-T.|date=2018-10-01|title=UIT-VSFC: Vietnamese Students' Feedback Corpus for Sentiment Analysis|url=https://ieeexplore.ieee.org/document/8573337|journal=2018 10th International Conference on Knowledge and Systems Engineering (KSE)|pages=19–24|location=Vietnam|publisher=IEEE|doi=10.1109/KSE.2018.8573337}}</ref> | + | * 学生意见分类。<ref name=":22">{{Cite journal|last1=Nguyen|first1=Kiet Van|last2=Nguyen|first2=Vu Duc|last3=Nguyen|first3=Phu X.V.|last4=Truong|first4=Tham T.H.|last5=Nguyen|first5=Ngan L-T.|date=2018-10-01|title=UIT-VSFC: Vietnamese Students' Feedback Corpus for Sentiment Analysis|url=https://ieeexplore.ieee.org/document/8573337|journal=2018 10th International Conference on Knowledge and Systems Engineering (KSE)|pages=19–24|location=Vietnam|publisher=IEEE|doi=10.1109/KSE.2018.8573337}}</ref> |
| * 篇章总结: 分类器可以提取目标制定的评论,并收集一个特定实体的意见。 | | * 篇章总结: 分类器可以提取目标制定的评论,并收集一个特定实体的意见。 |
− | * 复杂问题回答:分类器可以对复杂的问题进行分类,包括语言主体、目标和重点目标。在Yu等(2003)的研究中,研究人员开发了一个句子和篇章级别的聚类用来识别意见块。<ref name=":23">{{Cite journal|last1=Yu|first1=Hong|last2=Hatzivassiloglou|first2=Vasileios|date=2003-07-11|title=Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences|journal=Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing|series=EMNLP '03|location=USA|publisher=Association for Computational Linguistics|pages=129–136|doi=10.3115/1119355.1119372|doi-access=free}}</ref> | + | * 复杂问题回答:分类器可以对复杂的问题进行分类,包括语言主体、目标和重点目标。在Yu等(2003)的研究中,研究人员开发了一个句子和篇章级别的聚类用来识别意见块。<ref name=":23">{{Cite journal|last1=Yu|first1=Hong|last2=Hatzivassiloglou|first2=Vasileios|date=2003-07-11|title=Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences|journal=Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing|series=EMNLP '03|location=USA|publisher=Association for Computational Linguistics|pages=129–136|doi=10.3115/1119355.1119372|doi-access=free}}</ref> |
| * 特定领域的应用。 | | * 特定领域的应用。 |
| * 电子邮件分析: 主观和客观分类器通过追踪目标单词的语言模式来检测垃圾邮件。 | | * 电子邮件分析: 主观和客观分类器通过追踪目标单词的语言模式来检测垃圾邮件。 |
第347行: |
第347行: |
| |first2 = E. H. | | |first2 = E. H. |
| |title = Identifying and Analyzing Judgment Opinions. | | |title = Identifying and Analyzing Judgment Opinions. |
− | |book-title = Proceedings of the Human Language Technology / North American Association of Computational Linguistics conference (HLT-NAACL 2006). New York, NY. | + | |book-title = Proceedings of the Human Language Technology / North American Association of Computational Linguistics conference (HLT-NAACL 2006). New York, NY. |
| |year = 2006 | | |year = 2006 |
| |url = http://acl.ldc.upenn.edu/P/P06/P06-2063.pdf | | |url = http://acl.ldc.upenn.edu/P/P06/P06-2063.pdf |
第426行: |
第426行: |
| | | |
| 为了更好地适应市场需求,情感分析的评估已转向更多基于任务的措施,这些措施是与公关机构和市场研究专业人士的代表共同制定的。例如,RepLab评估数据集中较少考虑的文本内容,而更多地关注文本对品牌声誉问题的影响。<ref name=":31"> | | 为了更好地适应市场需求,情感分析的评估已转向更多基于任务的措施,这些措施是与公关机构和市场研究专业人士的代表共同制定的。例如,RepLab评估数据集中较少考虑的文本内容,而更多地关注文本对品牌声誉问题的影响。<ref name=":31"> |
− | Amigó, Enrique, Adolfo Corujo, Julio Gonzalo, Edgar Meij, and Maarten de Rijke. "Overview of RepLab 2012: Evaluating Online Reputation Management Systems." In CLEF (Online Working Notes/Labs/Workshop). 2012. | + | Amigó, Enrique, Adolfo Corujo, Julio Gonzalo, Edgar Meij, and Maarten de Rijke. "Overview of RepLab 2012: Evaluating Online Reputation Management Systems." In CLEF (Online Working Notes/Labs/Workshop). 2012. |
| </ref><ref name=":32"> | | </ref><ref name=":32"> |
| Amigó, Enrique, Jorge Carrillo De Albornoz, Irina Chugur, Adolfo Corujo, Julio Gonzalo, Tamara Martín, Edgar Meij, Maarten de Rijke, and Damiano Spina. "Overview of replab 2013: Evaluating online reputation monitoring systems." In International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 333-352. Springer Berlin Heidelberg, 2013. | | Amigó, Enrique, Jorge Carrillo De Albornoz, Irina Chugur, Adolfo Corujo, Julio Gonzalo, Tamara Martín, Edgar Meij, Maarten de Rijke, and Damiano Spina. "Overview of replab 2013: Evaluating online reputation monitoring systems." In International Conference of the Cross-Language Evaluation Forum for European Languages, pp. 333-352. Springer Berlin Heidelberg, 2013. |
第442行: |
第442行: |
| | | |
| | | |
− | 在研究中,朝着这个目标迈出了一步。目前,世界各地大学的几个研究团队正致力于通过情感分析来了解网络社区中的情感动态。<ref name="Collective emotions in cyberspace">CORDIS. [http://cordis.europa.eu/fetch?CALLER=FP7_PROJ_EN&ACTION=D&DOC=1&CAT=PROJ&QUERY=011e4ea33ef2:358b:41dc0328&RCN=89032 "Collective emotions in cyberspace (CYBEREMOTIONS)"], ''European Commission'', 2009-02-03. Retrieved on 2010-12-13.</ref>例如,CyberEmotions项目最近发现了负面情绪在推动社交网络讨论中的作用。<ref name="NewSci_flaming">Condliffe, Jamie. [https://www.newscientist.com/article/dn19821-flaming-drives-online-social-networks.html "Flaming drives online social networks "], ''New Scientist'', 2010-12-07. Retrieved on 2010-12-13.</ref> | + | 在研究中,朝着这个目标迈出了一步。目前,世界各地大学的几个研究团队正致力于通过情感分析来了解网络社区中的情感动态。<ref name="Collective emotions in cyberspace">CORDIS. [http://cordis.europa.eu/fetch?CALLER=FP7_PROJ_EN&ACTION=D&DOC=1&CAT=PROJ&QUERY=011e4ea33ef2:358b:41dc0328&RCN=89032 "Collective emotions in cyberspace (CYBEREMOTIONS)"], ''European Commission'', 2009-02-03. Retrieved on 2010-12-13.</ref>例如,CyberEmotions项目最近发现了负面情绪在推动社交网络讨论中的作用。<ref name="NewSci_flaming">Condliffe, Jamie. [https://www.newscientist.com/article/dn19821-flaming-drives-online-social-networks.html "Flaming drives online social networks "], ''New Scientist'', 2010-12-07. Retrieved on 2010-12-13.</ref> |
| | | |
| | | |
第448行: |
第448行: |
| | | |
| | | |
− | 尽管短文字符串可能是个问题,但对微型博客的情感分析已经表明,Twitter可以被视为一个有效的政治情感在线指标。Twitter的政治情感分析表显示它与政党和政客的政治立场非常吻合,这表明推特信息的内容合理地反映了线下的政治格局。<ref name="r25">{{cite journal|doi=10.1038/s41598-017-18262-5|pmid=29269945|pmc=5740080|title=Human Sexual Cycles are Driven by Culture and Match Collective Moods|journal=Scientific Reports|volume=7|issue=1|pages=17973|year=2017|last1=Wood|first1=Ian B.|last2=Varela|first2=Pedro L.|last3=Bollen|first3=Johan|last4=Rocha|first4=Luis M.|last5=Gonçalves-Sá|first5=Joana|bibcode=2017NatSR...717973W|arxiv=1707.03959}}</ref><ref name=":34">Tumasjan, Andranik; O.Sprenger, Timm; G.Sandner, Philipp; M.Welpe, Isabell (2010). [http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852 "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment"]. "Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media"</ref>此外,Twitter上的情感分析也被证明可以捕捉到,在全球范围内人类生殖周期背后的公众情感以及其他与公共健康相关的问题(如药物不良反应)背后的公共情感。<ref name="r27">{{cite journal|doi=10.1016/j.jbi.2016.06.007|pmid=27363901|pmc=4981644|title=Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts|journal=Journal of Biomedical Informatics|volume=62|pages=148–158|year=2016|last1=Korkontzelos|first1=Ioannis|last2=Nikfarjam|first2=Azadeh|last3=Shardlow|first3=Matthew|last4=Sarker|first4=Abeed|last5=Ananiadou|first5=Sophia|last6=Gonzalez|first6=Graciela H.}}</ref> | + | 尽管短文字符串可能是个问题,但对微型博客的情感分析已经表明,Twitter可以被视为一个有效的政治情感在线指标。Twitter的政治情感分析表显示它与政党和政客的政治立场非常吻合,这表明推特信息的内容合理地反映了线下的政治格局。<ref name="r25">{{cite journal|doi=10.1038/s41598-017-18262-5|pmid=29269945|pmc=5740080|title=Human Sexual Cycles are Driven by Culture and Match Collective Moods|journal=Scientific Reports|volume=7|issue=1|pages=17973|year=2017|last1=Wood|first1=Ian B.|last2=Varela|first2=Pedro L.|last3=Bollen|first3=Johan|last4=Rocha|first4=Luis M.|last5=Gonçalves-Sá|first5=Joana|bibcode=2017NatSR...717973W|arxiv=1707.03959}}</ref><ref name=":34">Tumasjan, Andranik; O.Sprenger, Timm; G.Sandner, Philipp; M.Welpe, Isabell (2010). [http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852 "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment"]. "Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media"</ref>此外,Twitter上的情感分析也被证明可以捕捉到,在全球范围内人类生殖周期背后的公众情感以及其他与公共健康相关的问题(如药物不良反应)背后的公共情感。<ref name="r27">{{cite journal|doi=10.1016/j.jbi.2016.06.007|pmid=27363901|pmc=4981644|title=Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts|journal=Journal of Biomedical Informatics|volume=62|pages=148–158|year=2016|last1=Korkontzelos|first1=Ioannis|last2=Nikfarjam|first2=Azadeh|last3=Shardlow|first3=Matthew|last4=Sarker|first4=Abeed|last5=Ananiadou|first5=Sophia|last6=Gonzalez|first6=Graciela H.}}</ref> |
| | | |
| | | |