更改

情感分析 (查看源代码)

2021年8月11日 (三) 17:01的版本

删除220字节、 2021年8月11日 (三) 17:01

→‎Web 2.0

第310行：第310行：

* 电子邮件分析: 主观和客观分类器通过追踪目标单词的语言模式来检测垃圾邮件。

−

=== Feature/aspect-based功能/属性为基础的情感分析 ===

+

=== '''Feature/aspect-based功能/属性为基础的情感分析''' ===

It refers to determining the opinions or sentiments expressed on different features or aspects of entities, e.g., of a cell phone, a digital camera, or a bank.<ref name="HuLiu04">{{cite conference

| first1 = Minqing | last1 = Hu

第537行：第537行：

== Web 2.0 ==

−

~~参阅：~~

+

参阅：声誉管理（Reputation management）、web 2.0和web数据挖掘（web mining）

−

The rise of [[social media]] such as [[blogs]] and [[social network]]s has fueled interest in sentiment analysis. With the proliferation of reviews, ratings, recommendations and other forms of online expression, online opinion has turned into a kind of virtual currency for businesses looking to market their products, identify new opportunities and manage their reputations. As businesses look to automate the process of filtering out the noise, understanding the conversations, identifying the relevant content and actioning it appropriately, many are now looking to the field of sentiment analysis.<ref name="Mining the Web for Feelings, Not Facts">Wright, Alex. [https://www.nytimes.com/2009/08/24/technology/internet/24emotion.html?_r=1 "Mining the Web for Feelings, Not Facts"], ''[[New York Times]]'', 2009-08-23. Retrieved on 2009-10-01.</ref> Further complicating the matter, is the rise of anonymous social media platforms such as [[4chan]] and [[Reddit]].<ref>{{cite web|title=Sentiment Analysis on Reddit|url=http://news.humanele.com/sentiment-analysis-reddit/|access-date=10 October 2014|date=2014-09-30}}</ref> If [[web 2.0]] was all about democratizing publishing, then the next stage of the web may well be based on democratizing [[data mining]] of all the content that is getting published.<ref name="The Future of Social Media Monitoring">Kirkpatrick, Marshall. [https://readwrite.com/2009/04/15/whats_next_in_social_media_monitoring/ "], ''[[ReadWriteWeb]]'', 2009-04-15. Retrieved on 2009-10-01.</ref>

+

The rise of [[social media]] such as [[blogs]] and [[social network]]s has fueled interest in sentiment analysis. With the proliferation of reviews, ratings, recommendations and other forms of online expression, online opinion has turned into a kind of virtual currency for businesses looking to market their products, identify new opportunities and manage their reputations. As businesses look to automate the process of filtering out the noise, understanding the conversations, identifying the relevant content and actioning it appropriately, many are now looking to the field of sentiment analysis.<ref name="Mining the Web for Feelings, Not Facts">Wright, Alex. [https://www.nytimes.com/2009/08/24/technology/internet/24emotion.html?_r=1 "Mining the Web for Feelings, Not Facts"], ''[[New York Times]]'', 2009-08-23. Retrieved on 2009-10-01.</ref> Further complicating the matter, is the rise of anonymous social media platforms such as [[4chan]] and [[Reddit]].<ref name=":33">{{cite web|title=Sentiment Analysis on Reddit|url=http://news.humanele.com/sentiment-analysis-reddit/|access-date=10 October 2014|date=2014-09-30}}</ref> If [[web 2.0]] was all about democratizing publishing, then the next stage of the web may well be based on democratizing [[data mining]] of all the content that is getting published.<ref name="The Future of Social Media Monitoring">Kirkpatrick, Marshall. [https://readwrite.com/2009/04/15/whats_next_in_social_media_monitoring/ "], ''[[ReadWriteWeb]]'', 2009-04-15. Retrieved on 2009-10-01.</ref>

−

博客和社交网络等社交媒体的兴起激发了人们对情绪分析的兴趣。随着评论、评级、推荐和其他形式的网络表达的激增，网络舆论已经变成了一种虚拟货币，企业可以通过这种货币来推销自己的产品、寻找新的机会和管理自己的声誉。随着企业寻求自动化过滤噪音的过程，理解对话，识别相关内容并适当活动，许多企业现在正在寻找情绪分析领域。莱特，亚历克斯。“从网上挖掘情感，而不是事实”，纽约时报，2009-08-23。2009-10-01.更复杂的是，匿名社交媒体平台的兴起，如4chan 和 Reddit。如果说 web 2.0完全是关于民主化发布，那么 web 的下一个阶段很可能是基于对所有正在发布的内容的民主化数据挖掘。马歇尔 · 柯克帕特里克。”，ReadWriteWeb，2009-04-15。2009-10-01.

+

博客和社交网络等社交媒体的兴起激发了人们对情感分析的兴趣。随着评论、评级、推荐和其他形式的网络在线表达的激增，网络在线评论语料已经变成了一种虚拟货币，企业可以借此来推销自己的产品、寻找新的机会和管理自己的声誉。随着企业寻求将过滤噪音、理解对话、识别相关内容并采取适当行动的过程的自动化程度加深，许多企业将目光投向了情感分析领域。<ref name="Mining the Web for Feelings, Not Facts" /> 使问题进一步复杂化的是匿名社交媒体平台的崛起，如4chan和Reddit。<ref name=":33" />如果说web 2.0完全是关于民主化发布，那么web的下一个阶段很可能是基于对所有正在发布的内容的民主化数据挖掘。<ref name="The Future of Social Media Monitoring" />

One step towards this aim is accomplished in research. Several research teams in universities around the world currently focus on understanding the dynamics of sentiment in [[Virtual community|e-communities]] through sentiment analysis.<ref name="Collective emotions in cyberspace">CORDIS. [http://cordis.europa.eu/fetch?CALLER=FP7_PROJ_EN&ACTION=D&DOC=1&CAT=PROJ&QUERY=011e4ea33ef2:358b:41dc0328&RCN=89032 "Collective emotions in cyberspace (CYBEREMOTIONS)"], ''[[European Commission]]'', 2009-02-03. Retrieved on 2010-12-13.</ref> The [[CyberEmotions|CyberEmotions project]], for instance, recently identified the role of negative [[emotion]]s in driving social networks discussions.<ref name="NewSci_flaming">Condliffe, Jamie. [https://www.newscientist.com/article/dn19821-flaming-drives-online-social-networks.html "Flaming drives online social networks "], ''[[New Scientist]]'', 2010-12-07. Retrieved on 2010-12-13.</ref>

−

实现这一目标的一个步骤就是研究。目前，世界各地的一些大学的研究团队通过情绪分析专注于了解电子社区中情绪的动态。「网络空间的集体情绪」，欧洲委员会，2009-02-03。2010-12-13.例如，CyberEmotions 项目最近发现了负面情绪在推动社交网络讨论中的作用。“火焰驱动在线社交网络”，《新科学家》，2010-12-07。2010-12-13.

+

在研究中，朝着这个目标迈出了一步。目前，世界各地大学的几个研究团队正致力于通过情感分析来了解网络社区中的情感动态。<ref name="Collective emotions in cyberspace" /> 例如，CyberEmotions项目最近发现了负面情绪在推动社交网络讨论中的作用。<ref name="NewSci_flaming" />

+

The problem is that most sentiment analysis algorithms use simple terms to express sentiment about a product or service. However, cultural factors, linguistic nuances, and differing contexts make it extremely difficult to turn a string of written text into a simple pro or con sentiment.<ref name="Mining the Web for Feelings, Not Facts" /> The fact that humans often disagree on the sentiment of text illustrates how big a task it is for computers to get this right. The shorter the string of text, the harder it becomes.

−

问题是，大多数情绪分析算法使用简单的术语来表达对产品或服务的情绪。然而，文化因素、语言上的细微差别以及不同的语境使得将一串文字转换成简单的赞成或反对的情绪变得极其困难。事实上，人们经常不同意文本的情绪，这说明了计算机要做好这件事是多么艰巨的任务。字符串越短，就越难。

+

问题是，大多数情感分析算法使用简单的术语来表达关于产品或服务的情感。然而，受到文化因素、语言上的细微差别以及不同的语境的影响，将文本字符串转换成简单的赞成或反对的情感变得极其困难。事实上，人类经常对文本的情感产生分歧，这一事实说明了计算机要做好这项工作是一项多么艰巨的任务。文本字符串越短，难度就越大。

+

−

Even though short text strings might be a problem, sentiment analysis within [[microblogging]] has shown that [[Twitter]] can be seen as a valid online indicator of political sentiment. Tweets' political sentiment demonstrates close correspondence to parties' and politicians' political positions, indicating that the content of Twitter messages plausibly reflects the offline political landscape.<ref>Tumasjan, Andranik; O.Sprenger, Timm; G.Sandner, Philipp; M.Welpe, Isabell (2010). [http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852 "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment"]. "Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media"</ref> Furthermore, sentiment analysis on [[Twitter]] has also been shown to capture the public mood behind human reproduction cycles on a planetary scale~~{{peacock term|date=June 2018}}~~,<ref name="r25">{{cite journal|doi=10.1038/s41598-017-18262-5|pmid=29269945|pmc=5740080|title=Human Sexual Cycles are Driven by Culture and Match Collective Moods|journal=Scientific Reports|volume=7|issue=1|pages=17973|year=2017|last1=Wood|first1=Ian B.|last2=Varela|first2=Pedro L.|last3=Bollen|first3=Johan|last4=Rocha|first4=Luis M.|last5=Gonçalves-Sá|first5=Joana|bibcode=2017NatSR...717973W|arxiv=1707.03959}}</ref> as well as other problems of public-health relevance such as adverse drug reactions.<ref name="r27">{{cite journal|doi=10.1016/j.jbi.2016.06.007|pmid=27363901|pmc=4981644|title=Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts|journal=Journal of Biomedical Informatics|volume=62|pages=148–158|year=2016|last1=Korkontzelos|first1=Ioannis|last2=Nikfarjam|first2=Azadeh|last3=Shardlow|first3=Matthew|last4=Sarker|first4=Abeed|last5=Ananiadou|first5=Sophia|last6=Gonzalez|first6=Graciela H.}}</ref>

+

Even though short text strings might be a problem, sentiment analysis within [[microblogging]] has shown that [[Twitter]] can be seen as a valid online indicator of political sentiment. Tweets' political sentiment demonstrates close correspondence to parties' and politicians' political positions, indicating that the content of Twitter messages plausibly reflects the offline political landscape.<ref name=":34">Tumasjan, Andranik; O.Sprenger, Timm; G.Sandner, Philipp; M.Welpe, Isabell (2010). [http://www.aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/viewFile/1441/1852 "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment"]. "Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media"</ref> Furthermore, sentiment analysis on [[Twitter]] has also been shown to capture the public mood behind human reproduction cycles on a planetary scale,<ref name="r25">{{cite journal|doi=10.1038/s41598-017-18262-5|pmid=29269945|pmc=5740080|title=Human Sexual Cycles are Driven by Culture and Match Collective Moods|journal=Scientific Reports|volume=7|issue=1|pages=17973|year=2017|last1=Wood|first1=Ian B.|last2=Varela|first2=Pedro L.|last3=Bollen|first3=Johan|last4=Rocha|first4=Luis M.|last5=Gonçalves-Sá|first5=Joana|bibcode=2017NatSR...717973W|arxiv=1707.03959}}</ref> as well as other problems of public-health relevance such as adverse drug reactions.<ref name="r27">{{cite journal|doi=10.1016/j.jbi.2016.06.007|pmid=27363901|pmc=4981644|title=Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts|journal=Journal of Biomedical Informatics|volume=62|pages=148–158|year=2016|last1=Korkontzelos|first1=Ioannis|last2=Nikfarjam|first2=Azadeh|last3=Shardlow|first3=Matthew|last4=Sarker|first4=Abeed|last5=Ananiadou|first5=Sophia|last6=Gonzalez|first6=Graciela H.}}</ref>

−

尽管短文字符串可能是个问题，微博内的情绪分析已经表明，Twitter 可以被视为一个有效的政治情绪在线指标。推特的政治情绪表明，它与政党和政客的政治立场非常吻合，这表明推特信息的内容合理地反映了线下的政治格局。安德拉尼克; O.Sprenger，Timm; G.Sandner，Philipp; M.Welpe，Isabell (2010)。“用 Twitter 预测选举: 140个人物揭示的政治情绪”。此外，推特上的情绪分析还显示，在全球范围内，人类生殖周期背后的公众情绪，以及其他与公共健康相关的问题，如药物不良反应。

+

尽管短文字符串可能是个问题，但对微型博客的情感分析已经表明，Twitter可以被视为一个有效的政治情感在线指标。Twitter的政治情感分析表显示它与政党和政客的政治立场非常吻合，这表明推特信息的内容合理地反映了线下的政治格局。<ref name=":34" /> 此外，Twitter上的情感分析也被证明可以捕捉到，在全球范围内人类生殖周期背后的公众情感<ref name="r25" /> 以及其他与公共健康相关的问题（如药物不良反应）背后的公共情感<ref name="r27" />。

−

== Application in recommender systems ==

+

== Application in recommender systems 推荐系统中的应用 ==

+

参阅：推荐系统（Recommender system）

+

For a [[recommender system]], sentiment analysis has been proven to be a valuable technique. A [[recommender system]] aims to predict the preference for an item of a target user. Mainstream recommender systems work on explicit data set. For example, [[collaborative filtering]] works on the rating matrix, and [[content-based filtering]] works on the [[Metadata|meta-data]] of the items.

−

一个推荐系统以来，情绪分析已经被证明是一种有价值的技术。推荐系统的目的是预测目标用户对某个商品的偏好。主流推荐系统工作在显式数据集上。例如，协同过滤工作在评级矩阵上，基于内容的过滤工作在项目的元数据上。

+

对于一个推荐系统来说，情感分析已经被证明是一种有价值的技术。推荐系统的目的是预测目标用户对某个项目的偏好。'''主流推荐系统是基于显性数据集工作的。例如，协同过滤（collaborative filtering）基于评分矩阵工作，基于内容的过滤（content-based filtering）基于项目元数据工作。'''

−

In many [[social networking service]]s or [[e-commerce]] websites, users can provide text review, comment or feedback to the items. These user-generated text provide a rich source of user's sentiment opinions about numerous products and items. Potentially, for an item, such text can reveal both the related feature/aspects of the item and the users' sentiments on each feature.<ref>{{cite journal|url=https://pdfs.semanticscholar.org/8f1b/9b97183b8aa2caa0fb6c9563b14daabe8316.pdf|archive-url=https://web.archive.org/web/20180524004208/https://pdfs.semanticscholar.org/8f1b/9b97183b8aa2caa0fb6c9563b14daabe8316.pdf|url-status=dead|archive-date=2018-05-24|first1=Huifeng|last1=Tang|first2=Songbo|last2=Tan|first3=Xueqi|last3=Cheng|title=A survey on sentiment detection of reviews|journal=Expert Systems with Applications|volume=36|issue=7|year=2009|pages=10760–10773|doi=10.1016/j.eswa.2009.02.063|s2cid=2178380}}</ref> The item's feature/aspects described in the text play the same role with the meta-data in [[content-based filtering]], but the former are more valuable for the recommender system. Since these features are broadly mentioned by users in their reviews, they can be seen as the most crucial features that can significantly influence the user's experience on the item, while the meta-data of the item (usually provided by the producers instead of consumers) may ignore features that are concerned by the users. For different items with common features, a user may give different sentiments. Also, a feature of the same item may receive different sentiments from different users. Users' sentiments on the features can be regarded as a multi-dimensional rating score, reflecting their preference on the items.

+

In many [[social networking service]]s or [[e-commerce]] websites, users can provide text review, comment or feedback to the items. These user-generated text provide a rich source of user's sentiment opinions about numerous products and items. Potentially, for an item, such text can reveal both the related feature/aspects of the item and the users' sentiments on each feature.<ref name=":35">{{cite journal|url=https://pdfs.semanticscholar.org/8f1b/9b97183b8aa2caa0fb6c9563b14daabe8316.pdf|archive-url=https://web.archive.org/web/20180524004208/https://pdfs.semanticscholar.org/8f1b/9b97183b8aa2caa0fb6c9563b14daabe8316.pdf|url-status=dead|archive-date=2018-05-24|first1=Huifeng|last1=Tang|first2=Songbo|last2=Tan|first3=Xueqi|last3=Cheng|title=A survey on sentiment detection of reviews|journal=Expert Systems with Applications|volume=36|issue=7|year=2009|pages=10760–10773|doi=10.1016/j.eswa.2009.02.063|s2cid=2178380}}</ref> The item's feature/aspects described in the text play the same role with the meta-data in [[content-based filtering]], but the former are more valuable for the recommender system. Since these features are broadly mentioned by users in their reviews, they can be seen as the most crucial features that can significantly influence the user's experience on the item, while the meta-data of the item (usually provided by the producers instead of consumers) may ignore features that are concerned by the users. For different items with common features, a user may give different sentiments. Also, a feature of the same item may receive different sentiments from different users. Users' sentiments on the features can be regarded as a multi-dimensional rating score, reflecting their preference on the items.

−

在许多社交网络服务或电子商务网站，用户可以提供文本审查，评论或反馈的项目。这些用户生成的文本提供了丰富的来源，用户对许多产品和项目的情感意见。对于一个项目，这样的文本可以显示项目的相关特性/~~方面以及用户对每个特性的看法。在基于内容的过滤中，文本中描述的条目的特征~~/方面与元数据起着同样的作用，但前者对推荐系统更有价值。由于用户在评论中广泛提到这些功能，它们可以被视为能够显著影响用户对产品的体验的最关键的功能，而产品的元数据(通常由生产者而不是消费者提供)可能忽略用户关心的功能。对于具有共同特征的不同项目，用户可能会给出不同的感受。而且，同一个项目的某个特性可能会收到不同用户的不同意见。用户对特征的感受可以看作是一个多维度的评分分值，反映了用户对特征的偏好。

+

在许多社交网络服务或电子商务网站，用户可以对商品提供文本评论、意见或反馈。这些用户生成的文本提供了丰富的用户对众多产品和商品的情感意见。对于一个商品而言，这样的文本可以同时显示商品的相关功能/属性以及用户对每个特性的看法。<ref name=":35" /> 在基于内容的过滤中，文本中描述的商品的功能/属性与元数据起着同样的作用，但前者对推荐系统更有价值。由于用户在评论中广泛提到这些特性，它们可以被视为能够显著影响用户对产品的体验的最关键的特性，而产品的元数据（通常由生产者而不是消费者提供）则可能忽略用户关心的特性。对于具有共同特征的不同商品，用户可能会有不同的情感意见。而且，同一个商品的不同特性也可能会得到不同用户不同的情感意见。用户对特征的情感可以看作是一个多维度的评分分值，它反映了用户对商品的偏好。

−

Based on the feature/aspects and the sentiments extracted from the user-generated text, a hybrid recommender system can be constructed.<ref name=":0">Jakob, Niklas, et al. "Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations." ''Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion''. ACM, 2009.</ref> There are two types of motivation to recommend a candidate item to a user. The first motivation is the candidate item have numerous common features with the user's preferred items,<ref>{{cite journal|first1=Hu|last1=Minqing|first2=Bing|last2=Liu|title=Mining opinion features in customer reviews|journal=AAAI|volume=4|issue=4|year=2004|s2cid=5724860|url=https://pdfs.semanticscholar.org/ee6c/726b55c66d4c222556cfae62a4eb69aa86b7.pdf|archive-url=https://web.archive.org/web/20180524004041/https://pdfs.semanticscholar.org/ee6c/726b55c66d4c222556cfae62a4eb69aa86b7.pdf|url-status=dead|archive-date=2018-05-24}}</ref> while the second motivation is that the candidate item receives a high sentiment on its features. For a preferred item, it is reasonable to believe that items with the same features will have a similar function or utility. So, these items will also likely to be preferred by the user. On the other hand, for a shared feature of two candidate items, other users may give positive sentiment to one of them while giving negative sentiment to another. Clearly, the high evaluated item should be recommended to the user. Based on these two motivations, a combination ranking score of similarity and sentiment rating can be constructed for each candidate item.<ref name=":0" />

+

Based on the feature/aspects and the sentiments extracted from the user-generated text, a hybrid recommender system can be constructed.<ref name=":0">Jakob, Niklas, et al. "Beyond the stars: exploiting free-text user reviews to improve the accuracy of movie recommendations." ''Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion''. ACM, 2009.</ref> There are two types of motivation to recommend a candidate item to a user. The first motivation is the candidate item have numerous common features with the user's preferred items,<ref name=":36">{{cite journal|first1=Hu|last1=Minqing|first2=Bing|last2=Liu|title=Mining opinion features in customer reviews|journal=AAAI|volume=4|issue=4|year=2004|s2cid=5724860|url=https://pdfs.semanticscholar.org/ee6c/726b55c66d4c222556cfae62a4eb69aa86b7.pdf|archive-url=https://web.archive.org/web/20180524004041/https://pdfs.semanticscholar.org/ee6c/726b55c66d4c222556cfae62a4eb69aa86b7.pdf|url-status=dead|archive-date=2018-05-24}}</ref> while the second motivation is that the candidate item receives a high sentiment on its features. For a preferred item, it is reasonable to believe that items with the same features will have a similar function or utility. So, these items will also likely to be preferred by the user. On the other hand, for a shared feature of two candidate items, other users may give positive sentiment to one of them while giving negative sentiment to another. Clearly, the high evaluated item should be recommended to the user. Based on these two motivations, a combination ranking score of similarity and sentiment rating can be constructed for each candidate item.<ref name=":0" />

−

~~基于特征~~/~~方面和从用户生成的文本中提取的情感，可以构造一个混合推荐系统。雅各布，尼克拉斯，等等。“超越明星~~: 利用免费文本用户评论来提高电影推荐的准确性。”第一届国际信息和通信技术会议论文集——民意情绪分析。美国计算机协会，2009。向用户推荐候选商品有两种动机。第一个动机是候选项目与用户偏好项目具有许多共同特征，第二个动机是候选项目对其特征的高度评价。对于一个首选项目，有理由相信具有相同特性的项目将具有类似的功能或实用性。因此，这些项目也可能是首选的用户。另一方面，对于两个候选项目的共同特征，其他用户可能给予其中一个正面的情绪，而给予另一个负面的情绪。显然，应该向用户推荐评价较高的项目。基于这两个动机，可以为每个候选项目建立相似度和情感评分的组合排序评分。

+

基于功能/属性和从用户生成的文本中提取的情感，可以构造一个混合推荐系统。<ref name=":0" /> 向用户推荐候选商品的动机有两种。第一种动力是候选商品与用户偏好商品具有许多共同特征，<ref name=":36" /> 第二种动机是候选商品在其特征上获得了高度的情感评价。对于一个偏好商品来说，有理由相信具有相同特性的商品将具有类似的功能或实用性。因此，这些商品也将有可能被用户所青睐。另一方面，对于两个候选商品的共同特征，其他用户可能给予其中一个正面的评价，而给予另一个负面的评价。显然，应该向用户推荐评价较高的商品。基于这两种动机，可以为每个候选商品建立相似度和情感评分的组合排序评分。<ref name=":0" />

−

Except for the difficulty of the sentiment analysis itself, applying sentiment analysis on reviews or feedback also faces the challenge of spam and biased reviews. One direction of work is focused on evaluating the helpfulness of each review.<ref>{{cite book|first1=Yang|last1=Liu|first2=Xiangji|last2=Huang|first3=Aijun|last3=An|first4=Xiaohui|last4=Yu|chapter-url=http://www.yorku.ca/xhyu/papers/ICDM2008.pdf|chapter=Modeling and predicting the helpfulness of online reviews|year=2008|title=ICDM'08. Eighth IEEE international conference on Data mining|pages=443–452|publisher= IEEE|doi=10.1109/ICDM.2008.94|isbn=978-0-7695-3502-9|s2cid=18235238}}</ref> Review or feedback poorly written is hardly helpful for recommender system. Besides, a review can be designed to hinder sales of a target product, thus be harmful to the recommender system even it is well written.

+

Except for the difficulty of the sentiment analysis itself, applying sentiment analysis on reviews or feedback also faces the challenge of spam and biased reviews. One direction of work is focused on evaluating the helpfulness of each review.<ref name=":37">{{cite book|first1=Yang|last1=Liu|first2=Xiangji|last2=Huang|first3=Aijun|last3=An|first4=Xiaohui|last4=Yu|chapter-url=http://www.yorku.ca/xhyu/papers/ICDM2008.pdf|chapter=Modeling and predicting the helpfulness of online reviews|year=2008|title=ICDM'08. Eighth IEEE international conference on Data mining|pages=443–452|publisher= IEEE|doi=10.1109/ICDM.2008.94|isbn=978-0-7695-3502-9|s2cid=18235238}}</ref> Review or feedback poorly written is hardly helpful for recommender system. Besides, a review can be designed to hinder sales of a target product, thus be harmful to the recommender system even it is well written.

−

除了情感分析本身的困难之外，对评论或反馈进行情感分析也面临着垃圾评论和有偏见的评论的挑战。其中一个工作方向是评估每个审查的有用性。写得不好的评论或反馈对推荐系统几乎没有任何帮助。此外，审查可能被设计成阻碍目标产品的销售，因此即使它写得很好也会对推荐系统产品造成伤害。

+

除了情感分析本身的困难之外，对评论或反馈进行情感分析还面临着垃圾评论和有偏见的评论的挑战。其中一个工作方向是评估每条评论的有用性，<ref name=":37" />因为粗劣的评论或反馈对推荐系统几乎没有任何帮助。此外，评论可能被刻意设计成阻碍目标产品销售，因此即使它写得很好也会对推荐系统造成伤害。

−

Researchers also found that long and short forms of user-generated text should be treated differently. An interesting result shows that short-form reviews are sometimes more helpful than long-form,<ref>{{cite book|doi=10.1145/1871437.1871741|last1=Bermingham|first1=Adam|last2=Smeaton|first2=Alan F.|title=Classifying sentiment in microblogs: is brevity an advantage?|journal=Proceedings of the 19th ACM International Conference on Information and Knowledge Management|pages=1833|year=2010|isbn=9781450300995|s2cid=2084603|url=http://doras.dcu.ie/15663/1/cikm1079-bermingham.pdf}}</ref> because it is easier to filter out the noise in a short-form text. For the long-form text, the growing length of the text does not always bring a proportionate increase in the number of features or sentiments in the text.

+

Researchers also found that long and short forms of user-generated text should be treated differently. An interesting result shows that short-form reviews are sometimes more helpful than long-form,<ref name=":38">{{cite book|doi=10.1145/1871437.1871741|last1=Bermingham|first1=Adam|last2=Smeaton|first2=Alan F.|title=Classifying sentiment in microblogs: is brevity an advantage?|journal=Proceedings of the 19th ACM International Conference on Information and Knowledge Management|pages=1833|year=2010|isbn=9781450300995|s2cid=2084603|url=http://doras.dcu.ie/15663/1/cikm1079-bermingham.pdf}}</ref> because it is easier to filter out the noise in a short-form text. For the long-form text, the growing length of the text does not always bring a proportionate increase in the number of features or sentiments in the text.

−

研究人员还发现，用户生成的长文本和短文本应该区别对待。一个有趣的结果表明，短形式的评论有时比长形式的评论更有帮助，因为它更容易过滤掉短形式文本中的干扰。对于长篇文本，文本长度的增长并不总是带来文本中特征或情感数量的相应增加。

+

研究人员还发现，应该用不同的方法处理用户生成的长文本和短文本。一个有趣的结果表明，短形式的评论有时比长形式的评论更有帮助，<ref name=":38" /> 因为它更容易过滤掉短形式文本中的干扰。对于长文本而言，文本长度的增长并不总是带来文本中特征或情感数量的相应增加。

−

Lamba & Madhusudhan<ref>{{cite journal |last1=Lamba |first1=Manika |last2=Madhusudhan |first2=Margam |title=Application of sentiment analysis in libraries to provide temporal information service: a case study on various facets of productivity |journal=Social Network Analysis and Mining |year=2018 |volume=8 |issue=1|pages=1–12|doi=10.1007/s13278-018-0541-y |s2cid=53047128 }}</ref> introduce a nascent way to cater the information needs of today’s library users by repackaging the results from sentiment analysis of social media platforms like Twitter and provide it as a consolidated time-based service in different formats. Further, they propose a new way of conducting marketing in libraries using social media mining and sentiment analysis.

+

Lamba & Madhusudhan<ref name=":39">{{cite journal |last1=Lamba |first1=Manika |last2=Madhusudhan |first2=Margam |title=Application of sentiment analysis in libraries to provide temporal information service: a case study on various facets of productivity |journal=Social Network Analysis and Mining |year=2018 |volume=8 |issue=1|pages=1–12|doi=10.1007/s13278-018-0541-y |s2cid=53047128 }}</ref> introduce a nascent way to cater the information needs of today’s library users by repackaging the results from sentiment analysis of social media platforms like Twitter and provide it as a consolidated time-based service in different formats. Further, they propose a new way of conducting marketing in libraries using social media mining and sentiment analysis.

−

Lamba & Madhusudhan 介绍了一种新的方法来满足当今图书馆用户的信息需求，方法是将 Twitter 等社交媒体平台的情绪分析结果重新打包，以不同的格式提供综合的基于时间的服务。此外，他们还提出了一种利用社会媒体挖掘和情感分析在图书馆进行营销的新方法。

+

Lamba和Madhusudhan<ref name=":39" /> 介绍了一种新的方法，即通过重新打包Twitter等社交媒体平台的情感分析结果，并以不同的形式提供基于时间的综合服务，来满足当今图书馆用户的信息需求。此外，他们还提出了一种利用社交媒体挖掘和情感分析在图书馆进行营销的新方法。

−

==See ~~also~~==

+

==See also参阅==

* [[Emotion recognition]]

* [[Market sentiment]]

* [[Stylometry]]

−

* 情感识别

−

* ~~市场情绪~~

+

* 市场情感

* 文体学

== References ==

−

~~{{Natural Language Processing}}~~

第602行：第604行：

[[Category:Social media]]

[[Category:Polling]]

−

~~Category:Natural language processing~~

−

~~Category:Affective computing~~

−

~~Category:Social media~~

−

~~Category:Polling~~

−

~~类别: 自然语言处理类别: 情感计算类别: 社会媒体类别: 轮询~~

−

~~<noinclude>~~

−

~~This page was moved from [[wikipedia:en:Sentiment analysis]]. Its edit history can be viewed at [[情感分析/edithistory]]</noinclude>~~

−

[[Category:待整理页面]]

Kuangmy

54

个编辑

更改

情感分析 (查看源代码)

2021年8月11日 (三) 17:01的版本

导航菜单

搜索