第44行: |
第44行: |
| | | |
| ===计算内容分析 === | | ===计算内容分析 === |
| + | 长期以来,内容分析一直是社会科学和媒体研究的传统方法。内容分析的自动化使得这一领域发生了一场“大数据”革命,这些研究中,社交媒体和报纸内容包括了数百万条的新闻。<ref>{{cite journal|author1=I. Flaounas|author2=M. Turchi|author3=O. Ali|author4=N. Fyson|author5=T. De Bie|author6=N. Mosdell|author7=J. Lewis|author8=N. Cristianini|title=The Structure of EU Mediasphere|journal=PLOS One|volume=5|issue=12|pages=e14243|year=2010|doi=10.1371/journal.pone.0014243|url=https://orca-mwe.cf.ac.uk/50732/1/Flaounas%202010.pdf|pmid=21170383|pmc=2999531|bibcode=2010PLoSO...514243F}}</ref><ref>{{cite journal|title=Nowcasting Events from the Social Web with Statistical Learning|author1=V Lampos|author2=N Cristianini|journal=ACM Transactions on Intelligent Systems and Technology |volume=3|issue=4|page=72|doi=10.1145/2337542.2337557|year=2012|url=http://www.lampos.net/sites/default/files/papers/lampos2012nowcasting.pdf}}</ref><ref>{{cite conference|title=NOAM: news outlets analysis and monitoring system|author1=I. Flaounas|author2=O. Ali|author3=M. Turchi|author4=T Snowsill|author5=F Nicart|author6=T De Bie|author7=N Cristianini|conference=Proc. of the 2011 ACM SIGMOD international conference on Management of data|year=2011|url=http://www.tijldebie.net/system/files/SIGMOD_11_demo_Ilias.pdf|doi=10.1145/1989323.1989474}}</ref><ref>{{cite book|author=N Cristianini|title=''Combinatorial Pattern Matching''|pages=2–13|year=2011|volume=6661|series= Lecture Notes in Computer Science|isbn=978-3-642-21457-8|doi=10.1007/978-3-642-21458-5_2|chapter=Automatic Discovery of Patterns in Media Content|citeseerx=10.1.1.653.9525}}</ref><ref>{{Cite journal|last=Lansdall-Welfare|first=Thomas|last2=Sudhahar|first2=Saatviga|last3=Thompson|first3=James|last4=Lewis|first4=Justin|last5=Team|first5=FindMyPast Newspaper|last6=Cristianini|first6=Nello|date=2017-01-09|title=Content analysis of 150 years of British periodicals|url=http://www.pnas.org/content/early/2017/01/03/1606380114|journal=Proceedings of the National Academy of Sciences|volume=114|issue=4|language=en|pages=E457–E465|doi=10.1073/pnas.1606380114|issn=0027-8424|pmid=28069962|pmc=5278459}}</ref> 基于数以百万计的文档和文本挖掘方法,性别偏差Gender Bias、可读性 Readability、内容相似度 Content Similarity'''、读者偏好 Reader Preferences,甚至是情绪 Mood都可以被分析。Flaounas等人对可读性、性别偏差和话题偏差 Topic Bias的研究,<ref>{{cite journal|author1=I. Flaounas|author2=O. Ali|author3=M. Turchi|author4=T. Lansdall-Welfare|author5=T. De Bie|author6=N. Mosdell|author7=J. Lewis|author8=N. Cristianini|title=Research methods in the age of digital journalism|journal=Digital Journalism|year=2012|doi=10.1080/21670811.2012.714928|volume=1|pages=102–116}}</ref> 展示了不同的话题如何有不同的性别偏差和可读性水平; 以及通过分析推特内容来检测人群情绪变化的可能性。<ref>{{cite conference|title=Effects of the Recession on Public Mood in the UK|author=T Lansdall-Welfare|author2=V Lampos|author3=N Cristianini|series=Mining Social Network Dynamics (MSND) session on Social Media Applications|doi=10.1145/2187980.2188264|conference=Proceedings of the 21st International Conference on World Wide Web|pages=1221–1226|location=New York, NY, USA|url=http://www.cs.bris.ac.uk/Publications/Papers/2001521.pdf}}</ref> |
| | | |
− | [[Content analysis]] has been a traditional part of social sciences and media studies for a long time. The automation of content analysis has allowed a "[[big data]]" revolution to take place in that field, with studies in social media and newspaper content that include millions of news items. [[Gender bias]], [[readability]], content similarity, reader preferences, and even mood have been analyzed based on [[text mining]] methods over millions of documents.<ref>{{cite journal|author1=I. Flaounas|author2=M. Turchi|author3=O. Ali|author4=N. Fyson|author5=T. De Bie|author6=N. Mosdell|author7=J. Lewis|author8=N. Cristianini|title=The Structure of EU Mediasphere|journal=PLOS One|volume=5|issue=12|pages=e14243|year=2010|doi=10.1371/journal.pone.0014243|url=https://orca-mwe.cf.ac.uk/50732/1/Flaounas%202010.pdf|pmid=21170383|pmc=2999531|bibcode=2010PLoSO...514243F}}</ref><ref>{{cite journal|title=Nowcasting Events from the Social Web with Statistical Learning|author1=V Lampos|author2=N Cristianini|journal=ACM Transactions on Intelligent Systems and Technology |volume=3|issue=4|page=72|doi=10.1145/2337542.2337557|year=2012|url=http://www.lampos.net/sites/default/files/papers/lampos2012nowcasting.pdf}}</ref><ref>{{cite conference|title=NOAM: news outlets analysis and monitoring system|author1=I. Flaounas|author2=O. Ali|author3=M. Turchi|author4=T Snowsill|author5=F Nicart|author6=T De Bie|author7=N Cristianini|conference=Proc. of the 2011 ACM SIGMOD international conference on Management of data|year=2011|url=http://www.tijldebie.net/system/files/SIGMOD_11_demo_Ilias.pdf|doi=10.1145/1989323.1989474}}</ref><ref>{{cite book|author=N Cristianini|title=''Combinatorial Pattern Matching''|pages=2–13|year=2011|volume=6661|series= Lecture Notes in Computer Science|isbn=978-3-642-21457-8|doi=10.1007/978-3-642-21458-5_2|chapter=Automatic Discovery of Patterns in Media Content|citeseerx=10.1.1.653.9525}}</ref><ref>{{Cite journal|last=Lansdall-Welfare|first=Thomas|last2=Sudhahar|first2=Saatviga|last3=Thompson|first3=James|last4=Lewis|first4=Justin|last5=Team|first5=FindMyPast Newspaper|last6=Cristianini|first6=Nello|date=2017-01-09|title=Content analysis of 150 years of British periodicals|url=http://www.pnas.org/content/early/2017/01/03/1606380114|journal=Proceedings of the National Academy of Sciences|volume=114|issue=4|language=en|pages=E457–E465|doi=10.1073/pnas.1606380114|issn=0027-8424|pmid=28069962|pmc=5278459}}</ref> The analysis of readability, gender bias and topic bias was demonstrated in Flaounas et al.<ref>{{cite journal|author1=I. Flaounas|author2=O. Ali|author3=M. Turchi|author4=T. Lansdall-Welfare|author5=T. De Bie|author6=N. Mosdell|author7=J. Lewis|author8=N. Cristianini|title=Research methods in the age of digital journalism|journal=Digital Journalism|year=2012|doi=10.1080/21670811.2012.714928|volume=1|pages=102–116}}</ref> showing how different topics have different gender biases and levels of readability; the possibility to detect mood shifts in a vast population by analysing Twitter content was demonstrated as well.<ref>{{cite conference|title=Effects of the Recession on Public Mood in the UK|author=T Lansdall-Welfare|author2=V Lampos|author3=N Cristianini|series=Mining Social Network Dynamics (MSND) session on Social Media Applications|doi=10.1145/2187980.2188264|conference=Proceedings of the 21st International Conference on World Wide Web|pages=1221–1226|location=New York, NY, USA|url=http://www.cs.bris.ac.uk/Publications/Papers/2001521.pdf}}</ref>
| |
| | | |
− | Content analysis has been a traditional part of social sciences and media studies for a long time. The automation of content analysis has allowed a "big data" revolution to take place in that field, with studies in social media and newspaper content that include millions of news items. Gender bias, readability, content similarity, reader preferences, and even mood have been analyzed based on text mining methods over millions of documents. The analysis of readability, gender bias and topic bias was demonstrated in Flaounas et al. showing how different topics have different gender biases and levels of readability; the possibility to detect mood shifts in a vast population by analysing Twitter content was demonstrated as well.
| + | 对大量历史报纸内容的分析是由 Dzogang 等人率先进行的,<ref>{{Cite journal|last=Dzogang|first=Fabon|last2=Lansdall-Welfare|first2=Thomas|last3=Team|first3=FindMyPast Newspaper|last4=Cristianini|first4=Nello|date=2016-11-08|title=Discovering Periodic Patterns in Historical News|journal=PLOS One|volume=11|issue=11|pages=e0165736|doi=10.1371/journal.pone.0165736|issn=1932-6203|pmc=5100883|pmid=27824911|bibcode=2016PLoSO..1165736D}}</ref> 他们展示了如何在历史报纸中自动发现'''周期结构 Periodic Structures'''。一个类似的社交媒体上进行的分析,也揭示了强烈的周期性结构。 参考文献[ https://core.ac.uk/download/pdf/83929129.pdf 维基百科搜索和推特帖子揭示的集体情绪的季节性波动] |
− | | |
− | 长期以来,内容分析一直是社会科学和媒体研究的传统方法。内容分析的自动化使得这一领域发生了一场“大数据”革命,这些研究中,社交媒体和报纸内容包括了数百万条的新闻。基于数以百万计的文档和文本挖掘方法,'''性别偏差Gender Bias'''、'''可读性 Readability'''、'''内容相似度 Content Similarity'''、'''读者偏好 Reader Preferences''',甚至是'''情绪 Mood'''都可以被分析。Flaounas等人对可读性、性别偏差和'''话题偏差 Topic Bias'''的研究,展示了不同的话题如何有不同的性别偏差和可读性水平; 以及通过分析推特内容来检测人群情绪变化的可能性。
| |
− | | |
− | | |
− | | |
− | The analysis of vast quantities of historical newspaper content has been pioneered by Dzogang et al.,<ref>{{Cite journal|last=Dzogang|first=Fabon|last2=Lansdall-Welfare|first2=Thomas|last3=Team|first3=FindMyPast Newspaper|last4=Cristianini|first4=Nello|date=2016-11-08|title=Discovering Periodic Patterns in Historical News|journal=PLOS One|volume=11|issue=11|pages=e0165736|doi=10.1371/journal.pone.0165736|issn=1932-6203|pmc=5100883|pmid=27824911|bibcode=2016PLoSO..1165736D}}</ref> which showed how periodic structures can be automatically discovered in historical newspapers. A similar analysis was performed on social media, again revealing strongly periodic structures.<ref>[https://core.ac.uk/download/pdf/83929129.pdf Seasonal Fluctuations in Collective Mood Revealed by Wikipedia Searches and Twitter Posts] F Dzogang, T Lansdall-Welfare, N Cristianini - 2016 IEEE International Conference on Data Mining, Workshop on ''Data Mining'' in Human Activity Analysis
| |
− | | |
− | The analysis of vast quantities of historical newspaper content has been pioneered by Dzogang et al., which showed how periodic structures can be automatically discovered in historical newspapers. A similar analysis was performed on social media, again revealing strongly periodic structures.<ref>[https://core.ac.uk/download/pdf/83929129.pdf Seasonal Fluctuations in Collective Mood Revealed by Wikipedia Searches and Twitter Posts] F Dzogang, T Lansdall-Welfare, N Cristianini - 2016 IEEE International Conference on Data Mining, Workshop on Data Mining in Human Activity Analysis
| |
− | | |
− | 对大量历史报纸内容的分析是由 Dzogang 等人率先进行的,他们展示了如何在历史报纸中自动发现'''周期结构 Periodic Structures'''。一个类似的社交媒体上进行的分析,也揭示了强烈的周期性结构。 参考文献[ https://core.ac.uk/download/pdf/83929129.pdf 维基百科搜索和推特帖子揭示的集体情绪的季节性波动] f Dzogang,t Lansdall-Welfare,n Cristianini-2016 IEEE International Conference on Data Mining, Workshop on Data Mining in Human Activity Analysis
| |
− | | |
− | </ref>
| |
− | | |
− | </ref>
| |
− | | |
− | / 参考
| |
| | | |
| ==挑战== | | ==挑战== |