Fake news has recently leveraged the power and scale of online social media to effectively spread misinformation which not only erodes the trust of people on traditional presses and journalisms, but also manipulates t...Fake news has recently leveraged the power and scale of online social media to effectively spread misinformation which not only erodes the trust of people on traditional presses and journalisms, but also manipulates the opinions and sentiments of the public. Detecting fake news is a daunting challenge due to subtle difference between real and fake news. As a first step of fighting with fake news, this paper characterizes hundreds of popular fake and real news measured by shares, reactions, and comments on Facebook from two perspectives:domain reputations and content understanding. Our domain reputation analysis reveals that the Web sites of the fake and real news publishers exhibit diverse registration behaviors, registration timing, domain rankings, and domain popularity. In addition, fake news tends to disappear from the Web after a certain amount of time. The content characterizations on the fake and real news corpus suggest that simply applying term frequency-inverse document frequency(tf-idf) and Latent Dirichlet Allocation(LDA) topic modeling is inefficient in detecting fake news,while exploring document similarity with the term and word vectors is a very promising direction for predicting fake and real news. To the best of our knowledge, this is the first effort to systematically study domain reputations and content characteristics of fake and real news, which will provide key insights for effectively detecting fake news on social media.展开更多
目的对基于社交媒体的乳腺癌相关内容分析研究进行范围综述。方法依据范围综述方法学框架,检索Web of Science、PubMed、Cochrane Library、CINAHL、Embase、中国知网、万方和中国生物医学数据库中的相关研究,检索时限为2013年1月1日-2...目的对基于社交媒体的乳腺癌相关内容分析研究进行范围综述。方法依据范围综述方法学框架,检索Web of Science、PubMed、Cochrane Library、CINAHL、Embase、中国知网、万方和中国生物医学数据库中的相关研究,检索时限为2013年1月1日-2023年1月1日,对纳入文献进行汇总和分析。结果最终纳入70篇文献,研究大多数来自美国,发表于2019-2022年。研究以乳腺癌患者或利益相关者作为研究对象,聚焦于社会支持、发帖内容准确性、治疗等主题,更多关注具有广泛受众的社交媒体平台Twitter、Facebook,和乳腺癌特异性社交媒体Breastcancer.org等,多数通过关键词、标签和算法检索帖子进行数据收集,根据帖子数量及研究目的选择人工处理、机器算法等形式,并从主题和情感2个主要维度开展文本、图像等的内容分析。结论目前基于社交媒体的乳腺癌相关内容分析研究关注社会支持和发帖内容的准确性等研究方向,数据分析方法涉及小样本手工分析和大样本机器学习,相关结果丰富了乳腺癌患者及其利益相关人群的需求和体验研究,可为基于患者报告的体验研究提供多样化的研究成果。后续研究可积极探索主流媒体中乳腺癌患者的真实需求及体验,具化各类群体的需求特征,从而基于社交媒体为乳腺癌群体建立精准化的信息服务方案。展开更多
基金supported in part by National Science Foundation (NSF) Algorithms for Threat Detection (ATD) Program (No. DMS #1737861)NSF Computer and Network Systems (CNS) Program (No. CNS #1816995)
文摘Fake news has recently leveraged the power and scale of online social media to effectively spread misinformation which not only erodes the trust of people on traditional presses and journalisms, but also manipulates the opinions and sentiments of the public. Detecting fake news is a daunting challenge due to subtle difference between real and fake news. As a first step of fighting with fake news, this paper characterizes hundreds of popular fake and real news measured by shares, reactions, and comments on Facebook from two perspectives:domain reputations and content understanding. Our domain reputation analysis reveals that the Web sites of the fake and real news publishers exhibit diverse registration behaviors, registration timing, domain rankings, and domain popularity. In addition, fake news tends to disappear from the Web after a certain amount of time. The content characterizations on the fake and real news corpus suggest that simply applying term frequency-inverse document frequency(tf-idf) and Latent Dirichlet Allocation(LDA) topic modeling is inefficient in detecting fake news,while exploring document similarity with the term and word vectors is a very promising direction for predicting fake and real news. To the best of our knowledge, this is the first effort to systematically study domain reputations and content characteristics of fake and real news, which will provide key insights for effectively detecting fake news on social media.
文摘目的对基于社交媒体的乳腺癌相关内容分析研究进行范围综述。方法依据范围综述方法学框架,检索Web of Science、PubMed、Cochrane Library、CINAHL、Embase、中国知网、万方和中国生物医学数据库中的相关研究,检索时限为2013年1月1日-2023年1月1日,对纳入文献进行汇总和分析。结果最终纳入70篇文献,研究大多数来自美国,发表于2019-2022年。研究以乳腺癌患者或利益相关者作为研究对象,聚焦于社会支持、发帖内容准确性、治疗等主题,更多关注具有广泛受众的社交媒体平台Twitter、Facebook,和乳腺癌特异性社交媒体Breastcancer.org等,多数通过关键词、标签和算法检索帖子进行数据收集,根据帖子数量及研究目的选择人工处理、机器算法等形式,并从主题和情感2个主要维度开展文本、图像等的内容分析。结论目前基于社交媒体的乳腺癌相关内容分析研究关注社会支持和发帖内容的准确性等研究方向,数据分析方法涉及小样本手工分析和大样本机器学习,相关结果丰富了乳腺癌患者及其利益相关人群的需求和体验研究,可为基于患者报告的体验研究提供多样化的研究成果。后续研究可积极探索主流媒体中乳腺癌患者的真实需求及体验,具化各类群体的需求特征,从而基于社交媒体为乳腺癌群体建立精准化的信息服务方案。