摘要
本文提出了一种基于数据融合的互联网舆情分析系统。系统使用网络爬虫采集互联网新闻、微信公众号、博客、论坛、APP、微博、报纸、视频等信息,结合中国移动自有DPI数据,采用情感分析等多种自然语言处理算法实现数据融合分析处理,建立不同类型数据之间的关联关系,挖掘更多数据价值,且系统使用多租户模式实现底层数据共享和用户舆情信息隔离,大量节省硬件存储成本,也让用户体验个性化舆情。
This paper proposes an Internet public sentiment analysis system based on data fusion.The news,WeChatpublic information,blog,forum,APP,micro-blog,newspapers,video which collected from the Internetby web crawler and DPI which from the China Mobile had been used to achieve data fusion and dataanalysis by using multiple Natural Language Processing algorithms such as the sentiment analysisalgorithm.Multi-tenant had been used to achieve the data sharing and information isolation,which canimprove resource utilization and allow user to experience personalized public opinion.
作者
杜晓黎
钱岭
张海文
杨希
DU Xiao-li;QIAN Ling;ZHANG Hai-wen;YANG Xi(China Mobile (Suzhou) Software Technology Co., Ltd./China Mobile Suzhou R & D Center, Suzhou 215163, China)
出处
《电信工程技术与标准化》
2017年第7期26-30,共5页
Telecom Engineering Technics and Standardization
关键词
互联网采集
数据融合
舆情
多租户
自然语言
internet Web crawler
data fusion
public sentiment
multi-tenant
natural language processing