摘要
特征抽取是网络舆情分析中最重要的环节之一,优秀的特征抽取算法能够极大的提高舆情分析的效率和准确率.对旅游网络舆情进行分析和监管,能够及时发现云南旅游中的突发事件,可提供给相关部门以便迅速采取正确的应对方式,对云南的旅游业发展有很大的帮助,分析了传统特征抽取算法正确率低下、运行效率不高等方面的不足,将领域本体知识应用在旅游网络舆情分析的特征抽取算法之中,建立旅游网络舆情领域本体,根据领域本体优化特征抽取计算特征词权重,经过多次大数据量试验验证,优化后的方法显著提高了特征抽取的正确率和运行效率,证明基于领域知识的特征抽取的正确率和运行效率得到很大的提升.
Feature extraction is one of the most important links in the analysis of public opinion while an excellent feature extraction algorithm can greatly improve the efficiency and accuracy of such analysis. The analysis and supervision of the public opinion on the tourism network can help the relevant departments discover the unexpected events in Yunnan and promptly adopt the correct approaches, which can help the healthy development of the tourism of Yunnan. This paper analyzes the low efficiency and inaccuracy of the traditional feature extraction algorithm, and then applies domain knowledge to the new feature extraction algorithm for the analysis of the public opinion on the tourism network. Through investigating and researching the information through inquiries and consulting with some experts, this paper builds a domain ontology for public opinion on the tourism network, and then extracts the weights of the feature words according to the domain ontology by optimizing the features. Several tests based on big data show the efficiency and accuracy of this feature extraction algorithm based on domain knowledge, which proves that domain knowledge has a very positive effect on the analysis of public opinion on the tourism network.
出处
《云南民族大学学报(自然科学版)》
CAS
2017年第3期252-257,共6页
Journal of Yunnan Minzu University:Natural Sciences Edition
基金
基金项目:云南省高校商务智能科技创新团队(42212217010)
关键词
旅游网络舆情
领域本体
特征抽取
权重
public opinion on the tourism network
domain ontology
feature extraction
weights