摘要
【目的】基于自然语言处理技术和复杂网络相关理论,提出新的产品特征识别方法,提高产品特征的抽取效果。【方法】构建产品特征–情感词对的二分加权网络,从网络视角更加清晰、直观地描述产品特征词和情感词之间的关系。然后提出NodeRank算法对产品特征词进行重要性排序,提高特征词提取的准确率。【结果】通过对京东商城中真实评论数据的仿真实验,结果表明NodeRank算法产品特征提取的准确率、召回率和F-score都高于HAC、TF-IDF和TextRank等基准算法。【局限】NodeRank算法的计算复杂度偏高,需要进一步优化。【结论】NodeRank算法是一种准确有效的特征提取方法,能够为产品特征提取、产品营销等商业活动提供支持。
[Objective] This paper presents a novel algorithm based on the NLP technique and complex network theory, aiming to extract product features more effectively. [Methods] First, we constructed a weighted bipartite graph with the product features and sentiment words, which described their relationship more clearly and intuitively from network perspective. Then, we proposed the NodeRank algorithm to rank the importance of product features, which improved the precision of feature extraction. [Results] We examined the proposed algorithm with data from jd.com, a popular online shopping site in China. The precision, recall and F-score of the NodeRank algorithm were better than the HAC, TF-IDF and Text Rank methods. [Limitations] The computational complexity of our new algorithm needs to be optimized. [Conclusions] The NodeRank algorithm could effectively extract the product features, which supports marketing and other business activities.
作者
周立欣
林杰
Zhou Lixin;Lin Jie(School of Economics and Management, Tongji University, Shanghai 200092, Chin)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2018年第4期90-98,共9页
Data Analysis and Knowledge Discovery
基金
国家自然科学基金项目"社交媒体中用户创新价值度测量模型及互动创新管理方法研究"(项目编号:71672128)
中央高校基本科研业务费专项资金项目"基于大数据的社交网络传播机理与模型研究"(项目编号:1200219368)的研究成果之一