摘要
针对中文网络客户评论中产品特征提取问题,提出采用FP增长算法获取候选产品特征集,再根据独立支持度、频繁项名词非特征规则及PMI阈值过滤技术对候选产品特征进行筛选,得到最终产品特征集,从而实现对中文网络客户评论中产品特征信息的自动挖掘。采用数据堂提供的手机评论语料,对该方法进行数据实验,实验结果可以验证该方法的有效性。
Aim for better solving the problem of extracting features from Chinese product reviews on the Intemet, an approach using FP - growth algorithm is proposed to obtain the set of candidate product features. Then, the candidate product features are filtered according to the rules of p - support, non - features frequent nouns and PMI threshold filtering technology. Finally, the final product features set are obtained. Thus, the automatic mining of product features information from Chinese customer reviews on the Internet is achieved. The proposed method is tested with the cell phone reviews from Datatang and the results show that the presented method is valid and effective.
出处
《现代图书情报技术》
CSSCI
北大核心
2013年第12期70-73,共4页
New Technology of Library and Information Service
基金
国家社会科学基金项目"差错管理气氛对企业创新行为的影响机理及对策研究"(项目编号:12CGL049)
重庆市自然科学基金项目"基于在线社交网络的舆情演化及社会化协同过滤推荐算法研究"(项目编号:CSTC2011jjA40045)的研究成果之一
关键词
产品特征
特征提取
关联规则
评论挖掘
Product features
Features extracting
Association rules
Review mining