摘要
意见挖掘中,产品特征层次的学习是其中重要的环节之一。为了更准确的学习产品特征层次,提出了一种从非规则与规则意见文本语料中对产品特征层次进行学习的算法。该算法能同时对包含专业描述的规则语料以及人为指定主题的非规则语料进行学习。利用文本特征词识别技术去除与主题相关度较差的词汇,并使用相对熵和语法结构分析方法从语料中产生出层次关系。实验结果表明,该算法能够较好地学习特征层次。
Product feature hierarchy learning is one of important parts in opinion mining area.In order to promote the assurance of product feature hierarchy learning,this paper gives an algorithm which can learn from regular and irregular opining text corpus.It can learn from regular corpus containing technical word and irregular corpus classed according to the subject at the same time.Then using the text feature discriminating technology to get rid of the irrespective word,and using relation entropy、 syntax structure method to get hierarchy from corpus.The experiment shows this algorithm can get a better result.
出处
《微处理机》
2010年第5期81-85,共5页
Microprocessors
基金
国家科技支撑计划项目(0216002343012)
关键词
特征层次
相对熵
语法结构
Feature hierarchy
Relation entropy
Syntax structure