摘要
为充分利用影评数据提高影视推荐效果,构建出具有特定属性的影视与评论知识图谱,提出了一种半自动的影评知识抽取方法。首先将源于网络的非结构化影评数据进行清洗、分词等预处理。然后逐句进行剖析获得影评句法树,统计分析涉及到的电影元素词以及情感词,构建用于知识抽取词典,对影评数据进行标注。最后制定知识抽取规则,结合词典和抽象量化聚类进行知识抽取,获得影评结构化知识,与电影本体知识进行融合成为影视与评论知识图谱。因新的知识结构包含用户体验等主观因素,包含影评信息的知识图谱可以更好地应用于智能推荐和其它知识图谱的应用领域中。
In order to make full use of the film review data to improve the effect of film recommendation, a semi-automatic knowledge extraction method for film review is proposed. First, clean and segment were implemented with unstructured film review data from the network. Then, the sentence-by-sentence analysis was carried out to obtain the syntax tree of film review, and the movie element words and emotional words involved were statistically analyzed. A dictionary for knowledge extraction was constructed and the film review data were annotated. Finally, knowledge extraction rules were formulated, and knowledge extraction was carried out by combining dictionary and abstract quantitative clustering to obtain structured knowledge of film review, which was integrated with film ontology knowledge to form a Knowledge Graph of film and review. The Knowledge Graph containing film review information can be better applied to intelligent recommendation and other Knowledge Graph applications because the new knowledge structure contains subjective factors such as user experience.
作者
许智宏
于子琪
董永峰
闫文杰
XU Zhi-hong;YU Zi-qi;DONG Yong-feng;YAN Wen-jie(School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;Hebei Province Key Laboratory of Big Data Calculation,Tianjin 300401,China)
出处
《计算机仿真》
北大核心
2020年第8期424-430,共7页
Computer Simulation
基金
国家自然科学基金(61702157)
河北省科技支撑计划(15210506)
天津市自然科学基金(16JCQNJC00400,16JCYBJC15600)。
关键词
知识图谱
影视评论
自然语言处理
情感分析
Knowledge graph
Film reviews
Natural language processing
Emotional analysis