摘要
新材料、新技术、新工艺的应用使得新属性广泛存在于新上市的产品中。现有的产品属性抽取方法通常只关注评价对象的主要属性抽取,未对新属性识别展开深入研究,从而影响以属性抽取为研究基础的相关研究的实验结论。针对该情况,本研究将产品新属性识别转化为分类任务,分别将分类模型、条件随机场(CRF)、双向长短期记忆网络与条件随机场结合的深度学习模型(Bi-LSTM-CRF)应用到该任务中。对实验结果进行分析,确定使用CRF模型获取候选新属性;随后,使用四种强约束规则过滤噪音,优化模型识别结果;最后,为增强所识别新属性的可解释性,基于层次聚类的思想对新属性和种子属性进行聚类,以种子属性解释新属性。实验结果表明本研究所提出的产品新属性识别方案能够对产品属性进行有效扩充。
The new product attributes widely exist in the newly marketed products because of the application of new materials,new technologies and new processes.The existing product attributes extraction methods mainly focus on the extraction of core attributes,and new attributes are not recognized.This will affect the experimental results of related research based on attribute extraction.In view of this situation,we transformed the new attributes recognition into classification tasks,and utilized the classification model,conditional random field(CRF)and deep learning model(Bi-LSTM-CRF)to solve this task.We analyzed the experimental results,and decided to employ CRF model to get candidate new attributes.And we filtered noise by four strong rule-based methods.In order to enhance the interpretability of the new attributes,the new attributes were clustered through the idea of hierarchical clustering.Experimental results show that the proposed scheme of new attributes recognition can effectively extend the collection of product attributes.
作者
秦成磊
章成志
Qin Chenglei;Zhang Chengzhi(Department of Information Management,Nanjing University of Science and Technology,Nanjing,210094)
出处
《信息资源管理学报》
CSSCI
2020年第3期78-91,共14页
Journal of Information Resources Management
基金
国家社科基金重大项目“面向知识创新服务的数据科学理论与方法研究”(16ZAD224)的成果之一。