摘要
企业间关系是企业价值链中最重要的组成部分之一,也是企业管理者最感兴趣的信息之一,对于企业决策和管理具有重大意义;从企业年报文本中抽取企业关系和实体关系要求高时效性和强鲁棒性。实体关系抽取的核心问题在于关系模式的选择和提取,由于中文句式较复杂、表达方式灵活、语义多样等固有性质的限制,导致关系实例的关系表述不准确,语义信息表示不足。因此,提出基于特征向量与SVO扩展的企业关系抽取模型,并且在该方法中引入触发词机制,然后使用具有触发词约束的关系模式对年报文本进行企业关系的抽取。最后通过对1 000家上市企业的年报文本进行实验,实验结果表明,该方法能较大地提高实体关系的抽取性能。
The relationship between enterprises is one of the most important components in the enterprise value chain,and also one of themost interesting information for business managers,which is of great significance to the decision-making and management of enterprises.It needs high timeliness and strong robustness to extract the relationships from the annual reports. The core problem of the entity relationextraction is the restriction of the inherent nature of the selection and extraction of the relational model. Due to the complex Chinese sentence,the flexible expression and the variety of semantics,the representation of relational examples is inaccurate and the semantic information is insufficient. Therefore,we propose a model of enterprise relationship extraction based on feature vector and SVO extension. In thismethod,the triggering mechanism is introduced,and then the relational model with the triggering words constraint is used to extract the relationship of the enterprise from the annual report text. Finally,the annual report text of 1 000 listed companies is tested. The experimentshows that this method can greatly improve the extraction performance of the entity relationship.
作者
代江波
毛建华
刘学锋
张鸿洋
DAI Jiang-bo;MAO Jian-hua;LIU Xue-feng;ZHANG Hong-yang(School of Communication &Information Engineering,Shanghai University,Shanghai 200444,China)
出处
《计算机技术与发展》
2018年第10期139-144,共6页
Computer Technology and Development
基金
国家自然科学基金(61271061)
上海市自然科学基金(16ZR1411100)
关键词
企业关系抽取
触发词
特征向量
SVO扩展
关系模式
enterprise relationship extraction
triggering words
feature vector
SVO extension
relational model