摘要
国家标准信息化是贯彻落实标准化工作改革和发展的重要举措。本文通过对国家标准文本中食品领域的术语语料进行研究,挖掘分析术语之间的多类型语义关系和关系层级结构,并对标准中术语及其定义进行关系抽取,获取其中的术语实体及之间关系。基于词向量的方法进行实验,结合依存句法分析,计算中文词嵌入的相似度对目标关系聚类,筛选种子进行迭代以实现关系抽取。实验结果表明,本文方法在物质原料构成等关系的抽取上取得了较好的结果。
The informatization of national standards is an important initiative to implement the reform and development of standardization work. This study mines and analyses terms and their relationships in the food domain in national standard texts,and establishes the hierarchical structure of relationships. In the experiments, word vectors are combined with dependent syntactic analysis to calculate the similarity of embeddings to cluster the target relationships and iteratively extend the seed set for relationship extraction. The results after comparison experiments show that better results are achieved in the extraction of relations such as material ingredient composition.
作者
於欣澄
刘慧
YU Xincheng;LIU Hui(Shanghai University of International Business and Economics,Shanghai 201620,China)
出处
《信息与电脑》
2022年第5期45-48,共4页
Information & Computer