Entity and relation extraction is an indispensable part of domain knowledge graph construction,which can serve relevant knowledge needs in a specific domain,such as providing support for product research,sales,risk co...Entity and relation extraction is an indispensable part of domain knowledge graph construction,which can serve relevant knowledge needs in a specific domain,such as providing support for product research,sales,risk control,and domain hotspot analysis.The existing entity and relation extraction methods that depend on pretrained models have shown promising performance on open datasets.However,the performance of these methods degrades when they face domain-specific datasets.Entity extraction models treat characters as basic semantic units while ignoring known character dependency in specific domains.Relation extraction is based on the hypothesis that the relations hidden in sentences are unified,thereby neglecting that relations may be diverse in different entity tuples.To address the problems above,this paper first introduced prior knowledge composed of domain dictionaries to enhance characters’dependence.Second,domain rules were built to eliminate noise in entity relations and promote potential entity relation extraction.Finally,experiments were designed to verify the effectiveness of our proposed methods.Experimental results on two domains,including laser industry and unmanned ship,showed the superiority of our methods.The F1 value on laser industry entity,unmanned ship entity,laser industry relation,and unmanned ship relation datasets is improved by+1%,+6%,+2%,and+1%,respectively.In addition,the extraction accuracy of entity relation triplet reaches 83%and 76%on laser industry entity pair and unmanned ship entity pair datasets,respectively.展开更多
基金This work is funded by the Shanghai Sailing Program(Grant No.20YF1413800)Military Medical Science and Technology Youth Cultivating Program(Grant No.20QNPY106)High Performance Computing Center of Shanghai University,and Shanghai Engineering Research Center of Intelligent Computing System(Grant No.19DZ2252600).
文摘Entity and relation extraction is an indispensable part of domain knowledge graph construction,which can serve relevant knowledge needs in a specific domain,such as providing support for product research,sales,risk control,and domain hotspot analysis.The existing entity and relation extraction methods that depend on pretrained models have shown promising performance on open datasets.However,the performance of these methods degrades when they face domain-specific datasets.Entity extraction models treat characters as basic semantic units while ignoring known character dependency in specific domains.Relation extraction is based on the hypothesis that the relations hidden in sentences are unified,thereby neglecting that relations may be diverse in different entity tuples.To address the problems above,this paper first introduced prior knowledge composed of domain dictionaries to enhance characters’dependence.Second,domain rules were built to eliminate noise in entity relations and promote potential entity relation extraction.Finally,experiments were designed to verify the effectiveness of our proposed methods.Experimental results on two domains,including laser industry and unmanned ship,showed the superiority of our methods.The F1 value on laser industry entity,unmanned ship entity,laser industry relation,and unmanned ship relation datasets is improved by+1%,+6%,+2%,and+1%,respectively.In addition,the extraction accuracy of entity relation triplet reaches 83%and 76%on laser industry entity pair and unmanned ship entity pair datasets,respectively.