摘要
提出了一种网络数据挖掘的方法从大规模文集中抽取命名实体之间的关系.其核心思想是,将文集中的命名实体对以及它们的上下文表示成网络结构并从该网络结构中发现网络社区,则每个社区表示一种关系,而处于相同社区中的命名实体对具有相同的关系;最后我们用适当的词语来标记这些关系.我们使用《人民日报语料库》进行实验,其结果表明我们不但可以得到较高的准确率,而且可以自动的标注命名实体的关系.
This paper proposes a networked data mining method for relations discovery from large corpus. The key idea is representing the named entities pairs and their contexts as the network structure and detecting the communities from the network. Then each community relates to a relation the named entities pairs in the same community have the same relation. Finally, we labeled the relations. Our experiment using the corpus of People's Daily reveals not only that the relations among named entities could be detected with high precision, but also that appropriate labels could be automatically provided for the relations.
出处
《咸宁学院学报》
2009年第6期38-40,102,共4页
Journal of Xianning University
关键词
命名实体对
社区
介数
Named entities pair
Community
Betweenness