期刊文献+

基于社会网络的跨文本同名消歧 被引量:13

Social Network Based Cross-Document Personal Name Disambiguation
下载PDF
导出
摘要 跨文本人名消歧是判断出现在不同文本的相同人名是否指称现实中相同实体的过程。跨文本人名消歧是准确获取感兴趣人物相关信息的基础,对多文本摘要、信息融合等具体应用也有重要的作用。该文运用社会网络分析法消歧中文不同文本同名歧义问题,思想是先使用谱聚类对社会网络中的人名聚类,然后根据不同社会网络边权值和不同图划分准则对人名消歧效果的影响,引入了模块度阈值作为社会网络划分的停止条件。在CLP2010的中文人名消歧数据上进行测试,显示了社会网络分析对人名消歧的有效性。 Cross-document personal name disambiguation is the process of determining if an identical name occurring in different texts refers to the same person in the real world.With the increasing need for multi-document applications,for example,multi-document summarization and information fusion,cross-document name entity disambiguation has drawn much attention.This paper employs a social network based algorithm for cross-document personal name disambiguation.This method uses the spectral clustering approach,compares the results of different graph partition criteria,and chooses the modularity threshold as the stopping measure for graph partition.Experiments datasets are built by CLP 2010 Chinese personal name disambiguation task.The results show that this method is promissing.
作者 陈晨 王厚峰
出处 《中文信息学报》 CSCD 北大核心 2011年第5期75-82,共8页 Journal of Chinese Information Processing
基金 高校博士点专项基金资助项目(20090001110047) 国家自然科学基金资助项目(60973053 91024009)
关键词 计算机应用技术 人名消歧 社会网络 谱聚类 停止条件 模块度 computer application technology personal name disambiguation social network spectral clustering cluster-stopping measure modularit
  • 相关文献

参考文献13

  • 1J. Artiles, J. Gonzalo, S. Sekine. The SemEval- 2007WePS Evaluation.. Establishing a benchmark for the Web People Search Task [C]//SemEval, 2007.
  • 2A. Bagga, B. Baldwin. Entity-based cross-document coreferencing using the Vector Space Model[C]//Proceedings of the 17th international conference on Computational linguistics-Volume 1, 1998: 79-85.
  • 3G. S. Mann, D. Yarowsky. Unsupervised personal name disambiguation [C]//Proceedings of the seventh conference on Natural language learning at HLT- NAACL, 2003.. 33-40.
  • 4M. B. Fleischman, E. Hovy. Multi-document person name resolution[C]//Proceedings of ACL-42, Reference Resolution Workshop, 2004.
  • 5B. Malin. Unsupervised Name Disambiguation via Social Network Similarity [C]//Workshop Notes on Link Analysis, Counterterrorism, and Security, 2005.
  • 6郎君,秦兵,宋巍,刘龙,刘挺,李生.基于社会网络的人名检索结果重名消解[J].计算机学报,2009,32(7):1365-1374. 被引量:32
  • 7T. Pedersen, K. Anagha. Automatic Cluster Stopping with Criterion Functions and the Gap Statistic[C]// Proceedings of the Demonstration Session of the Human Language Technology Conference and the Sixth Annual Meeting of the North American Chapter of the Association for Computational Linguistic, New York City. 2006.
  • 8Scott J. Social network analysis: A handbook (2nd ed. ) [M]. Thousands Oaks, CA: Sage. 2000.
  • 9Ng A, Jordan M,Weiss Y. On spectral clustering: Analysis and an algorithm. Advances in Neural Information Precessing Systems 14 [C]//MIT Press, 2002.
  • 10Z. Wu, R. Leahy. An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation[J]. IEEE Trans. Pattern Analysis and Machine Intelligence, 1993, 15 (11) : 1101-1113.

二级参考文献1

共引文献31

同被引文献115

  • 1张猛,王大玲,于戈.一种基于自动阈值发现的文本聚类方法[J].计算机研究与发展,2004,41(10):1748-1753. 被引量:16
  • 2Wang Houfeng(王厚峰),Mei Zheng.Chinese multi-document personal name disambiguation[J].High Technology Letters,2005,11(3):280-283. 被引量:8
  • 3罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量:36
  • 4Bagga Amit, Breck Baldwin. Entity-based cross-document coreferencing using the vector space model [ C ]// Pro- ceedings of the 36^th Annual Meeting of the ACL and the 17^th International Conreference on Computational Linguis- tics (COLING-ACL) ,1998:79-85.
  • 5Javier Artiles, Julio Gonzalo, Satoshi Sekine. The SemEval-2007 WePS Evaluation:Establishing a benchmark for the Web People Search Task [ C] //Proceedings of the 4th In- ternational Workshop on Semantic Evaluations (Semeval- 2007) ,2007:64-69.
  • 6Heng Ji, Ralph Grishman, Hoa Trang Dang, et al. An over- view of the TAC2010 knowledge base population rrack [ C ] // Proceedings of Text Analyties Conference (TAC2010) ,2010.
  • 7He Zhengyan, Wang Houfeng, Li Sujian. The task 2 of CIPS=SIGHAN 2012:named entity recognition and disam- biguation in chinese bakeoff [ C] ///Proceedings of the sec- ond CIPS:SIGHAN Joint Conference on Chinese Language Processing ,2012 : 108-114.
  • 8Chert Ying, James Martin. Towards robust unsupervised personal name disambiguation [ C ]////Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) ,2007 : 190-198.
  • 9Suzanne Tamang, Chen Zheng,Ji Heng. CUNY_BLENDER TAC-KBP2012 entity linking system and slot fining vali- dation system [ C ]//Proceedings of Text Analytics Confer- ence (TAC2012),2012.
  • 10Chen Ying,Jin Peng,Li Wenjie,et al. The Chinese persons name diambiguation evaluation: exploration of personal name disambignation in Chinese news [ C ]//Proceedings of the first CIPS-SIGHAN Joint Conference on Chinese Language Processing,2010.

引证文献13

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部