
基于引文耦合分析方法的相关词识别 被引量:1

Relevance Terms Recognition Based on Bibliographic Coupling Analysis Method
摘要 借鉴引文耦合分析方法,将词条定义中的实词比作词条的参考文献,根据词条定义中实词耦合强度实现相关词的识别。首先对词条定义进行分词和词性标注,并进行人工校对,然后抽取出动词和名词词性的实词,以词条定义中实词的耦合强度作为判定标准实现相关词的推荐,并用人工校对的方法,计算相关词识别的准确率、召回率和F值,论证该方法的有效性。该实验将新能源汽车领域汉语科技词系统中随机选择的500条词条及其定义作为测试集,发现该方法可以达到较高的准确率和召回率。 Enlightened by citation coupling analysis method, regarding the content words in the definition of term as the term's refer-ences, according to the content words coupling strength of the term's definition, the relevance terms recognition is achieved. First, the Chinese word segmentation, part-of-speech tagging and manual correction of term definition are processed. Then, verbs and nouns con-tent words are extracted and content words coupling strength is regarded as the criterion to achieve the relevance terms recognition. At last, manual correction is used to calculate the precision and recall of relevance terms recognition to demonstrate the effectiveness of this meth-od. This experiment regards the Chinese scientific and technical vocabulary system's 500 randomly selected terms and their definitions as the test set ( in the field of new energy vehicles) and find that the method can achieve a high precision and recall.
出处 《情报杂志》 CSSCI 北大核心 2014年第7期161-164,121,共5页 Journal of Intelligence
基金 国家自然科学基金项目"面向特定情报分析应用的知识组织系统快速构建关键问题研究"(编号:71203208) 国家"十二五"科技支撑计划课题"面向外文科技文献信息的超级科技词表和本体建设"(编号:2011BAH10B01) 中国科学技术信息研究所重点工作项目"汉语科技词系统建设与应用工程"(编号:ZD2012-3-2)的研究成果之一
关键词 词条定义 引文耦合分析 实词耦合 耦合强度 可视化 term definition bibliographic coupling analysis content words coupling coupling strength visualization
  • 相关文献


  • 1章成志,苏兰芳,苏新宁.基于多语境的相关词自动提取系统的设计与实现[J].现代图书情报技术,2006(9):23-28. 被引量:6
  • 2Kessler M M. Bibliographic Coupling Between Scientific papers [J] . American Documentation, 1963,14( 1) :10-25.
  • 3Vladutz G, Cook J. Bibliographic Coupling and Subject Relatedness [ A]//American Society for Information Science[ C]. Philadelphia :Proceedings of the 47 th AS IS Annual Meeting, 1984; 204-207.
  • 4Hsiao- Tieh Pu, Lee- Feng Chien. Integrating Log- Based and.Text- Based Methods Towards Automatic Web Thesaurus Con- struction[ A]//Proceedings of the American Society for In for- mation Science and Technology[ C] . American: Wiley online li- brary ,2004,41 ( 1 ) :463-471.
  • 5周全明.全文检索系统后控制技术研究[D].北京:空间政治学院,1995.
  • 6Pierre Senellart, Vincent D. Blondel. Automatic Discovery of Similar Words, Chapter in: Survey of Text Mining[ J]. Springer- Verlag, 2003.
  • 7Lesk M. Automatic Sense Disambiguation Using Machine Read- able Dictionaries:How to Tell a Pine Cone from an ice Cream Conel C]. New York: Proceedings of the 5th Annual Confer- ence on Systems Documentation, 1986:24-26.
  • 8李赟,黄开妍,任福继,钟义信.维基百科的中文语义相关词获取及相关度分析计算[J].北京邮电大学学报,2009,32(3):109-112. 被引量:19
  • 9Masaki Murata, Toshiyuki Kanamaru, Hitoshi Isahara. Automat- ic Synonym Acquisition Based on Matching of Definition Sen-tences in Multiple Dictionaries[ J]. CICLing 2005, LNCS 3406, 2005:293-304.
  • 10Wu Hua, Zhou Ming, Optimizing Synonym Extraction Using Monolingual and Bilingual Resource[ A~. Stroudsbnrg: Proceed- ings of the Second International Workshop on Paraphrasing, 2003:72-79.


  • 1Leacock C, Chodorow wordnet similarity for Fellbaum C. Wordnet Princeton: MIT Press, M. Combining local context and word sense identification [C] // An Electronic Lexical Database. 1998:265 -283.
  • 2Remy M. Wikipedia: the free encyclopedia, online information review[J]. Emerald Group Publishing Limited, 1999, 26(6): 434-435.
  • 3Ponzetto S P, Strube M. Deriving a large scale taxonomy from Wikipedia [ C ]//Proceedings of the 22nd National Conference on Artificial Intelligence. Vancouver: AAAI Press, 2007: 1440-1445.
  • 4Zesch T, Gurevych I. Analysis of the Wikipedia category graph for NLP applications[C]//Proceedings of the Text Graphs-2 Workshop (NAACL-HLT 2007). New York Omnipress Inc, 2007: 1-8.
  • 5Wang Yang, Wang Haofen, Zhu Haiping, et al. Exploit semantic information for category annotation recommendation in Wikipedia [ C]// Natural Language Processing and Information Systems. Berlin: Springer, 2007: 48- 60.
  • 6Banerjee S, Pedersen T. Extended gloss overlap as a measure of semantic relatedness [ C]//Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence. Acapulco. Mexico: Morgan Kaufmann Publishers Inc, 2003: 805-810.
  • 7Lesk M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone[C]//Proceedings of the 5th Annual Conference on Systems Documentation. New York: ACM, 1986 : 24-26.
  • 8Furnas GW, Landauer TK, Gomez LM, Dumais ST. The vocabulary problem in human-system communication. Communication of ACM, 1987,30(11):964~971.
  • 9Wen JR, Nie JY, Zhang HJ. Clustering user queries of a search engine. In: Proceedings of the 10th International World Wide Web Conference (WWW10). New York: ACM Press, 2001. 162~168.
  • 10Xu JX, Croft WB. Query expansion using local and global document analysis. In: Frei HP, Harman D, Schauble P, Wilkinson R,eds. Proceedings of the 19th Annual International SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1996. 4~11.











使用帮助 返回顶部