期刊文献+

现代汉语缩略语自动识别的方法研究 被引量:8

Research methods about automatic identification of modern Chinese abbreviation
下载PDF
导出
摘要 在中文信息处理领域,缩略语识别是研究中很重要的一个方面。针对缩略语词典资源稀少的现状,提出一种在生语料中自动抽取现代汉语缩略语的方法。首先获取候选缩略语的源短语候选集,然后利用基于上下文的源短语与缩略语配对方法,可自动生成一部缩略语词典,实验结果证明,该方法是一种相对"智能"的方法。 Identification of modem Chinese abbreviation is a very important study in Chinese information processing. Because of being short of abbreviation dictionary now, an approach is proposed, which would realize the automatic identification of modem Chinese ab- breviation. First abbreviation candidate is gained, then the abbreviation dictionary is automatically produced based on the context. The experiment show the approach is a relatively "smart" approach.
出处 《计算机工程与设计》 CSCD 北大核心 2007年第16期4052-4054,共3页 Computer Engineering and Design
基金 国家自然科学基金项目(60473139) 山西省自然科学基金项目(20051034) 山西大学青年基金项目(2006011)
关键词 源短语 缩略语 上下文 余弦相似度 未登录词 source phrase abbreviation context cosine similarity unknown words
  • 相关文献

参考文献9

  • 1崔世起,刘群,林守勋,等.中文缩略语自动抽取初探[C].全国第八届计算语言学联合学术会议JSCL-2005).北京:清华大学出版社,2005:53-57.
  • 2Serguei Pakhomov,Mayo Foundation,Rochester.Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts[C].Philadelphia:40th Annual Meeting of the Association for Computational Linguistics(ACL 2002),2002:160-167.
  • 3李国臣,罗云飞.采用优先选择策略的中文人称代词的指代消解[J].中文信息学报,2005,19(4):24-30. 被引量:33
  • 4Wee Meng Soon,Hwee Tou Ng.A machine learning approach to coreference resolution of noun phrases[J].Computational Linguistics,2001,27(4):521-544.
  • 5Vincent Ng,Claire Cardie.Improving machine learning approaches to coreference resolution[C].Philadelphia:40th Annual Meeting of the Association for Computational Linguistics (ACL),2002:104-111.
  • 6支流,朱学锋,段慧明,等.中文缩略语还原技术初探[C].全国第八届计算语言学联合学术会议JSCL-2005),北京:清华大学出版社,2005:600-602.
  • 7Akira Terada,Takenobu Tokunaga,Hozumi Tanaka.Automatic expansion of abbreviations by using context and character information[EB/OL].http://tanaka-www.cs.titech.ac.jp/publication/archive/286.pdf.
  • 8贺宏朝,何丕廉,高剑峰,黄昌宁.一种基于上下文的中文信息检索查询扩展[J].中文信息学报,2002,16(6):32-37. 被引量:25
  • 9张振亚,王进,程红梅,王煦法.基于余弦相似度的文本空间索引方法研究[J].计算机科学,2005,32(9):160-163. 被引量:51

二级参考文献24

  • 1[1]Miller G A, et al. Introduction to WordNet:an on-line lexical database, International Journal of Lexicography, 1990,3(4) :235 - 312
  • 2[2]Rila Mandala,Takenobu Tokunaga,Hozumi Tanaka,Combining multiple evidence from different types of thesaurus for query expansion,SIGIR, 1999:191 - 197
  • 3[3]Voorhees E M, Harman D K,The sixth Test REtrieval Conferenee(TREC-6) ,Gaithersburg,NIST, 1998
  • 4[4]Salton G, The SMART retrieval system-experiments in automatic document processing, Prentice Hall, 1971:115 -411
  • 5[5]http: ∥ morph. ldc. upenn. edu/Projects/Chinese
  • 6[6]Gao J F, Nie J Y, Zhang J, et al, Improving query translation for CLIR using statistical models, ACM SIGIR'01 ,New Orleans,2001:96- 104
  • 7[7]David Hull, Using statistical testing in the evaluation of retrieval performance, In Proc. of the 16th ACM/ SIGIR Conference, 1993: 329 - 338
  • 8Faloutsos C. FastMap: A Fast Algorithm for indexing, Data-Min ing and Visualization of Traditional and Multimedia Datasets. In:Proc. of ACM SIGMOD, 1995. 163~174
  • 9Jagadish H V. A retrieval technique for similar shapes. In:Proc. ACM SIGMOD Conf, May 1990. 208~217
  • 10Torgerson S. Multidimensional scaling: I. theory and method. Psychometrika, 1952,17: 401~419

共引文献111

同被引文献57

引证文献8

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部