摘要
在辞书现代化技术方面,国内辞书界的主要精力仍放在语料库的建设和使用上。然而,国际研究重点已转向语料的深加工和数据库建设,因为他们认识到,编者要想梳理海量语料并从中找到有用的东西绝对是一件既耗时又费力的事情。文章结合国际辞书现代技术的经验,阐述辞书现代化的新理念——辞书语料数据化,即应用语言学研究的新成果和数据挖掘技术,在海量的语料中提取词典所需的各种有效语言数据,把语料库变为词汇/词典数据库,从而大大提高语料使用和词典编纂的效率。
In terms of modern technology for lexicography,the main attention of Chinese lexicographers is still focused on the corpus construction,while in Western countries the focus of study has been shifted to the further processing and datamation of corpus,because they have realized that it is time consuming and strenuous for a lexicographer to deal with a huge amount of concordances in a corpus and find out what is needed for a particular purpose.The present paper,in the light of the achievements obtained by Western scholars in this domain,attempts to discuss the new conception of modern technology in lexicography: corpus datamation,i.e.the new achievements of linguistic study and data mining technology have been adopted to extract various lexical data from large corpora,and turn the corpus into lexical/lexicographical database,in order to improve the efficiency of language resources and dictionary compilation.
出处
《辞书研究》
北大核心
2012年第2期1-9,93,共9页
Lexicographical Studies
基金
上海市科学技术委员会的资助
资助课题编号为08dz1501100