摘要
研究词语搭配的关系对于自然语言处理有很大的帮助。目前对计算机用的搭配词典是用人工方法实现的,它由人工进行维护,有更新慢、收藏的词少等缺点。为此,利用文本挖掘技术对大规模语料库进行分析,挖掘词语搭配的深层关系,在此基础上自动建立词语搭配词典,实验结果显示该方法是有效的。
A collocations dictionary is the useful component to many natural language and spoken language processing application such as grammar checking, text-speech conversion and machine translation. Currently The collocations dictionary is constructed artificially, firstly it may not be updated frequently and many lexicon entries may be not available. Secondly to construct a dictionary may need lots of human resources. In this paper, text-mining approach for constructing a collocations dictionary is surveyed. The main purpose is to enable cheap and quick acquisition of a collocations dictionary from a large text corpus. Experimental results show that the approach is effective and suitable.
出处
《上海工程技术大学学报》
CAS
2004年第4期323-326,共4页
Journal of Shanghai University of Engineering Science
基金
上海工程技术大学青年基金资助项目(2003Q03)
关键词
文本挖掘
互信息
关联规则挖掘
搭配词典
text mining
mutual information
association rule mining
collocations dictionary