摘要
对中文时间词和数词在文本中的常见形式进行归纳,以此为基础构建用于识别时间词、数词的规则集,提出一个基于规则的时间词和数词自动识别算法,并对该算法在竞争情报分析领域和机器翻译领域中的应用价值进行论述。
This paper firstly generalizes the formats of Chinese time words and numerals appearing in the text. Based on them, this paper then sets up a rule sets for recognition, proposes a method about Chinese time words and numnerals based on rules and discusses its application value in competitive intelligence analysis as well as machine translation field at last.
出处
《现代图书情报技术》
CSSCI
北大核心
2007年第3期46-50,共5页
New Technology of Library and Information Service
关键词
词语切分
信息抽取
规则
Word segmentation Information extraction Rule