摘要
随着中文信息处理技术的不断提高,古籍自动化标点的研究也日益趋向成熟。但是现有的模式并不太适合林业类古籍。林业古籍有其独特的表达和术语,与一般性质的古籍存在区别的。本文以《树艺篇》为训练文本,对林业古籍断句语料库的建设问题进行探讨。
With the constant improvement of the Chinese information processing technology, the study of ancient books automation punctuation also increasingly mature. But the existing models are not suitable for forestry kind of ancient books. Forestry ancient books has its unique expression and terminology, difference with general qualitative ancient books. In this paper with the tree art paper for training text, discusses the problems on construction of forestry ancient punctuate corpus.
出处
《科技视界》
2015年第3期23-23,47,共2页
Science & Technology Vision
基金
江苏省教育厅2013年度高校哲学社会科学基金资助项目(2013SJB870004)
教育部人文社会科学研究青年基金项目(12YJC870008)
关键词
林业古籍
断句
语料库
《树艺篇》
Forestry ancient books
The pausing
Corpus
ShuY i Pian