摘要
文章通过对于10种不同类型的清代医籍进行抽样,以人工标注方式构建了一个小型的清代医籍分词语料库,对中医古籍的分词规范进行了探索性的研究和总结,并提出中医古籍分词规范建议:以既有事实、语义变化为总原则,从词性语法、语义类型两个维度进一步拟定细则。通过分析中医古籍中出现的术语常见类型与结构,并将其引入分词原则,从而构建具有中医古籍语言特色的分词标准,为进一步构建中医古籍分词算法模型、实现计算机精准抽取信息提供了一定基础支撑。
By sampling 10 different types of medical books of Qing dynasty, the paper has built a small database of Chinese word segmentation of the medical books of Qing dynasty and has made a suggestion: Taking the existing facts and semantic changes as the general principle, and elaborating the more specifically rules from the two dimensions of part-of-speech and semantic types. By analyzing the common grammar and structures in the ancient Chinese medicine books and introducing them into the principle of Chinese word segmentation, we can construct the Chinese word segmentation standard which has the characteristics of the ancient Chinese medicine books, which is in order to further building the algorithm model of word segmentation of ancient Chinese medicine books, and to achieve precise information extraction comouter orovides some basic suooort.
作者
付璐
李思
李明正
朱彦
FU Lu;LI Si;LI Ming-zheng;ZHU Yan(The China Institute for History of Medicine and Medical Literature,CACMS,Beijng 100700,China;School of Information and Communication Engineering,Beijing University of Posts and Telecommunications,Beijng 100876,China;Institute of Informationon Traditional Chinese Medicine,CACMS,Beijng 100700,China)
出处
《中华中医药杂志》
CAS
CSCD
北大核心
2018年第10期4700-4705,共6页
China Journal of Traditional Chinese Medicine and Pharmacy
基金
国家自然科学基金项目(No.61702047)
北京市自然科学基金项目(No.7174328
No.4174098)
中国中医科学院基本科研业务费自主选题项目(No.zz110318)~~
关键词
中医古籍
分词标引
清代医籍
词性语法原则
语义类型原则
Ancient Chinese medicine books
Word segmentation
Medical books of Qing dynasty
Principle of lexical grammar
Principle of semantic types