摘要
在HNC理论的指导下,在30篇共3613句的中文专利文献基础上,从数量、层级、语义类型、语义特征、干扰特征、结构特征、外部环境和位置特征等八个维度对中文专利文献名词性短语中并列结构进行语料标注,进而分析并列结构的分类及其分布情况,并在此基础上考察并总结并列结构的语义特征、结构特征和外部词特征,目的是辅助设计自动识别汉语名词性短语并列结构的策略、语言学规则和算法。
Under the guidance of HNC theory, coordination with overt conjunctions (COC) in 3613 sen- tenees of 30 articles in the Chinese patent literature is annotated in the eight aspects, namely number, level, semantic type, semantic feature, interference, structural feature, contextual words and boundary position. This paper counts and analyzes the types and distribution of COC, investigates semantic similarity, structural similarities and contextual information. Its aim is to design the strategies, algorithms and linguistic rules for automatically recognizing COC in nominal groups of Chinese patent literature.
作者
刘小蝶
Liu Xiaodie(College of International Education, Beijing Union University, Beijing 100101, China)
出处
《曲靖师范学院学报》
2017年第5期42-46,共5页
Journal of Qujing Normal University
基金
国家语委"十二五"科研规划项目(YB125-124)
国家高技术研究发展计划(863计划)(2012AA011104)
关键词
语言学
中文专利文献
并列结构
语义块
语义特征
Linguistics
Chinese patent literature
COC
semantic chunks
semantic features