摘要
目前高速列车本体多采用人工构建的方法,这种方法存在成本高、效率低且缺乏灵活性的不足。针对这种问题,提出了一种高速列车本体半自动构建方法,先使用分词工具Jieba对高速列车领域文档进行分词、去除停用词等预处理,然后使用TF-IDF、C-value等算法进行概念抽取,再使用层次聚类及Dice测度等算法挖掘领域层次关系及非层次关系,最后使用protégé工具构建结构化的OWL本体并进行可视化管理。通过高速列车本体半自动构建实例,实现概念及语义关系的自动获取,验证该方法的有效性及可行性。
At present, the high-speed train body mostly adopts the manual construction method. This method has the disadvantages of high cost, low efficiency and lack of flexibility. In view of this problem, this paper proposes a semi-automatic construction method for high-speed trains. First, the word segmentation tool Jieba is used to pre-segment and remove stop words from high-speed train domain documents, and then use TF-IDF, C-value and other algorithms for conception. Extraction, then use hierarchical clustering and Dice measures to mine hierarchical and non-hierarchical relationships in the domain. Finally, use the protégé tool to construct a structured OWL ontology and perform visual management. A semi-automatic construction example of the high-speed train body is used to realize the automatic acquisition of concepts and semantic relationships, verifying the validity and feasibility of the method.
作者
丁雨秋
黎荣
郑宇飞
黎伟洋
DING Yuqiu;LI Rong;ZHEN Yufei;LI Weiyang(Institute of Advanced Design and Manufacturing,School of Mechanical Engineering,Southwest Jiaotong University,Chengdu 610031,China)
出处
《机械设计与研究》
CSCD
北大核心
2020年第4期185-189,共5页
Machine Design And Research
基金
国家重点研发计划资助项目(2016YFB1200506)。
关键词
高速列车
本体
半自动
概念抽取
关系挖掘
high-speed train
ontology
semi-automatic
concept extraction
relationship mining