摘要
针对智能电网知识的共享与重用问题,本文提出了一种基于本体的智能电网文本知识获取方法。首先,以《电力主题词表》和《中国分类主题词表》为基础,基于Protégé构建一个初始种子本体,用Jena对此本体进行解析,生成概念树;然后,利用ICTCLAS进行中文分词,生成概念词集;最后,提出基于How Net的GSS匹配算法,并将概念词集与概念树进行匹配,将获取到的概念和属性添加到种子本体中,完成一次知识获取。通过抓取智能电网领域相关网页进行实例验证,获得了较好的效果。
To solve smart grid knowledge sharing and reuse problems,an ontology-based smart grid text knowl-edge acquisition method was proposed in this paper. The smart grid concept tree was built,using Jena to parse the initial ontology. This ontology was created by Protege and taken “Electricity Thesaurus” and “Classified Chinese Thesaurus” as the text source. Then,the concept set was gotten,using ICTCLAS to break up the Chi-nese word and the plain text file was given by HTMLParser,processing the related pages of smart grid. Finally, GSS matching algorithm based on the HowNet was put forward. The concept set and concept tree was matched, the concept and properties acquired were added to the seed ontology, and then a knowledge acquisition was completed in this method. The pages related to smart grid field was captured to test and verify this method,and the experiments showed that this method was more efficient.
出处
《东北电力大学学报》
2014年第5期60-68,共9页
Journal of Northeast Electric Power University
基金
国家自然科学基金资助项目(51077010)
吉林省科技厅社发处重点科技攻关项目(20130206085SF)
关键词
本体
智能电网
知识获取
自然语言处理
语义相似度算法
Ontology
Smart grid
Knowledge acquisition
Natural language processing
Semantic similarity