期刊文献+

基于语义框架的电网缺陷文本挖掘技术及其应用 被引量:78

Semantic Framework-Based Defect Text Mining Technique and Application in Power Grid
下载PDF
导出
摘要 电网企业拥有大量蕴含着重要可靠性信息的设备缺陷文本,依靠人工进行挖掘不仅效率低而且准确性因人而异。以变压器缺陷文本为研究对象,通过分析文本的特点,建立了基于语义框架的电网缺陷文本挖掘模型,解决了缺陷文本句子成分难以划分、数字量无法精确提取等问题,为电网领域的非结构化数据挖掘提供了新技术。首先在建立本体词库基础上,对缺陷文本进行分词、词汇特征提取等预处理;然后定义了电力语义框架与语义槽,提出了槽填充和语义框架构建流程,并通过词串合并实现了本体字典自动完善;最后对缺陷文本挖掘结果在可靠性统计中的应用进行了研究。算例表明,所提出的挖掘技术应用于电网缺陷自动分类与统计中,具有可行性和有效性。 Power grid enterprises have large amounts of equipment defect texts in Chinese, containing important reliability information. It is of low efficiency and uncertain accuracy to mine information hiding behind the texts manually. Taking transformer defect texts as study object, after analyzing text characteristics, a defect text mining model is established based on semantic framework. The model provides a new technology for unstructured data mining in power grid domain because it solves problems of segmenting sentence elements of defect texts and extracting digital information precisely. Firstly, defect texts are pretreated based on established ontology thesaurus, such as segmentation and feature extraction. Then, power semantic framework and semantic slots are defined, process of slot-filling and semantic framework construction is raised, and ontology dictionary is auto-perfected by merging word series. Finally, application of defect text mining results in statistical reliability is studied. Example shows that the proposed mining technology is feasible and effective when applied to automatic classification and statistics of grid defect.
出处 《电网技术》 EI CSCD 北大核心 2017年第2期637-643,共7页 Power System Technology
关键词 文本挖掘 语义框架 可靠性统计 缺陷文本 text mining semantic framework reliability statistics defect text
  • 相关文献

参考文献9

二级参考文献89

共引文献253

同被引文献703

引证文献78

二级引证文献1055

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部