摘要
离散化是装备模拟训练系统数据预处理的重要组成部分。针对传统数据离散化方法对单个属性依次处理,往往忽视属性间的相关性,造成装备模拟训练系统数据离散化后的误差。提出基于层次聚类和相容度的数据离散化方法。采用逐层泛化构建离散化总体框架,完成对装备模拟训练系统数据混合型决策表处理。设计动态确定簇数的层次聚类,实现对属性的初始整体划分;结合类别属性信息和相容度合并相邻区间,去除冗余的离散划分。实验结果表明,基于层次聚类和相容度的数据离散化方法在区间总数、精度方面有明显优势。
Discretization is an important part of equipment simulation data preprocessing.The traditional data discretization methods often ignore the correlation between attributes,which may cause errors when deal with the single attribute in turn.Therefore,a discretization method based on hierarchical clustering and compatibility degree was proposed.The discrete framework was constructed by layer generalization,and the data mixed decision table of equipment simulation training system was processed.Hierarchical clustering was designed to dynamically determine the number of clusters to achieve the initial overall division of attributes.The category attribute information and compatibility degree was used to merge the adjacent intervals and remove the redundant discrete intervals.The experimental results show that the proposed method has obvious advantages in the total number of the interval and accuracy.
作者
邓青
薛青
杜楠
付朝博
DENG Qing;XUE Qing;DU Nan;FU Chao-bo(Training Center, Army Armored Force Institute, Beijing 100072, China;68303 Troops, Geermu 816099, China)
出处
《科学技术与工程》
北大核心
2021年第27期11674-11680,共7页
Science Technology and Engineering
基金
武器装备预先研究项目(41404060205)。
关键词
装备模拟训练系统
数据离散化
层次聚类
相容度
数据挖掘
equipment simulation training system
data discretization
hierarchical clustering
compatibility degree
data mining