摘要
针对高维离散型分类数据的样本分布描述问题,提出基于增益的得分比方法,策略是根据属性和属性值的重要程度,为每个样本计算得分比,从样本对分类的隶属度角度描述各分类中样本的分布。得分比的概率密度曲线和直方图可以直观反映出在每一分类中典型样本和噪声样本的分布情况。
For describing the distribution of samples with high-dimensions and discrete classification data, the method of scoring-ratio based on Gain was presented. It computed scoring-ratio for every sample according to the importance of attributes and attribute-value, and the distribution of samples in a class was described from the point of view of membership degree of sample to each class. The probability density curve and histogram showed the distribution of typical and noise samples in each class distinctly.
出处
《计算机应用》
CSCD
北大核心
2005年第5期1004-1005,1011,共3页
journal of Computer Applications
基金
国家自然科学基金资助项目(60375005)
关键词
增益
隶属度
样本分布
Gain
membership degree
distribution of sample