期刊文献+

利用自然最近邻的不平衡数据过采样方法 被引量:2

Oversampling Method for Unbalanced Data by Natural Nearest Neighbor
下载PDF
导出
摘要 针对现有过采样方法存在的易引入噪声点、合成样本重叠的问题,提出一种基于自然最近邻的不平衡数据过采样方法。确定少数类样本的自然最近邻,每个样本的近邻个数由算法自适应计算生成,反映了样本分布的疏密程度。基于自然近邻关系对少数类样本聚类,由位于同一类簇中密集区域的核心点和稀疏区域的非核心点生成新样本。在二维合成数据集和UCI数据集上的对比实验验证了该方法的可行性和有效性,提高了不平衡数据的分类精度。 Aiming at the problem of introducing noise points and synthesizing overlapping samples in existing oversampling methods,this paper proposes an oversampling method based on natural nearest neighbors.The proposed method firstly determines the natural nearest neighbor for minority samples.Each sample’s number of nearest neighbors is generated by adaptive calculation in the algorithm,which reflects the density of distribution.After cluster analysis for minority samples based on relations of natural neighbor,this method generates new samples using core points in dense area and non-core points in sparse area from the same cluster.The comparison experiments on a two-dimensional synthesis dataset and UCI datasets verify the feasibility and effectiveness of this method and improve the classification accuracy of unbalanced data.
作者 孟东霞 李玉鑑 MENG Dongxia;LI Yujian(School of Financial Technology,Hebei Finance University,Baoding,Hebei 071051,China;School of Artificial Intelligence,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China)
出处 《计算机工程与应用》 CSCD 北大核心 2021年第2期91-96,共6页 Computer Engineering and Applications
基金 河北省高校智慧金融应用技术研发中心基金(XGZJ2020008) 国家自然科学基金(61876010)。
关键词 不平衡数据集 过采样 自然最近邻 聚类 imbalanced data set over sampling natural nearest neighbor clustering
  • 相关文献

参考文献6

二级参考文献26

  • 1Gogoi P, Borah B, Bhattaeharyya D K. Outlier identification using symmetric neighborhoodsJ]. Procedia Technology, 2012, 6 239-246.
  • 2Breunig M M, Kriegel H P, etal. IX)F:identifying density-based local outliers[-J. Proc. of 2000 ACM SIGMOD international conference on Management of data. ACM Sigmod Record, 2000, 29(2):93-104.
  • 3Hautamaki V, Karkkainen I. Outlier detection using k-nearest neighbor graphiC]//Proc. 17th IEEE Int. Conf. on Pattern Rec- ognition. 2004,3 : 430-433.
  • 4Angiulli, F, Palopoli L. Detecting outlying properties of excep- tional objects E J']. ACM Transaction on Database Systems, 2009,34(1):62-74.
  • 5Richard J, Chris (2. Fuzzy-rough nearest neighbor classification and prediction [J]. Theoretical Computer Science, 2011, 412(42) : 5871-5884.
  • 6PandyaD H, Upadhyay S H, Harsha S P. Fault diagnosis of rol- ling element bearing with intrinsic mode function of acoustic e- mission data using APF-kNNJ]. Expert Systems with Applica- tions,2013,40(10) :4137-4145.
  • 7Xu Yong, Zhu Qi, et al. Coarse to fine K nearest neighbor classi- fier FJ]. Pattern Recognition Letters, 2013,34(9) : 980-936.
  • 8黄创光,印鉴,汪静,刘玉葆,王甲海.不确定近邻的协同过滤推荐算法[J].计算机学报,2010,33(8):1369-1377. 被引量:217
  • 9罗辛,欧阳元新,熊璋,袁满.通过相似度支持度优化基于K近邻的协同过滤算法[J].计算机学报,2010,33(8):1437-1445. 被引量:126
  • 10陈思,郭躬德,陈黎飞.基于聚类融合的不平衡数据分类方法[J].模式识别与人工智能,2010,23(6):772-780. 被引量:28

共引文献57

同被引文献9

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部