Feature-based augmentation and classification for tabular data

下载PDF

导出

摘要 Generating synthetic samples for a tabular data is a strenuous task.Most of the time,the columns(features)in the dataset may not follow an ideal distribution function.The objective of the proposed algorithm,Histogram Augmentation Technique(HAT),is to generate a dataset whose distribution is similar to that of the original dataset.This augmentation is achieved based on individual columns,where separate algorithms are designed for continuous and discrete columns.Humans also use features of an object for interpretation.When humans make a judgement,they notice prominent features and characterise the perceived object.However,conventional Machine Learning classifiers are designed and trained on the basis of samples.Taking the features as the basis for classification,Feature Importance Classifier(FIC)has been attempted in this work.FIC treats every feature independent of each other,and ranks the features based on its dependence with the classified label.It has been found that the FIC has the highest accuracy and has improved the accuracy by 5.54%on average,when it's compared to other classifiers.The suggested algorithms have been experimented on five datasets and compared with two augmentation algorithms and four state-of-the-art ML classification algorithms.

作者 Balachander Sathianarayanan Yogesh Chandra Singh Samant Prahalad S.Conjeepuram Guruprasad Varshin B.Hariharan Nirmala Devi Manickam

机构地区 Amrita School of Engineering

出处《CAAI Transactions on Intelligence Technology》 SCIE EI 2022年第3期481-491,共11页 智能技术学报（英文）

关键词 FUNCTION CLASSIFIER COLUMNS

分类号 TP3 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

参考文献3

1Hayat Ullah,Bashir Ahmad,Iqra Sana,Anum Sattar,Aurangzeb Khan,Saima Akbar,Muhammad Zubair Asghar.Comparative study for machine learning classifier recommendation to predict political affiliation based on online[J].CAAI Transactions on Intelligence Technology,2021,6(3):251-264. 被引量：1
2Santosh Satapathy,D Loganathan,Hari Kishan Kondaveeti,RamaKrushna Rath.Performance analysis of machine learning algorithms on automated sleep staging feature sets[J].CAAI Transactions on Intelligence Technology,2021,6(2):155-174. 被引量：1
3Doreswamy,Mohammad Kazim Hooshmand,Ibrahim Gad.Feature selection approach using ensemble learning for network anomaly detection[J].CAAI Transactions on Intelligence Technology,2020,5(4):283-293. 被引量：4

共引文献3

1李贝贝,彭力,戴菲菲.结合马氏距离与自编码器的网络流量异常检测方法[J].计算机工程,2022,48(4):133-142. 被引量：13
2Mohammad Kazim Hooshmand,Doreswamy Hosahalli.Network anomaly detection using deep learning techniques[J].CAAI Transactions on Intelligence Technology,2022,7(2):228-243. 被引量：5
3Ashish SINGH,Abhinav KUMAR,Suyel NAMASUDRA.DNACDS:Cloud IoE big data security and accessing scheme based on DNA cryptography[J].Frontiers of Computer Science,2024,18(1):157-170. 被引量：3

1Macdonald G. Obudho,George O. Orwa,Romanus O. Otieno,Festus A. Were.Robust Classification through a Nonparametric Kernel Discriminant Analysis[J].Open Journal of Statistics,2022,12(4):443-455. 被引量：1
2Afef Houimli,Issam Ben Mhamed,Bechir Letaief,Dorra Ben-Sellem.Fast and Accurate Thoracic SPECT Image Reconstruction[J].Computer Modeling in Engineering & Sciences,2022(5):881-904.
3杜巧萍.儿童复杂性先天性心脏病超声心动图多层螺旋CT检查效果及其应用临床价值[J].实用医学影像杂志,2022,23(2):183-186. 被引量：4
4TAILIN WU,XIANG ZHOU,CANHUA YE,WENCAN LU,HAITAO LIN,YANZHE WEI,ZEKAI KE,ZHENGJI HUANG,JIANZHOU LUO,HUIREN TAO,CHUNGUANG DUAN.Retraction notice to:M1 macrophage-derived exosomes moderate the differentiation of bone marrow mesenchymal stem cells[J].BIOCELL,2022,46(4):1123-1123. 被引量：2
5Bing Ye.The molecular mechanisms that underlie neural network assembly[J].Medical Review,2022,2(3):244-250.
6Jehad Ali,Byungkyu Lee,Jimyung Oh,Jungtae Lee,Byeong-hee Roh.A Novel Features Prioritization Mechanism for Controllers in Software-Defined Networking[J].Computers, Materials & Continua,2021(10):267-282. 被引量：1
7Noor Munir,Majid Khan,Mohammad Mazyad Hazzazi,Amer Aljaedi,Sajjad Shaukat Jamal,Iqtadar Hussain.Atmospheric Convection Model Based Digital Confidentiality Scheme[J].Computers, Materials & Continua,2022(6):4503-4522.
8Shahan Yamin Siddiqui,Iftikhar Naseer,Muhammad Adnan Khan,Muhammad Faheem Mushtaq,Rizwan Ali Naqvi,Dildar Hussain,Amir Haider.Intelligent Breast Cancer Prediction Empowered with Fusion and Deep Learning[J].Computers, Materials & Continua,2021(4):1033-1049. 被引量：6
9Samer Alabed,Issam Maaz,Mohammad Al-Rabayah.Two-Phase Bidirectional Dual-Relay Selection Strategy for Wireless Relay Networks[J].Computers, Materials & Continua,2021(10):539-553. 被引量：1
10Valéry M. Monthe,Laurent Nana,Georges E. Kouamou,Claude Tangha.A Decision Support Framework for the Choice of Languages and Methods for the Design of Real Time Embedded Systems[J].Journal of Software Engineering and Applications,2016,9(7):353-397.

CAAI Transactions on Intelligence Technology

2022年第3期

浏览历史

内容加载中请稍等...

Feature-based augmentation and classification for tabular data

参考文献3

共引文献3

相关作者

相关机构

相关主题

浏览历史