With the modernization of machine learning techniques in healthcare,different innovations including support vector machine(SVM)have predominantly played a major role in classifying lung cancer,predicting coronavirus d...With the modernization of machine learning techniques in healthcare,different innovations including support vector machine(SVM)have predominantly played a major role in classifying lung cancer,predicting coronavirus disease 2019,and other diseases.In particular,our algorithm focuses on integrated datasets as compared with other existing works.In this study,parallel-based SVM(P-SVM)andmulticlass-basedmultiple submodels(MMSM-SVM)were used to analyze the optimal classification of lung diseases.This analysis aimed to find the optimal classification of lung diseases with id and stages,such as key-value pairs in MapReduce combined with P-SVM and MMSVM for binary and multiclasses,respectively.For nonlinear classification,kernel clustering-based SVM embedded with multiple submodels was developed.Both algorithms were developed using Apache spark environment,and data for the analysis were retrieved from microscope lab,UCI,Kaggle,and General Thoracic surgery database along with some electronic health records related to various lung diseases to increase the dataset size to 5 GB.Performance measures were conducted using a 5 GB dataset with five nodes.Dataset size was finally increased,and task analysis and CPU utilization were measured.展开更多
基金This study is supported by the Tamil Nadu State Council of Science and Technology.
文摘With the modernization of machine learning techniques in healthcare,different innovations including support vector machine(SVM)have predominantly played a major role in classifying lung cancer,predicting coronavirus disease 2019,and other diseases.In particular,our algorithm focuses on integrated datasets as compared with other existing works.In this study,parallel-based SVM(P-SVM)andmulticlass-basedmultiple submodels(MMSM-SVM)were used to analyze the optimal classification of lung diseases.This analysis aimed to find the optimal classification of lung diseases with id and stages,such as key-value pairs in MapReduce combined with P-SVM and MMSVM for binary and multiclasses,respectively.For nonlinear classification,kernel clustering-based SVM embedded with multiple submodels was developed.Both algorithms were developed using Apache spark environment,and data for the analysis were retrieved from microscope lab,UCI,Kaggle,and General Thoracic surgery database along with some electronic health records related to various lung diseases to increase the dataset size to 5 GB.Performance measures were conducted using a 5 GB dataset with five nodes.Dataset size was finally increased,and task analysis and CPU utilization were measured.