This research proposes a method called enhanced collaborative andgeometric multi-kernel learning (E-CGMKL) that can enhance the CGMKLalgorithm which deals with multi-class classification problems with non-lineardata d...This research proposes a method called enhanced collaborative andgeometric multi-kernel learning (E-CGMKL) that can enhance the CGMKLalgorithm which deals with multi-class classification problems with non-lineardata distributions. CGMKL combines multiple kernel learning with softmaxfunction using the framework of multi empirical kernel learning (MEKL) inwhich empirical kernel mapping (EKM) provides explicit feature constructionin the high dimensional kernel space. CGMKL ensures the consistent outputof samples across kernel spaces and minimizes the within-class distance tohighlight geometric features of multiple classes. However, the kernels constructed by CGMKL do not have any explicit relationship among them andtry to construct high dimensional feature representations independently fromeach other. This could be disadvantageous for learning on datasets with complex hidden structures. To overcome this limitation, E-CGMKL constructskernel spaces from hidden layers of trained deep neural networks (DNN).Due to the nature of the DNN architecture, these kernel spaces not onlyprovide multiple feature representations but also inherit the compositionalhierarchy of the hidden layers, which might be beneficial for enhancing thepredictive performance of the CGMKL algorithm on complex data withnatural hierarchical structures, for example, image data. Furthermore, ourproposed scheme handles image data by constructing kernel spaces from aconvolutional neural network (CNN). Considering the effectiveness of CNNarchitecture on image data, these kernel spaces provide a major advantageover the CGMKL algorithm which does not exploit the CNN architecture forconstructing kernel spaces from image data. Additionally, outputs of hiddenlayers directly provide features for kernel spaces and unlike CGMKL, do notrequire an approximate MEKL framework. E-CGMKL combines the consistency and geometry preserving aspects of CGMKL with the compositionalhierarchy of kernel spaces extracted from DNN hidden layers to enhance the predictive performance of CGMKL significantly. The experimental results onvarious data sets demonstrate the superior performance of the E-CGMKLalgorithm compared to other competing methods including the benchmarkCGMKL.展开更多
Due to unforeseen climate change,complicated chronic diseases,and mutation of viruses’hospital administration’s top challenge is to know about the Length of stay(LOS)of different diseased patients in the hospitals.H...Due to unforeseen climate change,complicated chronic diseases,and mutation of viruses’hospital administration’s top challenge is to know about the Length of stay(LOS)of different diseased patients in the hospitals.Hospital management does not exactly know when the existing patient leaves the hospital;this information could be crucial for hospital management.It could allow them to take more patients for admission.As a result,hospitals face many problems managing available resources and new patients in getting entries for their prompt treatment.Therefore,a robust model needs to be designed to help hospital administration predict patients’LOS to resolve these issues.For this purpose,a very large-sized data(more than 2.3 million patients’data)related to New-York Hospitals patients and containing information about a wide range of diseases including Bone-Marrow,Tuberculosis,Intestinal Transplant,Mental illness,Leukaemia,Spinal cord injury,Trauma,Rehabilitation,Kidney and Alcoholic Patients,HIV Patients,Malignant Breast disorder,Asthma,Respiratory distress syndrome,etc.have been analyzed to predict the LOS.We selected six Machine learning(ML)models named:Multiple linear regression(MLR),Lasso regression(LR),Ridge regression(RR),Decision tree regression(DTR),Extreme gradient boosting regression(XGBR),and Random Forest regression(RFR).The selected models’predictive performance was checked using R square andMean square error(MSE)as the performance evaluation criteria.Our results revealed the superior predictive performance of the RFRmodel,both in terms of RS score(92%)and MSE score(5),among all selected models.By Exploratory data analysis(EDA),we conclude that maximumstay was between 0 to 5 days with the meantime of each patient 5.3 days and more than 50 years old patients spent more days in the hospital.Based on the average LOS,results revealed that the patients with diagnoses related to birth complications spent more days in the hospital than other diseases.This finding could help predict the future length of hospital stay of new patients,which will help the hospital administration estimate and manage their resources efficiently.展开更多
Artificial intelligence(AI)and machine learning(ML)help in making predictions and businesses to make key decisions that are beneficial for them.In the case of the online shopping business,it’s very important to find ...Artificial intelligence(AI)and machine learning(ML)help in making predictions and businesses to make key decisions that are beneficial for them.In the case of the online shopping business,it’s very important to find trends in the data and get knowledge of features that helps drive the success of the business.In this research,a dataset of 12,330 records of customers has been analyzedwho visited an online shoppingwebsite over a period of one year.The main objective of this research is to find features that are relevant in terms of correctly predicting the purchasing decisions made by visiting customers and build ML models which could make correct predictions on unseen data in the future.The permutation feature importance approach has been used to get the importance of features according to the output variable(Revenue).Five ML models i.e.,decision tree(DT),random forest(RF),extra tree(ET)classifier,Neural networks(NN),and Logistic regression(LR)have been used to make predictions on the unseen data in the future.The performance of each model has been discussed in detail using performance measurement techniques such as accuracy score,precision,recall,F1 score,and ROC-AUC curve.RF model is the bestmodel among all five chosen based on accuracy score of 90%and F1 score of 79%followed by extra tree classifier.Hence,our study indicates that RF model can be used by online retailing businesses for predicting consumer buying behaviour.Our research also reveals the importance of page value as a key feature for capturing online purchasing trends.This may give a clue to future businesses who can focus on this specific feature and can find key factors behind page value success which in turn will help the online shopping business.展开更多
文摘This research proposes a method called enhanced collaborative andgeometric multi-kernel learning (E-CGMKL) that can enhance the CGMKLalgorithm which deals with multi-class classification problems with non-lineardata distributions. CGMKL combines multiple kernel learning with softmaxfunction using the framework of multi empirical kernel learning (MEKL) inwhich empirical kernel mapping (EKM) provides explicit feature constructionin the high dimensional kernel space. CGMKL ensures the consistent outputof samples across kernel spaces and minimizes the within-class distance tohighlight geometric features of multiple classes. However, the kernels constructed by CGMKL do not have any explicit relationship among them andtry to construct high dimensional feature representations independently fromeach other. This could be disadvantageous for learning on datasets with complex hidden structures. To overcome this limitation, E-CGMKL constructskernel spaces from hidden layers of trained deep neural networks (DNN).Due to the nature of the DNN architecture, these kernel spaces not onlyprovide multiple feature representations but also inherit the compositionalhierarchy of the hidden layers, which might be beneficial for enhancing thepredictive performance of the CGMKL algorithm on complex data withnatural hierarchical structures, for example, image data. Furthermore, ourproposed scheme handles image data by constructing kernel spaces from aconvolutional neural network (CNN). Considering the effectiveness of CNNarchitecture on image data, these kernel spaces provide a major advantageover the CGMKL algorithm which does not exploit the CNN architecture forconstructing kernel spaces from image data. Additionally, outputs of hiddenlayers directly provide features for kernel spaces and unlike CGMKL, do notrequire an approximate MEKL framework. E-CGMKL combines the consistency and geometry preserving aspects of CGMKL with the compositionalhierarchy of kernel spaces extracted from DNN hidden layers to enhance the predictive performance of CGMKL significantly. The experimental results onvarious data sets demonstrate the superior performance of the E-CGMKLalgorithm compared to other competing methods including the benchmarkCGMKL.
文摘Due to unforeseen climate change,complicated chronic diseases,and mutation of viruses’hospital administration’s top challenge is to know about the Length of stay(LOS)of different diseased patients in the hospitals.Hospital management does not exactly know when the existing patient leaves the hospital;this information could be crucial for hospital management.It could allow them to take more patients for admission.As a result,hospitals face many problems managing available resources and new patients in getting entries for their prompt treatment.Therefore,a robust model needs to be designed to help hospital administration predict patients’LOS to resolve these issues.For this purpose,a very large-sized data(more than 2.3 million patients’data)related to New-York Hospitals patients and containing information about a wide range of diseases including Bone-Marrow,Tuberculosis,Intestinal Transplant,Mental illness,Leukaemia,Spinal cord injury,Trauma,Rehabilitation,Kidney and Alcoholic Patients,HIV Patients,Malignant Breast disorder,Asthma,Respiratory distress syndrome,etc.have been analyzed to predict the LOS.We selected six Machine learning(ML)models named:Multiple linear regression(MLR),Lasso regression(LR),Ridge regression(RR),Decision tree regression(DTR),Extreme gradient boosting regression(XGBR),and Random Forest regression(RFR).The selected models’predictive performance was checked using R square andMean square error(MSE)as the performance evaluation criteria.Our results revealed the superior predictive performance of the RFRmodel,both in terms of RS score(92%)and MSE score(5),among all selected models.By Exploratory data analysis(EDA),we conclude that maximumstay was between 0 to 5 days with the meantime of each patient 5.3 days and more than 50 years old patients spent more days in the hospital.Based on the average LOS,results revealed that the patients with diagnoses related to birth complications spent more days in the hospital than other diseases.This finding could help predict the future length of hospital stay of new patients,which will help the hospital administration estimate and manage their resources efficiently.
文摘Artificial intelligence(AI)and machine learning(ML)help in making predictions and businesses to make key decisions that are beneficial for them.In the case of the online shopping business,it’s very important to find trends in the data and get knowledge of features that helps drive the success of the business.In this research,a dataset of 12,330 records of customers has been analyzedwho visited an online shoppingwebsite over a period of one year.The main objective of this research is to find features that are relevant in terms of correctly predicting the purchasing decisions made by visiting customers and build ML models which could make correct predictions on unseen data in the future.The permutation feature importance approach has been used to get the importance of features according to the output variable(Revenue).Five ML models i.e.,decision tree(DT),random forest(RF),extra tree(ET)classifier,Neural networks(NN),and Logistic regression(LR)have been used to make predictions on the unseen data in the future.The performance of each model has been discussed in detail using performance measurement techniques such as accuracy score,precision,recall,F1 score,and ROC-AUC curve.RF model is the bestmodel among all five chosen based on accuracy score of 90%and F1 score of 79%followed by extra tree classifier.Hence,our study indicates that RF model can be used by online retailing businesses for predicting consumer buying behaviour.Our research also reveals the importance of page value as a key feature for capturing online purchasing trends.This may give a clue to future businesses who can focus on this specific feature and can find key factors behind page value success which in turn will help the online shopping business.