Machine learning(ML)practices such as classification have played a very important role in classifying diseases in medical science.Since medical science is a sensitive field,the pre-processing of medical data requires ...Machine learning(ML)practices such as classification have played a very important role in classifying diseases in medical science.Since medical science is a sensitive field,the pre-processing of medical data requires careful handling to make quality clinical decisions.Generally,medical data is considered high-dimensional and complex data that contains many irrelevant and redundant features.These factors indirectly upset the disease prediction and classification accuracy of any ML model.To address this issue,various data pre-processing methods called Feature Selection(FS)techniques have been presented in the literature.However,the majority of such techniques frequently suffer from local minima issues due to large solution space.Thus,this study has proposed a novel wrapper-based Sand Cat SwarmOptimization(SCSO)technique as an FS approach to find optimum features from ten benchmark medical datasets.The SCSO algorithm replicates the hunting and searching strategies of the sand cat while having the advantage of avoiding local optima and finding the ideal solution with minimal control variables.Moreover,K-Nearest Neighbor(KNN)classifier was used to evaluate the effectiveness of the features identified by the proposed SCSO algorithm.The performance of the proposed SCSO algorithm was compared with six state-of-the-art and recent wrapper-based optimization algorithms using the validation metrics of classification accuracy,optimum feature size,and computational cost in seconds.The simulation results on the benchmark medical datasets revealed that the proposed SCSO-KNN approach has outperformed comparative algorithms with an average classification accuracy of 93.96%by selecting 14.2 features within 1.91 s.Additionally,the Wilcoxon rank test was used to perform the significance analysis between the proposed SCSOKNN method and six other algorithms for a p-value less than 5.00E-02.The findings revealed that the proposed algorithm produces better outcomes with an average p-value of 1.82E-02.Moreover,potential future directions are also suggested as a result of the study’s promising findings.展开更多
Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by...Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by 2040,implying that one out of every ten persons will be diabetic.There is no doubt that this startling figure requires immediate attention from industry and academia to promote innovation and growth in diabetes risk prediction to save individuals’lives.Due to its rapid development,deep learning(DL)was used to predict numerous diseases.However,DLmethods still suffer from their limited prediction performance due to the hyperparameters selection and parameters optimization.Therefore,the selection of hyper-parameters is critical in improving classification performance.This study presents Convolutional Neural Network(CNN)that has achieved remarkable results in many medical domains where the Bayesian optimization algorithm(BOA)has been employed for hyperparameters selection and parameters optimization.Two issues have been investigated and solved during the experiment to enhance the results.The first is the dataset class imbalance,which is solved using Synthetic Minority Oversampling Technique(SMOTE)technique.The second issue is the model’s poor performance,which has been solved using the Bayesian optimization algorithm.The findings indicate that the Bayesian based-CNN model superbases all the state-of-the-art models in the literature with an accuracy of 89.36%,F1-score of 0.88.6,andMatthews Correlation Coefficient(MCC)of 0.88.6.展开更多
Prediction of machine failure is challenging as the dataset is often imbalanced with a low failure rate.The common approach to han-dle classification involving imbalanced data is to balance the data using a sampling a...Prediction of machine failure is challenging as the dataset is often imbalanced with a low failure rate.The common approach to han-dle classification involving imbalanced data is to balance the data using a sampling approach such as random undersampling,random oversampling,or Synthetic Minority Oversampling Technique(SMOTE)algorithms.This paper compared the classification performance of three popular classifiers(Logistic Regression,Gaussian Naïve Bayes,and Support Vector Machine)in predicting machine failure in the Oil and Gas industry.The original machine failure dataset consists of 20,473 hourly data and is imbalanced with 19945(97%)‘non-failure’and 528(3%)‘failure data’.The three independent variables to predict machine failure were pressure indicator,flow indicator,and level indicator.The accuracy of the classifiers is very high and close to 100%,but the sensitivity of all classifiers using the original dataset was close to zero.The performance of the three classifiers was then evaluated for data with different imbalance rates(10%to 50%)generated from the original data using SMOTE,SMOTE-Support Vector Machine(SMOTE-SVM)and SMOTE-Edited Nearest Neighbour(SMOTE-ENN).The classifiers were evaluated based on improvement in sensitivity and F-measure.Results showed that the sensitivity of all classifiers increases as the imbalance rate increases.SVM with radial basis function(RBF)kernel has the highest sensitivity when data is balanced(50:50)using SMOTE(Sensitivitytest=0.5686,Ftest=0.6927)compared to Naïve Bayes(Sensitivitytest=0.4033,Ftest=0.6218)and Logistic Regression(Sensitivitytest=0.4194,Ftest=0.621).Overall,the Gaussian Naïve Bayes model consistently improves sensitivity and F-measure as the imbalance ratio increases,but the sensitivity is below 50%.The classifiers performed better when data was balanced using SMOTE-SVM compared to SMOTE and SMOTE-ENN.展开更多
Fuzzy inference system(FIS)is a process of fuzzy logic reasoning to produce the output based on fuzzified inputs.The system starts with identifying input from data,applying the fuzziness to input using membership func...Fuzzy inference system(FIS)is a process of fuzzy logic reasoning to produce the output based on fuzzified inputs.The system starts with identifying input from data,applying the fuzziness to input using membership functions(MF),generating fuzzy rules for the fuzzy sets and obtaining the output.There are several types of input MFs which can be introduced in FIS,commonly chosen based on the type of real data,sensitivity of certain rule implied and computational limits.This paper focuses on the construction of interval type 2(IT2)trapezoidal shape MF from fuzzy C Means(FCM)that is used for fuzzification process of mamdani FIS.In the process,upper MF(UMF)and lower MF(LMF)of the MF need to be identified to get the range of the footprint of uncertainty(FOU).This paper proposes Genetic tuning process,which is a part of genetic algorithm(GA),to adjust parameters in order to improve the behavior of existing system,especially to enhance the accuracy of the system model.This novel process is a hybrid approach which produces Genetic Fuzzy System(GFS)that helps to enhance fuzzy classification problems and performance.The approach provides a new method for the construction and tuning process of the IT2 MF,based on the FCM outcomes.The result is compared to Gaussian shape IT2 MF and trapezoid IT2 MF generated by the classic GA method.It is shown that the proposed approach is able to outperform the mentioned benchmarked approaches.The work implies a wider range of IT2 MF types,constructed based on FCM outcomes,and an optimum generation of the FOU so that it can be implemented in practical applications such as prediction,analytics and rule-based solutions.展开更多
Communication is a basic need of every human being to exchange thoughts and interact with the society.Acute peoples usually confab through different spoken languages,whereas deaf people cannot do so.Therefore,the Sign...Communication is a basic need of every human being to exchange thoughts and interact with the society.Acute peoples usually confab through different spoken languages,whereas deaf people cannot do so.Therefore,the Sign Language(SL)is the communication medium of such people for their conversation and interaction with the society.The SL is expressed in terms of specific gesture for every word and a gesture is consisted in a sequence of performed signs.The acute people normally observe these signs to understand the difference between single and multiple gestures for singular and plural words respectively.The signs for singular words such as I,eat,drink,home are unalike the plural words as school,cars,players.A special training is required to gain the sufficient knowledge and practice so that people can differentiate and understand every gesture/sign appropriately.Innumerable researches have been performed to articulate the computer-based solution to understand the single gesture with the help of a single hand enumeration.The complete understanding of such communications are possible only with the help of this differentiation of gestures in computer-based solution of SL to cope with the real world environment.Hence,there is still a demand for specific environment to automate such a communication solution to interact with such type of special people.This research focuses on facilitating the deaf community by capturing the gestures in video format and then mapping and differentiating as single or multiple gestures used in words.Finally,these are converted into the respective words/sentences within a reasonable time.This provide a real time solution for the deaf people to communicate and interact with the society.展开更多
Communication is a basic need of every human being;by this,they can learn,express their feelings and exchange their ideas,but deaf people cannot listen and speak.For communication,they use various hands gestures,also ...Communication is a basic need of every human being;by this,they can learn,express their feelings and exchange their ideas,but deaf people cannot listen and speak.For communication,they use various hands gestures,also known as Sign Language(SL),which they learn from special schools.As normal people have not taken SL classes;therefore,they are unable to perform signs of daily routine sentences(e.g.,what are the specifications of this mobile phone?).A technological solution can facilitate in overcoming this communication gap by which normal people can communicate with deaf people.This paper presents an architecture for an application named Sign4PSL that translates the sentences to Pakistan Sign Language(PSL)for deaf people with visual representation using virtual signing character.This research aims to develop a generic independent application that is lightweight and reusable on any platform,including web and mobile,with an ability to perform offline text translation.The Sign4PSL relies on a knowledge base that stores both corpus of PSL Words and their coded form in the notation system.Sign4PSL takes English language text as an input,performs the translation to PSL through sign language notation and displays gestures to the user using virtual character.The system is tested on deaf students at a special school.The results have shown that the students were able to understand the story presented to them appropriately.展开更多
A web browser is the most basic tool for accessing the internet from any of the machines/equipment.Recently,data breaches have been reported frequently from users who are concerned about their personal information,as ...A web browser is the most basic tool for accessing the internet from any of the machines/equipment.Recently,data breaches have been reported frequently from users who are concerned about their personal information,as well as threats from criminal actors.Giving loss of data and information to an innocent user comes under the jurisdiction of cyber-attack.These kinds of cyber-attacks are far more dangerous when it comes to the many types of devices employed in an internet of things(IoT)environment.Continuous surveillance of IoT devices and forensic tools are required to overcome the issues pertaining to secure data and assets.Peer to peer(P2P)applications have been utilized for criminal operations on the web.Therefore,it is a challenge for a forensic investigator to perform forensic analysis of the evolving hardware and software platforms for IoT.For identity concealment and privacy protection,the Onion Router(Tor)and Chrome with the Invisible Internet Project(I2P)as the foundation browser are often used.Confirmation is required to determine whether Tor is truly anonymous and private as they claim.Some people,on the other hand,utilize the Tor browser for evil reasons.Tools and techniques are available for the collection of artifacts,identifying problem areas,further processing and analysis of data on the computer and IoT.Present research tried to explore a few tools for the tracing of I2P activities over computer on windows 10 that reflects IoT devices.According to the results of this research,it leaves an excessive amount of important digital evidence on the operating system that can be exploited to attack the information of users.This research is based on windows operating system and does not support other operating systems.展开更多
Electricity price forecasting is a subset of energy and power forecasting that focuses on projecting commercial electricity market present and future prices.Electricity price forecasting have been a critical input to ...Electricity price forecasting is a subset of energy and power forecasting that focuses on projecting commercial electricity market present and future prices.Electricity price forecasting have been a critical input to energy corporations’strategic decision-making systems over the last 15 years.Many strategies have been utilized for price forecasting in the past,however Artificial Intelligence Techniques(Fuzzy Logic and ANN)have proven to be more efficient than traditional techniques(Regression and Time Series).Fuzzy logic is an approach that uses membership functions(MF)and fuzzy inference model to forecast future electricity prices.Fuzzy c-means(FCM)is one of the popular clustering approach for generating fuzzy membership functions.However,the fuzzy c-means algorithm is limited to producing only one type of MFs,Gaussian MF.The generation of various fuzzy membership functions is critical since it allows for more efficient and optimal problem solutions.As a result,for the best and most improved results for electricity price forecasting,an approach to generate multiple type-1 fuzzy MFs using FCM algorithm is required.Therefore,the objective of this paper is to propose an approach for generating type-1 fuzzy triangular and trapezoidal MFs using FCM algorithm to overcome the limitations of the FCM algorithm.The approach is used to compute and improve forecasting accuracy for electricity prices,where Australian Energy Market Operator(AEMO)data is used.The results show that the proposed approach of using FCM to generate type-1 fuzzy MFs is effective and can be adopted.展开更多
基金This research was supported by a Researchers Supporting Project Number(RSP2021/309)King Saud University,Riyadh,Saudi Arabia.The authors wish to acknowledge Yayasan Universiti Teknologi Petronas for supporting this work through the research grant(015LC0-308).
文摘Machine learning(ML)practices such as classification have played a very important role in classifying diseases in medical science.Since medical science is a sensitive field,the pre-processing of medical data requires careful handling to make quality clinical decisions.Generally,medical data is considered high-dimensional and complex data that contains many irrelevant and redundant features.These factors indirectly upset the disease prediction and classification accuracy of any ML model.To address this issue,various data pre-processing methods called Feature Selection(FS)techniques have been presented in the literature.However,the majority of such techniques frequently suffer from local minima issues due to large solution space.Thus,this study has proposed a novel wrapper-based Sand Cat SwarmOptimization(SCSO)technique as an FS approach to find optimum features from ten benchmark medical datasets.The SCSO algorithm replicates the hunting and searching strategies of the sand cat while having the advantage of avoiding local optima and finding the ideal solution with minimal control variables.Moreover,K-Nearest Neighbor(KNN)classifier was used to evaluate the effectiveness of the features identified by the proposed SCSO algorithm.The performance of the proposed SCSO algorithm was compared with six state-of-the-art and recent wrapper-based optimization algorithms using the validation metrics of classification accuracy,optimum feature size,and computational cost in seconds.The simulation results on the benchmark medical datasets revealed that the proposed SCSO-KNN approach has outperformed comparative algorithms with an average classification accuracy of 93.96%by selecting 14.2 features within 1.91 s.Additionally,the Wilcoxon rank test was used to perform the significance analysis between the proposed SCSOKNN method and six other algorithms for a p-value less than 5.00E-02.The findings revealed that the proposed algorithm produces better outcomes with an average p-value of 1.82E-02.Moreover,potential future directions are also suggested as a result of the study’s promising findings.
基金This research/paper was fully supported by Universiti Teknologi PETRONAS,under the Yayasan Universiti Teknologi PETRONAS(YUTP)Fundamental Research Grant Scheme(015LC0-311).
文摘Diabetes mellitus is a long-term condition characterized by hyperglycemia.It could lead to plenty of difficulties.According to rising morbidity in recent years,the world’s diabetic patients will exceed 642 million by 2040,implying that one out of every ten persons will be diabetic.There is no doubt that this startling figure requires immediate attention from industry and academia to promote innovation and growth in diabetes risk prediction to save individuals’lives.Due to its rapid development,deep learning(DL)was used to predict numerous diseases.However,DLmethods still suffer from their limited prediction performance due to the hyperparameters selection and parameters optimization.Therefore,the selection of hyper-parameters is critical in improving classification performance.This study presents Convolutional Neural Network(CNN)that has achieved remarkable results in many medical domains where the Bayesian optimization algorithm(BOA)has been employed for hyperparameters selection and parameters optimization.Two issues have been investigated and solved during the experiment to enhance the results.The first is the dataset class imbalance,which is solved using Synthetic Minority Oversampling Technique(SMOTE)technique.The second issue is the model’s poor performance,which has been solved using the Bayesian optimization algorithm.The findings indicate that the Bayesian based-CNN model superbases all the state-of-the-art models in the literature with an accuracy of 89.36%,F1-score of 0.88.6,andMatthews Correlation Coefficient(MCC)of 0.88.6.
基金supported under the research Grant(PO Number:920138936)from the Institute of Technology PETRONAS Sdn Bhd,32610,Bandar Seri Iskandar,Perak,Malaysia.
文摘Prediction of machine failure is challenging as the dataset is often imbalanced with a low failure rate.The common approach to han-dle classification involving imbalanced data is to balance the data using a sampling approach such as random undersampling,random oversampling,or Synthetic Minority Oversampling Technique(SMOTE)algorithms.This paper compared the classification performance of three popular classifiers(Logistic Regression,Gaussian Naïve Bayes,and Support Vector Machine)in predicting machine failure in the Oil and Gas industry.The original machine failure dataset consists of 20,473 hourly data and is imbalanced with 19945(97%)‘non-failure’and 528(3%)‘failure data’.The three independent variables to predict machine failure were pressure indicator,flow indicator,and level indicator.The accuracy of the classifiers is very high and close to 100%,but the sensitivity of all classifiers using the original dataset was close to zero.The performance of the three classifiers was then evaluated for data with different imbalance rates(10%to 50%)generated from the original data using SMOTE,SMOTE-Support Vector Machine(SMOTE-SVM)and SMOTE-Edited Nearest Neighbour(SMOTE-ENN).The classifiers were evaluated based on improvement in sensitivity and F-measure.Results showed that the sensitivity of all classifiers increases as the imbalance rate increases.SVM with radial basis function(RBF)kernel has the highest sensitivity when data is balanced(50:50)using SMOTE(Sensitivitytest=0.5686,Ftest=0.6927)compared to Naïve Bayes(Sensitivitytest=0.4033,Ftest=0.6218)and Logistic Regression(Sensitivitytest=0.4194,Ftest=0.621).Overall,the Gaussian Naïve Bayes model consistently improves sensitivity and F-measure as the imbalance ratio increases,but the sensitivity is below 50%.The classifiers performed better when data was balanced using SMOTE-SVM compared to SMOTE and SMOTE-ENN.
基金The works presented in this paper are part of an ongoing research funded by the Fundamental Research Grant Scheme(FRGS/1/2018/ICT02/UTP/02/1)a grant funded by the Ministry of Higher Education,Malaysia and the Yayasan Universiti Teknologi PETRONAS grant(015LC0-274 and 015LC0-311).
文摘Fuzzy inference system(FIS)is a process of fuzzy logic reasoning to produce the output based on fuzzified inputs.The system starts with identifying input from data,applying the fuzziness to input using membership functions(MF),generating fuzzy rules for the fuzzy sets and obtaining the output.There are several types of input MFs which can be introduced in FIS,commonly chosen based on the type of real data,sensitivity of certain rule implied and computational limits.This paper focuses on the construction of interval type 2(IT2)trapezoidal shape MF from fuzzy C Means(FCM)that is used for fuzzification process of mamdani FIS.In the process,upper MF(UMF)and lower MF(LMF)of the MF need to be identified to get the range of the footprint of uncertainty(FOU).This paper proposes Genetic tuning process,which is a part of genetic algorithm(GA),to adjust parameters in order to improve the behavior of existing system,especially to enhance the accuracy of the system model.This novel process is a hybrid approach which produces Genetic Fuzzy System(GFS)that helps to enhance fuzzy classification problems and performance.The approach provides a new method for the construction and tuning process of the IT2 MF,based on the FCM outcomes.The result is compared to Gaussian shape IT2 MF and trapezoid IT2 MF generated by the classic GA method.It is shown that the proposed approach is able to outperform the mentioned benchmarked approaches.The work implies a wider range of IT2 MF types,constructed based on FCM outcomes,and an optimum generation of the FOU so that it can be implemented in practical applications such as prediction,analytics and rule-based solutions.
基金The work presented in this paper is part of an ongoing research funded by Yayasan Universiti Teknologi PETRONAS Grant(015LC0-311 and 015LC0-029).
文摘Communication is a basic need of every human being to exchange thoughts and interact with the society.Acute peoples usually confab through different spoken languages,whereas deaf people cannot do so.Therefore,the Sign Language(SL)is the communication medium of such people for their conversation and interaction with the society.The SL is expressed in terms of specific gesture for every word and a gesture is consisted in a sequence of performed signs.The acute people normally observe these signs to understand the difference between single and multiple gestures for singular and plural words respectively.The signs for singular words such as I,eat,drink,home are unalike the plural words as school,cars,players.A special training is required to gain the sufficient knowledge and practice so that people can differentiate and understand every gesture/sign appropriately.Innumerable researches have been performed to articulate the computer-based solution to understand the single gesture with the help of a single hand enumeration.The complete understanding of such communications are possible only with the help of this differentiation of gestures in computer-based solution of SL to cope with the real world environment.Hence,there is still a demand for specific environment to automate such a communication solution to interact with such type of special people.This research focuses on facilitating the deaf community by capturing the gestures in video format and then mapping and differentiating as single or multiple gestures used in words.Finally,these are converted into the respective words/sentences within a reasonable time.This provide a real time solution for the deaf people to communicate and interact with the society.
基金This research is ongoing research supported by Yayasan Universiti Teknologi PETRONAS Grant Scheme,015LC0029 and 015LC0277.
文摘Communication is a basic need of every human being;by this,they can learn,express their feelings and exchange their ideas,but deaf people cannot listen and speak.For communication,they use various hands gestures,also known as Sign Language(SL),which they learn from special schools.As normal people have not taken SL classes;therefore,they are unable to perform signs of daily routine sentences(e.g.,what are the specifications of this mobile phone?).A technological solution can facilitate in overcoming this communication gap by which normal people can communicate with deaf people.This paper presents an architecture for an application named Sign4PSL that translates the sentences to Pakistan Sign Language(PSL)for deaf people with visual representation using virtual signing character.This research aims to develop a generic independent application that is lightweight and reusable on any platform,including web and mobile,with an ability to perform offline text translation.The Sign4PSL relies on a knowledge base that stores both corpus of PSL Words and their coded form in the notation system.Sign4PSL takes English language text as an input,performs the translation to PSL through sign language notation and displays gestures to the user using virtual character.The system is tested on deaf students at a special school.The results have shown that the students were able to understand the story presented to them appropriately.
基金supported by Yayasan Universiti Teknologi PETRONAS Grant Scheme015LC0029 and 015LC0277.
文摘A web browser is the most basic tool for accessing the internet from any of the machines/equipment.Recently,data breaches have been reported frequently from users who are concerned about their personal information,as well as threats from criminal actors.Giving loss of data and information to an innocent user comes under the jurisdiction of cyber-attack.These kinds of cyber-attacks are far more dangerous when it comes to the many types of devices employed in an internet of things(IoT)environment.Continuous surveillance of IoT devices and forensic tools are required to overcome the issues pertaining to secure data and assets.Peer to peer(P2P)applications have been utilized for criminal operations on the web.Therefore,it is a challenge for a forensic investigator to perform forensic analysis of the evolving hardware and software platforms for IoT.For identity concealment and privacy protection,the Onion Router(Tor)and Chrome with the Invisible Internet Project(I2P)as the foundation browser are often used.Confirmation is required to determine whether Tor is truly anonymous and private as they claim.Some people,on the other hand,utilize the Tor browser for evil reasons.Tools and techniques are available for the collection of artifacts,identifying problem areas,further processing and analysis of data on the computer and IoT.Present research tried to explore a few tools for the tracing of I2P activities over computer on windows 10 that reflects IoT devices.According to the results of this research,it leaves an excessive amount of important digital evidence on the operating system that can be exploited to attack the information of users.This research is based on windows operating system and does not support other operating systems.
基金This research is an ongoing research supported by Yayasan UTP Grant(015LC0-321&015LC0-311)Fundamental Research Grant Scheme(FRGS/1/2018/ICT02/UTP/02/1)a grant funded by the Ministry of Higher Education,Malaysia.
文摘Electricity price forecasting is a subset of energy and power forecasting that focuses on projecting commercial electricity market present and future prices.Electricity price forecasting have been a critical input to energy corporations’strategic decision-making systems over the last 15 years.Many strategies have been utilized for price forecasting in the past,however Artificial Intelligence Techniques(Fuzzy Logic and ANN)have proven to be more efficient than traditional techniques(Regression and Time Series).Fuzzy logic is an approach that uses membership functions(MF)and fuzzy inference model to forecast future electricity prices.Fuzzy c-means(FCM)is one of the popular clustering approach for generating fuzzy membership functions.However,the fuzzy c-means algorithm is limited to producing only one type of MFs,Gaussian MF.The generation of various fuzzy membership functions is critical since it allows for more efficient and optimal problem solutions.As a result,for the best and most improved results for electricity price forecasting,an approach to generate multiple type-1 fuzzy MFs using FCM algorithm is required.Therefore,the objective of this paper is to propose an approach for generating type-1 fuzzy triangular and trapezoidal MFs using FCM algorithm to overcome the limitations of the FCM algorithm.The approach is used to compute and improve forecasting accuracy for electricity prices,where Australian Energy Market Operator(AEMO)data is used.The results show that the proposed approach of using FCM to generate type-1 fuzzy MFs is effective and can be adopted.