The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives,such as minimizing the cost of production.However,the effectiveness of real-time optimization i...The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives,such as minimizing the cost of production.However,the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances.Thus,we propose a real-time optimization algorithm based on the soft actor-critic(SAC)deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances.Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances.The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints.Carefully abstracted states facilitate algorithm convergence,and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios.Through these well-designed components,the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming.It even demonstrates comparable performance with the time-horizon based real-time optimization method,which requires knowledge of uncertainty models,confirming its capability to handle uncertainty without accurate models.Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.展开更多
Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data predic...Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.展开更多
The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agil...The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agile Scrum and the Obtain, Scrub, Explore, Model, and iNterpret (OSEMN) methodology. Six machine learning models, namely Linear Forecast, Naive Forecast, Simple Moving Average with weekly window (SMA 5), Simple Moving Average with monthly window (SMA 20), Autoregressive Integrated Moving Average (ARIMA), and Long Short-Term Memory (LSTM), are compared and evaluated through Mean Absolute Error (MAE), with the LSTM model performing the best, showcasing its potential for practical financial applications. A Django web application “Predict It” is developed to implement the LSTM model. Ethical concerns related to predictive modeling in finance are addressed. Data quality, algorithm choice, feature engineering, and preprocessing techniques are emphasized for better model performance. The research acknowledges limitations and suggests future research directions, aiming to equip investors and financial professionals with reliable predictive models for dynamic markets.展开更多
Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit ca...Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.展开更多
Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes...Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.展开更多
The increasing prevalence of Internet of Things(IoT)devices has introduced a new phase of connectivity in recent years and,concurrently,has opened the floodgates for growing cyber threats.Among the myriad of potential...The increasing prevalence of Internet of Things(IoT)devices has introduced a new phase of connectivity in recent years and,concurrently,has opened the floodgates for growing cyber threats.Among the myriad of potential attacks,Denial of Service(DoS)attacks and Distributed Denial of Service(DDoS)attacks remain a dominant concern due to their capability to render services inoperable by overwhelming systems with an influx of traffic.As IoT devices often lack the inherent security measures found in more mature computing platforms,the need for robust DoS/DDoS detection systems tailored to IoT is paramount for the sustainable development of every domain that IoT serves.In this study,we investigate the effectiveness of three machine learning(ML)algorithms:extreme gradient boosting(XGB),multilayer perceptron(MLP)and random forest(RF),for the detection of IoTtargeted DoS/DDoS attacks and three feature engineering methods that have not been used in the existing stateof-the-art,and then employed the best performing algorithm to design a prototype of a novel real-time system towards detection of such DoS/DDoS attacks.The CICIoT2023 dataset was derived from the latest real-world IoT traffic,incorporates both benign and malicious network traffic patterns and after data preprocessing and feature engineering,the data was fed into our models for both training and validation,where findings suggest that while all threemodels exhibit commendable accuracy in detectingDoS/DDoS attacks,the use of particle swarmoptimization(PSO)for feature selection has made great improvements in the performance(accuracy,precsion recall and F1-score of 99.93%for XGB)of the ML models and their execution time(491.023 sceonds for XGB)compared to recursive feature elimination(RFE)and randomforest feature importance(RFI)methods.The proposed real-time system for DoS/DDoS attack detection entails the implementation of an platform capable of effectively processing and analyzing network traffic in real-time.This involvesemploying the best-performing ML algorithmfor detection and the integration of warning mechanisms.We believe this approach will significantly enhance the field of security research and continue to refine it based on future insights and developments.展开更多
Objective:We propose a solution that is backed by cloud computing,combines a series of AI neural networks of computer vision;is capable of detecting,highlighting,and locating breast lesions from a live ultrasound vide...Objective:We propose a solution that is backed by cloud computing,combines a series of AI neural networks of computer vision;is capable of detecting,highlighting,and locating breast lesions from a live ultrasound video feed,provides BI-RADS categorizations;and has reliable sensitivity and specificity.Multiple deep-learning models were trained on more than 300,000 breast ultrasound images to achieve object detection and regions of interest classification.The main objective of this study was to determine whether the performance of our Al-powered solution was comparable to that of ultrasound radiologists.Methods:The noninferiority evaluation was conducted by comparing the examination results of the same screening women between our AI-powered solution and ultrasound radiologists with over 10 years of experience.The study lasted for one and a half years and was carried out in the Duanzhou District Women and Children's Hospital,Zhaoqing,China.1,133 females between 20 and 70 years old were selected through convenience sampling.Results:The accuracy,sensitivity,specificity,positive predictive value,and negative predictive value were 93.03%,94.90%,90.71%,92.68%,and 93.48%,respectively.The area under the curve(AUC)for all positives was 0.91569 and the AUC for all negatives was 0.90461.The comparison indicated that the overall performance of the AI system was comparable to that of ultrasound radiologists.Conclusion:This innovative AI-powered ultrasound solution is cost-effective and user-friendly,and could be applied to massive breast cancer screening.展开更多
Intelligent healthcare networks represent a significant component in digital applications,where the requirements hold within quality-of-service(QoS)reliability and safeguarding privacy.This paper addresses these requi...Intelligent healthcare networks represent a significant component in digital applications,where the requirements hold within quality-of-service(QoS)reliability and safeguarding privacy.This paper addresses these requirements through the integration of enabler paradigms,including federated learning(FL),cloud/edge computing,softwaredefined/virtualized networking infrastructure,and converged prediction algorithms.The study focuses on achieving reliability and efficiency in real-time prediction models,which depend on the interaction flows and network topology.In response to these challenges,we introduce a modified version of federated logistic regression(FLR)that takes into account convergence latencies and the accuracy of the final FL model within healthcare networks.To establish the FLR framework for mission-critical healthcare applications,we provide a comprehensive workflow in this paper,introducing framework setup,iterative round communications,and model evaluation/deployment.Our optimization process delves into the formulation of loss functions and gradients within the domain of federated optimization,which concludes with the generation of service experience batches for model deployment.To assess the practicality of our approach,we conducted experiments using a hypertension prediction model with data sourced from the 2019 annual dataset(Version 2.0.1)of the Korea Medical Panel Survey.Performance metrics,including end-to-end execution delays,model drop/delivery ratios,and final model accuracies,are captured and compared between the proposed FLR framework and other baseline schemes.Our study offers an FLR framework setup for the enhancement of real-time prediction modeling within intelligent healthcare networks,addressing the critical demands of QoS reliability and privacy preservation.展开更多
Predicting the mechanical behaviors of structure and perceiving the anomalies in advance are essential to ensuring the safe operation of infrastructures in the long run.In addition to the incomplete consideration of i...Predicting the mechanical behaviors of structure and perceiving the anomalies in advance are essential to ensuring the safe operation of infrastructures in the long run.In addition to the incomplete consideration of influencing factors,the prediction time scale of existing studies is rough.Therefore,this study focuses on the development of a real-time prediction model by coupling the spatio-temporal correlation with external load through autoencoder network(ATENet)based on structural health monitoring(SHM)data.An autoencoder mechanism is performed to acquire the high-level representation of raw monitoring data at different spatial positions,and the recurrent neural network is applied to understanding the temporal correlation from the time series.Then,the obtained temporal-spatial information is coupled with dynamic loads through a fully connected layer to predict structural performance in next 12 h.As a case study,the proposed model is formulated on the SHM data collected from a representative underwater shield tunnel.The robustness study is carried out to verify the reliability and the prediction capability of the proposed model.Finally,the ATENet model is compared with some typical models,and the results indicate that it has the best performance.ATENet model is of great value to predict the realtime evolution trend of tunnel structure.展开更多
Production optimization has gained increasing attention from the smart oilfield community because it can increase economic benefits and oil recovery substantially.While existing methods could produce high-optimality r...Production optimization has gained increasing attention from the smart oilfield community because it can increase economic benefits and oil recovery substantially.While existing methods could produce high-optimality results,they cannot be applied to real-time optimization for large-scale reservoirs due to high computational demands.In addition,most methods generally assume that the reservoir model is deterministic and ignore the uncertainty of the subsurface environment,making the obtained scheme unreliable for practical deployment.In this work,an efficient and robust method,namely evolutionaryassisted reinforcement learning(EARL),is proposed to achieve real-time production optimization under uncertainty.Specifically,the production optimization problem is modeled as a Markov decision process in which a reinforcement learning agent interacts with the reservoir simulator to train a control policy that maximizes the specified goals.To deal with the problems of brittle convergence properties and lack of efficient exploration strategies of reinforcement learning approaches,a population-based evolutionary algorithm is introduced to assist the training of agents,which provides diverse exploration experiences and promotes stability and robustness due to its inherent redundancy.Compared with prior methods that only optimize a solution for a particular scenario,the proposed approach trains a policy that can adapt to uncertain environments and make real-time decisions to cope with unknown changes.The trained policy,represented by a deep convolutional neural network,can adaptively adjust the well controls based on different reservoir states.Simulation results on two reservoir models show that the proposed approach not only outperforms the RL and EA methods in terms of optimization efficiency but also has strong robustness and real-time decision capacity.展开更多
To detect the improper sitting posture of a person sitting on a chair,a posture detection system using machine learning classification has been proposed in this work.The addressed problem correlates to the third Susta...To detect the improper sitting posture of a person sitting on a chair,a posture detection system using machine learning classification has been proposed in this work.The addressed problem correlates to the third Sustainable Development Goal(SDG),ensuring healthy lives and promoting well-being for all ages,as specified by the World Health Organization(WHO).An improper sitting position can be fatal if one sits for a long time in the wrong position,and it can be dangerous for ulcers and lower spine discomfort.This novel study includes a practical implementation of a cushion consisting of a grid of 3×3 force-sensitive resistors(FSR)embedded to read the pressure of the person sitting on it.Additionally,the Body Mass Index(BMI)has been included to increase the resilience of the system across individual physical variances and to identify the incorrect postures(backward,front,left,and right-leaning)based on the five machine learning algorithms:ensemble boosted trees,ensemble bagged trees,ensemble subspace K-Nearest Neighbors(KNN),ensemble subspace discriminant,and ensemble RUSBoosted trees.The proposed arrangement is novel as existing works have only provided simulations without practical implementation,whereas we have implemented the proposed design in Simulink.The results validate the proposed sensor placements,and the machine learning(ML)model reaches a maximum accuracy of 99.99%,which considerably outperforms the existing works.The proposed concept is valuable as it makes it easier for people in workplaces or even at individual household levels to work for long periods without suffering from severe harmful effects from poor posture.展开更多
This paper presents a machine-learning-based speedup strategy for real-time implementation of model-predictive-control(MPC)in emergency voltage stabilization of power systems.Despite success in various applications,re...This paper presents a machine-learning-based speedup strategy for real-time implementation of model-predictive-control(MPC)in emergency voltage stabilization of power systems.Despite success in various applications,real-time implementation of MPC in power systems has not been successful due to the online control computation time required for large-sized complex systems,and in power systems,the computation time exceeds the available decision time used in practice by a large extent.This long-standing problem is addressed here by developing a novel MPC-based framework that i)computes an optimal strategy for nominal loads in an offline setting and adapts it for real-time scenarios by successive online control corrections at each control instant utilizing the latest measurements,and ii)employs a machine-learning based approach for the prediction of voltage trajectory and its sensitivity to control inputs,thereby accelerating the overall control computation by multiple times.Additionally,a realistic control coordination scheme among static var compensators(SVC),load-shedding(LS),and load tap-changers(LTC)is presented that incorporates the practical delayed actions of the LTCs.The performance of the proposed scheme is validated for IEEE 9-bus and 39-bus systems,with±20%variations in nominal loading conditions together with contingencies.We show that our proposed methodology speeds up the online computation by 20-fold,bringing it down to a practically feasible value(fraction of a second),making the MPC real-time and feasible for power system control for the first time.展开更多
The recent trends in Industry 4.0 and Internet of Things have encour-aged many factory managers to improve inspection processes to achieve automa-tion and high detection rates.However,the corresponding cost results of...The recent trends in Industry 4.0 and Internet of Things have encour-aged many factory managers to improve inspection processes to achieve automa-tion and high detection rates.However,the corresponding cost results of sample tests are still used for quality control.A low-cost automated optical inspection system that can be integrated with production lines to fully inspect products with-out adjustments is introduced herein.The corresponding mechanism design enables each product to maintain afixed position and orientation during inspec-tion to accelerate the inspection process.The proposed system combines image recognition and deep learning to measure the dimensions of the thread and iden-tify its defects within 20 s,which is lower than the production-line productivity per 30 s.In addition,the system is designed to be used for monitoring production lines and equipment status.The dimensional tolerance of the proposed system reaches 0.012 mm,and a 100%accuracy is achieved in terms of the defect reso-lution.In addition,an attention-based visualization approach is utilized to verify the rationale for the use of the convolutional neural network model and identify the location of thread defects.展开更多
The aim of this article is to assist farmers in making better crop selection decisions based on soil fertility and weather forecast through the use of IoT and AI (smart farming). To accomplish this, a prototype was de...The aim of this article is to assist farmers in making better crop selection decisions based on soil fertility and weather forecast through the use of IoT and AI (smart farming). To accomplish this, a prototype was developed capable of predicting the best suitable crop for a specific plot of land based on soil fertility and making recommendations based on weather forecast. Random Forest machine learning algorithm was used and trained with Jupyter in the Anaconda framework to achieve an accuracy of about 99%. Based on this process, IoT with the Message Queuing Telemetry Transport (MQTT) protocol, a machine learning algorithm, based on Random Forest, and weather forecast API for crop prediction and recommendations were used. The prototype accepts nitrogen, phosphorus, potassium, humidity, temperature and pH as input parameters from the IoT sensors, as well as the weather API for data forecasting. The approach was tested in a suburban area of Yaounde (Cameroon). Taking into account future meteorological parameters (rainfall, wind and temperature) in this project produced better recommendations and therefore better crop selection. All necessary results can be accessed from anywhere and at any time using the IoT system via a web browser.展开更多
The deep learning models are identified as having a significant impact on various problems.The same can be adapted to the problem of brain tumor classification.However,several deep learning models are presented earlie...The deep learning models are identified as having a significant impact on various problems.The same can be adapted to the problem of brain tumor classification.However,several deep learning models are presented earlier,but they need better classification accuracy.An efficient Multi-Feature Approximation Based Convolution Neural Network(CNN)model(MFACNN)is proposed to handle this issue.The method reads the input 3D Magnetic Resonance Imaging(MRI)images and applies Gabor filters at multiple levels.The noise-removed image has been equalized for its quality by using histogram equalization.Further,the features like white mass,grey mass,texture,and shape are extracted from the images.Extracted features are trained with deep learning Convolution Neural Network(CNN).The network has been designed with a single convolution layer towards dimensionality reduction.The texture features obtained from the brain image have been transformed into a multi-dimensional feature matrix,which has been transformed into a single-dimensional feature vector at the convolution layer.The neurons of the intermediate layer are designed to measure White Mass Texture Support(WMTS),GrayMass Texture Support(GMTS),WhiteMass Covariance Support(WMCS),GrayMass Covariance Support(GMCS),and Class Texture Adhesive Support(CTAS).In the test phase,the neurons at the intermediate layer compute the support as mentioned above values towards various classes of images.Based on that,the method adds a Multi-Variate Feature Similarity Measure(MVFSM).Based on the importance ofMVFSM,the process finds the class of brain image given and produces an efficient result.展开更多
Spammer detection is to identify and block malicious activities performing users.Such users should be identified and terminated from social media to keep the social media process organic and to maintain the integrity ...Spammer detection is to identify and block malicious activities performing users.Such users should be identified and terminated from social media to keep the social media process organic and to maintain the integrity of online social spaces.Previous research aimed to find spammers based on hybrid approaches of graph mining,posted content,and metadata,using small and manually labeled datasets.However,such hybrid approaches are unscalable,not robust,particular dataset dependent,and require numerous parameters,complex graphs,and natural language processing(NLP)resources to make decisions,which makes spammer detection impractical for real-time detection.For example,graph mining requires neighbors’information,posted content-based approaches require multiple tweets from user profiles,then NLP resources to make decisions that are not applicable in a real-time environment.To fill the gap,firstly,we propose a REal-time Metadata based Spammer detection(REMS)model based on only metadata features to identify spammers,which takes the least number of parameters and provides adequate results.REMS is a scalable and robust model that uses only 19 metadata features of Twitter users to induce 73.81%F1-Score classification accuracy using a balanced training dataset(50%spam and 50%genuine users).The 19 features are 8 original and 11 derived features from the original features of Twitter users,identified with extensive experiments and analysis.Secondly,we present the largest and most diverse dataset of published research,comprising 211 K spam users and 1 million genuine users.The diversity of the dataset can be measured as it comprises users who posted 2.1 million Tweets on seven topics(100 hashtags)from 6 different geographical locations.The REMS’s superior classification performance with multiple machine and deep learning methods indicates that only metadata features have the potential to identify spammers rather than focusing on volatile posted content and complex graph structures.Dataset and REMS’s codes are available on GitHub(www.github.com/mhadnanali/REMS).展开更多
Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning technique...Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.展开更多
基金supported by National Key Research & Development Program-Intergovernmental International Science and Technology Innovation Cooperation Project (2021YFE0112800)National Natural Science Foundation of China (Key Program: 62136003)+2 种基金National Natural Science Foundation of China (62073142)Fundamental Research Funds for the Central Universities (222202417006)Shanghai Al Lab
文摘The gasoline inline blending process has widely used real-time optimization techniques to achieve optimization objectives,such as minimizing the cost of production.However,the effectiveness of real-time optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances.Thus,we propose a real-time optimization algorithm based on the soft actor-critic(SAC)deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances.Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances.The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints.Carefully abstracted states facilitate algorithm convergence,and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios.Through these well-designed components,the algorithm based on the SAC outperforms real-time optimization methods based on either nonlinear or linear programming.It even demonstrates comparable performance with the time-horizon based real-time optimization method,which requires knowledge of uncertainty models,confirming its capability to handle uncertainty without accurate models.Our simulation illustrates a promising approach to free real-time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice.
基金the National Natural Science Foundation of China(22108307)the Natural Science Foundation of Shandong Province(ZR2020KB006)the Outstanding Youth Fund of Shandong Provincial Natural Science Foundation(ZR2020YQ17).
文摘Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.
文摘The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agile Scrum and the Obtain, Scrub, Explore, Model, and iNterpret (OSEMN) methodology. Six machine learning models, namely Linear Forecast, Naive Forecast, Simple Moving Average with weekly window (SMA 5), Simple Moving Average with monthly window (SMA 20), Autoregressive Integrated Moving Average (ARIMA), and Long Short-Term Memory (LSTM), are compared and evaluated through Mean Absolute Error (MAE), with the LSTM model performing the best, showcasing its potential for practical financial applications. A Django web application “Predict It” is developed to implement the LSTM model. Ethical concerns related to predictive modeling in finance are addressed. Data quality, algorithm choice, feature engineering, and preprocessing techniques are emphasized for better model performance. The research acknowledges limitations and suggests future research directions, aiming to equip investors and financial professionals with reliable predictive models for dynamic markets.
文摘Credit card fraud remains a significant challenge, with financial losses and consumer protection at stake. This study addresses the need for practical, real-time fraud detection methodologies. Using a Kaggle credit card dataset, I tackle class imbalance using the Synthetic Minority Oversampling Technique (SMOTE) to enhance modeling efficiency. I compare several machine learning algorithms, including Logistic Regression, Linear Discriminant Analysis, K-nearest Neighbors, Classification and Regression Tree, Naive Bayes, Support Vector, Random Forest, XGBoost, and Light Gradient-Boosting Machine to classify transactions as fraud or genuine. Rigorous evaluation metrics, such as AUC, PRAUC, F1, KS, Recall, and Precision, identify the Random Forest as the best performer in detecting fraudulent activities. The Random Forest model successfully identifies approximately 92% of transactions scoring 90 and above as fraudulent, equating to a detection rate of over 70% for all fraudulent transactions in the test dataset. Moreover, the model captures more than half of the fraud in each bin of the test dataset. SHAP values provide model explainability, with the SHAP summary plot highlighting the global importance of individual features, such as “V12” and “V14”. SHAP force plots offer local interpretability, revealing the impact of specific features on individual predictions. This study demonstrates the potential of machine learning, particularly the Random Forest model, for real-time credit card fraud detection, offering a promising approach to mitigate financial losses and protect consumers.
文摘Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.
文摘The increasing prevalence of Internet of Things(IoT)devices has introduced a new phase of connectivity in recent years and,concurrently,has opened the floodgates for growing cyber threats.Among the myriad of potential attacks,Denial of Service(DoS)attacks and Distributed Denial of Service(DDoS)attacks remain a dominant concern due to their capability to render services inoperable by overwhelming systems with an influx of traffic.As IoT devices often lack the inherent security measures found in more mature computing platforms,the need for robust DoS/DDoS detection systems tailored to IoT is paramount for the sustainable development of every domain that IoT serves.In this study,we investigate the effectiveness of three machine learning(ML)algorithms:extreme gradient boosting(XGB),multilayer perceptron(MLP)and random forest(RF),for the detection of IoTtargeted DoS/DDoS attacks and three feature engineering methods that have not been used in the existing stateof-the-art,and then employed the best performing algorithm to design a prototype of a novel real-time system towards detection of such DoS/DDoS attacks.The CICIoT2023 dataset was derived from the latest real-world IoT traffic,incorporates both benign and malicious network traffic patterns and after data preprocessing and feature engineering,the data was fed into our models for both training and validation,where findings suggest that while all threemodels exhibit commendable accuracy in detectingDoS/DDoS attacks,the use of particle swarmoptimization(PSO)for feature selection has made great improvements in the performance(accuracy,precsion recall and F1-score of 99.93%for XGB)of the ML models and their execution time(491.023 sceonds for XGB)compared to recursive feature elimination(RFE)and randomforest feature importance(RFI)methods.The proposed real-time system for DoS/DDoS attack detection entails the implementation of an platform capable of effectively processing and analyzing network traffic in real-time.This involvesemploying the best-performing ML algorithmfor detection and the integration of warning mechanisms.We believe this approach will significantly enhance the field of security research and continue to refine it based on future insights and developments.
文摘Objective:We propose a solution that is backed by cloud computing,combines a series of AI neural networks of computer vision;is capable of detecting,highlighting,and locating breast lesions from a live ultrasound video feed,provides BI-RADS categorizations;and has reliable sensitivity and specificity.Multiple deep-learning models were trained on more than 300,000 breast ultrasound images to achieve object detection and regions of interest classification.The main objective of this study was to determine whether the performance of our Al-powered solution was comparable to that of ultrasound radiologists.Methods:The noninferiority evaluation was conducted by comparing the examination results of the same screening women between our AI-powered solution and ultrasound radiologists with over 10 years of experience.The study lasted for one and a half years and was carried out in the Duanzhou District Women and Children's Hospital,Zhaoqing,China.1,133 females between 20 and 70 years old were selected through convenience sampling.Results:The accuracy,sensitivity,specificity,positive predictive value,and negative predictive value were 93.03%,94.90%,90.71%,92.68%,and 93.48%,respectively.The area under the curve(AUC)for all positives was 0.91569 and the AUC for all negatives was 0.90461.The comparison indicated that the overall performance of the AI system was comparable to that of ultrasound radiologists.Conclusion:This innovative AI-powered ultrasound solution is cost-effective and user-friendly,and could be applied to massive breast cancer screening.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS2022-00167197Development of Intelligent 5G/6G Infrastructure Technology for the Smart City)+2 种基金in part by the National Research Foundation of Korea(NRF),Ministry of Education,through Basic Science Research Program under Grant NRF-2020R1I1A3066543in part by BK21 FOUR(Fostering Outstanding Universities for Research)under Grant 5199990914048in part by the Soonchunhyang University Research Fund.
文摘Intelligent healthcare networks represent a significant component in digital applications,where the requirements hold within quality-of-service(QoS)reliability and safeguarding privacy.This paper addresses these requirements through the integration of enabler paradigms,including federated learning(FL),cloud/edge computing,softwaredefined/virtualized networking infrastructure,and converged prediction algorithms.The study focuses on achieving reliability and efficiency in real-time prediction models,which depend on the interaction flows and network topology.In response to these challenges,we introduce a modified version of federated logistic regression(FLR)that takes into account convergence latencies and the accuracy of the final FL model within healthcare networks.To establish the FLR framework for mission-critical healthcare applications,we provide a comprehensive workflow in this paper,introducing framework setup,iterative round communications,and model evaluation/deployment.Our optimization process delves into the formulation of loss functions and gradients within the domain of federated optimization,which concludes with the generation of service experience batches for model deployment.To assess the practicality of our approach,we conducted experiments using a hypertension prediction model with data sourced from the 2019 annual dataset(Version 2.0.1)of the Korea Medical Panel Survey.Performance metrics,including end-to-end execution delays,model drop/delivery ratios,and final model accuracies,are captured and compared between the proposed FLR framework and other baseline schemes.Our study offers an FLR framework setup for the enhancement of real-time prediction modeling within intelligent healthcare networks,addressing the critical demands of QoS reliability and privacy preservation.
基金This work is supported by the National Natural Science Foundation of China(Grant No.51991392)Key Deployment Projects of Chinese Academy of Sciences(Grant No.ZDRW-ZS-2021-3-3)the Second Tibetan Plateau Scientific Expedition and Research Program(STEP)(Grant No.2019QZKK0904).
文摘Predicting the mechanical behaviors of structure and perceiving the anomalies in advance are essential to ensuring the safe operation of infrastructures in the long run.In addition to the incomplete consideration of influencing factors,the prediction time scale of existing studies is rough.Therefore,this study focuses on the development of a real-time prediction model by coupling the spatio-temporal correlation with external load through autoencoder network(ATENet)based on structural health monitoring(SHM)data.An autoencoder mechanism is performed to acquire the high-level representation of raw monitoring data at different spatial positions,and the recurrent neural network is applied to understanding the temporal correlation from the time series.Then,the obtained temporal-spatial information is coupled with dynamic loads through a fully connected layer to predict structural performance in next 12 h.As a case study,the proposed model is formulated on the SHM data collected from a representative underwater shield tunnel.The robustness study is carried out to verify the reliability and the prediction capability of the proposed model.Finally,the ATENet model is compared with some typical models,and the results indicate that it has the best performance.ATENet model is of great value to predict the realtime evolution trend of tunnel structure.
基金This work is supported by the National Natural Science Foundation of China under Grant 52274057,52074340 and 51874335the Major Scientific and Technological Projects of CNPC under Grant ZD2019-183-008the Science and Technology Support Plan for Youth Innovation of University in Shandong Province under Grant 2019KJH002,111 Project under Grant B08028.
文摘Production optimization has gained increasing attention from the smart oilfield community because it can increase economic benefits and oil recovery substantially.While existing methods could produce high-optimality results,they cannot be applied to real-time optimization for large-scale reservoirs due to high computational demands.In addition,most methods generally assume that the reservoir model is deterministic and ignore the uncertainty of the subsurface environment,making the obtained scheme unreliable for practical deployment.In this work,an efficient and robust method,namely evolutionaryassisted reinforcement learning(EARL),is proposed to achieve real-time production optimization under uncertainty.Specifically,the production optimization problem is modeled as a Markov decision process in which a reinforcement learning agent interacts with the reservoir simulator to train a control policy that maximizes the specified goals.To deal with the problems of brittle convergence properties and lack of efficient exploration strategies of reinforcement learning approaches,a population-based evolutionary algorithm is introduced to assist the training of agents,which provides diverse exploration experiences and promotes stability and robustness due to its inherent redundancy.Compared with prior methods that only optimize a solution for a particular scenario,the proposed approach trains a policy that can adapt to uncertain environments and make real-time decisions to cope with unknown changes.The trained policy,represented by a deep convolutional neural network,can adaptively adjust the well controls based on different reservoir states.Simulation results on two reservoir models show that the proposed approach not only outperforms the RL and EA methods in terms of optimization efficiency but also has strong robustness and real-time decision capacity.
文摘To detect the improper sitting posture of a person sitting on a chair,a posture detection system using machine learning classification has been proposed in this work.The addressed problem correlates to the third Sustainable Development Goal(SDG),ensuring healthy lives and promoting well-being for all ages,as specified by the World Health Organization(WHO).An improper sitting position can be fatal if one sits for a long time in the wrong position,and it can be dangerous for ulcers and lower spine discomfort.This novel study includes a practical implementation of a cushion consisting of a grid of 3×3 force-sensitive resistors(FSR)embedded to read the pressure of the person sitting on it.Additionally,the Body Mass Index(BMI)has been included to increase the resilience of the system across individual physical variances and to identify the incorrect postures(backward,front,left,and right-leaning)based on the five machine learning algorithms:ensemble boosted trees,ensemble bagged trees,ensemble subspace K-Nearest Neighbors(KNN),ensemble subspace discriminant,and ensemble RUSBoosted trees.The proposed arrangement is novel as existing works have only provided simulations without practical implementation,whereas we have implemented the proposed design in Simulink.The results validate the proposed sensor placements,and the machine learning(ML)model reaches a maximum accuracy of 99.99%,which considerably outperforms the existing works.The proposed concept is valuable as it makes it easier for people in workplaces or even at individual household levels to work for long periods without suffering from severe harmful effects from poor posture.
基金This work was supported in part by the National Science Foundation(NSF-CSSI-2004766,NSF-PFI-2141084).
文摘This paper presents a machine-learning-based speedup strategy for real-time implementation of model-predictive-control(MPC)in emergency voltage stabilization of power systems.Despite success in various applications,real-time implementation of MPC in power systems has not been successful due to the online control computation time required for large-sized complex systems,and in power systems,the computation time exceeds the available decision time used in practice by a large extent.This long-standing problem is addressed here by developing a novel MPC-based framework that i)computes an optimal strategy for nominal loads in an offline setting and adapts it for real-time scenarios by successive online control corrections at each control instant utilizing the latest measurements,and ii)employs a machine-learning based approach for the prediction of voltage trajectory and its sensitivity to control inputs,thereby accelerating the overall control computation by multiple times.Additionally,a realistic control coordination scheme among static var compensators(SVC),load-shedding(LS),and load tap-changers(LTC)is presented that incorporates the practical delayed actions of the LTCs.The performance of the proposed scheme is validated for IEEE 9-bus and 39-bus systems,with±20%variations in nominal loading conditions together with contingencies.We show that our proposed methodology speeds up the online computation by 20-fold,bringing it down to a practically feasible value(fraction of a second),making the MPC real-time and feasible for power system control for the first time.
基金supported partially by the Ministry of Science and Technology,Taiwan,under contracts MOST-110-2634-F-009-024,109-2218-E-150-002,and 109-2218-E-005-015.
文摘The recent trends in Industry 4.0 and Internet of Things have encour-aged many factory managers to improve inspection processes to achieve automa-tion and high detection rates.However,the corresponding cost results of sample tests are still used for quality control.A low-cost automated optical inspection system that can be integrated with production lines to fully inspect products with-out adjustments is introduced herein.The corresponding mechanism design enables each product to maintain afixed position and orientation during inspec-tion to accelerate the inspection process.The proposed system combines image recognition and deep learning to measure the dimensions of the thread and iden-tify its defects within 20 s,which is lower than the production-line productivity per 30 s.In addition,the system is designed to be used for monitoring production lines and equipment status.The dimensional tolerance of the proposed system reaches 0.012 mm,and a 100%accuracy is achieved in terms of the defect reso-lution.In addition,an attention-based visualization approach is utilized to verify the rationale for the use of the convolutional neural network model and identify the location of thread defects.
文摘The aim of this article is to assist farmers in making better crop selection decisions based on soil fertility and weather forecast through the use of IoT and AI (smart farming). To accomplish this, a prototype was developed capable of predicting the best suitable crop for a specific plot of land based on soil fertility and making recommendations based on weather forecast. Random Forest machine learning algorithm was used and trained with Jupyter in the Anaconda framework to achieve an accuracy of about 99%. Based on this process, IoT with the Message Queuing Telemetry Transport (MQTT) protocol, a machine learning algorithm, based on Random Forest, and weather forecast API for crop prediction and recommendations were used. The prototype accepts nitrogen, phosphorus, potassium, humidity, temperature and pH as input parameters from the IoT sensors, as well as the weather API for data forecasting. The approach was tested in a suburban area of Yaounde (Cameroon). Taking into account future meteorological parameters (rainfall, wind and temperature) in this project produced better recommendations and therefore better crop selection. All necessary results can be accessed from anywhere and at any time using the IoT system via a web browser.
文摘The deep learning models are identified as having a significant impact on various problems.The same can be adapted to the problem of brain tumor classification.However,several deep learning models are presented earlier,but they need better classification accuracy.An efficient Multi-Feature Approximation Based Convolution Neural Network(CNN)model(MFACNN)is proposed to handle this issue.The method reads the input 3D Magnetic Resonance Imaging(MRI)images and applies Gabor filters at multiple levels.The noise-removed image has been equalized for its quality by using histogram equalization.Further,the features like white mass,grey mass,texture,and shape are extracted from the images.Extracted features are trained with deep learning Convolution Neural Network(CNN).The network has been designed with a single convolution layer towards dimensionality reduction.The texture features obtained from the brain image have been transformed into a multi-dimensional feature matrix,which has been transformed into a single-dimensional feature vector at the convolution layer.The neurons of the intermediate layer are designed to measure White Mass Texture Support(WMTS),GrayMass Texture Support(GMTS),WhiteMass Covariance Support(WMCS),GrayMass Covariance Support(GMCS),and Class Texture Adhesive Support(CTAS).In the test phase,the neurons at the intermediate layer compute the support as mentioned above values towards various classes of images.Based on that,the method adds a Multi-Variate Feature Similarity Measure(MVFSM).Based on the importance ofMVFSM,the process finds the class of brain image given and produces an efficient result.
基金supported by the Guangzhou Government Project(Grant No.62216235)the National Natural Science Foundation of China(Grant Nos.61573328,622260-1).
文摘Spammer detection is to identify and block malicious activities performing users.Such users should be identified and terminated from social media to keep the social media process organic and to maintain the integrity of online social spaces.Previous research aimed to find spammers based on hybrid approaches of graph mining,posted content,and metadata,using small and manually labeled datasets.However,such hybrid approaches are unscalable,not robust,particular dataset dependent,and require numerous parameters,complex graphs,and natural language processing(NLP)resources to make decisions,which makes spammer detection impractical for real-time detection.For example,graph mining requires neighbors’information,posted content-based approaches require multiple tweets from user profiles,then NLP resources to make decisions that are not applicable in a real-time environment.To fill the gap,firstly,we propose a REal-time Metadata based Spammer detection(REMS)model based on only metadata features to identify spammers,which takes the least number of parameters and provides adequate results.REMS is a scalable and robust model that uses only 19 metadata features of Twitter users to induce 73.81%F1-Score classification accuracy using a balanced training dataset(50%spam and 50%genuine users).The 19 features are 8 original and 11 derived features from the original features of Twitter users,identified with extensive experiments and analysis.Secondly,we present the largest and most diverse dataset of published research,comprising 211 K spam users and 1 million genuine users.The diversity of the dataset can be measured as it comprises users who posted 2.1 million Tweets on seven topics(100 hashtags)from 6 different geographical locations.The REMS’s superior classification performance with multiple machine and deep learning methods indicates that only metadata features have the potential to identify spammers rather than focusing on volatile posted content and complex graph structures.Dataset and REMS’s codes are available on GitHub(www.github.com/mhadnanali/REMS).
文摘Stroke is a leading cause of disability and mortality worldwide,necessitating the development of advanced technologies to improve its diagnosis,treatment,and patient outcomes.In recent years,machine learning techniques have emerged as promising tools in stroke medicine,enabling efficient analysis of large-scale datasets and facilitating personalized and precision medicine approaches.This abstract provides a comprehensive overview of machine learning’s applications,challenges,and future directions in stroke medicine.Recently introduced machine learning algorithms have been extensively employed in all the fields of stroke medicine.Machine learning models have demonstrated remarkable accuracy in imaging analysis,diagnosing stroke subtypes,risk stratifications,guiding medical treatment,and predicting patient prognosis.Despite the tremendous potential of machine learning in stroke medicine,several challenges must be addressed.These include the need for standardized and interoperable data collection,robust model validation and generalization,and the ethical considerations surrounding privacy and bias.In addition,integrating machine learning models into clinical workflows and establishing regulatory frameworks are critical for ensuring their widespread adoption and impact in routine stroke care.Machine learning promises to revolutionize stroke medicine by enabling precise diagnosis,tailored treatment selection,and improved prognostication.Continued research and collaboration among clinicians,researchers,and technologists are essential for overcoming challenges and realizing the full potential of machine learning in stroke care,ultimately leading to enhanced patient outcomes and quality of life.This review aims to summarize all the current implications of machine learning in stroke diagnosis,treatment,and prognostic evaluation.At the same time,another purpose of this paper is to explore all the future perspectives these techniques can provide in combating this disabling disease.