Walking as a unique biometric tool conveys important information for emotion recognition.Individuals in different emotional states exhibit distinct walking patterns.For this purpose,this paper proposes a novel approac...Walking as a unique biometric tool conveys important information for emotion recognition.Individuals in different emotional states exhibit distinct walking patterns.For this purpose,this paper proposes a novel approach to recognizing emotion during walking using electroencephalogram(EEG)and inertial signals.Accurate recognition of emotion is achieved by training in an end-to-end deep learning fashion and taking into account multi-modal fusion.Subjects wear virtual reality head-mounted display(VR-HMD)equipment to immerse in strong emotions during walking.VR environment shows excellent imitation and experience ability,which plays an important role in awakening and changing emotions.In addition,the multi-modal signals acquired from EEG and inertial sensors are separately represented as virtual emotion images by discrete wavelet transform(DWT).These serve as input to the attention-based convolutional neural network(CNN)fusion model.The designed network structure is simple and lightweight while integrating the channel attention mechanism to extract and enhance features.To effectively improve the performance of the recognition system,the proposed decision fusion algorithm combines Critic method and majority voting strategy to determine the weight values that affect the final decision results.An investigation is made on the effect of diverse mother wavelet types and wavelet decomposition levels on model performance which indicates that the 2.2-order reverse biorthogonal(rbio2.2)wavelet with two-level decomposition has the best recognition performance.Comparative experiment results show that the proposed method outperforms other existing state-of-the-art works with an accuracy of 98.73%.展开更多
Online Social Networks(OSNs)are based on the sharing of different types of information and on various interactions(comments,reactions,and sharing).One of these important actions is the emotional reaction to the conten...Online Social Networks(OSNs)are based on the sharing of different types of information and on various interactions(comments,reactions,and sharing).One of these important actions is the emotional reaction to the content.The diversity of reaction types available on Facebook(namely FB)enables users to express their feelings,and its traceability creates and enriches the users’emotional identity in the virtual world.This paper is based on the analysis of 119875012 FB reactions(Like,Love,Haha,Wow,Sad,Angry,Thankful,and Pride)made at multiple levels(publications,comments,and sub-comments)to study and classify the users’emotional behavior,visualize the distribution of different types of reactions,and analyze the gender impact on emotion generation.All of these can be achieved by addressing these research questions:who reacts the most?Which emotion is the most expressed?展开更多
Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure in...Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches.One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development.Systematic accumulation of various types of biological data,including interactions between proteins,regulation of genes transcription,proteomics,and metabolomics experiments results,allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets.The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components,as well as for identifying their roles in the development of various types of tumors.We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database.For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach,while for all other proteins and genes machine learning method is used.For association extraction,we develop a set of semantic rules.We complete the results of the text analysis with the gene set enrichment analysis.The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.展开更多
Traditional auto-scaling approaches are conceived as reactive automations,typically triggered when predefined thresholds are breached by resource consumption metrics.Managing such rules at scale is cumbersome,especial...Traditional auto-scaling approaches are conceived as reactive automations,typically triggered when predefined thresholds are breached by resource consumption metrics.Managing such rules at scale is cumbersome,especially when resources require non-negligible time to be instantiated.This paper introduces an architecture for predictive cloud operations,which enables orchestrators to apply time-series forecasting techniques to estimate the evolution of relevant metrics and take decisions based on the predicted state of the system.In this way,they can anticipate load peaks and trigger appropriate scaling actions in advance,such that new resources are available when needed.The proposed architecture is implemented in OpenStack,extending the monitoring capabilities of Monasca by injecting short-term forecasts of standard metrics.We use our architecture to implement predictive scaling policies leveraging on linear regression,autoregressive integrated moving average,feed-forward,and recurrent neural networks(RNN).Then,we evaluate their performance on a synthetic workload,comparing them to those of a traditional policy.To assess the ability of the different models to generalize to unseen patterns,we also evaluate them on traces from a real content delivery network(CDN)workload.In particular,the RNN model exhibites the best overall performance in terms of prediction error,observed client-side response latency,and forecasting overhead.The implementation of our architecture is open-source.展开更多
Open Air Interface(OAI)alliance recently introduced a new disaggregated Open Radio Access Networks(O-RAN)framework for next generation telecommunications and networks.This disaggregated architecture is open,automated,...Open Air Interface(OAI)alliance recently introduced a new disaggregated Open Radio Access Networks(O-RAN)framework for next generation telecommunications and networks.This disaggregated architecture is open,automated,software defined,virtual,and supports the latest advanced technologies like Artificial Intelligence(AI)Machine Learning(AI/ML).This novel intelligent architecture enables programmers to design and customize automated applications according to the business needs and to improve quality of service in fifth generation(5G)and Beyond 5G(B5G).Its disaggregated and multivendor nature gives the opportunity to new startups and small vendors to participate and provide cheap hardware software solutions to keep the market competitive.This paper presents the disaggregated and programmable O-RAN architecture focused on automation,AI/ML services,and applications with Flexible Radio access network Intelligent Controller(FRIC).We schematically demonstrate the reinforcement learning,external applications(xApps),and automation steps to implement this disaggregated O-RAN architecture.The idea of this research paper is to implement an AI/ML enabled automation system for software defined disaggregated O-RAN,which monitors,manages,and performs AI/ML-related services,including the model deployment,optimization,inference,and training.展开更多
Robo or unsolicited calls have become a persistent issue in telecommunication networks,posing significant challenges to individuals,businesses,and regulatory authorities.These calls not only trick users into disclosin...Robo or unsolicited calls have become a persistent issue in telecommunication networks,posing significant challenges to individuals,businesses,and regulatory authorities.These calls not only trick users into disclosing their private and financial information,but also affect their productivity through unwanted phone ringing.A proactive approach to identify and block such unsolicited calls is essential to protect users and service providers from potential harm.Therein,this paper proposes a solution to identify robo-callers in the telephony network utilising a set of novel features to evaluate the trustworthiness of callers in a network.The trust score of the callers is then used along with machine learning models to classify them as legitimate or robo-caller.We use a large anonymized dataset(call detailed records)from a large telecommunication provider containing more than 1 billion records collected over 10 days.We have conducted extensive evaluation demonstrating that the proposed approach achieves high accuracy and detection rate whilst minimizing the error rate.Specifically,the proposed features when used collectively achieve a true-positive rate of around 97%with a false-positive rate of less than 0.01%.展开更多
The current large-scale Internet of Things(IoT)networks typically generate high-velocity network traffic streams.Attackers use IoT devices to create botnets and launch attacks,such as DDoS,Spamming,Cryptocurrency mini...The current large-scale Internet of Things(IoT)networks typically generate high-velocity network traffic streams.Attackers use IoT devices to create botnets and launch attacks,such as DDoS,Spamming,Cryptocurrency mining,Phishing,etc.The service providers of large-scale IoT networks need to set up a data pipeline to collect the vast network traffic data from the IoT devices,store it,analyze it,and report the malicious IoT devices and types of attacks.Further,the attacks originating from IoT devices are dynamic,as attackers launch one kind of attack at one time and another kind of attack at another time.The number of attacks and benign instances also vary from time to time.This phenomenon of change in attack patterns is called concept drift.Hence,the attack detection system must learn continuously from the ever-changing real-time attack patterns in large-scale IoT network traffic.To meet this requirement,in this work,we propose a data pipeline with Apache Kafka,Apache Spark structured streaming,and MongoDB that can adapt to the ever-changing attack patterns in real time and classify attacks in large-scale IoT networks.When concept drift is detected,the proposed system retrains the classifier with the instances that cause the drift and a representative subsample instances from the previous training of the model.The proposed approach is evaluated with the latest dataset,IoT23,which consists of benign and several attack instances from various IoT devices.Attack classification accuracy is improved from 97.8%to 99.46%by the proposed system.The training time of distributed random forest algorithm is also studied by varying the number of cores in Apache Spark environment.展开更多
Telemarketing is a well-established marketing approach to offering products and services to prospective customers.The effectiveness of such an approach,however,is highly dependent on the selection of the appropriate c...Telemarketing is a well-established marketing approach to offering products and services to prospective customers.The effectiveness of such an approach,however,is highly dependent on the selection of the appropriate consumer base,as reaching uninterested customers will induce annoyance and consume costly enterprise resources in vain while missing interested ones.The introduction of business intelligence and machine learning models can positively influence the decision-making process by predicting the potential customer base,and the existing literature in this direction shows promising results.However,the selection of influential features and the construction of effective learning models for improved performance remain a challenge.Furthermore,from the modelling perspective,the class imbalance nature of the training data,where samples with unsuccessful outcomes highly outnumber successful ones,further compounds the problem by creating biased and inaccurate models.Additionally,customer preferences are likely to change over time due to various reasons,and/or a fresh group of customers may be targeted for a new product or service,necessitating model retraining which is not addressed at all in existing works.A major challenge in model retraining is maintaining a balance between stability(retaining older knowledge)and plasticity(being receptive to new information).To address the above issues,this paper proposes an ensemble machine learning model with feature selection and oversampling techniques to identify potential customers more accurately.A novel online learning method is proposed for model retraining when new samples are available over time.This newly introduced method equips the proposed approach to deal with dynamic data,leading to improved readiness of the proposed model for practical adoption,and is a highly useful addition to the literature.Extensive experiments with real-world data show that the proposed approach achieves excellent results in all cases(e.g.,98.6%accuracy in classifying customers)and outperforms recent competing models in the literature by a considerable margin of 3%on a widely used dataset.展开更多
Graph Neural Networks(GNNs)have become a widely used tool for learning and analyzing data on graph structures,largely due to their ability to preserve graph structure and properties via graph representation learning.H...Graph Neural Networks(GNNs)have become a widely used tool for learning and analyzing data on graph structures,largely due to their ability to preserve graph structure and properties via graph representation learning.However,the effect of depth on the performance of GNNs,particularly isotropic and anisotropic models,remains an active area of research.This study presents a comprehensive exploration of the impact of depth on GNNs,with a focus on the phenomena of over-smoothing and the bottleneck effect in deep graph neural networks.Our research investigates the tradeoff between depth and performance,revealing that increasing depth can lead to over-smoothing and a decrease in performance due to the bottleneck effect.We also examine the impact of node degrees on classification accuracy,finding that nodes with low degrees can pose challenges for accurate classification.Our experiments use several benchmark datasets and a range of evaluation metrics to compare isotropic and anisotropic GNNs of varying depths,also explore the scalability of these models.Our findings provide valuable insights into the design of deep GNNs and offer potential avenues for future research to improve their performance.展开更多
Big data has the ability to open up innovative and ground-breaking prospects for the electrical grid,which also supports to obtain a variety of technological,social,and financial benefits.There is an unprecedented amo...Big data has the ability to open up innovative and ground-breaking prospects for the electrical grid,which also supports to obtain a variety of technological,social,and financial benefits.There is an unprecedented amount of heterogeneous big data as a consequence of the growth of power grid technologies,along with data processing and advanced tools.The main obstacles in turning the heterogeneous large dataset into useful results are computational burden and information security.The original contribution of this paper is to develop a new big data framework for detecting various intrusions from the smart grid systems with the use of AI mechanisms.Here,an AdaBelief Exponential Feature Selection(AEFS)technique is used to efficiently handle the input huge datasets from the smart grid for boosting security.Then,a Kernel based Extreme Neural Network(KENN)technique is used to anticipate security vulnerabilities more effectively.The Polar Bear Optimization(PBO)algorithm is used to efficiently determine the parameters for the estimate of radial basis function.Moreover,several types of smart grid network datasets are employed during analysis in order to examine the outcomes and efficiency of the proposed AdaBelief Exponential Feature Selection-Kernel based Extreme Neural Network(AEFS-KENN)big data security framework.The results reveal that the accuracy of proposed AEFS-KENN is increased up to 99.5%with precision and AUC of 99%for all smart grid big datasets used in this study.展开更多
Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to opti...Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to optimize the properties of a specific molecule to satisfy the specific properties of the generated molecule.The Matched Molecular Pairs(MMPs),which contain the source and target molecules,are used herein,and logD and solubility are selected as the optimization properties.The main innovative work lies in the calculation related to a specific transformer from the perspective of a matrix dimension.Threshold intervals and state changes are then used to encode logD and solubility for subsequent tests.During the experiments,we screen the data based on the proportion of heavy atoms to all atoms in the groups and select 12365,1503,and 1570 MMPs as the training,validation,and test sets,respectively.Transformer models are compared with the baseline models with respect to their abilities to generate molecules with specific properties.Results show that the transformer model can accurately optimize the source molecules to satisfy specific properties.展开更多
Data temperature is a response to the ever-growing amount of data.These data have to be stored,but they have been observed that only a small portion of the data are accessed more frequently at any one time.This leads ...Data temperature is a response to the ever-growing amount of data.These data have to be stored,but they have been observed that only a small portion of the data are accessed more frequently at any one time.This leads to the concept of hot and cold data.Cold data can be migrated away from high-performance nodes to free up performance for higher priority data.Existing studies classify hot and cold data primarily on the basis of data age and usage frequency.We present this as a limitation in the current implementation of data temperature.This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive.We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement.We identify new metadata variables and user-defined variables to extend the current data temperature value.We further establish rules and conditions for limiting unnecessary movement of the data,which helps to prevent wasted input output(I/O)costs.We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature.The proposed system provides higher accuracy,increases performance,and gives greater user control for optimal positioning of data within multi-tiered storage solutions.展开更多
Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering t...Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering tasks because of its effectiveness.This study proposes a general semi-supervised multi-view nonnegative matrix factorization algorithm.This algorithm incorporates discriminative and geometric information on data to learn a better-fused representation,and adopts a feature normalizing strategy to align the different views.Two specific implementations of this algorithm are developed to validate the effectiveness of the proposed framework:Graph regularization based Discriminatively Constrained Multi-View Nonnegative Matrix Factorization(GDCMVNMF)and Extended Multi-View Constrained Nonnegative Matrix Factorization(ExMVCNMF).The intrinsic connection between these two specific implementations is discussed,and the optimization based on multiply update rules is presented.Experiments on six datasets show that the effectiveness of GDCMVNMF and ExMVCNMF outperforms several representative unsupervised and semi-supervised multi-view NMF approaches.展开更多
Under the general trend of the rapid development of smart grids,data security and privacy are facing serious challenges;protecting the privacy data of single users under the premise of obtaining user-aggregated data h...Under the general trend of the rapid development of smart grids,data security and privacy are facing serious challenges;protecting the privacy data of single users under the premise of obtaining user-aggregated data has attracted widespread attention.In this study,we propose an encryption scheme on the basis of differential privacy for the problem of user privacy leakage when aggregating data from multiple smart meters.First,we use an improved homomorphic encryption method to realize the encryption aggregation of users’data.Second,we propose a double-blind noise addition protocol to generate distributed noise through interaction between users and a cloud platform to prevent semi-honest participants from stealing data by colluding with one another.Finally,the simulation results show that the proposed scheme can encrypt the transmission of multi-intelligent meter data under the premise of satisfying the differential privacy mechanism.Even if an attacker has enough background knowledge,the security of the electricity information of one another can be ensured.展开更多
The e-commerce industry’s rapid growth,accelerated by the COVID-19 pandemic,has led to an alarming increase in digital fraud and associated losses.To establish a healthy e-commerce ecosystem,robust cyber security and...The e-commerce industry’s rapid growth,accelerated by the COVID-19 pandemic,has led to an alarming increase in digital fraud and associated losses.To establish a healthy e-commerce ecosystem,robust cyber security and anti-fraud measures are crucial.However,research on fraud detection systems has struggled to keep pace due to limited real-world datasets.Advances in artificial intelligence,Machine Learning(ML),and cloud computing have revitalized research and applications in this domain.While ML and data mining techniques are popular in fraud detection,specific reviews focusing on their application in e-commerce platforms like eBay and Facebook are lacking depth.Existing reviews provide broad overviews but fail to grasp the intricacies of ML algorithms in the e-commerce context.To bridge this gap,our study conducts a systematic literature review using the Preferred Reporting Items for Systematic reviews and Meta-Analysis(PRISMA)methodology.We aim to explore the effectiveness of these techniques in fraud detection within digital marketplaces and the broader e-commerce landscape.Understanding the current state of the literature and emerging trends is crucial given the rising fraud incidents and associated costs.Through our investigation,we identify research opportunities and provide insights to industry stakeholders on key ML and data mining techniques for combating e-commerce fraud.Our paper examines the research on these techniques as published in the past decade.Employing the PRISMA approach,we conducted a content analysis of 101 publications,identifying research gaps,recent techniques,and highlighting the increasing utilization of artificial neural networks in fraud detection within the industry.展开更多
The ability to make accurate energy predictions while considering all related energy factors allows production plants,regulatory bodies,and governments to meet energy demand and assess the effects of energy-saving ini...The ability to make accurate energy predictions while considering all related energy factors allows production plants,regulatory bodies,and governments to meet energy demand and assess the effects of energy-saving initiatives.When energy consumption falls within normal parameters,it will be possible to use the developed model to predict energy consumption and develop improvements and mitigating measures for energy consumption.The objective of this model is to accurately predict energy consumption without data limitations and provide results that are easily interpretable.The proposed model is an implementation of the stacked Long Short-Term Memory(LSTM)snapshot ensemble combined with the Fast Fourier Transform(FFT)and meta-learner.Hebrail and Berard’s Individual Household Electric-Power Consumption(IHEPC)dataset incorporated with weather data are used to analyse the model’s accuracy with predicting energy consumption.The model is trained,and the results measured using Root Mean Square Error(RMSE),Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and coefficient of determination(R^(2))metrics are 0.020,0.013,0.017,and 0.999,respectively.The stacked LSTM snapshot ensemble performs better than the compared models based on prediction accuracy and minimized errors.The results of this study show that prediction accuracy is high,and the model’s stability is high as well.The model shows that high levels of accuracy prove accurate predictive ability,and together with high levels of stability,the model has good interpretability,which is not typically accounted for in models.However,this study shows that it can be inferred.展开更多
Grid-based recommendation algorithms view users and items as abstract nodes,and the information utilised by the algorithm is hidden in the selection relationships between users and items.Although these relationships c...Grid-based recommendation algorithms view users and items as abstract nodes,and the information utilised by the algorithm is hidden in the selection relationships between users and items.Although these relationships can be easily handled,much useful information is overlooked,resulting in a less accurate recommendation algorithm.The aim of this paper is to propose improvements on the standard substance diffusion algorithm,taking into account the influence of the user’s rating on the recommended item,adding a moderating factor,and optimising the initial resource allocation vector and resource transfer matrix in the recommendation algorithm.An average ranking score evaluation index is introduced to quantify user satisfaction with the recommendation results.Experiments are conducted on the MovieLens training dataset,and the experimental results show that the proposed algorithm outperforms classical collaborative filtering systems and network structure based recommendation systems in terms of recommendation accuracy and hit rate.展开更多
Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the reco...Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the recorded data have certain missing values due to factors,such as weather and equipment anomalies.These missing values seriously affect the analysis of QAR data by aeronautical engineers,such as airline flight scenario reproduction and airline flight safety status assessment.Therefore,imputing missing values in the QAR data,which can further guarantee the flight safety of airlines,is crucial.QAR data also have multivariate,multiprocess,and temporal features.Therefore,we innovatively propose the imputation models A-AEGAN("A"denotes attention mechanism,"AE"denotes autoencoder,and"GAN"denotes generative adversarial network)and SA-AEGAN("SA"denotes self-attentive mechanism)for missing values of QAR data,which can be effectively applied to QAR data.Specifically,we apply an innovative generative adversarial network to impute missing values from QAR data.The improved gated recurrent unit is then introduced as the neural unit of GAN,which can successfully capture the temporal relationships in QAR data.In addition,we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator.The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator.We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data.Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data.Furthermore,we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data.Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.展开更多
With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be for...With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be formalized as a multi-task multi-view learning problem with a specific task comprising multiple views with shared features collected from multiple sensors.Existing incremental learning methods are often single-task single-view,which cannot learn shared representations between relevant tasks and views.An adaptive multi-task multi-view incremental learning framework for data stream classification called MTMVIS is proposed to address the above challenges,utilizing the idea of multi-task multi-view learning.Specifically,the attention mechanism is first used to align different sensor data of different views.In addition,MTMVIS uses adaptive Fisher regularization from the perspective of multi-task multi-view learning to overcome catastrophic forgetting in incremental learning.Results reveal that the proposed framework outperforms state-of-the-art methods based on the experiments on two different datasets with other baselines.展开更多
This study explores the potential of Artificial Intelligence(AI)in early screening and prognosis of Dry Eye Disease(DED),aiming to enhance the accuracy of therapeutic approaches for eye-care practitioners.Despite the ...This study explores the potential of Artificial Intelligence(AI)in early screening and prognosis of Dry Eye Disease(DED),aiming to enhance the accuracy of therapeutic approaches for eye-care practitioners.Despite the promising opportunities,challenges such as diverse diagnostic evidence,complex etiology,and interdisciplinary knowledge integration impede the interpretability,reliability,and applicability of AI-based DED detection methods.The research conducts a comprehensive review of datasets,diagnostic evidence,and standards,as well as advanced algorithms in AI-based DED detection over the past five years.The DED diagnostic methods are categorized into three groups based on their relationship with AI techniques:(1)those with ground truth and/or comparable standards,(2)potential AI-based methods with significant advantages,and(3)supplementary methods for AI-based DED detection.The study proposes suggested DED detection standards,the combination of multiple diagnostic evidence,and future research directions to guide further investigations.Ultimately,the research contributes to the advancement of ophthalmic disease detection by providing insights into knowledge foundations,advanced methods,challenges,and potential future perspectives,emphasizing the significant role of AI in both academic and practical aspects of ophthalmology.展开更多
基金This work was supported by the National Natural Science Foundation of China(Nos.61903170,62173175,61877033)the Natural Science Foundation of Shandong Province(Nos.ZR2019BF045,ZR2019MF021)the Key Research and Development Project of Shandong Province of China(No.2019GGX101003).
文摘Walking as a unique biometric tool conveys important information for emotion recognition.Individuals in different emotional states exhibit distinct walking patterns.For this purpose,this paper proposes a novel approach to recognizing emotion during walking using electroencephalogram(EEG)and inertial signals.Accurate recognition of emotion is achieved by training in an end-to-end deep learning fashion and taking into account multi-modal fusion.Subjects wear virtual reality head-mounted display(VR-HMD)equipment to immerse in strong emotions during walking.VR environment shows excellent imitation and experience ability,which plays an important role in awakening and changing emotions.In addition,the multi-modal signals acquired from EEG and inertial sensors are separately represented as virtual emotion images by discrete wavelet transform(DWT).These serve as input to the attention-based convolutional neural network(CNN)fusion model.The designed network structure is simple and lightweight while integrating the channel attention mechanism to extract and enhance features.To effectively improve the performance of the recognition system,the proposed decision fusion algorithm combines Critic method and majority voting strategy to determine the weight values that affect the final decision results.An investigation is made on the effect of diverse mother wavelet types and wavelet decomposition levels on model performance which indicates that the 2.2-order reverse biorthogonal(rbio2.2)wavelet with two-level decomposition has the best recognition performance.Comparative experiment results show that the proposed method outperforms other existing state-of-the-art works with an accuracy of 98.73%.
文摘Online Social Networks(OSNs)are based on the sharing of different types of information and on various interactions(comments,reactions,and sharing).One of these important actions is the emotional reaction to the content.The diversity of reaction types available on Facebook(namely FB)enables users to express their feelings,and its traceability creates and enriches the users’emotional identity in the virtual world.This paper is based on the analysis of 119875012 FB reactions(Like,Love,Haha,Wow,Sad,Angry,Thankful,and Pride)made at multiple levels(publications,comments,and sub-comments)to study and classify the users’emotional behavior,visualize the distribution of different types of reactions,and analyze the gender impact on emotion generation.All of these can be achieved by addressing these research questions:who reacts the most?Which emotion is the most expressed?
基金This work was supported by the Ministry of Science and Higher Education of the Russian Federation within the framework of state support for the creation and development of World-Class Research Centers'Digital Biodesign and Personalized Healthcare'(No.75-15-2022-305).
文摘Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches.One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development.Systematic accumulation of various types of biological data,including interactions between proteins,regulation of genes transcription,proteomics,and metabolomics experiments results,allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets.The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components,as well as for identifying their roles in the development of various types of tumors.We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database.For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach,while for all other proteins and genes machine learning method is used.For association extraction,we develop a set of semantic rules.We complete the results of the text analysis with the gene set enrichment analysis.The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.
基金supported by the PNRR-M4C2-Investimento 1.3,Partenariato Esteso(No.PE00000013-FAIR).
文摘Traditional auto-scaling approaches are conceived as reactive automations,typically triggered when predefined thresholds are breached by resource consumption metrics.Managing such rules at scale is cumbersome,especially when resources require non-negligible time to be instantiated.This paper introduces an architecture for predictive cloud operations,which enables orchestrators to apply time-series forecasting techniques to estimate the evolution of relevant metrics and take decisions based on the predicted state of the system.In this way,they can anticipate load peaks and trigger appropriate scaling actions in advance,such that new resources are available when needed.The proposed architecture is implemented in OpenStack,extending the monitoring capabilities of Monasca by injecting short-term forecasts of standard metrics.We use our architecture to implement predictive scaling policies leveraging on linear regression,autoregressive integrated moving average,feed-forward,and recurrent neural networks(RNN).Then,we evaluate their performance on a synthetic workload,comparing them to those of a traditional policy.To assess the ability of the different models to generalize to unseen patterns,we also evaluate them on traces from a real content delivery network(CDN)workload.In particular,the RNN model exhibites the best overall performance in terms of prediction error,observed client-side response latency,and forecasting overhead.The implementation of our architecture is open-source.
文摘Open Air Interface(OAI)alliance recently introduced a new disaggregated Open Radio Access Networks(O-RAN)framework for next generation telecommunications and networks.This disaggregated architecture is open,automated,software defined,virtual,and supports the latest advanced technologies like Artificial Intelligence(AI)Machine Learning(AI/ML).This novel intelligent architecture enables programmers to design and customize automated applications according to the business needs and to improve quality of service in fifth generation(5G)and Beyond 5G(B5G).Its disaggregated and multivendor nature gives the opportunity to new startups and small vendors to participate and provide cheap hardware software solutions to keep the market competitive.This paper presents the disaggregated and programmable O-RAN architecture focused on automation,AI/ML services,and applications with Flexible Radio access network Intelligent Controller(FRIC).We schematically demonstrate the reinforcement learning,external applications(xApps),and automation steps to implement this disaggregated O-RAN architecture.The idea of this research paper is to implement an AI/ML enabled automation system for software defined disaggregated O-RAN,which monitors,manages,and performs AI/ML-related services,including the model deployment,optimization,inference,and training.
文摘Robo or unsolicited calls have become a persistent issue in telecommunication networks,posing significant challenges to individuals,businesses,and regulatory authorities.These calls not only trick users into disclosing their private and financial information,but also affect their productivity through unwanted phone ringing.A proactive approach to identify and block such unsolicited calls is essential to protect users and service providers from potential harm.Therein,this paper proposes a solution to identify robo-callers in the telephony network utilising a set of novel features to evaluate the trustworthiness of callers in a network.The trust score of the callers is then used along with machine learning models to classify them as legitimate or robo-caller.We use a large anonymized dataset(call detailed records)from a large telecommunication provider containing more than 1 billion records collected over 10 days.We have conducted extensive evaluation demonstrating that the proposed approach achieves high accuracy and detection rate whilst minimizing the error rate.Specifically,the proposed features when used collectively achieve a true-positive rate of around 97%with a false-positive rate of less than 0.01%.
文摘The current large-scale Internet of Things(IoT)networks typically generate high-velocity network traffic streams.Attackers use IoT devices to create botnets and launch attacks,such as DDoS,Spamming,Cryptocurrency mining,Phishing,etc.The service providers of large-scale IoT networks need to set up a data pipeline to collect the vast network traffic data from the IoT devices,store it,analyze it,and report the malicious IoT devices and types of attacks.Further,the attacks originating from IoT devices are dynamic,as attackers launch one kind of attack at one time and another kind of attack at another time.The number of attacks and benign instances also vary from time to time.This phenomenon of change in attack patterns is called concept drift.Hence,the attack detection system must learn continuously from the ever-changing real-time attack patterns in large-scale IoT network traffic.To meet this requirement,in this work,we propose a data pipeline with Apache Kafka,Apache Spark structured streaming,and MongoDB that can adapt to the ever-changing attack patterns in real time and classify attacks in large-scale IoT networks.When concept drift is detected,the proposed system retrains the classifier with the instances that cause the drift and a representative subsample instances from the previous training of the model.The proposed approach is evaluated with the latest dataset,IoT23,which consists of benign and several attack instances from various IoT devices.Attack classification accuracy is improved from 97.8%to 99.46%by the proposed system.The training time of distributed random forest algorithm is also studied by varying the number of cores in Apache Spark environment.
文摘Telemarketing is a well-established marketing approach to offering products and services to prospective customers.The effectiveness of such an approach,however,is highly dependent on the selection of the appropriate consumer base,as reaching uninterested customers will induce annoyance and consume costly enterprise resources in vain while missing interested ones.The introduction of business intelligence and machine learning models can positively influence the decision-making process by predicting the potential customer base,and the existing literature in this direction shows promising results.However,the selection of influential features and the construction of effective learning models for improved performance remain a challenge.Furthermore,from the modelling perspective,the class imbalance nature of the training data,where samples with unsuccessful outcomes highly outnumber successful ones,further compounds the problem by creating biased and inaccurate models.Additionally,customer preferences are likely to change over time due to various reasons,and/or a fresh group of customers may be targeted for a new product or service,necessitating model retraining which is not addressed at all in existing works.A major challenge in model retraining is maintaining a balance between stability(retaining older knowledge)and plasticity(being receptive to new information).To address the above issues,this paper proposes an ensemble machine learning model with feature selection and oversampling techniques to identify potential customers more accurately.A novel online learning method is proposed for model retraining when new samples are available over time.This newly introduced method equips the proposed approach to deal with dynamic data,leading to improved readiness of the proposed model for practical adoption,and is a highly useful addition to the literature.Extensive experiments with real-world data show that the proposed approach achieves excellent results in all cases(e.g.,98.6%accuracy in classifying customers)and outperforms recent competing models in the literature by a considerable margin of 3%on a widely used dataset.
文摘Graph Neural Networks(GNNs)have become a widely used tool for learning and analyzing data on graph structures,largely due to their ability to preserve graph structure and properties via graph representation learning.However,the effect of depth on the performance of GNNs,particularly isotropic and anisotropic models,remains an active area of research.This study presents a comprehensive exploration of the impact of depth on GNNs,with a focus on the phenomena of over-smoothing and the bottleneck effect in deep graph neural networks.Our research investigates the tradeoff between depth and performance,revealing that increasing depth can lead to over-smoothing and a decrease in performance due to the bottleneck effect.We also examine the impact of node degrees on classification accuracy,finding that nodes with low degrees can pose challenges for accurate classification.Our experiments use several benchmark datasets and a range of evaluation metrics to compare isotropic and anisotropic GNNs of varying depths,also explore the scalability of these models.Our findings provide valuable insights into the design of deep GNNs and offer potential avenues for future research to improve their performance.
文摘Big data has the ability to open up innovative and ground-breaking prospects for the electrical grid,which also supports to obtain a variety of technological,social,and financial benefits.There is an unprecedented amount of heterogeneous big data as a consequence of the growth of power grid technologies,along with data processing and advanced tools.The main obstacles in turning the heterogeneous large dataset into useful results are computational burden and information security.The original contribution of this paper is to develop a new big data framework for detecting various intrusions from the smart grid systems with the use of AI mechanisms.Here,an AdaBelief Exponential Feature Selection(AEFS)technique is used to efficiently handle the input huge datasets from the smart grid for boosting security.Then,a Kernel based Extreme Neural Network(KENN)technique is used to anticipate security vulnerabilities more effectively.The Polar Bear Optimization(PBO)algorithm is used to efficiently determine the parameters for the estimate of radial basis function.Moreover,several types of smart grid network datasets are employed during analysis in order to examine the outcomes and efficiency of the proposed AdaBelief Exponential Feature Selection-Kernel based Extreme Neural Network(AEFS-KENN)big data security framework.The results reveal that the accuracy of proposed AEFS-KENN is increased up to 99.5%with precision and AUC of 99%for all smart grid big datasets used in this study.
基金This work was supported by the National Natural Science Foundation of China(Nos.62272288,61972451,and U22A2041)the Shenzhen Key Laboratory of Intelligent Bioinformatics(No.ZDSYS20220422103800001).
文摘Generating novel molecules to satisfy specific properties is a challenging task in modern drug discovery,which requires the optimization of a specific objective based on satisfying chemical rules.Herein,we aim to optimize the properties of a specific molecule to satisfy the specific properties of the generated molecule.The Matched Molecular Pairs(MMPs),which contain the source and target molecules,are used herein,and logD and solubility are selected as the optimization properties.The main innovative work lies in the calculation related to a specific transformer from the perspective of a matrix dimension.Threshold intervals and state changes are then used to encode logD and solubility for subsequent tests.During the experiments,we screen the data based on the proportion of heavy atoms to all atoms in the groups and select 12365,1503,and 1570 MMPs as the training,validation,and test sets,respectively.Transformer models are compared with the baseline models with respect to their abilities to generate molecules with specific properties.Results show that the transformer model can accurately optimize the source molecules to satisfy specific properties.
文摘Data temperature is a response to the ever-growing amount of data.These data have to be stored,but they have been observed that only a small portion of the data are accessed more frequently at any one time.This leads to the concept of hot and cold data.Cold data can be migrated away from high-performance nodes to free up performance for higher priority data.Existing studies classify hot and cold data primarily on the basis of data age and usage frequency.We present this as a limitation in the current implementation of data temperature.This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive.We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement.We identify new metadata variables and user-defined variables to extend the current data temperature value.We further establish rules and conditions for limiting unnecessary movement of the data,which helps to prevent wasted input output(I/O)costs.We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature.The proposed system provides higher accuracy,increases performance,and gives greater user control for optimal positioning of data within multi-tiered storage solutions.
基金This work was supported by the National Key Research and Development Project of China(No.2019YFB2102500)the Strategic Priority CAS Project(No.XDB38040200)+2 种基金the National Natural Science Foundation of China(Nos.62206269,U1913210)the Guangdong Provincial Science and Technology Projects(Nos.2022A1515011217,2022A1515011557)the Shenzhen Science and Technology Projects(No.JSGG20211029095546003)。
文摘Nonnegative Matrix Factorization(NMF)is one of the most popular feature learning technologies in the field of machine learning and pattern recognition.It has been widely used and studied in the multi-view clustering tasks because of its effectiveness.This study proposes a general semi-supervised multi-view nonnegative matrix factorization algorithm.This algorithm incorporates discriminative and geometric information on data to learn a better-fused representation,and adopts a feature normalizing strategy to align the different views.Two specific implementations of this algorithm are developed to validate the effectiveness of the proposed framework:Graph regularization based Discriminatively Constrained Multi-View Nonnegative Matrix Factorization(GDCMVNMF)and Extended Multi-View Constrained Nonnegative Matrix Factorization(ExMVCNMF).The intrinsic connection between these two specific implementations is discussed,and the optimization based on multiply update rules is presented.Experiments on six datasets show that the effectiveness of GDCMVNMF and ExMVCNMF outperforms several representative unsupervised and semi-supervised multi-view NMF approaches.
基金This work was supported by the National Natural Science Foundation of China(No.51677059)the Fujian Provincial University Engineering Research Center Open Fund(No.KF-D21009).
文摘Under the general trend of the rapid development of smart grids,data security and privacy are facing serious challenges;protecting the privacy data of single users under the premise of obtaining user-aggregated data has attracted widespread attention.In this study,we propose an encryption scheme on the basis of differential privacy for the problem of user privacy leakage when aggregating data from multiple smart meters.First,we use an improved homomorphic encryption method to realize the encryption aggregation of users’data.Second,we propose a double-blind noise addition protocol to generate distributed noise through interaction between users and a cloud platform to prevent semi-honest participants from stealing data by colluding with one another.Finally,the simulation results show that the proposed scheme can encrypt the transmission of multi-intelligent meter data under the premise of satisfying the differential privacy mechanism.Even if an attacker has enough background knowledge,the security of the electricity information of one another can be ensured.
文摘The e-commerce industry’s rapid growth,accelerated by the COVID-19 pandemic,has led to an alarming increase in digital fraud and associated losses.To establish a healthy e-commerce ecosystem,robust cyber security and anti-fraud measures are crucial.However,research on fraud detection systems has struggled to keep pace due to limited real-world datasets.Advances in artificial intelligence,Machine Learning(ML),and cloud computing have revitalized research and applications in this domain.While ML and data mining techniques are popular in fraud detection,specific reviews focusing on their application in e-commerce platforms like eBay and Facebook are lacking depth.Existing reviews provide broad overviews but fail to grasp the intricacies of ML algorithms in the e-commerce context.To bridge this gap,our study conducts a systematic literature review using the Preferred Reporting Items for Systematic reviews and Meta-Analysis(PRISMA)methodology.We aim to explore the effectiveness of these techniques in fraud detection within digital marketplaces and the broader e-commerce landscape.Understanding the current state of the literature and emerging trends is crucial given the rising fraud incidents and associated costs.Through our investigation,we identify research opportunities and provide insights to industry stakeholders on key ML and data mining techniques for combating e-commerce fraud.Our paper examines the research on these techniques as published in the past decade.Employing the PRISMA approach,we conducted a content analysis of 101 publications,identifying research gaps,recent techniques,and highlighting the increasing utilization of artificial neural networks in fraud detection within the industry.
文摘The ability to make accurate energy predictions while considering all related energy factors allows production plants,regulatory bodies,and governments to meet energy demand and assess the effects of energy-saving initiatives.When energy consumption falls within normal parameters,it will be possible to use the developed model to predict energy consumption and develop improvements and mitigating measures for energy consumption.The objective of this model is to accurately predict energy consumption without data limitations and provide results that are easily interpretable.The proposed model is an implementation of the stacked Long Short-Term Memory(LSTM)snapshot ensemble combined with the Fast Fourier Transform(FFT)and meta-learner.Hebrail and Berard’s Individual Household Electric-Power Consumption(IHEPC)dataset incorporated with weather data are used to analyse the model’s accuracy with predicting energy consumption.The model is trained,and the results measured using Root Mean Square Error(RMSE),Mean Absolute Error(MAE),Mean Absolute Percentage Error(MAPE),and coefficient of determination(R^(2))metrics are 0.020,0.013,0.017,and 0.999,respectively.The stacked LSTM snapshot ensemble performs better than the compared models based on prediction accuracy and minimized errors.The results of this study show that prediction accuracy is high,and the model’s stability is high as well.The model shows that high levels of accuracy prove accurate predictive ability,and together with high levels of stability,the model has good interpretability,which is not typically accounted for in models.However,this study shows that it can be inferred.
基金supported by the National Natural Science Foundation of China(No.62302199)China Postdoctoral Science Foundation(No.2023M731368)+2 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions(No.22KJB520016)Ministry of Education in China(MOE)Youth Foundation Project of Humanities and Social Sciences(No.22YJC870007)2022 Jiangsu University Undergraduate Student English Teaching Excellence Program,and Ministry of Education’s Industry-Education Cooperation Collaborative Education Project(No.202102306005).
文摘Grid-based recommendation algorithms view users and items as abstract nodes,and the information utilised by the algorithm is hidden in the selection relationships between users and items.Although these relationships can be easily handled,much useful information is overlooked,resulting in a less accurate recommendation algorithm.The aim of this paper is to propose improvements on the standard substance diffusion algorithm,taking into account the influence of the user’s rating on the recommended item,adding a moderating factor,and optimising the initial resource allocation vector and resource transfer matrix in the recommendation algorithm.An average ranking score evaluation index is introduced to quantify user satisfaction with the recommendation results.Experiments are conducted on the MovieLens training dataset,and the experimental results show that the proposed algorithm outperforms classical collaborative filtering systems and network structure based recommendation systems in terms of recommendation accuracy and hit rate.
基金This work was supported by the National Natural Science Foundation of China(Nos.61972456,61402329)the Natural Science Foundation of Tianjin(Nos.19JCYBJC15400,21YDTPJC00440)。
文摘Quick Access Recorder(QAR),an important device for storing data from various flight parameters,contains a large amount of valuable data and comprehensively records the real state of the airline flight.However,the recorded data have certain missing values due to factors,such as weather and equipment anomalies.These missing values seriously affect the analysis of QAR data by aeronautical engineers,such as airline flight scenario reproduction and airline flight safety status assessment.Therefore,imputing missing values in the QAR data,which can further guarantee the flight safety of airlines,is crucial.QAR data also have multivariate,multiprocess,and temporal features.Therefore,we innovatively propose the imputation models A-AEGAN("A"denotes attention mechanism,"AE"denotes autoencoder,and"GAN"denotes generative adversarial network)and SA-AEGAN("SA"denotes self-attentive mechanism)for missing values of QAR data,which can be effectively applied to QAR data.Specifically,we apply an innovative generative adversarial network to impute missing values from QAR data.The improved gated recurrent unit is then introduced as the neural unit of GAN,which can successfully capture the temporal relationships in QAR data.In addition,we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator.The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator.We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data.Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data.Furthermore,we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data.Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.
文摘With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be formalized as a multi-task multi-view learning problem with a specific task comprising multiple views with shared features collected from multiple sensors.Existing incremental learning methods are often single-task single-view,which cannot learn shared representations between relevant tasks and views.An adaptive multi-task multi-view incremental learning framework for data stream classification called MTMVIS is proposed to address the above challenges,utilizing the idea of multi-task multi-view learning.Specifically,the attention mechanism is first used to align different sensor data of different views.In addition,MTMVIS uses adaptive Fisher regularization from the perspective of multi-task multi-view learning to overcome catastrophic forgetting in incremental learning.Results reveal that the proposed framework outperforms state-of-the-art methods based on the experiments on two different datasets with other baselines.
基金funded by the National Natural Science Foundation of China Natural(Nos.U22A2041,82071915,and 62372047)the Shenzhen Key Laboratory of Intelligent Bioinformatics(No.ZDSYS20220422103800001)+5 种基金the Shenzhen Science and Technology Program(No.KQTD20200820113106007)the Guangdong Basic and Applied Basic Research Foundation(No.2022A1515220015)the Zhuhai Technology and Research Foundation(Nos.ZH22036201210034PWC,2220004000131,and 2220004002412)the Project of Humanities and Social Science of MOE(Ministry of Education in China)(No.22YJCZH213)the Science and Technology Research Program of Chongqing Municipal Education Commission(Nos.KJZD-K202203601,KJQN0202203605,and KJQN202203607)the Natural Science Foundation of Chongqing China(No.cstc2021jcyj-msxmX1108).
文摘This study explores the potential of Artificial Intelligence(AI)in early screening and prognosis of Dry Eye Disease(DED),aiming to enhance the accuracy of therapeutic approaches for eye-care practitioners.Despite the promising opportunities,challenges such as diverse diagnostic evidence,complex etiology,and interdisciplinary knowledge integration impede the interpretability,reliability,and applicability of AI-based DED detection methods.The research conducts a comprehensive review of datasets,diagnostic evidence,and standards,as well as advanced algorithms in AI-based DED detection over the past five years.The DED diagnostic methods are categorized into three groups based on their relationship with AI techniques:(1)those with ground truth and/or comparable standards,(2)potential AI-based methods with significant advantages,and(3)supplementary methods for AI-based DED detection.The study proposes suggested DED detection standards,the combination of multiple diagnostic evidence,and future research directions to guide further investigations.Ultimately,the research contributes to the advancement of ophthalmic disease detection by providing insights into knowledge foundations,advanced methods,challenges,and potential future perspectives,emphasizing the significant role of AI in both academic and practical aspects of ophthalmology.