Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction betwe...Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction between bundle and item views and relied on simplistic methods for predicting user-bundle relationships. To address this limitation, we propose Hybrid Contrastive Learning for Bundle Recommendation (HCLBR). Our approach integrates unsupervised and supervised contrastive learning to enrich user and bundle representations, promoting diversity. By leveraging interconnected views of user-item and user-bundle nodes, HCLBR enhances representation learning for robust recommendations. Evaluation on four public datasets demonstrates the superior performance of HCLBR over state-of-the-art baselines. Our findings highlight the significance of leveraging contrastive learning and interconnected views in bundle recommendation, providing valuable insights for marketing strategies and recommendation system design.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is design...In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.展开更多
Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real ...Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real degradation is not consistent with the assumption.To deal with real-world scenarios,existing blind SR methods are committed to estimating both the degradation and the super-resolved image with an extra loss or iterative scheme.However,degradation estimation that requires more computation would result in limited SR performance due to the accumulated estimation errors.In this paper,we propose a contrastive regularization built upon contrastive learning to exploit both the information of blurry images and clear images as negative and positive samples,respectively.Contrastive regularization ensures that the restored image is pulled closer to the clear image and pushed far away from the blurry image in the representation space.Furthermore,instead of estimating the degradation,we extract global statistical prior information to capture the character of the distortion.Considering the coupling between the degradation and the low-resolution image,we embed the global prior into the distortion-specific SR network to make our method adaptive to the changes of distortions.We term our distortion-specific network with contrastive regularization as CRDNet.The extensive experiments on synthetic and realworld scenes demonstrate that our lightweight CRDNet surpasses state-of-the-art blind super-resolution approaches.展开更多
This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to...This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.展开更多
Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropria...Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropriate model training solution.However,the required massive personal data for training contain private information with a significant risk of data leakage in cloud environments,leading to significant communication overheads.This paper proposes a federated person ReID method with model-contrastive learning(MOON)in an edge-cloud environment,named FRM.Specifically,based on federated partial averaging,MOON warmup is added to correct the local training of individual edge servers and improve the model’s effectiveness by calculating and back-propagating a model-contrastive loss,which represents the similarity between local and global models.In addition,we propose a lightweight person ReID network,named multi-branch combined depth space network(MB-CDNet),to reduce the computing resource usage of the edge device when training and testing the person ReID model.MB-CDNet is a multi-branch version of combined depth space network(CDNet).We add a part branch and a global branch on the basis of CDNet and introduce an attention pyramid to improve the performance of the model.The experimental results on open-access person ReID datasets demonstrate that FRM achieves better performance than existing baseline.展开更多
Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of t...Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.展开更多
Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These li...Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These limitations can result in the misjudgment of models,leading to a degradation in overall detection performance.This paper proposes a novel transformer-like anomaly detection model adopting a contrastive learning module and a memory block(CLME)to overcome the above limitations.The contrastive learning module tailored for time series data can learn the contextual relationships to generate temporal fine-grained representations.The memory block can record normal patterns of these representations through the utilization of attention-based addressing and reintegration mechanisms.These two modules together effectively alleviate the problem of generalization.Furthermore,this paper introduces a fusion anomaly detection strategy that comprehensively takes into account the residual and feature spaces.Such a strategy can enlarge the discrepancies between normal and abnormal data,which is more conducive to anomaly identification.The proposed CLME model not only efficiently enhances the generalization performance but also improves the ability of anomaly detection.To validate the efficacy of the proposed approach,extensive experiments are conducted on well-established benchmark datasets,including SWaT,PSM,WADI,and MSL.The results demonstrate outstanding performance,with F1 scores of 90.58%,94.83%,91.58%,and 91.75%,respectively.These findings affirm the superiority of the CLME model over existing stateof-the-art anomaly detection methodologies in terms of its ability to detect anomalies within complex datasets accurately.展开更多
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on...Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.展开更多
Psychological studies on human subjects show that contrast detection learning promote learner's sensitivity to visual stimulus contrast. The underlying neural mechanisms remain unknown. In this study, three cats (Fe...Psychological studies on human subjects show that contrast detection learning promote learner's sensitivity to visual stimulus contrast. The underlying neural mechanisms remain unknown. In this study, three cats (Felis catus) were trained to perform monocularly a contrast detection task by two-altemative forced choice method. The perceptual ability of each cat improved remarkably with learning as indicated by a significantly increased contrast sensitivity to visual stimuli. The learning effect displayed an evident specificity to the eye employed for learning but could partially transfer to the naive eye, prompting the possibility that contrast detection learning might cause neural plasticity before and after the information from both eyes are merged in the visual pathway. Further, the contrast sensitivity improvement was evident basically around the spatial frequency (SF) used for learning, which suggested that contrast detection learning effect showed, to some extent, a SF specificity. This study indicates that cat exhibits a property of contrast detection learning similar to human subjects and can be used as an animal model for subsequent investigations on the neural correlates that mediate learning-induced contrast sensitivity improvement in humans.展开更多
Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity an...Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).展开更多
Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as ...Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as weak user-item interaction supervisory signals and noise in the knowledge graph.To tackle these issues,this paper proposes a neighbor information contrast-enhanced recommendation method by adding subtle noise to construct contrast views and employing contrastive learning to strengthen supervisory signals and reduce knowledge noise.Specifically,first,this paper adopts heterogeneous propagation and knowledge-aware attention networks to obtain multi-order neighbor embedding of users and items,mining the high-order neighbor informa-tion of users and items.Next,in the neighbor information,this paper introduces weak noise following a uniform distribution to construct neighbor contrast views,effectively reducing the time overhead of view construction.This paper then performs contrastive learning between neighbor views to promote the uniformity of view information,adjusting the neighbor structure,and achieving the goal of reducing the knowledge noise in the knowledge graph.Finally,this paper introduces multi-task learning to mitigate the problem of weak supervisory signals.To validate the effectiveness of our method,experiments are conducted on theMovieLens-1M,MovieLens-20M,Book-Crossing,and Last-FM datasets.The results showthat compared to the best baselines,our method shows significant improvements in AUC and F1.展开更多
Fruit infections have an impact on both the yield and the quality of the crop.As a result,an automated recognition system for fruit leaf diseases is important.In artificial intelligence(AI)applications,especially in a...Fruit infections have an impact on both the yield and the quality of the crop.As a result,an automated recognition system for fruit leaf diseases is important.In artificial intelligence(AI)applications,especially in agriculture,deep learning shows promising disease detection and classification results.The recent AI-based techniques have a few challenges for fruit disease recognition,such as low-resolution images,small datasets for learning models,and irrelevant feature extraction.This work proposed a new fruit leaf leaf leaf disease recognition framework using deep learning features and improved pathfinder optimization.Three fruit types have been employed in this work for the validation process,such as apple,grape,and Citrus.In the first step,a noisy dataset is prepared by employing the original images to learn the designed framework better.The EfficientNet-B0 deep model is fine-tuned on the next step and trained separately on the original and noisy data.After that,features are fused using a serial concatenation approach that is later optimized in the next step using an improved Path Finder Algorithm(PFA).This algorithm aims to select the best features based on the fitness score and ignore redundant information.The selected features are finally classified using machine learning classifiers such as Medium Neural Network,Wide Neural Network,and Support Vector Machine.The experimental process was conducted on each fruit dataset separately and obtained an accuracy of 100%,99.7%,99.7%,and 93.4%for apple,grape,Citrus fruit,and citrus plant leaves,respectively.A detailed analysis is conducted and also compared with the recent techniques,and the proposed framework shows improved accuracy.展开更多
Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-ti...Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.展开更多
Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the pro...Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.展开更多
Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information ...Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.展开更多
A learner’s stages of L2 development are connected by his or her L1 and culture.It is,accordingly,of paramount importance to understand the second language learners’culture and learning process and better assist the...A learner’s stages of L2 development are connected by his or her L1 and culture.It is,accordingly,of paramount importance to understand the second language learners’culture and learning process and better assist them through this process in the way of teaching them English.Similarly,inter-language theory(IL)and contrastive rhetoric are affected by factors,such as learner’s L1,learning experiences,and culture.This paper talks about these two theory’s characteristics,constructs,and importance,so language instructors may better understand the L2 learning phenomena and think out better methods to help language learners improve their language skills.展开更多
In medical imaging,computer vision researchers are faced with a variety of features for verifying the authenticity of classifiers for an accurate diagnosis.In response to the coronavirus 2019(COVID-19)pandemic,new tes...In medical imaging,computer vision researchers are faced with a variety of features for verifying the authenticity of classifiers for an accurate diagnosis.In response to the coronavirus 2019(COVID-19)pandemic,new testing procedures,medical treatments,and vaccines are being developed rapidly.One potential diagnostic tool is a reverse-transcription polymerase chain reaction(RT-PCR).RT-PCR,typically a time-consuming process,was less sensitive to COVID-19 recognition in the disease’s early stages.Here we introduce an optimized deep learning(DL)scheme to distinguish COVID-19-infected patients from normal patients according to computed tomography(CT)scans.In the proposed method,contrast enhancement is used to improve the quality of the original images.A pretrained DenseNet-201 DL model is then trained using transfer learning.Two fully connected layers and an average pool are used for feature extraction.The extracted deep features are then optimized with a Firefly algorithm to select the most optimal learning features.Fusing the selected features is important to improving the accuracy of the approach;however,it directly affects the computational cost of the technique.In the proposed method,a new parallel high index technique is used to fuse two optimal vectors;the outcome is then passed on to an extreme learning machine for final classification.Experiments were conducted on a collected database of patients using a 70:30 training:Testing ratio.Our results indicated an average classification accuracy of 94.76%with the proposed approach.A comparison of the outcomes to several other DL models demonstrated the effectiveness of our DL method for classifying COVID-19 based on CT scans.展开更多
An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification...An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification of GI abnormalities by deep learning.The first bleeding region is segmented using a hybrid approach.The threshold is applied to each channel extracted from the original RGB image.Later,all channels are merged through mutual information and pixel-based techniques.As a result,the image is segmented.Texture and deep learning features are extracted in the proposed classification task.The transfer learning(TL)approach is used for the extraction of deep features.The Local Binary Pattern(LBP)method is used for texture features.Later,an entropy-based feature selection approach is implemented to select the best features of both deep learning and texture vectors.The selected optimal features are combined with a serial-based technique and the resulting vector is fed to the Ensemble Learning Classifier.The experimental process is evaluated on the basis of two datasets:Private and KVASIR.The accuracy achieved is 99.8 per cent for the private data set and 86.4 percent for the KVASIR data set.It can be confirmed that the proposed method is effective in detecting and classifying GI abnormalities and exceeds other methods of comparison.展开更多
Background:A brain tumor reects abnormal cell growth.Challenges:Surgery,radiation therapy,and chemotherapy are used to treat brain tumors,but these procedures are painful and costly.Magnetic resonance imaging(MRI)is a...Background:A brain tumor reects abnormal cell growth.Challenges:Surgery,radiation therapy,and chemotherapy are used to treat brain tumors,but these procedures are painful and costly.Magnetic resonance imaging(MRI)is a non-invasive modality for diagnosing tumors,but scans must be interpretated by an expert radiologist.Methodology:We used deep learning and improved particle swarm optimization(IPSO)to automate brain tumor classication.MRI scan contrast is enhanced by ant colony optimization(ACO);the scans are then used to further train a pretrained deep learning model,via transfer learning(TL),and to extract features from two dense layers.We fused the features of both layers into a single,more informative vector.An IPSO algorithm selected the optimal features,which were classied using a support vector machine.Results:We analyzed high-and low-grade glioma images from the BRATS 2018 dataset;the identication accuracies were 99.9%and 99.3%,respectively.Impact:The accuracy of our method is signicantly higher than existing techniques;thus,it will help radiologists to make diagnoses,by providing a“second opinion.”展开更多
文摘Bundle recommendation aims to provide users with convenient one-stop solutions by recommending bundles of related items that cater to their diverse needs. However, previous research has neglected the interaction between bundle and item views and relied on simplistic methods for predicting user-bundle relationships. To address this limitation, we propose Hybrid Contrastive Learning for Bundle Recommendation (HCLBR). Our approach integrates unsupervised and supervised contrastive learning to enrich user and bundle representations, promoting diversity. By leveraging interconnected views of user-item and user-bundle nodes, HCLBR enhances representation learning for robust recommendations. Evaluation on four public datasets demonstrates the superior performance of HCLBR over state-of-the-art baselines. Our findings highlight the significance of leveraging contrastive learning and interconnected views in bundle recommendation, providing valuable insights for marketing strategies and recommendation system design.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
基金supported by the Fundamental Research Funds for the Central Universities(NOS.NS2019054,NS2020045)。
文摘In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.
基金supported by the National Natural Science Foundation of China(61971165)the Key Research and Development Program of Hubei Province(2020BAB113)。
文摘Previous deep learning-based super-resolution(SR)methods rely on the assumption that the degradation process is predefined(e.g.,bicubic downsampling).Thus,their performance would suffer from deterioration if the real degradation is not consistent with the assumption.To deal with real-world scenarios,existing blind SR methods are committed to estimating both the degradation and the super-resolved image with an extra loss or iterative scheme.However,degradation estimation that requires more computation would result in limited SR performance due to the accumulated estimation errors.In this paper,we propose a contrastive regularization built upon contrastive learning to exploit both the information of blurry images and clear images as negative and positive samples,respectively.Contrastive regularization ensures that the restored image is pulled closer to the clear image and pushed far away from the blurry image in the representation space.Furthermore,instead of estimating the degradation,we extract global statistical prior information to capture the character of the distortion.Considering the coupling between the degradation and the low-resolution image,we embed the global prior into the distortion-specific SR network to make our method adaptive to the changes of distortions.We term our distortion-specific network with contrastive regularization as CRDNet.The extensive experiments on synthetic and realworld scenes demonstrate that our lightweight CRDNet surpasses state-of-the-art blind super-resolution approaches.
基金supported by the NationalNatural Science Foundation of China (No.62107014,Jian P.,62177025,He B.)the Key R&D and Promotion Projects of Henan Province (No.212102210147,Jian P.)Innovative Education Program for Graduate Students at North China University of Water Resources and Electric Power,China (No.YK-2021-99,Guo F.).
文摘This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method.
基金supported by the the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20211284the Financial and Science Technology Plan Project of Xinjiang Production and Construction Corps under Grant No.2020DB005.
文摘Person re-identification(ReID)aims to recognize the same person in multiple images from different camera views.Training person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropriate model training solution.However,the required massive personal data for training contain private information with a significant risk of data leakage in cloud environments,leading to significant communication overheads.This paper proposes a federated person ReID method with model-contrastive learning(MOON)in an edge-cloud environment,named FRM.Specifically,based on federated partial averaging,MOON warmup is added to correct the local training of individual edge servers and improve the model’s effectiveness by calculating and back-propagating a model-contrastive loss,which represents the similarity between local and global models.In addition,we propose a lightweight person ReID network,named multi-branch combined depth space network(MB-CDNet),to reduce the computing resource usage of the edge device when training and testing the person ReID model.MB-CDNet is a multi-branch version of combined depth space network(CDNet).We add a part branch and a global branch on the basis of CDNet and introduce an attention pyramid to improve the performance of the model.The experimental results on open-access person ReID datasets demonstrate that FRM achieves better performance than existing baseline.
基金support by the National Natural Science Foundation of China(NSFC)under grant number 61873274.
文摘Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.
基金support from the Major National Science and Technology Special Projects(2016ZX02301003-004-007)the Natural Science Foundation of Hebei Province(F2020202067)。
文摘Some reconstruction-based anomaly detection models in multivariate time series have brought impressive performance advancements but suffer from weak generalization ability and a lack of anomaly identification.These limitations can result in the misjudgment of models,leading to a degradation in overall detection performance.This paper proposes a novel transformer-like anomaly detection model adopting a contrastive learning module and a memory block(CLME)to overcome the above limitations.The contrastive learning module tailored for time series data can learn the contextual relationships to generate temporal fine-grained representations.The memory block can record normal patterns of these representations through the utilization of attention-based addressing and reintegration mechanisms.These two modules together effectively alleviate the problem of generalization.Furthermore,this paper introduces a fusion anomaly detection strategy that comprehensively takes into account the residual and feature spaces.Such a strategy can enlarge the discrepancies between normal and abnormal data,which is more conducive to anomaly identification.The proposed CLME model not only efficiently enhances the generalization performance but also improves the ability of anomaly detection.To validate the efficacy of the proposed approach,extensive experiments are conducted on well-established benchmark datasets,including SWaT,PSM,WADI,and MSL.The results demonstrate outstanding performance,with F1 scores of 90.58%,94.83%,91.58%,and 91.75%,respectively.These findings affirm the superiority of the CLME model over existing stateof-the-art anomaly detection methodologies in terms of its ability to detect anomalies within complex datasets accurately.
基金supported by Science and Technology Research Project of Jiangxi Education Department.Project Grant No.GJJ2203306.
文摘Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment.
基金Supported by Natural Science Foundation of Anhui Province(070413138)the foundation of Key Laboratory of Anhui Province and the Key Research Foundation from Education Department of Anhui Province(KJ2009A167)
文摘Psychological studies on human subjects show that contrast detection learning promote learner's sensitivity to visual stimulus contrast. The underlying neural mechanisms remain unknown. In this study, three cats (Felis catus) were trained to perform monocularly a contrast detection task by two-altemative forced choice method. The perceptual ability of each cat improved remarkably with learning as indicated by a significantly increased contrast sensitivity to visual stimuli. The learning effect displayed an evident specificity to the eye employed for learning but could partially transfer to the naive eye, prompting the possibility that contrast detection learning might cause neural plasticity before and after the information from both eyes are merged in the visual pathway. Further, the contrast sensitivity improvement was evident basically around the spatial frequency (SF) used for learning, which suggested that contrast detection learning effect showed, to some extent, a SF specificity. This study indicates that cat exhibits a property of contrast detection learning similar to human subjects and can be used as an animal model for subsequent investigations on the neural correlates that mediate learning-induced contrast sensitivity improvement in humans.
基金supported by the National Natural Science Foundation of China under Grant 61671219.
文摘Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).
基金supported by the Natural Science Foundation of Ningxia Province(No.2023AAC03316)the Ningxia Hui Autonomous Region Education Department Higher Edu-cation Key Scientific Research Project(No.NYG2022051)the North Minzu University Graduate Innovation Project(YCX23146).
文摘Knowledge graph can assist in improving recommendation performance and is widely applied in various person-alized recommendation domains.However,existing knowledge-aware recommendation methods face challenges such as weak user-item interaction supervisory signals and noise in the knowledge graph.To tackle these issues,this paper proposes a neighbor information contrast-enhanced recommendation method by adding subtle noise to construct contrast views and employing contrastive learning to strengthen supervisory signals and reduce knowledge noise.Specifically,first,this paper adopts heterogeneous propagation and knowledge-aware attention networks to obtain multi-order neighbor embedding of users and items,mining the high-order neighbor informa-tion of users and items.Next,in the neighbor information,this paper introduces weak noise following a uniform distribution to construct neighbor contrast views,effectively reducing the time overhead of view construction.This paper then performs contrastive learning between neighbor views to promote the uniformity of view information,adjusting the neighbor structure,and achieving the goal of reducing the knowledge noise in the knowledge graph.Finally,this paper introduces multi-task learning to mitigate the problem of weak supervisory signals.To validate the effectiveness of our method,experiments are conducted on theMovieLens-1M,MovieLens-20M,Book-Crossing,and Last-FM datasets.The results showthat compared to the best baselines,our method shows significant improvements in AUC and F1.
文摘Fruit infections have an impact on both the yield and the quality of the crop.As a result,an automated recognition system for fruit leaf diseases is important.In artificial intelligence(AI)applications,especially in agriculture,deep learning shows promising disease detection and classification results.The recent AI-based techniques have a few challenges for fruit disease recognition,such as low-resolution images,small datasets for learning models,and irrelevant feature extraction.This work proposed a new fruit leaf leaf leaf disease recognition framework using deep learning features and improved pathfinder optimization.Three fruit types have been employed in this work for the validation process,such as apple,grape,and Citrus.In the first step,a noisy dataset is prepared by employing the original images to learn the designed framework better.The EfficientNet-B0 deep model is fine-tuned on the next step and trained separately on the original and noisy data.After that,features are fused using a serial concatenation approach that is later optimized in the next step using an improved Path Finder Algorithm(PFA).This algorithm aims to select the best features based on the fitness score and ignore redundant information.The selected features are finally classified using machine learning classifiers such as Medium Neural Network,Wide Neural Network,and Support Vector Machine.The experimental process was conducted on each fruit dataset separately and obtained an accuracy of 100%,99.7%,99.7%,and 93.4%for apple,grape,Citrus fruit,and citrus plant leaves,respectively.A detailed analysis is conducted and also compared with the recent techniques,and the proposed framework shows improved accuracy.
基金supported in part by the National Natural Science Foundation of China(Grant No.82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)of Shenzhen Science and Technology Innovation Committee+6 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Natural Science Foundation of Jiangsu Province(No.BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038 and SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575)the Henan Province Science and Technology Research(222102310322)The Jiangsu Students’Innovation and Entrepreneurship Training Program(202110304096Y).
文摘Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.
基金supported by the Research Grant Fund from Kwangwoon University in 2023,the National Natural Science Foundation of China under Grant(62311540155)the Taishan Scholars Project Special Funds(tsqn202312035)the open research foundation of State Key Laboratory of Integrated Chips and Systems.
文摘Wearable wristband systems leverage deep learning to revolutionize hand gesture recognition in daily activities.Unlike existing approaches that often focus on static gestures and require extensive labeled data,the proposed wearable wristband with selfsupervised contrastive learning excels at dynamic motion tracking and adapts rapidly across multiple scenarios.It features a four-channel sensing array composed of an ionic hydrogel with hierarchical microcone structures and ultrathin flexible electrodes,resulting in high-sensitivity capacitance output.Through wireless transmission from a Wi-Fi module,the proposed algorithm learns latent features from the unlabeled signals of random wrist movements.Remarkably,only few-shot labeled data are sufficient for fine-tuning the model,enabling rapid adaptation to various tasks.The system achieves a high accuracy of 94.9%in different scenarios,including the prediction of eight-direction commands,and air-writing of all numbers and letters.The proposed method facilitates smooth transitions between multiple tasks without the need for modifying the structure or undergoing extensive task-specific training.Its utility has been further extended to enhance human–machine interaction over digital platforms,such as game controls,calculators,and three-language login systems,offering users a natural and intuitive way of communication.
文摘Unsupervised learning methods such as graph contrastive learning have been used for dynamic graph represen-tation learning to eliminate the dependence of labels.However,existing studies neglect positional information when learning discrete snapshots,resulting in insufficient network topology learning.At the same time,due to the lack of appropriate data augmentation methods,it is difficult to capture the evolving patterns of the network effectively.To address the above problems,a position-aware and subgraph enhanced dynamic graph contrastive learning method is proposed for discrete-time dynamic graphs.Firstly,the global snapshot is built based on the historical snapshots to express the stable pattern of the dynamic graph,and the random walk is used to obtain the position representation by learning the positional information of the nodes.Secondly,a new data augmentation method is carried out from the perspectives of short-term changes and long-term stable structures of dynamic graphs.Specifically,subgraph sampling based on snapshots and global snapshots is used to obtain two structural augmentation views,and node structures and evolving patterns are learned by combining graph neural network,gated recurrent unit,and attention mechanism.Finally,the quality of node representation is improved by combining the contrastive learning between different structural augmentation views and between the two representations of structure and position.Experimental results on four real datasets show that the performance of the proposed method is better than the existing unsupervised methods,and it is more competitive than the supervised learning method under a semi-supervised setting.
文摘A learner’s stages of L2 development are connected by his or her L1 and culture.It is,accordingly,of paramount importance to understand the second language learners’culture and learning process and better assist them through this process in the way of teaching them English.Similarly,inter-language theory(IL)and contrastive rhetoric are affected by factors,such as learner’s L1,learning experiences,and culture.This paper talks about these two theory’s characteristics,constructs,and importance,so language instructors may better understand the L2 learning phenomena and think out better methods to help language learners improve their language skills.
基金Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)and the Soonchunhyang University Research Fund.
文摘In medical imaging,computer vision researchers are faced with a variety of features for verifying the authenticity of classifiers for an accurate diagnosis.In response to the coronavirus 2019(COVID-19)pandemic,new testing procedures,medical treatments,and vaccines are being developed rapidly.One potential diagnostic tool is a reverse-transcription polymerase chain reaction(RT-PCR).RT-PCR,typically a time-consuming process,was less sensitive to COVID-19 recognition in the disease’s early stages.Here we introduce an optimized deep learning(DL)scheme to distinguish COVID-19-infected patients from normal patients according to computed tomography(CT)scans.In the proposed method,contrast enhancement is used to improve the quality of the original images.A pretrained DenseNet-201 DL model is then trained using transfer learning.Two fully connected layers and an average pool are used for feature extraction.The extracted deep features are then optimized with a Firefly algorithm to select the most optimal learning features.Fusing the selected features is important to improving the accuracy of the approach;however,it directly affects the computational cost of the technique.In the proposed method,a new parallel high index technique is used to fuse two optimal vectors;the outcome is then passed on to an extreme learning machine for final classification.Experiments were conducted on a collected database of patients using a 70:30 training:Testing ratio.Our results indicated an average classification accuracy of 94.76%with the proposed approach.A comparison of the outcomes to several other DL models demonstrated the effectiveness of our DL method for classifying COVID-19 based on CT scans.
基金This research was financially supported in part by the Ministry of Trade,Industry and Energy(MOTIE)and Korea Institute for Advancement of Technology(KIAT)through the International Cooperative R&D program.(Project No.P0016038)in part by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2021-2016-0-00312)supervised by the IITP(Institute for Information&communications Technology Planning&Evaluation).
文摘An automated system is proposed for the detection and classification of GI abnormalities.The proposed method operates under two pipeline procedures:(a)segmentation of the bleeding infection region and(b)classification of GI abnormalities by deep learning.The first bleeding region is segmented using a hybrid approach.The threshold is applied to each channel extracted from the original RGB image.Later,all channels are merged through mutual information and pixel-based techniques.As a result,the image is segmented.Texture and deep learning features are extracted in the proposed classification task.The transfer learning(TL)approach is used for the extraction of deep features.The Local Binary Pattern(LBP)method is used for texture features.Later,an entropy-based feature selection approach is implemented to select the best features of both deep learning and texture vectors.The selected optimal features are combined with a serial-based technique and the resulting vector is fed to the Ensemble Learning Classifier.The experimental process is evaluated on the basis of two datasets:Private and KVASIR.The accuracy achieved is 99.8 per cent for the private data set and 86.4 percent for the KVASIR data set.It can be confirmed that the proposed method is effective in detecting and classifying GI abnormalities and exceeds other methods of comparison.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by the Korea Government(MOTIE)(P0012724,The Competency Development Program for Industry Specialist)the Soonchunhyang University Research Fund.
文摘Background:A brain tumor reects abnormal cell growth.Challenges:Surgery,radiation therapy,and chemotherapy are used to treat brain tumors,but these procedures are painful and costly.Magnetic resonance imaging(MRI)is a non-invasive modality for diagnosing tumors,but scans must be interpretated by an expert radiologist.Methodology:We used deep learning and improved particle swarm optimization(IPSO)to automate brain tumor classication.MRI scan contrast is enhanced by ant colony optimization(ACO);the scans are then used to further train a pretrained deep learning model,via transfer learning(TL),and to extract features from two dense layers.We fused the features of both layers into a single,more informative vector.An IPSO algorithm selected the optimal features,which were classied using a support vector machine.Results:We analyzed high-and low-grade glioma images from the BRATS 2018 dataset;the identication accuracies were 99.9%and 99.3%,respectively.Impact:The accuracy of our method is signicantly higher than existing techniques;thus,it will help radiologists to make diagnoses,by providing a“second opinion.”