A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples ...A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples in the feature space to boost the performance of each weak classifier in the asymmetric boosting framework. Then, the weak classifier based on a novel linear discriminate analysis (LDA) algorithm which is learned from the subspace of heterogeneous features is integrated into the framework. Finally, the proposed method deals with each bit of the code sequentially, which utilizes the samples misclassified in each round in order to learn compact and balanced code. The heterogeneous information from different modalities can be effectively complementary to each other, which leads to much higher performance. The experimental results based on the two public benchmarks demonstrate that this method is superior to many of the state- of-the-art methods. In conclusion, the performance of the retrieval system can be improved with the help of multiple heterogeneous features and the compact hash codes which can be learned by the imbalanced learning method.展开更多
In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate...In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate accuracy of the global model.Utilizing shared feature representations alongside customized classifiers for individual clients emerges as a promising personalized solution.Nonetheless,previous research has frequently neglected the integration of global knowledge into local representation learning and the synergy between global and local classifiers,thereby limiting model performance.To tackle these issues,this study proposes a hierarchical optimization method for federated learning with feature alignment and the fusion of classification decisions(FedFCD).FedFCD regularizes the relationship between global and local feature representations to achieve alignment and incorporates decision information from the global classifier,facilitating the late fusion of decision outputs from both global and local classifiers.Additionally,FedFCD employs a hierarchical optimization strategy to flexibly optimize model parameters.Through experiments on the Fashion-MNIST,CIFAR-10 and CIFAR-100 datasets,we demonstrate the effectiveness and superiority of FedFCD.For instance,on the CIFAR-100 dataset,FedFCD exhibited a significant improvement in average test accuracy by 6.83%compared to four outstanding personalized federated learning approaches.Furthermore,extended experiments confirm the robustness of FedFCD across various hyperparameter values.展开更多
Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a p...Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.展开更多
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ...Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.展开更多
This paper proposes an event-based two-stage Nonintrusive load monitoring(NILM)method involving multidimensional features,which is an essential technology for energy savings and management.First,capture appliance even...This paper proposes an event-based two-stage Nonintrusive load monitoring(NILM)method involving multidimensional features,which is an essential technology for energy savings and management.First,capture appliance events using a goodness of fit test and then pair the on-off events.Then the multi-dimensional features are extracted to establish a feature library.In the first stage identification,several groups of events for the appliance have been divided,according to three features,including phase,steady active power and power peak.In the second stage identification,a“one against the rest”support vector machine(SVM)model for each group is established to precisely identify the appliances.The proposed method is verified by using a public available dataset;the results show that the proposed method contains high generalization ability,less computation,and less training samples.展开更多
In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure ...In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure in social networks.First,we perform fast privacy leak detection on the currently published text based on the fastText model.In the case that the text to be published contains certain private information,we fully consider the aggregation effect of the private information leaked by different channels,and establish a convolution neural network model based on multi-dimensional features(MF-CNN)to detect privacy disclosure comprehensively and accurately.The experimental results show that the proposed method has a higher accuracy of privacy disclosure detection and can meet the real-time requirements of detection.展开更多
In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlatio...In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlation optimization is proposed.On the basis of airframe damage feature analysis,the multi-dimensional feature entropy is defined to realize the full fusion of multiple feature information of the image,and the division method is extended to multi-threshold to refine the damage division and reduce the impact of the damage adjacent region’s morphological changes on the division.Through the correlation parameter optimization algorithm,the problem of low efficiency of multi-dimensional multi-threshold division method is solved.Finally,the proposed method is compared and verified by instances of airframe damage image.The results show that compared with the traditional threshold division method,the damage region divided by the proposed method is complete and accurate,and the boundary is clear and coherent,which can effectively reduce the interference of many factors such as uneven luminance,chromaticity deviation,dirt attachment,image compression,and so on.The correlation optimization algorithm has high efficiency and stable convergence,and can meet the requirements of aircraft intelligent maintenance.展开更多
There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR)...There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.展开更多
基金The National Natural Science Foundation of China(No.61305058)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB520003)+1 种基金the Natural Science Foundation of Jiangsu Province(No.BK20130471)the Scientific Research Foundation for Advanced Talents by Jiangsu University(No.13JDG093)
文摘A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples in the feature space to boost the performance of each weak classifier in the asymmetric boosting framework. Then, the weak classifier based on a novel linear discriminate analysis (LDA) algorithm which is learned from the subspace of heterogeneous features is integrated into the framework. Finally, the proposed method deals with each bit of the code sequentially, which utilizes the samples misclassified in each round in order to learn compact and balanced code. The heterogeneous information from different modalities can be effectively complementary to each other, which leads to much higher performance. The experimental results based on the two public benchmarks demonstrate that this method is superior to many of the state- of-the-art methods. In conclusion, the performance of the retrieval system can be improved with the help of multiple heterogeneous features and the compact hash codes which can be learned by the imbalanced learning method.
基金the National Natural Science Foundation of China(Grant No.62062001)Ningxia Youth Top Talent Project(2021).
文摘In the realm of data privacy protection,federated learning aims to collaboratively train a global model.However,heterogeneous data between clients presents challenges,often resulting in slow convergence and inadequate accuracy of the global model.Utilizing shared feature representations alongside customized classifiers for individual clients emerges as a promising personalized solution.Nonetheless,previous research has frequently neglected the integration of global knowledge into local representation learning and the synergy between global and local classifiers,thereby limiting model performance.To tackle these issues,this study proposes a hierarchical optimization method for federated learning with feature alignment and the fusion of classification decisions(FedFCD).FedFCD regularizes the relationship between global and local feature representations to achieve alignment and incorporates decision information from the global classifier,facilitating the late fusion of decision outputs from both global and local classifiers.Additionally,FedFCD employs a hierarchical optimization strategy to flexibly optimize model parameters.Through experiments on the Fashion-MNIST,CIFAR-10 and CIFAR-100 datasets,we demonstrate the effectiveness and superiority of FedFCD.For instance,on the CIFAR-100 dataset,FedFCD exhibited a significant improvement in average test accuracy by 6.83%compared to four outstanding personalized federated learning approaches.Furthermore,extended experiments confirm the robustness of FedFCD across various hyperparameter values.
文摘Spatial heterogeneity refers to the variation or differences in characteristics or features across different locations or areas in space. Spatial data refers to information that explicitly or indirectly belongs to a particular geographic region or location, also known as geo-spatial data or geographic information. Focusing on spatial heterogeneity, we present a hybrid machine learning model combining two competitive algorithms: the Random Forest Regressor and CNN. The model is fine-tuned using cross validation for hyper-parameter adjustment and performance evaluation, ensuring robustness and generalization. Our approach integrates Global Moran’s I for examining global autocorrelation, and local Moran’s I for assessing local spatial autocorrelation in the residuals. To validate our approach, we implemented the hybrid model on a real-world dataset and compared its performance with that of the traditional machine learning models. Results indicate superior performance with an R-squared of 0.90, outperforming RF 0.84 and CNN 0.74. This study contributed to a detailed understanding of spatial variations in data considering the geographical information (Longitude & Latitude) present in the dataset. Our results, also assessed using the Root Mean Squared Error (RMSE), indicated that the hybrid yielded lower errors, showing a deviation of 53.65% from the RF model and 63.24% from the CNN model. Additionally, the global Moran’s I index was observed to be 0.10. This study underscores that the hybrid was able to predict correctly the house prices both in clusters and in dispersed areas.
基金supported by the National Natural Science Foundation of China(Nos.92152301,12072282)。
文摘Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.
基金supported by the National Science Foundation of China(U2166209,52007126)the Science and Technology Project of State Grid Tibet Electric Power Company(52311020009X)。
文摘This paper proposes an event-based two-stage Nonintrusive load monitoring(NILM)method involving multidimensional features,which is an essential technology for energy savings and management.First,capture appliance events using a goodness of fit test and then pair the on-off events.Then the multi-dimensional features are extracted to establish a feature library.In the first stage identification,several groups of events for the appliance have been divided,according to three features,including phase,steady active power and power peak.In the second stage identification,a“one against the rest”support vector machine(SVM)model for each group is established to precisely identify the appliances.The proposed method is verified by using a public available dataset;the results show that the proposed method contains high generalization ability,less computation,and less training samples.
基金This work was supported by the National Natural Science Foundation of China(No.61672101)the Beijing Key Laboratory of Internet Culture and Digital Dissemination Research(ICDDXN004)Key Lab of Information Network Security,Ministry of Public Security,China(No.C18601).
文摘In order to effectively detect the privacy that may be leaked through social networks and avoid unnecessary harm to users,this paper takes microblog as the research object to study the detection of privacy disclosure in social networks.First,we perform fast privacy leak detection on the currently published text based on the fastText model.In the case that the text to be published contains certain private information,we fully consider the aggregation effect of the private information leaked by different channels,and establish a convolution neural network model based on multi-dimensional features(MF-CNN)to detect privacy disclosure comprehensively and accurately.The experimental results show that the proposed method has a higher accuracy of privacy disclosure detection and can meet the real-time requirements of detection.
基金supported by the Aeronautical Science Foundation of China(No.20151067003)。
文摘In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlation optimization is proposed.On the basis of airframe damage feature analysis,the multi-dimensional feature entropy is defined to realize the full fusion of multiple feature information of the image,and the division method is extended to multi-threshold to refine the damage division and reduce the impact of the damage adjacent region’s morphological changes on the division.Through the correlation parameter optimization algorithm,the problem of low efficiency of multi-dimensional multi-threshold division method is solved.Finally,the proposed method is compared and verified by instances of airframe damage image.The results show that compared with the traditional threshold division method,the damage region divided by the proposed method is complete and accurate,and the boundary is clear and coherent,which can effectively reduce the interference of many factors such as uneven luminance,chromaticity deviation,dirt attachment,image compression,and so on.The correlation optimization algorithm has high efficiency and stable convergence,and can meet the requirements of aircraft intelligent maintenance.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2014 ZX03001027)
文摘There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.