With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ...With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.展开更多
An empirical test on long memory between price and trading volume of China metals futures market was given with MF-DCCA method. The empirical results show that long memory feature with a certain period exists in price...An empirical test on long memory between price and trading volume of China metals futures market was given with MF-DCCA method. The empirical results show that long memory feature with a certain period exists in price-volume correlation and a fittther proof was given by analyzing the source of multifractal feature. The empirical results suggest that it is of important practical significance to bring the fractal market theory and other nonlinear theory into the analysis and explanation of the behavior in metal futures market.展开更多
The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection m...The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.展开更多
In the global information era,people acquire more and more information from the Internet,but the quality of the search results is degraded strongly because of the presence of web spam.Web spam is one of the serious pr...In the global information era,people acquire more and more information from the Internet,but the quality of the search results is degraded strongly because of the presence of web spam.Web spam is one of the serious problems for search engines,and many methods have been proposed for spam detection.We exploit the content features of non-spam in contrast to those of spam.The content features for non-spam pages always possess lots of statistical regularities; but those for spam pages possess very few statistical regularities,because spam pages are made randomly in order to increase the page rank.In this paper,we summarize the regularities distributions of content features for non-spam pages,and propose the calculating probability formulae of the entropy and independent n-grams respectively.Furthermore,we put forward the calculation formulae of multi features correlation.Among them,the notable content features may be used as auxiliary information for spam detection.展开更多
News media profiling is helpful in preventing the spread of fake news at the source and maintaining a good media and news ecosystem.Most previous works only extract features and evaluate media from one dimension indep...News media profiling is helpful in preventing the spread of fake news at the source and maintaining a good media and news ecosystem.Most previous works only extract features and evaluate media from one dimension independently,ignoring the interconnections between different aspects.This paper proposes a novel news media bias and factuality profiling framework assisted by correlated features.This framework models the relationship and interaction between media bias and factuality,utilizing this relationship to assist in the prediction of profiling results.Our approach extracts features independently while aligning and fusing them through recursive convolu-tion and attention mechanisms,thus harnessing multi-scale interactive information across different dimensions and levels.This method improves the effectiveness of news media evaluation.Experimental results indicate that our proposed framework significantly outperforms existing methods,achieving the best performance in Accuracy and F1 score,improving by at least 1%compared to other methods.This paper further analyzes and discusses based on the experimental results.展开更多
Based on spatio-temporal correlativity analysis method, the automatic identification techniques for data anomaly monitoring of coal mining working face gas are presented. The asynchronous correlative characteristics o...Based on spatio-temporal correlativity analysis method, the automatic identification techniques for data anomaly monitoring of coal mining working face gas are presented. The asynchronous correlative characteristics of gas migration in working face airflow direction are qualitatively analyzed. The calculation method of asynchronous correlation delay step and the prediction and inversion formulas of gas concentration changing with time and space after gas emission in the air return roadway are provided. By calculating one hundred and fifty groups of gas sensors data series from a coal mine which have the theoretical correlativity, the correlative coefficient values range of eight kinds of data anomaly is obtained. Then the gas moni- toring data anomaly identification algorithm based on spatio-temporal correlativity analysis is accordingly presented. In order to improve the efficiency of analysis, the gas sensors code rules which can express the spatial topological relations are sug- gested. The experiments indicate that methods presented in this article can effectively compensate the defects of methods based on a single gas sensor monitoring data.展开更多
Cross-Project Defect Prediction(CPDP)is a method that utilizes historical data from other source projects to train predictive models for defect prediction in the target project.However,existing CPDP methods only consi...Cross-Project Defect Prediction(CPDP)is a method that utilizes historical data from other source projects to train predictive models for defect prediction in the target project.However,existing CPDP methods only consider linear correlations between features(indicators)of the source and target projects.These models are not capable of evaluating non-linear correlations between features when they exist,for example,when there are differences in data distributions between the source and target projects.As a result,the performance of such CPDP models is compromised.In this paper,this paper proposes a novel CPDP method based on Synthetic Minority Oversampling Technique(SMOTE)and Deep Canonical Correlation Analysis(DCCA),referred to as S-DCCA.Canonical Correlation Analysis(CCA)is employed to address the issue of non-linear correlations between features of the source and target projects.S-DCCA extends CCA by incorporating the MlpNet model for feature extraction from the dataset.The redundant features are then eliminated by maximizing the correlated feature subset using the CCA loss function.Finally,cross-project defect prediction is achieved through the application of the SMOTE data sampling technique.Area Under Curve(AUC)and F1 scores(F1)are used as evaluation metrics.This paper conducted experiments on 27 projects from four public datasets to validate the proposed method.The results demonstrate that,on average,our method outperforms all baseline approaches by at least 1.2%in AUC and 5.5%in F1 score.This indicates that the proposed method exhibits favorable performance characteristics.展开更多
Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Mos...Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Most DDoS feature extraction methods cannot fully utilize the information of the original data,resulting in the extracted features losing useful features.In this paper,a DDoS feature representation method based on deep belief network(DBN)is proposed.We quantify the original data by the size of the network flows,the distribution of IP addresses and ports,and the diversity of packet sizes of different protocols and train the DBN in an unsupervised manner by these quantified values.Two feedforward neural networks(FFNN)are initialized by the trained deep belief network,and one of the feedforward neural networks continues to be trained in a supervised manner.The canonical correlation analysis(CCA)method is used to fuse the features extracted by two feedforward neural networks per layer.Experiments show that compared with other methods,the proposed method can extract better features.展开更多
Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information mor...Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.展开更多
In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution o...In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution of the boundary effect to learn a better correlation filter.However,the related studies are insufficient.By exploring the potential of trackers in these two aspects,a novel adaptive padding correlation filter(APCF)with feature group fusion is proposed for robust visual tracking in this paper based on the popular context-aware tracking framework.In the tracker,three feature groups are fused by use of the weighted sum of the normalized response maps,to alleviate the risk of drift caused by the extreme change of single feature.Moreover,to improve the adaptive ability of padding for the filter training of different object shapes,the best padding is selected from the preset pool according to tracking precision over the whole video,where tracking precision is predicted according to the prediction model trained by use of the sequence features of the first several frames.The sequence features include three traditional features and eight newly constructed features.Extensive experiments demonstrate that the proposed tracker is superior to most state-of-the-art correlation filter based trackers and has a stable improvement compared to the basic trackers.展开更多
The high-precision GPS data observed from the northeast margin of the Qinghai-Xizang (Tibet) block and the Sichuan-Yunnan GPS monitoring areas in 1991 (1993), 1999 and 2001 revealed that: before the Kunlun earthq...The high-precision GPS data observed from the northeast margin of the Qinghai-Xizang (Tibet) block and the Sichuan-Yunnan GPS monitoring areas in 1991 (1993), 1999 and 2001 revealed that: before the Kunlun earthquake with Ms =8.1 on November 14, 2001, the dynamic variation features of horizontal movement-deformation field in the north and east marginal tectonic areas of the Qinghai-Xizang (Tibet) block had some correlated features. That is to say, under the general background of inherited movement, the movement intensifies in the two areas weakened synchronously and the state of deformation changed when the great earthquake was impending. Analysis and study in connection with geological structures showed that before the Kunlun Ms8.1 earthquake, the correlated variations of movement-deformation on the boundaries of Qinghai-Xizang (Tibet) block were related to the disturbing stress field caused by the extensive and rapid stress-strain accumulation in the late stage of large earthquake preparation. Owing to the occurrence of large earthquake inside the block, the release of large amount of strain energy, and the adjustment of tectonic stress field, in relevant structural positions (especially zones not penetrated by historical strong earthquake ruptures) in boundary zones where larger amount of strain energy was accumulated, stress-strain may be further accumulated or else released through rupture.展开更多
The paper reveals that the variations in parameters like u *, the scaling velocity and θ*. The scaling temperature during the various phases of monsoon might be linked with subsynoptic features. The rise in u * is ma...The paper reveals that the variations in parameters like u *, the scaling velocity and θ*. The scaling temperature during the various phases of monsoon might be linked with subsynoptic features. The rise in u * is mainly connected with the presence of lower tropospheric cyclonic vorticity over a subsynoptic scale of the site. However the variations in θ. is mainly linked with the various phases of monsoon and θ * shows a sharp rise in presence of low level convective cloud.Besides the correlation studies of u and u., θv and θv, θv-θv0 and θv, * are undertaken. The correlation between θv and θv * is poor. In other two cases correlations are good. Besides u / u * has shown good coefficient of variation values within the ξ range.展开更多
Undoubtedly,uncooperative or malicious nodes threaten the safety of Internet of Vehicles(IoV)by destroying routing or data.To this end,some researchers have designed some node detection mechanisms and trust calculatin...Undoubtedly,uncooperative or malicious nodes threaten the safety of Internet of Vehicles(IoV)by destroying routing or data.To this end,some researchers have designed some node detection mechanisms and trust calculating algorithms based on some different feature parameters of IoV such as communication,data,energy,etc.,to detect and evaluate vehicle nodes.However,it is difficult to effectively assess the trust level of a vehicle node only by message forwarding,data consistency,and energy sufficiency.In order to resolve these problems,a novel mechanism and a new trust calculating model is proposed in this paper.First,the four tuple method is adopted,to qualitatively describing various types of nodes of IoV;Second,analyzing the behavioral features and correlation of various nodes based on route forwarding rate,data forwarding rate and physical location;third,designing double layer detection feature parameters with the ability to detect uncooperative nodes and malicious nodes;fourth,establishing a node correlative detection model with a double layer structure by combining the network layer and the perception layer.Accordingly,we conducted simulation experiments to verify the accuracy and time of this detection method under different speed-rate topological conditions of IoV.The results show that comparing with methods which only considers energy or communication parameters,the method proposed in this paper has obvious advantages in the detection of uncooperative and malicious nodes of IoV;especially,with the double detection feature parameters and node correlative detection model combined,detection accuracy is effectively improved,and the calculation time of node detection is largely reduced.展开更多
To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation f...To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation filter is proposed.Firstly,multi-layer features are extracted by a deep model pre-trained on massive object recognition datasets.The linearly separable features of Relu3-1,Relu4-1 and Relu5-4 layers from VGG-Net-19 are especially suitable for target tracking.Then,correlation filters over hierarchical convolutional features are learned to generate their correlation response maps.Finally,a novel approach of weight adjustment is presented to fuse response maps.The maximum value of the final response map is just the location of the target.Extensive experiments on the object tracking benchmark datasets demonstrate the high robustness and recognition precision compared with several state-of-the-art trackers under the different conditions.展开更多
Finding effective cancer treatment is a challenge, because the sensitivity of the cancer stems from both intrinsic cellular properties and acquired resistances from prior treatment. Previous research has revealed indi...Finding effective cancer treatment is a challenge, because the sensitivity of the cancer stems from both intrinsic cellular properties and acquired resistances from prior treatment. Previous research has revealed individual protein markers that are significant to chemosensitivity prediction. Our goal is to find correlated protein markers which are collectively significant to chemosensitivity prediction to complement the individual markers already reported. In order to do this, we used the D’ correlation measurement to study the feature selection correlations for chemosensitivity prediction of 118 anticancer agents with putatively known mechanisms of action. Three data-sets on the NCI-60 were utilized in this study: two protein datasets, one previously studied for chemosensitivity prediction and another novel to this topic, and one DNA copy number dataset. To validate our approach, we identified the protein markers that were strongly correlated by our analysis with the individual protein markers found in previous studies. Our feature analysis discovered highly correlated protein marker pairs, based on which we found individual protein markers with medical significance. While some of the markers uncovered were consistent with those previously reported, others were original to this work. Using these marker pairs we were able to further correlate the cellular functions associated with them. As an exploratory analysis, we discovered feature selection correlation patterns between and within different drug mechanisms of action for each of our datasets. In conclusion, the highly correlated protein marker pairs as well as their functions found by our feature analysis are validated by previous studies, and are shown to be medically significant, demonstrating D’ as an effective measurement of correlation in the context of feature selection for the first time.展开更多
In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlatio...In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlation optimization is proposed.On the basis of airframe damage feature analysis,the multi-dimensional feature entropy is defined to realize the full fusion of multiple feature information of the image,and the division method is extended to multi-threshold to refine the damage division and reduce the impact of the damage adjacent region’s morphological changes on the division.Through the correlation parameter optimization algorithm,the problem of low efficiency of multi-dimensional multi-threshold division method is solved.Finally,the proposed method is compared and verified by instances of airframe damage image.The results show that compared with the traditional threshold division method,the damage region divided by the proposed method is complete and accurate,and the boundary is clear and coherent,which can effectively reduce the interference of many factors such as uneven luminance,chromaticity deviation,dirt attachment,image compression,and so on.The correlation optimization algorithm has high efficiency and stable convergence,and can meet the requirements of aircraft intelligent maintenance.展开更多
In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to t...In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.展开更多
The dosage of gold-antimony flotation process of 5 main drugs,including Copper Sulfate,Lead Nitrate,Yellow Medicine,No.2 Oil,Black Medicine,with corresponding visual features of foam images,including Stability,Gray Sc...The dosage of gold-antimony flotation process of 5 main drugs,including Copper Sulfate,Lead Nitrate,Yellow Medicine,No.2 Oil,Black Medicine,with corresponding visual features of foam images,including Stability,Gray Scale,Mean R,Mean G,Mean B,Mean Average,Dimension and Degree Variance,were recorded.Parameter correlation analysis showed that the correlation among Copper Sulfate,Yellow Medicine,Black Medicine,as well as the correlation among Gray Scale,Mean R,Mean G,Mean B,is strong,and the correlation among Dimension,Gray Scale,Mean R,Mean G,Mean B,as well as the correlation between Stability and each dosing parameter,is week.It also indicated a feasible way to decrease the complexity of flotation control system by reducing some parameters.展开更多
Geochemical composition characteristics of light oils from the Tertiary in the west of the Chepaizi uplift in the Junggar basin, northwest China, are distinct from those of biodegraded oils derived from the Permian in...Geochemical composition characteristics of light oils from the Tertiary in the west of the Chepaizi uplift in the Junggar basin, northwest China, are distinct from those of biodegraded oils derived from the Permian in the study area and crude oils from some adjacent oil fields such as the Chepaizi and Xiaoguai oilfields. Oil source corre-lation shows that light oils in the study area have similar n-alkane and isoprenoid distribution patterns and carbon isotope compositions with the coal-derived oils from the Jurassic, and display obvious discrepancy on biomarker composition characteristics with the Cretaceous source rock extracts, inferring that they are probably the mixed oils from the Jurassic coal measures and Cretaceous source rocks. In this study, combined with the geochemical data of coal-derived oils from the Jurassic and Cretaceous source rocks or crude oils from the Cretaceous, the source and commingling features of the Tertiary crude oils of Well Pai 2 and Well Pai 8 were investigated. The proportion of the two sources in the mixed crude oils was estimated, and the hydrocarbon accumulation pattern of reservoirs in the study area was established.展开更多
The world produces vast quantities of high-dimensional multi-semantic data.However,extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challengi...The world produces vast quantities of high-dimensional multi-semantic data.However,extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challenging.Feature selection aims to mitigate the adverse impacts of high dimensionality in multi-label data by eliminating redundant and irrelevant features.The ant colony optimization algorithm has demonstrated encouraging outcomes in multi-label feature selection,because of its simplicity,efficiency,and similarity to reinforcement learning.Nevertheless,existing methods do not consider crucial correlation information,such as dynamic redundancy and label correlation.To tackle these concerns,the paper proposes a multi-label feature selection technique based on ant colony optimization algorithm(MFACO),focusing on dynamic redundancy and label correlation.Initially,the dynamic redundancy is assessed between the selected feature subset and potential features.Meanwhile,the ant colony optimization algorithm extracts label correlation from the label set,which is then combined into the heuristic factor as label weights.Experimental results demonstrate that our proposed strategies can effectively enhance the optimal search ability of ant colony,outperforming the other algorithms involved in the paper.展开更多
基金This work was supported by the National Natural Science Foundation of China(U2133208,U20A20161).
文摘With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.
基金Project(13&ZD024)supported by the Major Program of the National Social Science Fund of ChinaProject(71073177)supported by the National Natural Science Foundation of China+3 种基金Project(CX2012B107)supported by the Graduate Student Innovation Project of Hunan Province,ChinaProject(13YJAZH149)supported by the Social Science Fund of Ministry of Education of ChinaProject(2011ZK2043)supported by the Key Program of the Soft Science Research Project of Hunan Province,ChinaProject(12JJ4077)supported by Natural Science Foundation of Hunan Province of China
文摘An empirical test on long memory between price and trading volume of China metals futures market was given with MF-DCCA method. The empirical results show that long memory feature with a certain period exists in price-volume correlation and a fittther proof was given by analyzing the source of multifractal feature. The empirical results suggest that it is of important practical significance to bring the fractal market theory and other nonlinear theory into the analysis and explanation of the behavior in metal futures market.
基金supported by the Youth Fund of the National Natural Science Foundation of China(No.52304311)the National Natural Science Foundation of China(No.52274282)the Postdoctoral Fellowship Program of CPSF(No.GZC20233016)。
文摘The fluidity of coal-water slurry(CWS)is crucial for various industrial applications such as long-distance transportation,gasification,and combustion.However,there is currently a lack of rapid and accurate detection methods for assessing CWS fluidity.This paper proposed a method for analyzing the fluidity using videos of CWS dripping processes.By integrating the temporal and spatial features of each frame in the video,a multi-cascade classifier for CWS fluidity is established.The classifier distinguishes between four levels(A,B,C,and D)based on the quality of fluidity.The preliminary classification of A and D is achieved through feature engineering and the XGBoost algorithm.Subsequently,convolutional neural networks(CNN)and long short-term memory(LSTM)are utilized to further differentiate between the B and C categories which are prone to confusion.Finally,through detailed comparative experiments,the paper demonstrates the step-by-step design process of the proposed method and the superiority of the final solution.The proposed method achieves an accuracy rate of over 90%in determining the fluidity of CWS,serving as a technical reference for future industrial applications.
基金supported by the National Science Foundation of China(No.61170145,61373081)the Specialized Research Fund for the Doctoral Program of Higher Education of China(No.20113704110001)+1 种基金the Technology and Development Project of Shandong(No.2013GGX10125)the Taishan Scholar Project of Shandong,China
文摘In the global information era,people acquire more and more information from the Internet,but the quality of the search results is degraded strongly because of the presence of web spam.Web spam is one of the serious problems for search engines,and many methods have been proposed for spam detection.We exploit the content features of non-spam in contrast to those of spam.The content features for non-spam pages always possess lots of statistical regularities; but those for spam pages possess very few statistical regularities,because spam pages are made randomly in order to increase the page rank.In this paper,we summarize the regularities distributions of content features for non-spam pages,and propose the calculating probability formulae of the entropy and independent n-grams respectively.Furthermore,we put forward the calculation formulae of multi features correlation.Among them,the notable content features may be used as auxiliary information for spam detection.
基金funded by“the Fundamental Research Funds for the Central Universities”,No.CUC23ZDTJ005.
文摘News media profiling is helpful in preventing the spread of fake news at the source and maintaining a good media and news ecosystem.Most previous works only extract features and evaluate media from one dimension independently,ignoring the interconnections between different aspects.This paper proposes a novel news media bias and factuality profiling framework assisted by correlated features.This framework models the relationship and interaction between media bias and factuality,utilizing this relationship to assist in the prediction of profiling results.Our approach extracts features independently while aligning and fusing them through recursive convolu-tion and attention mechanisms,thus harnessing multi-scale interactive information across different dimensions and levels.This method improves the effectiveness of news media evaluation.Experimental results indicate that our proposed framework significantly outperforms existing methods,achieving the best performance in Accuracy and F1 score,improving by at least 1%compared to other methods.This paper further analyzes and discusses based on the experimental results.
基金Supported by the National Natural Science Foundation of China (40971275, 50811120111)
文摘Based on spatio-temporal correlativity analysis method, the automatic identification techniques for data anomaly monitoring of coal mining working face gas are presented. The asynchronous correlative characteristics of gas migration in working face airflow direction are qualitatively analyzed. The calculation method of asynchronous correlation delay step and the prediction and inversion formulas of gas concentration changing with time and space after gas emission in the air return roadway are provided. By calculating one hundred and fifty groups of gas sensors data series from a coal mine which have the theoretical correlativity, the correlative coefficient values range of eight kinds of data anomaly is obtained. Then the gas moni- toring data anomaly identification algorithm based on spatio-temporal correlativity analysis is accordingly presented. In order to improve the efficiency of analysis, the gas sensors code rules which can express the spatial topological relations are sug- gested. The experiments indicate that methods presented in this article can effectively compensate the defects of methods based on a single gas sensor monitoring data.
基金NationalNatural Science Foundation of China,Grant/AwardNumber:61867004National Natural Science Foundation of China Youth Fund,Grant/Award Number:41801288.
文摘Cross-Project Defect Prediction(CPDP)is a method that utilizes historical data from other source projects to train predictive models for defect prediction in the target project.However,existing CPDP methods only consider linear correlations between features(indicators)of the source and target projects.These models are not capable of evaluating non-linear correlations between features when they exist,for example,when there are differences in data distributions between the source and target projects.As a result,the performance of such CPDP models is compromised.In this paper,this paper proposes a novel CPDP method based on Synthetic Minority Oversampling Technique(SMOTE)and Deep Canonical Correlation Analysis(DCCA),referred to as S-DCCA.Canonical Correlation Analysis(CCA)is employed to address the issue of non-linear correlations between features of the source and target projects.S-DCCA extends CCA by incorporating the MlpNet model for feature extraction from the dataset.The redundant features are then eliminated by maximizing the correlated feature subset using the CCA loss function.Finally,cross-project defect prediction is achieved through the application of the SMOTE data sampling technique.Area Under Curve(AUC)and F1 scores(F1)are used as evaluation metrics.This paper conducted experiments on 27 projects from four public datasets to validate the proposed method.The results demonstrate that,on average,our method outperforms all baseline approaches by at least 1.2%in AUC and 5.5%in F1 score.This indicates that the proposed method exhibits favorable performance characteristics.
基金supported by the National Natural Science Foundation of Hainan(2018CXTD333,617048)National Natural Science Foundation of China(61762033,61702539)+4 种基金The National Natural Science Foundation of Hunan(2018JJ3611)Social Development Project of Public Welfare Technology Application of Zhejiang Province(LGF18F020019)Hainan University Doctor Start Fund Project(kyqd1328)Hainan University Youth Fund Project(qnjj1444)State Key Laboratory of Marine Resource Utilization in South China Sea Funding.
文摘Distributed denial of service(DDoS)attacks launch more and more frequently and are more destructive.Feature representation as an important part of DDoS defense technology directly affects the efficiency of defense.Most DDoS feature extraction methods cannot fully utilize the information of the original data,resulting in the extracted features losing useful features.In this paper,a DDoS feature representation method based on deep belief network(DBN)is proposed.We quantify the original data by the size of the network flows,the distribution of IP addresses and ports,and the diversity of packet sizes of different protocols and train the DBN in an unsupervised manner by these quantified values.Two feedforward neural networks(FFNN)are initialized by the trained deep belief network,and one of the feedforward neural networks continues to be trained in a supervised manner.The canonical correlation analysis(CCA)method is used to fuse the features extracted by two feedforward neural networks per layer.Experiments show that compared with other methods,the proposed method can extract better features.
基金This work is supported by the National Natural Science Foundation of China(No.61772561)the Key Research&Development Plan of Hunan Province(No.2018NK2012)+1 种基金the Science Research Projects of Hunan Provincial Education Department(Nos.18A174,18C0262)the Science&Technology Innovation Platform and Talent Plan of Hunan Province(2017TP1022).
文摘Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.
基金supported by the National KeyResearch and Development Program of China(2018AAA0103203)the National Natural Science Foundation of China(62073036,62076031)the Beijing Natural Science Foundation(4202071)。
文摘In recent visual tracking research,correlation filter(CF)based trackers become popular because of their high speed and considerable accuracy.Previous methods mainly work on the extension of features and the solution of the boundary effect to learn a better correlation filter.However,the related studies are insufficient.By exploring the potential of trackers in these two aspects,a novel adaptive padding correlation filter(APCF)with feature group fusion is proposed for robust visual tracking in this paper based on the popular context-aware tracking framework.In the tracker,three feature groups are fused by use of the weighted sum of the normalized response maps,to alleviate the risk of drift caused by the extreme change of single feature.Moreover,to improve the adaptive ability of padding for the filter training of different object shapes,the best padding is selected from the preset pool according to tracking precision over the whole video,where tracking precision is predicted according to the prediction model trained by use of the sequence features of the first several frames.The sequence features include three traditional features and eight newly constructed features.Extensive experiments demonstrate that the proposed tracker is superior to most state-of-the-art correlation filter based trackers and has a stable improvement compared to the basic trackers.
文摘The high-precision GPS data observed from the northeast margin of the Qinghai-Xizang (Tibet) block and the Sichuan-Yunnan GPS monitoring areas in 1991 (1993), 1999 and 2001 revealed that: before the Kunlun earthquake with Ms =8.1 on November 14, 2001, the dynamic variation features of horizontal movement-deformation field in the north and east marginal tectonic areas of the Qinghai-Xizang (Tibet) block had some correlated features. That is to say, under the general background of inherited movement, the movement intensifies in the two areas weakened synchronously and the state of deformation changed when the great earthquake was impending. Analysis and study in connection with geological structures showed that before the Kunlun Ms8.1 earthquake, the correlated variations of movement-deformation on the boundaries of Qinghai-Xizang (Tibet) block were related to the disturbing stress field caused by the extensive and rapid stress-strain accumulation in the late stage of large earthquake preparation. Owing to the occurrence of large earthquake inside the block, the release of large amount of strain energy, and the adjustment of tectonic stress field, in relevant structural positions (especially zones not penetrated by historical strong earthquake ruptures) in boundary zones where larger amount of strain energy was accumulated, stress-strain may be further accumulated or else released through rupture.
文摘The paper reveals that the variations in parameters like u *, the scaling velocity and θ*. The scaling temperature during the various phases of monsoon might be linked with subsynoptic features. The rise in u * is mainly connected with the presence of lower tropospheric cyclonic vorticity over a subsynoptic scale of the site. However the variations in θ. is mainly linked with the various phases of monsoon and θ * shows a sharp rise in presence of low level convective cloud.Besides the correlation studies of u and u., θv and θv, θv-θv0 and θv, * are undertaken. The correlation between θv and θv * is poor. In other two cases correlations are good. Besides u / u * has shown good coefficient of variation values within the ξ range.
基金This research is supported by the National Natural Science Foundations of China under Grants Nos.61862040,61762060 and 61762059The authors gratefully acknowledge the anonymous reviewers for their helpful comments and suggestions.
文摘Undoubtedly,uncooperative or malicious nodes threaten the safety of Internet of Vehicles(IoV)by destroying routing or data.To this end,some researchers have designed some node detection mechanisms and trust calculating algorithms based on some different feature parameters of IoV such as communication,data,energy,etc.,to detect and evaluate vehicle nodes.However,it is difficult to effectively assess the trust level of a vehicle node only by message forwarding,data consistency,and energy sufficiency.In order to resolve these problems,a novel mechanism and a new trust calculating model is proposed in this paper.First,the four tuple method is adopted,to qualitatively describing various types of nodes of IoV;Second,analyzing the behavioral features and correlation of various nodes based on route forwarding rate,data forwarding rate and physical location;third,designing double layer detection feature parameters with the ability to detect uncooperative nodes and malicious nodes;fourth,establishing a node correlative detection model with a double layer structure by combining the network layer and the perception layer.Accordingly,we conducted simulation experiments to verify the accuracy and time of this detection method under different speed-rate topological conditions of IoV.The results show that comparing with methods which only considers energy or communication parameters,the method proposed in this paper has obvious advantages in the detection of uncooperative and malicious nodes of IoV;especially,with the double detection feature parameters and node correlative detection model combined,detection accuracy is effectively improved,and the calculation time of node detection is largely reduced.
文摘To solve the problem of low robustness of trackers under significant appearance changes in complex background,a novel moving target tracking method based on hierarchical deep features weighted fusion and correlation filter is proposed.Firstly,multi-layer features are extracted by a deep model pre-trained on massive object recognition datasets.The linearly separable features of Relu3-1,Relu4-1 and Relu5-4 layers from VGG-Net-19 are especially suitable for target tracking.Then,correlation filters over hierarchical convolutional features are learned to generate their correlation response maps.Finally,a novel approach of weight adjustment is presented to fuse response maps.The maximum value of the final response map is just the location of the target.Extensive experiments on the object tracking benchmark datasets demonstrate the high robustness and recognition precision compared with several state-of-the-art trackers under the different conditions.
文摘Finding effective cancer treatment is a challenge, because the sensitivity of the cancer stems from both intrinsic cellular properties and acquired resistances from prior treatment. Previous research has revealed individual protein markers that are significant to chemosensitivity prediction. Our goal is to find correlated protein markers which are collectively significant to chemosensitivity prediction to complement the individual markers already reported. In order to do this, we used the D’ correlation measurement to study the feature selection correlations for chemosensitivity prediction of 118 anticancer agents with putatively known mechanisms of action. Three data-sets on the NCI-60 were utilized in this study: two protein datasets, one previously studied for chemosensitivity prediction and another novel to this topic, and one DNA copy number dataset. To validate our approach, we identified the protein markers that were strongly correlated by our analysis with the individual protein markers found in previous studies. Our feature analysis discovered highly correlated protein marker pairs, based on which we found individual protein markers with medical significance. While some of the markers uncovered were consistent with those previously reported, others were original to this work. Using these marker pairs we were able to further correlate the cellular functions associated with them. As an exploratory analysis, we discovered feature selection correlation patterns between and within different drug mechanisms of action for each of our datasets. In conclusion, the highly correlated protein marker pairs as well as their functions found by our feature analysis are validated by previous studies, and are shown to be medically significant, demonstrating D’ as an effective measurement of correlation in the context of feature selection for the first time.
基金supported by the Aeronautical Science Foundation of China(No.20151067003)。
文摘In order to obtain the image of airframe damage region and provide the input data for aircraft intelligent maintenance,a multi-dimensional and multi-threshold airframe damage region division method based on correlation optimization is proposed.On the basis of airframe damage feature analysis,the multi-dimensional feature entropy is defined to realize the full fusion of multiple feature information of the image,and the division method is extended to multi-threshold to refine the damage division and reduce the impact of the damage adjacent region’s morphological changes on the division.Through the correlation parameter optimization algorithm,the problem of low efficiency of multi-dimensional multi-threshold division method is solved.Finally,the proposed method is compared and verified by instances of airframe damage image.The results show that compared with the traditional threshold division method,the damage region divided by the proposed method is complete and accurate,and the boundary is clear and coherent,which can effectively reduce the interference of many factors such as uneven luminance,chromaticity deviation,dirt attachment,image compression,and so on.The correlation optimization algorithm has high efficiency and stable convergence,and can meet the requirements of aircraft intelligent maintenance.
基金Supported by the Major Program of National Natural Science Foundation of China (No. 70890080 and No. 70890083)
文摘In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.
基金This work is supported by the Natural Science Foundation of China with Nos.61621062,61773407 and 61872408Hunan Province Science Foundation of China with No.2016JJ6136.
文摘The dosage of gold-antimony flotation process of 5 main drugs,including Copper Sulfate,Lead Nitrate,Yellow Medicine,No.2 Oil,Black Medicine,with corresponding visual features of foam images,including Stability,Gray Scale,Mean R,Mean G,Mean B,Mean Average,Dimension and Degree Variance,were recorded.Parameter correlation analysis showed that the correlation among Copper Sulfate,Yellow Medicine,Black Medicine,as well as the correlation among Gray Scale,Mean R,Mean G,Mean B,is strong,and the correlation among Dimension,Gray Scale,Mean R,Mean G,Mean B,as well as the correlation between Stability and each dosing parameter,is week.It also indicated a feasible way to decrease the complexity of flotation control system by reducing some parameters.
基金supported by the National Basic Research Program of China (2006CB202300)
文摘Geochemical composition characteristics of light oils from the Tertiary in the west of the Chepaizi uplift in the Junggar basin, northwest China, are distinct from those of biodegraded oils derived from the Permian in the study area and crude oils from some adjacent oil fields such as the Chepaizi and Xiaoguai oilfields. Oil source corre-lation shows that light oils in the study area have similar n-alkane and isoprenoid distribution patterns and carbon isotope compositions with the coal-derived oils from the Jurassic, and display obvious discrepancy on biomarker composition characteristics with the Cretaceous source rock extracts, inferring that they are probably the mixed oils from the Jurassic coal measures and Cretaceous source rocks. In this study, combined with the geochemical data of coal-derived oils from the Jurassic and Cretaceous source rocks or crude oils from the Cretaceous, the source and commingling features of the Tertiary crude oils of Well Pai 2 and Well Pai 8 were investigated. The proportion of the two sources in the mixed crude oils was estimated, and the hydrocarbon accumulation pattern of reservoirs in the study area was established.
基金supported by National Natural Science Foundation of China(Grant Nos.62376089,62302153,62302154,62202147)the key Research and Development Program of Hubei Province,China(Grant No.2023BEB024).
文摘The world produces vast quantities of high-dimensional multi-semantic data.However,extracting valuable information from such a large amount of high-dimensional and multi-label data is undoubtedly arduous and challenging.Feature selection aims to mitigate the adverse impacts of high dimensionality in multi-label data by eliminating redundant and irrelevant features.The ant colony optimization algorithm has demonstrated encouraging outcomes in multi-label feature selection,because of its simplicity,efficiency,and similarity to reinforcement learning.Nevertheless,existing methods do not consider crucial correlation information,such as dynamic redundancy and label correlation.To tackle these concerns,the paper proposes a multi-label feature selection technique based on ant colony optimization algorithm(MFACO),focusing on dynamic redundancy and label correlation.Initially,the dynamic redundancy is assessed between the selected feature subset and potential features.Meanwhile,the ant colony optimization algorithm extracts label correlation from the label set,which is then combined into the heuristic factor as label weights.Experimental results demonstrate that our proposed strategies can effectively enhance the optimal search ability of ant colony,outperforming the other algorithms involved in the paper.