Marine oil spill emulsions are difficult to recover,and the damage to the environment is not easy to eliminate.The use of remote sensing to accurately identify oil spill emulsions is highly important for the protectio...Marine oil spill emulsions are difficult to recover,and the damage to the environment is not easy to eliminate.The use of remote sensing to accurately identify oil spill emulsions is highly important for the protection of marine environments.However,the spectrum of oil emulsions changes due to different water content.Hyperspectral remote sensing and deep learning can use spectral and spatial information to identify different types of oil emulsions.Nonetheless,hyperspectral data can also cause information redundancy,reducing classification accuracy and efficiency,and even overfitting in machine learning models.To address these problems,an oil emulsion deep-learning identification model with spatial-spectral feature fusion is established,and feature bands that can distinguish between crude oil,seawater,water-in-oil emulsion(WO),and oil-in-water emulsion(OW)are filtered based on a standard deviation threshold–mutual information method.Using oil spill airborne hyperspectral data,we conducted identification experiments on oil emulsions in different background waters and under different spatial and temporal conditions,analyzed the transferability of the model,and explored the effects of feature band selection and spectral resolution on the identification of oil emulsions.The results show the following.(1)The standard deviation–mutual information feature selection method is able to effectively extract feature bands that can distinguish between WO,OW,oil slick,and seawater.The number of bands was reduced from 224 to 134 after feature selection on the Airborne Visible Infrared Imaging Spectrometer(AVIRIS)data and from 126 to 100 on the S185 data.(2)With feature selection,the overall accuracy and Kappa of the identification results for the training area are 91.80%and 0.86,respectively,improved by 2.62%and 0.04,and the overall accuracy and Kappa of the identification results for the migration area are 86.53%and 0.80,respectively,improved by 3.45%and 0.05.(3)The oil emulsion identification model has a certain degree of transferability and can effectively identify oil spill emulsions for AVIRIS data at different times and locations,with an overall accuracy of more than 80%,Kappa coefficient of more than 0.7,and F1 score of 0.75 or more for each category.(4)As the spectral resolution decreasing,the model yields different degrees of misclassification for areas with a mixed distribution of oil slick and seawater or mixed distribution of WO and OW.Based on the above experimental results,we demonstrate that the oil emulsion identification model with spatial–spectral feature fusion achieves a high accuracy rate in identifying oil emulsion using airborne hyperspectral data,and can be applied to images under different spatial and temporal conditions.Furthermore,we also elucidate the impact of factors such as spectral resolution and background water bodies on the identification process.These findings provide new reference for future endeavors in automated marine oil spill detection.展开更多
A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples ...A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples in the feature space to boost the performance of each weak classifier in the asymmetric boosting framework. Then, the weak classifier based on a novel linear discriminate analysis (LDA) algorithm which is learned from the subspace of heterogeneous features is integrated into the framework. Finally, the proposed method deals with each bit of the code sequentially, which utilizes the samples misclassified in each round in order to learn compact and balanced code. The heterogeneous information from different modalities can be effectively complementary to each other, which leads to much higher performance. The experimental results based on the two public benchmarks demonstrate that this method is superior to many of the state- of-the-art methods. In conclusion, the performance of the retrieval system can be improved with the help of multiple heterogeneous features and the compact hash codes which can be learned by the imbalanced learning method.展开更多
The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most exi...The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most existing frameworks typically utilize separate modules for spatial and temporal correlations modeling.However,this stepwise pattern may limit the effectiveness and efficiency in spatial-temporal feature extraction and cause the overlook of important information in some steps.Furthermore,it is lacking sufficient guidance from prior information while modeling based on a given spatial adjacency graph(e.g.,deriving from the geodesic distance or approximate connectivity),and may not reflect the actual interaction between nodes.To overcome those limitations,our paper proposes a spatial-temporal graph synchronous aggregation(STGSA)model to extract the localized and long-term spatial-temporal dependencies simultaneously.Specifically,a tailored graph aggregation method in the vertex domain is designed to extract spatial and temporal features in one graph convolution process.In each STGSA block,we devise a directed temporal correlation graph to represent the localized and long-term dependencies between nodes,and the potential temporal dependence is further fine-tuned by an adaptive weighting operation.Meanwhile,we construct an elaborated spatial adjacency matrix to represent the road sensor graph by considering both physical distance and node similarity in a datadriven manner.Then,inspired by the multi-head attention mechanism which can jointly emphasize information from different r epresentation subspaces,we construct a multi-stream module based on the STGSA blocks to capture global information.It projects the embedding input repeatedly with multiple different channels.Finally,the predicted values are generated by stacking several multi-stream modules.Extensive experiments are constructed on six real-world datasets,and numerical results show that the proposed STGSA model significantly outperforms the benchmarks.展开更多
Due to the diversity and unpredictability of changes in malicious code,studying the traceability of variant families remains challenging.In this paper,we propose a GAN-EfficientNetV2-based method for tracing families ...Due to the diversity and unpredictability of changes in malicious code,studying the traceability of variant families remains challenging.In this paper,we propose a GAN-EfficientNetV2-based method for tracing families of malicious code variants.This method leverages the similarity in layouts and textures between images of malicious code variants from the same source and their original family of malicious code images.The method includes a lightweight classifier and a simulator.The classifier utilizes the enhanced EfficientNetV2 to categorize malicious code images and can be easily deployed on mobile,embedded,and other devices.The simulator utilizes an enhanced generative adversarial network to simulate different variants of malicious code and generates datasets to validate the model’s performance.This process helps identify model vulnerabilities and security risks,facilitating model enhancement and development.The classifier achieves 98.61%and 97.59%accuracy on the MMCC dataset and Malevis dataset,respectively.The simulator’s generated image of malicious code variants has an FID value of 155.44 and an IS value of 1.72±0.42.The classifier’s accuracy for tracing the family of malicious code variants is as high as 90.29%,surpassing that of mainstream neural network models.This meets the current demand for high generalization and anti-obfuscation abilities in malicious code classification models due to the rapid evolution of malicious code.展开更多
With the growth of the Internet,more and more business is being done online,for example,online offices,online education and so on.While this makes people’s lives more convenient,it also increases the risk of the netw...With the growth of the Internet,more and more business is being done online,for example,online offices,online education and so on.While this makes people’s lives more convenient,it also increases the risk of the network being attacked by malicious code.Therefore,it is important to identify malicious codes on computer systems efficiently.However,most of the existing malicious code detection methods have two problems:(1)The ability of the model to extract features is weak,resulting in poor model performance.(2)The large scale of model data leads to difficulties deploying on devices with limited resources.Therefore,this paper proposes a lightweight malicious code identification model Lightweight Malicious Code Classification Method Based on Improved SqueezeNet(LCMISNet).In this paper,the MFire lightweight feature extraction module is constructed by proposing a feature slicing module and a multi-size depthwise separable convolution module.The feature slicing module reduces the number of parameters by grouping features.The multi-size depthwise separable convolution module reduces the number of parameters and enhances the feature extraction capability by replacing the standard convolution with depthwise separable convolution with different convolution kernel sizes.In addition,this paper also proposes a feature splicing module to connect the MFire lightweight feature extraction module based on the feature reuse and constructs the lightweight model LCMISNet.The malicious code recognition accuracy of LCMISNet on the BIG 2015 dataset and the Malimg dataset reaches 98.90% and 99.58%,respectively.It proves that LCMISNet has a powerful malicious code recognition performance.In addition,compared with other network models,LCMISNet has better performance,and a lower number of parameters and computations.展开更多
In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to descr...In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.展开更多
To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitatio...To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitation) classification are investigated.11 classifying features in the AMR-WB+ codec are selected and 2 novel classifying features,i.e.,EFM(Energy Flatness Measurement) and stdEFM(standard deviation of EFM),are proposed.Consequently,a novel semi-open-loop mode selection algorithm based on EFM and selected AMR-WB+ features is proposed.The results of classifying test and listening test show that the performance of the novel algorithm is much better than that of the AMR-WB+ semi-open-loop coding mode selection algorithm.展开更多
To solve the problem that using a single feature cannot play the role of multiple features of Android application in malicious code detection, an Android malicious code detection mechanism is proposed based on integra...To solve the problem that using a single feature cannot play the role of multiple features of Android application in malicious code detection, an Android malicious code detection mechanism is proposed based on integrated learning on the basis of dynamic and static detection. Considering three types of Android behavior characteristics, a three-layer hybrid algorithm was proposed. And it combined the malicious code detection based on digital signature to improve the detection efficiency. The digital signature of the known malicious code was extracted to form a malicious sample library. The authority that can reflect Android malicious behavior, API call and the running system call features were also extracted. An expandable hybrid discriminant algorithm was designed for the above three types of features. The algorithm was tested with machine learning method by constructing the optimal classifier suitable for the above features. Finally, the Android malicious code detection system was designed and implemented based on the multi-layer hybrid algorithm. The experimental results show that the system performs Android malicious code detection based on the combination of signature and dynamic and static features. Compared with other related work, the system has better performance in execution efficiency and detection rate.展开更多
How to identify topological entities during rebuilding features is a critical problem in Feature-Based Parametric Modeling System (FBPMS). In the article, authors proposes a new coding approach to distinguish differen...How to identify topological entities during rebuilding features is a critical problem in Feature-Based Parametric Modeling System (FBPMS). In the article, authors proposes a new coding approach to distinguish different entities. The coding mechanism is expatiated,and some typical examples are presented. At last, the algorithm of decoding is put forward based on set theory.展开更多
Two signature systems based on smart cards and fingerprint features are proposed. In one signature system, the cryptographic key is stored in the smart card and is only accessible when the signer's extracted fingerpr...Two signature systems based on smart cards and fingerprint features are proposed. In one signature system, the cryptographic key is stored in the smart card and is only accessible when the signer's extracted fingerprint features match his stored template. To resist being tampered on public channel, the user's message and the signed message are encrypted by the signer's public key and the user's public key, respectively. In the other signature system, the keys are generated by combining the signer's fingerprint features, check bits, and a rememberable key, and there are no matching process and keys stored on the smart card. Additionally, there is generally more than one public key in this system, that is, there exist some pseudo public keys except a real one.展开更多
Northeast China, as the most important production base of agriculture, forestry, and livestock-breeding as well as the old industrial base in the whole country, has been playin a key role in the construction and deve...Northeast China, as the most important production base of agriculture, forestry, and livestock-breeding as well as the old industrial base in the whole country, has been playin a key role in the construction and development of China's economy. However, after the policy of reform and open-up was taken in China. the economic development speed and efficiency ofthis area have turned to be evidently lower than those of coastal area and the national average level as well, which is so-called 'Northeast Phenomenon' and 'Neo-Northeast Phenomenon'. In terms of those phenomena, this paper firstly reviews the spatial and temporal features of the regional evolution of this area so as to unveil the profound forming causes of 'Northeast Phenomena' and 'Neo-Northeast Phenomena'. And then the paper makes a further exploration into the status quo of this region and its forming causes by analyzing its economy gross, industrial structure, product structure, regional eco-categories, etc. At the end of the paper, the authors put forward the basic coordinated development strategies for Northeast China. namely we can revitalize this area by means of adjustment of economic structure, regional coordination, planning urban and rural areas as a whole, institutional innovation, etc.展开更多
An objective identification technique is used to detect regional extreme low temperature events (RELTE) in China during 1960-2009. Their spatial-temporal characteristics are analyzed. The results indicate that the l...An objective identification technique is used to detect regional extreme low temperature events (RELTE) in China during 1960-2009. Their spatial-temporal characteristics are analyzed. The results indicate that the lowest temperatures of RELTE, together with the frequency distribution of the geometric latitude center, exhibit a double-peak feature. The RELTE frequently happen near the geometric area of 30°N and 42°N before the mid-1980s, but shifted afterwards to 30°N. During 1960-2009, the frequency~ intensity, and the maximum impacted area of RELTE show overall decreasing trends. Due to the contribution of RELTE, with long duratioh and large spatial range, which account for 10% of the total RELTE, there is a significant turning point in the late 1980s. A change to a much more steady state after the late 1990s is identified. In addition, the integrated indices of RELTE are classified and analyzed.展开更多
In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance...In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.展开更多
To extract features of fabric defects effectively and reduce dimension of feature space,a feature extraction method of fabric defects based on complex contourlet transform (CCT) and principal component analysis (PC...To extract features of fabric defects effectively and reduce dimension of feature space,a feature extraction method of fabric defects based on complex contourlet transform (CCT) and principal component analysis (PCA) is proposed.Firstly,training samples of fabric defect images are decomposed by CCT.Secondly,PCA is applied in the obtained low-frequency component and part of highfrequency components to get a lower dimensional feature space.Finally,components of testing samples obtained by CCT are projected onto the feature space where different types of fabric defects are distinguished by the minimum Euclidean distance method.A large number of experimental results show that,compared with PCA,the method combining wavdet low-frequency component with PCA (WLPCA),the method combining contourlet transform with PCA (CPCA),and the method combining wavelet low-frequency and highfrequency components with PCA (WPCA),the proposed method can extract features of common fabric defect types effectively.The recognition rate is greatly improved while the dimension is reduced.展开更多
Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during highe...Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during higher-level abstracting.Moreover,because the target is not always mentioned in the text,most methods have ignored target information.In order to solve these problems,we propose a neural network ensemble method that combines the timing dependence bases on long short-term memory(LSTM)and the excellent extracting performance of convolutional neural networks(CNNs).The method can obtain multi-level features that consider both local and global features.We also introduce attention mechanisms to magnify target information-related features.Furthermore,we employ sparse coding to remove noise to obtain characteristic features.Performance was improved by using sparse coding on the basis of attention employment and feature extraction.We evaluate our approach on the SemEval-2016Task 6-A public dataset,achieving a performance that exceeds the benchmark and those of participating teams.展开更多
Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the...Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUW-SR. The state of the art WUW-SR system is based on three different sets of features: Mel-Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding Coefficients (LPC), and Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC). In (front-end of Wake-Up-Word Speech Recognition System Design on FPGA) [1], we presented an experimental FPGA design and implementation of a novel architecture of a real-time spectrogram extraction processor that generates MFCC, LPC, and ENH_MFCC spectrograms simultaneously. In this paper, the details of converting the three sets of spectrograms 1) Mel-Frequency Cepstral Coefficients (MFCC), 2) Linear Predictive Coding Coefficients (LPC), and 3) Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC) to their equivalent features are presented. In the WUW- SR system, the recognizer’s frontend is located at the terminal which is typically connected over a data network to remote back-end recognition (e.g., server). The WUW-SR is shown in Figure 1. The three sets of speech features are extracted at the front-end. These extracted features are then compressed and transmitted to the server via a dedicated channel, where subsequently they are decoded.展开更多
This paper addresses an important issue in model combination, that is, model locality. Since usually a global linear model is unable to reflect nonlinearity and to characterize local features, especially in a complex ...This paper addresses an important issue in model combination, that is, model locality. Since usually a global linear model is unable to reflect nonlinearity and to characterize local features, especially in a complex sys-tem, we propose a mixture of local feature models to overcome these weaknesses. The basic idea is to split the entire input space into operating domains, and a recently developed feature-based model combination method is applied to build local models for each region. To realize this idea, three steps are required, which include clustering, local modeling and model combination, governed by a single objective function. An adaptive fuzzy parametric clustering algorithm is proposed to divide the whole input space into operating regimes, local feature models are created in each individual region by applying a recently developed fea-ture-based model combination method, and finally they are combined into a single mixture model. Corre-spondingly, a three-stage procedure is designed to optimize the complete objective function, which is actu-ally a hybrid Genetic Algorithm (GA). Our simulation results show that the adaptive fuzzy mixture of local feature models turns out to be superior to global models.展开更多
基金The National Natural Science Foundation of China under contract Nos 61890964 and 42206177the Joint Funds of the National Natural Science Foundation of China under contract No.U1906217.
文摘Marine oil spill emulsions are difficult to recover,and the damage to the environment is not easy to eliminate.The use of remote sensing to accurately identify oil spill emulsions is highly important for the protection of marine environments.However,the spectrum of oil emulsions changes due to different water content.Hyperspectral remote sensing and deep learning can use spectral and spatial information to identify different types of oil emulsions.Nonetheless,hyperspectral data can also cause information redundancy,reducing classification accuracy and efficiency,and even overfitting in machine learning models.To address these problems,an oil emulsion deep-learning identification model with spatial-spectral feature fusion is established,and feature bands that can distinguish between crude oil,seawater,water-in-oil emulsion(WO),and oil-in-water emulsion(OW)are filtered based on a standard deviation threshold–mutual information method.Using oil spill airborne hyperspectral data,we conducted identification experiments on oil emulsions in different background waters and under different spatial and temporal conditions,analyzed the transferability of the model,and explored the effects of feature band selection and spectral resolution on the identification of oil emulsions.The results show the following.(1)The standard deviation–mutual information feature selection method is able to effectively extract feature bands that can distinguish between WO,OW,oil slick,and seawater.The number of bands was reduced from 224 to 134 after feature selection on the Airborne Visible Infrared Imaging Spectrometer(AVIRIS)data and from 126 to 100 on the S185 data.(2)With feature selection,the overall accuracy and Kappa of the identification results for the training area are 91.80%and 0.86,respectively,improved by 2.62%and 0.04,and the overall accuracy and Kappa of the identification results for the migration area are 86.53%and 0.80,respectively,improved by 3.45%and 0.05.(3)The oil emulsion identification model has a certain degree of transferability and can effectively identify oil spill emulsions for AVIRIS data at different times and locations,with an overall accuracy of more than 80%,Kappa coefficient of more than 0.7,and F1 score of 0.75 or more for each category.(4)As the spectral resolution decreasing,the model yields different degrees of misclassification for areas with a mixed distribution of oil slick and seawater or mixed distribution of WO and OW.Based on the above experimental results,we demonstrate that the oil emulsion identification model with spatial–spectral feature fusion achieves a high accuracy rate in identifying oil emulsion using airborne hyperspectral data,and can be applied to images under different spatial and temporal conditions.Furthermore,we also elucidate the impact of factors such as spectral resolution and background water bodies on the identification process.These findings provide new reference for future endeavors in automated marine oil spill detection.
基金The National Natural Science Foundation of China(No.61305058)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB520003)+1 种基金the Natural Science Foundation of Jiangsu Province(No.BK20130471)the Scientific Research Foundation for Advanced Talents by Jiangsu University(No.13JDG093)
文摘A novel hashing method based on multiple heterogeneous features is proposed to improve the accuracy of the image retrieval system. First, it leverages the imbalanced distribution of the similar and dissimilar samples in the feature space to boost the performance of each weak classifier in the asymmetric boosting framework. Then, the weak classifier based on a novel linear discriminate analysis (LDA) algorithm which is learned from the subspace of heterogeneous features is integrated into the framework. Finally, the proposed method deals with each bit of the code sequentially, which utilizes the samples misclassified in each round in order to learn compact and balanced code. The heterogeneous information from different modalities can be effectively complementary to each other, which leads to much higher performance. The experimental results based on the two public benchmarks demonstrate that this method is superior to many of the state- of-the-art methods. In conclusion, the performance of the retrieval system can be improved with the help of multiple heterogeneous features and the compact hash codes which can be learned by the imbalanced learning method.
基金partially supported by the National Key Research and Development Program of China(2020YFB2104001)。
文摘The success of intelligent transportation systems relies heavily on accurate traffic prediction,in which how to model the underlying spatial-temporal information from traffic data has come under the spotlight.Most existing frameworks typically utilize separate modules for spatial and temporal correlations modeling.However,this stepwise pattern may limit the effectiveness and efficiency in spatial-temporal feature extraction and cause the overlook of important information in some steps.Furthermore,it is lacking sufficient guidance from prior information while modeling based on a given spatial adjacency graph(e.g.,deriving from the geodesic distance or approximate connectivity),and may not reflect the actual interaction between nodes.To overcome those limitations,our paper proposes a spatial-temporal graph synchronous aggregation(STGSA)model to extract the localized and long-term spatial-temporal dependencies simultaneously.Specifically,a tailored graph aggregation method in the vertex domain is designed to extract spatial and temporal features in one graph convolution process.In each STGSA block,we devise a directed temporal correlation graph to represent the localized and long-term dependencies between nodes,and the potential temporal dependence is further fine-tuned by an adaptive weighting operation.Meanwhile,we construct an elaborated spatial adjacency matrix to represent the road sensor graph by considering both physical distance and node similarity in a datadriven manner.Then,inspired by the multi-head attention mechanism which can jointly emphasize information from different r epresentation subspaces,we construct a multi-stream module based on the STGSA blocks to capture global information.It projects the embedding input repeatedly with multiple different channels.Finally,the predicted values are generated by stacking several multi-stream modules.Extensive experiments are constructed on six real-world datasets,and numerical results show that the proposed STGSA model significantly outperforms the benchmarks.
基金support this work is the Key Research and Development Program of Heilongjiang Province,specifically Grant Number 2023ZX02C10.
文摘Due to the diversity and unpredictability of changes in malicious code,studying the traceability of variant families remains challenging.In this paper,we propose a GAN-EfficientNetV2-based method for tracing families of malicious code variants.This method leverages the similarity in layouts and textures between images of malicious code variants from the same source and their original family of malicious code images.The method includes a lightweight classifier and a simulator.The classifier utilizes the enhanced EfficientNetV2 to categorize malicious code images and can be easily deployed on mobile,embedded,and other devices.The simulator utilizes an enhanced generative adversarial network to simulate different variants of malicious code and generates datasets to validate the model’s performance.This process helps identify model vulnerabilities and security risks,facilitating model enhancement and development.The classifier achieves 98.61%and 97.59%accuracy on the MMCC dataset and Malevis dataset,respectively.The simulator’s generated image of malicious code variants has an FID value of 155.44 and an IS value of 1.72±0.42.The classifier’s accuracy for tracing the family of malicious code variants is as high as 90.29%,surpassing that of mainstream neural network models.This meets the current demand for high generalization and anti-obfuscation abilities in malicious code classification models due to the rapid evolution of malicious code.
文摘With the growth of the Internet,more and more business is being done online,for example,online offices,online education and so on.While this makes people’s lives more convenient,it also increases the risk of the network being attacked by malicious code.Therefore,it is important to identify malicious codes on computer systems efficiently.However,most of the existing malicious code detection methods have two problems:(1)The ability of the model to extract features is weak,resulting in poor model performance.(2)The large scale of model data leads to difficulties deploying on devices with limited resources.Therefore,this paper proposes a lightweight malicious code identification model Lightweight Malicious Code Classification Method Based on Improved SqueezeNet(LCMISNet).In this paper,the MFire lightweight feature extraction module is constructed by proposing a feature slicing module and a multi-size depthwise separable convolution module.The feature slicing module reduces the number of parameters by grouping features.The multi-size depthwise separable convolution module reduces the number of parameters and enhances the feature extraction capability by replacing the standard convolution with depthwise separable convolution with different convolution kernel sizes.In addition,this paper also proposes a feature splicing module to connect the MFire lightweight feature extraction module based on the feature reuse and constructs the lightweight model LCMISNet.The malicious code recognition accuracy of LCMISNet on the BIG 2015 dataset and the Malimg dataset reaches 98.90% and 99.58%,respectively.It proves that LCMISNet has a powerful malicious code recognition performance.In addition,compared with other network models,LCMISNet has better performance,and a lower number of parameters and computations.
基金supported in part by the National Natural Science Foundation of China under Grant 61772561,author J.Q,http://www.nsfc.gov.cn/in part by the Key Research and Development Plan of Hunan Province under Grant 2018NK2012,author J.Q,http://kjt.hunan.gov.cn/+7 种基金in part by the Key Research and Development Plan of Hunan Province under Grant 2019SK2022,author Y.T,http://kjt.hunan.gov.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 18A174,author X.X,http://kxjsc.gov.hnedu.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 19B584,author Y.T,http://kxjsc.gov.hnedu.cn/in part by the Degree&Postgraduate Education Reform Project of Hunan Province under Grant 2019JGYB154,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Excellent teaching team Project of Hunan Province under Grant[2019]370-133,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Education and Teaching Reform Project of Central South University of Forestry&Technology under Grant 2019JG013,author X.X,http://jwc.csuft.edu.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4140),author Y.T,http://kjt.hunan.gov.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4141),author X.X,http://kjt.hunan.gov.cn/.
文摘In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.
文摘To solve the problems of the AMR-WB+(Extended Adaptive Multi-Rate-WideBand) semi-open-loop coding mode selection algorithm,features for ACELP(Algebraic Code Excited Linear Prediction) and TCX(Transform Coded eXcitation) classification are investigated.11 classifying features in the AMR-WB+ codec are selected and 2 novel classifying features,i.e.,EFM(Energy Flatness Measurement) and stdEFM(standard deviation of EFM),are proposed.Consequently,a novel semi-open-loop mode selection algorithm based on EFM and selected AMR-WB+ features is proposed.The results of classifying test and listening test show that the performance of the novel algorithm is much better than that of the AMR-WB+ semi-open-loop coding mode selection algorithm.
文摘To solve the problem that using a single feature cannot play the role of multiple features of Android application in malicious code detection, an Android malicious code detection mechanism is proposed based on integrated learning on the basis of dynamic and static detection. Considering three types of Android behavior characteristics, a three-layer hybrid algorithm was proposed. And it combined the malicious code detection based on digital signature to improve the detection efficiency. The digital signature of the known malicious code was extracted to form a malicious sample library. The authority that can reflect Android malicious behavior, API call and the running system call features were also extracted. An expandable hybrid discriminant algorithm was designed for the above three types of features. The algorithm was tested with machine learning method by constructing the optimal classifier suitable for the above features. Finally, the Android malicious code detection system was designed and implemented based on the multi-layer hybrid algorithm. The experimental results show that the system performs Android malicious code detection based on the combination of signature and dynamic and static features. Compared with other related work, the system has better performance in execution efficiency and detection rate.
文摘How to identify topological entities during rebuilding features is a critical problem in Feature-Based Parametric Modeling System (FBPMS). In the article, authors proposes a new coding approach to distinguish different entities. The coding mechanism is expatiated,and some typical examples are presented. At last, the algorithm of decoding is put forward based on set theory.
基金This project was supported by the National Science Foundation of China (60763009)China Postdoctoral Science Foundation (2005038041)Hainan Natural Science Foundation (80528).
文摘Two signature systems based on smart cards and fingerprint features are proposed. In one signature system, the cryptographic key is stored in the smart card and is only accessible when the signer's extracted fingerprint features match his stored template. To resist being tampered on public channel, the user's message and the signed message are encrypted by the signer's public key and the user's public key, respectively. In the other signature system, the keys are generated by combining the signer's fingerprint features, check bits, and a rememberable key, and there are no matching process and keys stored on the smart card. Additionally, there is generally more than one public key in this system, that is, there exist some pseudo public keys except a real one.
基金Under the auspices of National Natural Science Foundation of China (No. 40471040)
文摘Northeast China, as the most important production base of agriculture, forestry, and livestock-breeding as well as the old industrial base in the whole country, has been playin a key role in the construction and development of China's economy. However, after the policy of reform and open-up was taken in China. the economic development speed and efficiency ofthis area have turned to be evidently lower than those of coastal area and the national average level as well, which is so-called 'Northeast Phenomenon' and 'Neo-Northeast Phenomenon'. In terms of those phenomena, this paper firstly reviews the spatial and temporal features of the regional evolution of this area so as to unveil the profound forming causes of 'Northeast Phenomena' and 'Neo-Northeast Phenomena'. And then the paper makes a further exploration into the status quo of this region and its forming causes by analyzing its economy gross, industrial structure, product structure, regional eco-categories, etc. At the end of the paper, the authors put forward the basic coordinated development strategies for Northeast China. namely we can revitalize this area by means of adjustment of economic structure, regional coordination, planning urban and rural areas as a whole, institutional innovation, etc.
基金supported by the Special Scientific Research Projects for Public Interest(No.GYHY201006021 and GYHY201106016)the National Natural Science Foundation of China(No.41205040 and 40930952)
文摘An objective identification technique is used to detect regional extreme low temperature events (RELTE) in China during 1960-2009. Their spatial-temporal characteristics are analyzed. The results indicate that the lowest temperatures of RELTE, together with the frequency distribution of the geometric latitude center, exhibit a double-peak feature. The RELTE frequently happen near the geometric area of 30°N and 42°N before the mid-1980s, but shifted afterwards to 30°N. During 1960-2009, the frequency~ intensity, and the maximum impacted area of RELTE show overall decreasing trends. Due to the contribution of RELTE, with long duratioh and large spatial range, which account for 10% of the total RELTE, there is a significant turning point in the late 1980s. A change to a much more steady state after the late 1990s is identified. In addition, the integrated indices of RELTE are classified and analyzed.
文摘In recent years, the accuracy of speech recognition (SR) has been one of the most active areas of research. Despite that SR systems are working reasonably well in quiet conditions, they still suffer severe performance degradation in noisy conditions or distorted channels. It is necessary to search for more robust feature extraction methods to gain better performance in adverse conditions. This paper investigates the performance of conventional and new hybrid speech feature extraction algorithms of Mel Frequency Cepstrum Coefficient (MFCC), Linear Prediction Coding Coefficient (LPCC), perceptual linear production (PLP), and RASTA-PLP in noisy conditions through using multivariate Hidden Markov Model (HMM) classifier. The behavior of the proposal system is evaluated using TIDIGIT human voice dataset corpora, recorded from 208 different adult speakers in both training and testing process. The theoretical basis for speech processing and classifier procedures were presented, and the recognition results were obtained based on word recognition rate.
基金National Natural Science Foundation of China(No.60872065)the Key Laboratory of Textile Science&Technology,Ministry of Education,China(No.P1111)+1 种基金the Key Laboratory of Advanced Textile Materials and Manufacturing Technology,Ministry of Education,China(No.2010001)the Priority Academic Program Development of Jiangsu Higher Education Institution,China
文摘To extract features of fabric defects effectively and reduce dimension of feature space,a feature extraction method of fabric defects based on complex contourlet transform (CCT) and principal component analysis (PCA) is proposed.Firstly,training samples of fabric defect images are decomposed by CCT.Secondly,PCA is applied in the obtained low-frequency component and part of highfrequency components to get a lower dimensional feature space.Finally,components of testing samples obtained by CCT are projected onto the feature space where different types of fabric defects are distinguished by the minimum Euclidean distance method.A large number of experimental results show that,compared with PCA,the method combining wavdet low-frequency component with PCA (WLPCA),the method combining contourlet transform with PCA (CPCA),and the method combining wavelet low-frequency and highfrequency components with PCA (WPCA),the proposed method can extract features of common fabric defect types effectively.The recognition rate is greatly improved while the dimension is reduced.
基金This work is supported by the Fundamental Research Funds for the Central Universities(Grant No.2572019BH03).
文摘Stance detection is the task of attitude identification toward a standpoint.Previous work of stance detection has focused on feature extraction but ignored the fact that irrelevant features exist as noise during higher-level abstracting.Moreover,because the target is not always mentioned in the text,most methods have ignored target information.In order to solve these problems,we propose a neural network ensemble method that combines the timing dependence bases on long short-term memory(LSTM)and the excellent extracting performance of convolutional neural networks(CNNs).The method can obtain multi-level features that consider both local and global features.We also introduce attention mechanisms to magnify target information-related features.Furthermore,we employ sparse coding to remove noise to obtain characteristic features.Performance was improved by using sparse coding on the basis of attention employment and feature extraction.We evaluate our approach on the SemEval-2016Task 6-A public dataset,achieving a performance that exceeds the benchmark and those of participating teams.
文摘Wake-Up-Word Speech Recognition task (WUW-SR) is a computationally very demand, particularly the stage of feature extraction which is decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUW-SR. The state of the art WUW-SR system is based on three different sets of features: Mel-Frequency Cepstral Coefficients (MFCC), Linear Predictive Coding Coefficients (LPC), and Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC). In (front-end of Wake-Up-Word Speech Recognition System Design on FPGA) [1], we presented an experimental FPGA design and implementation of a novel architecture of a real-time spectrogram extraction processor that generates MFCC, LPC, and ENH_MFCC spectrograms simultaneously. In this paper, the details of converting the three sets of spectrograms 1) Mel-Frequency Cepstral Coefficients (MFCC), 2) Linear Predictive Coding Coefficients (LPC), and 3) Enhanced Mel-Frequency Cepstral Coefficients (ENH_MFCC) to their equivalent features are presented. In the WUW- SR system, the recognizer’s frontend is located at the terminal which is typically connected over a data network to remote back-end recognition (e.g., server). The WUW-SR is shown in Figure 1. The three sets of speech features are extracted at the front-end. These extracted features are then compressed and transmitted to the server via a dedicated channel, where subsequently they are decoded.
文摘This paper addresses an important issue in model combination, that is, model locality. Since usually a global linear model is unable to reflect nonlinearity and to characterize local features, especially in a complex sys-tem, we propose a mixture of local feature models to overcome these weaknesses. The basic idea is to split the entire input space into operating domains, and a recently developed feature-based model combination method is applied to build local models for each region. To realize this idea, three steps are required, which include clustering, local modeling and model combination, governed by a single objective function. An adaptive fuzzy parametric clustering algorithm is proposed to divide the whole input space into operating regimes, local feature models are created in each individual region by applying a recently developed fea-ture-based model combination method, and finally they are combined into a single mixture model. Corre-spondingly, a three-stage procedure is designed to optimize the complete objective function, which is actu-ally a hybrid Genetic Algorithm (GA). Our simulation results show that the adaptive fuzzy mixture of local feature models turns out to be superior to global models.