In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extract...In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extraction ability,and they cannot mine the discriminating features of the protocol data thoroughly.To address the issue,we propose an unknown application layer protocol recognition method based on deep clustering.Deep clustering which consists of the deep neural network and the clustering algorithm can automatically extract the features of the input and cluster the data based on the extracted features.Compared with the traditional clustering methods,deep clustering boasts of higher clustering accuracy.The proposed method utilizes network-in-network(NIN),channel attention,spatial attention and Bidirectional Long Short-term memory(BLSTM)to construct an autoencoder to extract the spatial-temporal features of the protocol data,and utilizes the unsupervised clustering algorithm to recognize the unknown protocols based on the features.The method firstly extracts the application layer protocol data from the network traffic and transforms the data into one-dimensional matrix.Secondly,the autoencoder is pretrained,and the protocol data is compressed into low dimensional latent space by the autoencoder and the initial clustering is performed with K-Means.Finally,the clustering loss is calculated and the classification model is optimized according to the clustering loss.The classification results can be obtained when the classification model is optimal.Compared with the existing unknown protocol recognition methods,the proposed method utilizes deep clustering to cluster the unknown protocols,and it can mine the key features of the protocol data and recognize the unknown protocols accurately.Experimental results show that the proposed method can effectively recognize the unknown protocols,and its performance is better than other methods.展开更多
With the rapid growth of internet based services and the data generated on these services are attracted by the attackers to intrude the networking services and information.Based on the characteristics of these intrude...With the rapid growth of internet based services and the data generated on these services are attracted by the attackers to intrude the networking services and information.Based on the characteristics of these intruders,many researchers attempted to aim to detect the intrusion with the help of automating process.Since,the large volume of data is generated and transferred through network,the security and performance are remained an issue.IDS(Intrusion Detection System)was developed to detect and prevent the intruders and secure the network systems.The performance and loss are still an issue because of the features space grows while detecting the intruders.In this paper,deep clustering based CNN have been used to detect the intruders with the help of Meta heuristic algorithms for feature selection and preprocessing.The proposed system includes three phases such as preprocessing,feature selection and classification.In the first phase,KDD dataset is preprocessed by using Binning normalization and Eigen-PCA based discretization method.In second phase,feature selection is performed by using Information Gain based Dragonfly Optimizer(IGDFO).Finally,Deep clustering based Convolutional Neural Network(CCNN)classifier optimized with Particle Swarm Optimization(PSO)identifies intrusion attacks efficiently.The clustering loss and network loss can be reduced with the optimization algorithm.We evaluate the proposed IDS model with the NSL-KDD dataset in terms of evaluation metrics.The experimental results show that proposed system achieves better performance compared with the existing system in terms of accuracy,precision,recall,f-measure and false detection rate.展开更多
The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clust...The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clustering is a cornerstone in bioinformatics research, offering a potent computational tool for analyzing diverse types of data encountered in genomics and related fields. MiRNA clustering plays a pivotal role in deciphering the intricate regulatory roles of miRNAs in biological systems. It uncovers novel biomarkers for disease diagnosis and prognosis and advances our understanding of gene regulatory networks and pathways implicated in health and disease, as well as drug discovery. Namely, we have implemented clustering procedure to find interrelations among miRNAs within clusters, and their relations to diseases. Deep clustering (DC) algorithms signify a departure from traditional clustering methods towards more sophisticated techniques, that can uncover intricate patterns and relationships within gene expression data. Deep learning (DL) models have shown remarkable success in various domains, and their application in genomics, especially for tasks like clustering, holding immense promise. The deep convolutional clustering procedure used is different from other traditional methods, demonstrating unbiased clustering results. In the paper, we implement the procedure on a Multiple Myeloma miRNA dataset publicly available on GEO platform, as a template of a cancer instance analysis, and hazard some biological issues.展开更多
In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da...In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for ident...At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for identifying high-risk scenarios of interlocking faults in new energy power grids based on a deep embedding clustering(DEC)algorithm and apply it in a risk assessment of cascading failures in different operating scenarios for new energy power grids.First,considering the real-time operation status and system structure of new energy power grids,the scenario cascading failure risk indicator is established.Based on this indicator,the risk of cascading failure is calculated for the scenario set,the scenarios are clustered based on the DEC algorithm,and the scenarios with the highest indicators are selected as the significant risk scenario set.The results of simulations with an example power grid show that our method can effectively identify scenarios with a high risk of cascading failures from a large number of scenarios.展开更多
Defects detection with Electroluminescence(EL)image for photovoltaic(PV)module has become a standard test procedure during the process of production,installation,and operation of solar modules.There are some typical d...Defects detection with Electroluminescence(EL)image for photovoltaic(PV)module has become a standard test procedure during the process of production,installation,and operation of solar modules.There are some typical defects types,such as crack,finger interruption,that can be recognized with high accuracy.However,due to the complexity of EL images and the limitation of the dataset,it is hard to label all types of defects during the inspection process.The unknown or unlabeled create significant difficulties in the practical application of the automatic defects detection technique.To address the problem,we proposed an evolutionary algorithm combined with traditional image processing technology,deep learning,transfer learning,and deep clustering,which can recognize the unknown or unlabeled in the original dataset defects automatically along with the increasing of the dataset size.Specifically,we first propose a deep learning-based features extractor and defects classifier.Then,the unlabeled defects can be classified by the deep clustering algorithm and stored separately to update the original database without human intervention.When the number of unknown images reaches the preset values,transfer learning is introduced to train the classifier with the updated database.The fine-tuned model can detect new defects with high accuracy.Finally,numerical results confirm that the proposed solution can carry out efficient and accurate defect detection automatically using electroluminescence images.展开更多
As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this pa...As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this paper proposes an SCM-oriented dynamic supply-demand(SD)intelligent adaptation model for massive manufacturing services.In this model,a collaborative network model is established based on the properties of both the supply-demand and their relationships;in addition,an algorithm based on deep graph clustering(DGC)and aligned sampling(AS)is used to divide and conquer the large adaptation domain to solve the problem of the slow computational speed caused by the high complexity of spatiotemporal search in the collaborative network model.At the same time,an intelligent supply-demand adaptation method driven by the quality of service(QoS)is established,in which the experiences of adaptation are shared among adaptation subdomains through deep reinforcement learning(DRL)powered by a transfer mechanism to improve the poor adaptation results caused by dynamic uncertainty.The results show that the model and the solution proposed in this paper can performcollaborative and intelligent supply-demand adaptation for themassive and dynamic resources in SCM through autonomous learning and can effectively performglobal supply-demand matching and optimal resource allocation.展开更多
Purpose-The aim of this study is to propose a deep neural network(DNN)method that uses side information to improve clustering results for big datasets;also,the authors show that applying this information improves the ...Purpose-The aim of this study is to propose a deep neural network(DNN)method that uses side information to improve clustering results for big datasets;also,the authors show that applying this information improves the performance of clustering and also increase the speed of the network training convergence.Design/methodology/approach-In data mining,semisupervised learning is an interesting approach because good performance can be achieved with a small subset of labeled data;one reason is that the data labeling is expensive,and semisupervised learning does not need all labels.One type of semisupervised learning is constrained clustering;this type of learning does not use class labels for clustering.Instead,it uses information of some pairs of instances(side information),and these instances maybe are in the same cluster(must-link[ML])or in different clusters(cannot-link[CL]).Constrained clustering was studied extensively;however,little works have focused on constrained clustering for big datasets.In this paper,the authors have presented a constrained clustering for big datasets,and the method uses a DNN.The authors inject the constraints(ML and CL)to this DNN to promote the clustering performance and call it constrained deep embedded clustering(CDEC).In this manner,an autoencoder was implemented to elicit informative low dimensional features in the latent space and then retrain the encoder network using a proposed Kullback-Leibler divergence objective function,which captures the constraints in order to cluster the projected samples.The proposed CDEC has been compared with the adversarial autoencoder,constrained 1-spectral clustering and autoencoder t k-means was applied to the known MNIST,Reuters-10k and USPS datasets,and their performance were assessed in terms of clustering accuracy.Empirical results confirmed the statistical superiority of CDEC in terms of clustering accuracy to the counterparts.Findings-First of all,this is the first DNN-constrained clustering that uses side information to improve the performance of clustering without using labels in big datasets with high dimension.Second,the author defined a formula to inject side information to the DNN.Third,the proposed method improves clustering performance and network convergence speed.Originality/value-Little works have focused on constrained clustering for big datasets;also,the studies in DNNs for clustering,with specific loss function that simultaneously extract features and clustering the data,are rare.The method improves the performance of big data clustering without using labels,and it is important because the data labeling is expensive and time-consuming,especially for big datasets.展开更多
Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Curren...Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Current researches mostly use traditional machine learning methods to extract features of weather scenes,and clustering algorithms to divide similar scenes.Inspired by the excellent performance of deep learning in image recognition,this paper proposes a terminal area similar weather scene classification method based on improved deep convolution embedded clustering(IDCEC),which uses the com-bination of the encoding layer and the decoding layer to reduce the dimensionality of the weather image,retaining useful information to the greatest extent,and then uses the combination of the pre-trained encoding layer and the clustering layer to train the clustering model of the similar scenes in the terminal area.Finally,term-inal area of Guangzhou Airport is selected as the research object,the method pro-posed in this article is used to classify historical weather data in similar scenes,and the performance is compared with other state-of-the-art methods.The experi-mental results show that the proposed IDCEC method can identify similar scenes more accurately based on the spatial distribution characteristics and severity of weather;at the same time,compared with the actualflight volume in the Guangz-hou terminal area,IDCEC's recognition results of similar weather scenes are con-sistent with the recognition of experts in thefield.展开更多
The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access...The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy(NLACP)to a machine-readable form.To study the automation process,we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations.Therefore,this paper focuses on the questions of:how can we automatically infer the hierarchical structure of an ABAC model given NLACPs;and,how can we extract and define the set of authorization attributes based on the resulting structure.To address these questions,we propose an approach built upon recent advancements in natural language processing and machine learning techniques.For such a solution,the lack of appropriate data often poses a bottleneck.Therefore,we decouple the primary contributions of this work into:(1)developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts,and(2)generating a set of realistic synthetic natural language access control policies(NLACPs)to evaluate the proposed framework.Our experimental results are promising as we achieved-in average-an F1-score of 0.96 when extracting attributes values of subjects,and 0.91 when extracting the values of objects’attributes from natural language access control policies.展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access...The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy(NLACP)to a machine-readable form.To study the automation process,we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations.Therefore,this paper focuses on the questions of:how can we automatically infer the hierarchical structure of an ABAC model given NLACPs;and,how can we extract and define the set of authorization attributes based on the resulting structure.To address these questions,we propose an approach built upon recent advancements in natural language processing and machine learning techniques.For such a solution,the lack of appropriate data often poses a bottleneck.Therefore,we decouple the primary contributions of this work into:(1)developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts,and(2)generating a set of realistic synthetic natural language access control policies(NLACPs)to evaluate the proposed framework.Our experimental results are promising as we achieved-in average-an F1-score of 0.96 when extracting attributes values of subjects,and 0.91 when extracting the values of objects’attributes from natural language access control policies.展开更多
基金This work is supported by the National Key R&D Program of China(2017YFB0802900).
文摘In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extraction ability,and they cannot mine the discriminating features of the protocol data thoroughly.To address the issue,we propose an unknown application layer protocol recognition method based on deep clustering.Deep clustering which consists of the deep neural network and the clustering algorithm can automatically extract the features of the input and cluster the data based on the extracted features.Compared with the traditional clustering methods,deep clustering boasts of higher clustering accuracy.The proposed method utilizes network-in-network(NIN),channel attention,spatial attention and Bidirectional Long Short-term memory(BLSTM)to construct an autoencoder to extract the spatial-temporal features of the protocol data,and utilizes the unsupervised clustering algorithm to recognize the unknown protocols based on the features.The method firstly extracts the application layer protocol data from the network traffic and transforms the data into one-dimensional matrix.Secondly,the autoencoder is pretrained,and the protocol data is compressed into low dimensional latent space by the autoencoder and the initial clustering is performed with K-Means.Finally,the clustering loss is calculated and the classification model is optimized according to the clustering loss.The classification results can be obtained when the classification model is optimal.Compared with the existing unknown protocol recognition methods,the proposed method utilizes deep clustering to cluster the unknown protocols,and it can mine the key features of the protocol data and recognize the unknown protocols accurately.Experimental results show that the proposed method can effectively recognize the unknown protocols,and its performance is better than other methods.
基金The third and fourth authors were supported by the Project of Specific Research PrF UHK No.2101/2021 and Long-term development plan of UHK,University of Hradec Králové,Czech Republic.
文摘With the rapid growth of internet based services and the data generated on these services are attracted by the attackers to intrude the networking services and information.Based on the characteristics of these intruders,many researchers attempted to aim to detect the intrusion with the help of automating process.Since,the large volume of data is generated and transferred through network,the security and performance are remained an issue.IDS(Intrusion Detection System)was developed to detect and prevent the intruders and secure the network systems.The performance and loss are still an issue because of the features space grows while detecting the intruders.In this paper,deep clustering based CNN have been used to detect the intruders with the help of Meta heuristic algorithms for feature selection and preprocessing.The proposed system includes three phases such as preprocessing,feature selection and classification.In the first phase,KDD dataset is preprocessed by using Binning normalization and Eigen-PCA based discretization method.In second phase,feature selection is performed by using Information Gain based Dragonfly Optimizer(IGDFO).Finally,Deep clustering based Convolutional Neural Network(CCNN)classifier optimized with Particle Swarm Optimization(PSO)identifies intrusion attacks efficiently.The clustering loss and network loss can be reduced with the optimization algorithm.We evaluate the proposed IDS model with the NSL-KDD dataset in terms of evaluation metrics.The experimental results show that proposed system achieves better performance compared with the existing system in terms of accuracy,precision,recall,f-measure and false detection rate.
文摘The regulatory role of the Micro-RNAs (miRNAs) in the messenger RNAs (mRNAs) gene expression is well understood by the biologists since some decades, even though the delving into specific aspects is in progress. Clustering is a cornerstone in bioinformatics research, offering a potent computational tool for analyzing diverse types of data encountered in genomics and related fields. MiRNA clustering plays a pivotal role in deciphering the intricate regulatory roles of miRNAs in biological systems. It uncovers novel biomarkers for disease diagnosis and prognosis and advances our understanding of gene regulatory networks and pathways implicated in health and disease, as well as drug discovery. Namely, we have implemented clustering procedure to find interrelations among miRNAs within clusters, and their relations to diseases. Deep clustering (DC) algorithms signify a departure from traditional clustering methods towards more sophisticated techniques, that can uncover intricate patterns and relationships within gene expression data. Deep learning (DL) models have shown remarkable success in various domains, and their application in genomics, especially for tasks like clustering, holding immense promise. The deep convolutional clustering procedure used is different from other traditional methods, demonstrating unbiased clustering results. In the paper, we implement the procedure on a Multiple Myeloma miRNA dataset publicly available on GEO platform, as a template of a cancer instance analysis, and hazard some biological issues.
基金supported in part by the National Natural Science Foundation of China under Grant 62171203in part by the Jiangsu Province“333 Project”High-Level Talent Cultivation Subsidized Project+2 种基金in part by the SuzhouKey Supporting Subjects for Health Informatics under Grant SZFCXK202147in part by the Changshu Science and Technology Program under Grants CS202015 and CS202246in part by Changshu Key Laboratory of Medical Artificial Intelligence and Big Data under Grants CYZ202301 and CS202314.
文摘In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
基金funded by the State Grid Limited Science and Technology Project of China,Grant Number SGSXDK00DJJS2200144.
文摘At present,the proportion of new energy in the power grid is increasing,and the random fluctuations in power output increase the risk of cascading failures in the power grid.In this paper,we propose a method for identifying high-risk scenarios of interlocking faults in new energy power grids based on a deep embedding clustering(DEC)algorithm and apply it in a risk assessment of cascading failures in different operating scenarios for new energy power grids.First,considering the real-time operation status and system structure of new energy power grids,the scenario cascading failure risk indicator is established.Based on this indicator,the risk of cascading failure is calculated for the scenario set,the scenarios are clustered based on the DEC algorithm,and the scenarios with the highest indicators are selected as the significant risk scenario set.The results of simulations with an example power grid show that our method can effectively identify scenarios with a high risk of cascading failures from a large number of scenarios.
文摘Defects detection with Electroluminescence(EL)image for photovoltaic(PV)module has become a standard test procedure during the process of production,installation,and operation of solar modules.There are some typical defects types,such as crack,finger interruption,that can be recognized with high accuracy.However,due to the complexity of EL images and the limitation of the dataset,it is hard to label all types of defects during the inspection process.The unknown or unlabeled create significant difficulties in the practical application of the automatic defects detection technique.To address the problem,we proposed an evolutionary algorithm combined with traditional image processing technology,deep learning,transfer learning,and deep clustering,which can recognize the unknown or unlabeled in the original dataset defects automatically along with the increasing of the dataset size.Specifically,we first propose a deep learning-based features extractor and defects classifier.Then,the unlabeled defects can be classified by the deep clustering algorithm and stored separately to update the original database without human intervention.When the number of unknown images reaches the preset values,transfer learning is introduced to train the classifier with the updated database.The fine-tuned model can detect new defects with high accuracy.Finally,numerical results confirm that the proposed solution can carry out efficient and accurate defect detection automatically using electroluminescence images.
基金This paper was supported in part by the National Natural Science Foundation of China under Grant 62172235in part by Natural Science Foundation of Jiangsu Province of China under Grant BK20191381in part by Primary Research&Development Plan of Jiangsu Province Grant BE2019742.
文摘As a new mode and means of smart manufacturing,smart cloud manufacturing(SCM)faces great challenges in massive supply and demand,dynamic resource collaboration and intelligent adaptation.To address the problem,this paper proposes an SCM-oriented dynamic supply-demand(SD)intelligent adaptation model for massive manufacturing services.In this model,a collaborative network model is established based on the properties of both the supply-demand and their relationships;in addition,an algorithm based on deep graph clustering(DGC)and aligned sampling(AS)is used to divide and conquer the large adaptation domain to solve the problem of the slow computational speed caused by the high complexity of spatiotemporal search in the collaborative network model.At the same time,an intelligent supply-demand adaptation method driven by the quality of service(QoS)is established,in which the experiences of adaptation are shared among adaptation subdomains through deep reinforcement learning(DRL)powered by a transfer mechanism to improve the poor adaptation results caused by dynamic uncertainty.The results show that the model and the solution proposed in this paper can performcollaborative and intelligent supply-demand adaptation for themassive and dynamic resources in SCM through autonomous learning and can effectively performglobal supply-demand matching and optimal resource allocation.
文摘Purpose-The aim of this study is to propose a deep neural network(DNN)method that uses side information to improve clustering results for big datasets;also,the authors show that applying this information improves the performance of clustering and also increase the speed of the network training convergence.Design/methodology/approach-In data mining,semisupervised learning is an interesting approach because good performance can be achieved with a small subset of labeled data;one reason is that the data labeling is expensive,and semisupervised learning does not need all labels.One type of semisupervised learning is constrained clustering;this type of learning does not use class labels for clustering.Instead,it uses information of some pairs of instances(side information),and these instances maybe are in the same cluster(must-link[ML])or in different clusters(cannot-link[CL]).Constrained clustering was studied extensively;however,little works have focused on constrained clustering for big datasets.In this paper,the authors have presented a constrained clustering for big datasets,and the method uses a DNN.The authors inject the constraints(ML and CL)to this DNN to promote the clustering performance and call it constrained deep embedded clustering(CDEC).In this manner,an autoencoder was implemented to elicit informative low dimensional features in the latent space and then retrain the encoder network using a proposed Kullback-Leibler divergence objective function,which captures the constraints in order to cluster the projected samples.The proposed CDEC has been compared with the adversarial autoencoder,constrained 1-spectral clustering and autoencoder t k-means was applied to the known MNIST,Reuters-10k and USPS datasets,and their performance were assessed in terms of clustering accuracy.Empirical results confirmed the statistical superiority of CDEC in terms of clustering accuracy to the counterparts.Findings-First of all,this is the first DNN-constrained clustering that uses side information to improve the performance of clustering without using labels in big datasets with high dimension.Second,the author defined a formula to inject side information to the DNN.Third,the proposed method improves clustering performance and network convergence speed.Originality/value-Little works have focused on constrained clustering for big datasets;also,the studies in DNNs for clustering,with specific loss function that simultaneously extract features and clustering the data,are rare.The method improves the performance of big data clustering without using labels,and it is important because the data labeling is expensive and time-consuming,especially for big datasets.
基金supported by the Fundamental Research Funds for the CentralUniversities under Grant NS2020045. Y.L.G received the grant.
文摘Weather is a key factor affecting the control of air traffic.Accurate recognition and classification of similar weather scenes in the terminal area is helpful for rapid decision-making in air trafficflow management.Current researches mostly use traditional machine learning methods to extract features of weather scenes,and clustering algorithms to divide similar scenes.Inspired by the excellent performance of deep learning in image recognition,this paper proposes a terminal area similar weather scene classification method based on improved deep convolution embedded clustering(IDCEC),which uses the com-bination of the encoding layer and the decoding layer to reduce the dimensionality of the weather image,retaining useful information to the greatest extent,and then uses the combination of the pre-trained encoding layer and the clustering layer to train the clustering model of the similar scenes in the terminal area.Finally,term-inal area of Guangzhou Airport is selected as the research object,the method pro-posed in this article is used to classify historical weather data in similar scenes,and the performance is compared with other state-of-the-art methods.The experi-mental results show that the proposed IDCEC method can identify similar scenes more accurately based on the spatial distribution characteristics and severity of weather;at the same time,compared with the actualflight volume in the Guangz-hou terminal area,IDCEC's recognition results of similar weather scenes are con-sistent with the recognition of experts in thefield.
文摘The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy(NLACP)to a machine-readable form.To study the automation process,we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations.Therefore,this paper focuses on the questions of:how can we automatically infer the hierarchical structure of an ABAC model given NLACPs;and,how can we extract and define the set of authorization attributes based on the resulting structure.To address these questions,we propose an approach built upon recent advancements in natural language processing and machine learning techniques.For such a solution,the lack of appropriate data often poses a bottleneck.Therefore,we decouple the primary contributions of this work into:(1)developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts,and(2)generating a set of realistic synthetic natural language access control policies(NLACPs)to evaluate the proposed framework.Our experimental results are promising as we achieved-in average-an F1-score of 0.96 when extracting attributes values of subjects,and 0.91 when extracting the values of objects’attributes from natural language access control policies.
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.
基金supported by Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy(NLACP)to a machine-readable form.To study the automation process,we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations.Therefore,this paper focuses on the questions of:how can we automatically infer the hierarchical structure of an ABAC model given NLACPs;and,how can we extract and define the set of authorization attributes based on the resulting structure.To address these questions,we propose an approach built upon recent advancements in natural language processing and machine learning techniques.For such a solution,the lack of appropriate data often poses a bottleneck.Therefore,we decouple the primary contributions of this work into:(1)developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts,and(2)generating a set of realistic synthetic natural language access control policies(NLACPs)to evaluate the proposed framework.Our experimental results are promising as we achieved-in average-an F1-score of 0.96 when extracting attributes values of subjects,and 0.91 when extracting the values of objects’attributes from natural language access control policies.