In network traffic classification,it is important to understand the correlation between network traffic and its causal application,protocol,or service group,for example,in facilitating lawful interception,ensuring the...In network traffic classification,it is important to understand the correlation between network traffic and its causal application,protocol,or service group,for example,in facilitating lawful interception,ensuring the quality of service,preventing application choke points,and facilitating malicious behavior identification.In this paper,we review existing network classification techniques,such as port-based identification and those based on deep packet inspection,statistical features in conjunction with machine learning,and deep learning algorithms.We also explain the implementations,advantages,and limitations associated with these techniques.Our review also extends to publicly available datasets used in the literature.Finally,we discuss existing and emerging challenges,as well as future research directions.展开更多
Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abno...Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.展开更多
The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.Howev...The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.However,the major limitation of DPI systems is that their signature library is mainly extracted manually,which makes it hard to efficiently obtain the signature of new applications.Hence,in this paper,we propose an automatic signature extraction mechanism using Principal Component Analysis(PCA)technology,which is able to extract the signature automatically.In the proposed method,the signatures are expressed in the form of serial consistent sequences constructed by principal components instead of normally separated substrings in the original data extracted from the traditional methods.Extensive experiments based on numerous sets of data have been carried out to evaluate the performance of the proposed scheme,and the results prove that the newly proposed method can achieve good performance in terms of accuracy and efficiency.展开更多
Moments have been used in all sorts of object classification systems based on image. There are lots of moments studied by many researchers in the area of object classification and one of the most preference moments is...Moments have been used in all sorts of object classification systems based on image. There are lots of moments studied by many researchers in the area of object classification and one of the most preference moments is the Zernike moment. In this paper, the performance of object classification using the Zernike moment has been explored. The classifier based on neural networks has been used in this study. The results indicate the best performance in identifying the aggregate is at 91.4% with a ten orders of the Zernike moment. This encouraging result has shown that the Zernike moment is a suitable moment to be used as a feature of object classification systems.展开更多
There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR)...There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.展开更多
Intrusion Detection System(IDS)is a network security mechanism that analyses all users’and applications’traffic and detectsmalicious activities in real-time.The existing IDSmethods suffer fromlower accuracy and lack...Intrusion Detection System(IDS)is a network security mechanism that analyses all users’and applications’traffic and detectsmalicious activities in real-time.The existing IDSmethods suffer fromlower accuracy and lack the required level of security to prevent sophisticated attacks.This problem can result in the system being vulnerable to attacks,which can lead to the loss of sensitive data and potential system failure.Therefore,this paper proposes an Intrusion Detection System using Logistic Tanh-based Convolutional Neural Network Classification(LTH-CNN).Here,the Correlation Coefficient based Mayfly Optimization(CC-MA)algorithm is used to extract the input characteristics for the IDS from the input data.Then,the optimized features are utilized by the LTH-CNN,which returns the attacked and non-attacked data.After that,the attacked data is stored in the log file and non-attacked data is mapped to the cyber security and data security phases.To prevent the system from cyber-attack,the Source and Destination IP address is converted into a complex binary format named 1’s Complement Reverse Shift Right(CRSR),where,in the data security phase the sensed data is converted into an encrypted format using Senders Public key Exclusive OR Receivers Public Key-Elliptic Curve Cryptography(PXORP-ECC)Algorithm to improve the data security.TheNetwork Security Laboratory-Knowledge Discovery inDatabases(NSLKDD)dataset and real-time sensor are used to train and evaluate the proposed LTH-CNN.The suggested model is evaluated based on accuracy,sensitivity,and specificity,which outperformed the existing IDS methods,according to the results of the experiments.展开更多
In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Ba...In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Based on features of land cover of the coal mining area,on texture feature extraction and a selection method of a gray-level co-occurrence matrix (GLCM) of the SAR image,we propose in this study that the optimum window size for computing the GLCM is an appropriate sized window that can effectively distinguish different types of land cover. Next,a band combination was carried out over the text feature images and the band-filtered SAR image to secure a new multi-band image. After the transformation of the new image with principal component analysis,a classification is conducted selectively on three principal component bands with the most information. Finally,through training and experimenting with the samples,a better three-layered BP neural network was established to classify the SAR image. The results show that,assisted by texture information,the neural network classification improved the accuracy of SAR image classification by 14.6%,compared with a classification by maximum likelihood estimation without texture information.展开更多
This study aimed to assess the role of the National Comprehensive Cancer Network (NCCN) risk classification in predicting biochemical recurrence (BCR) after radical prostatectomy (RP) in Chinese prostate cancer ...This study aimed to assess the role of the National Comprehensive Cancer Network (NCCN) risk classification in predicting biochemical recurrence (BCR) after radical prostatectomy (RP) in Chinese prostate cancer patients. We included a consecutive cohort of 385 patients with prostate cancer who underwent RP at Fudan University Shanghai Cancer Center (Shanghai, China) from March 2011 to December 2014. Gleason grade groups were applied at analysis according to the 2014 International Society of Urological Pathology Consensus. Risk groups were stratified according to the NCCN Clinical Practice Guidelines in Oncology: Prostate Cancer version 1, 2017. All 385 patients were divided into BCR and non-BCR groups. The clinicopathological characteristics were compared using an independent sample t-test, Chi-squared test, and Fisher's exact test. BCR-free survival was compared using the log-rank test and multivariable Cox proportional hazard analysis. During median follow-up of 48 months (range: 1-78 months), 31 (8.05%) patients experienced BCR. The BCR group had higher prostate-specific antigen level at diagnosis (46.54 ± 39.58 ng m1-1 vs 21.02 ± 21.06 ng ml-1, P= 0.001), more advanced pT stage (P= 0.002), and higher pN1 rate (P〈 0.001). NCCN risk classification was a significant predictor of BCR {P = 0.0006) and BCR-free survival (P = 0.003) after RP. As NCCN risk level increased, there was a significant decreasing trend in BCR-free survival rate (Ptrend = 0.0002). This study confirmed and validated that NCCN risk classification was a significant predictor of BCR and BCR-free survival after RP.展开更多
A technique for wear particle identification using computer vision system is described. The computer vision system employs LVQ Neural Networks as classifier to recognize the surface texture of wear particles in lubric...A technique for wear particle identification using computer vision system is described. The computer vision system employs LVQ Neural Networks as classifier to recognize the surface texture of wear particles in lubricating oil and determine the conditions of machines. The recognition process includes four stages:(1)capturing image from ferrographies containing wear particles;(2) digitising the image and extracting features;(3) learning the training data selected from the feature data set;(4) identifying the wear particles and generating the result report of machine condition classification. To verify the technique proposed here, the recognition results of several typical classes of wear particles generated at the sliding and rolling surfaces in a diesel engine are presented.展开更多
Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of ...Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.展开更多
Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for ...Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.展开更多
A procedure has been developed for making voiced, unvoiced, and silence classifications of speech by using a multilayer feedforward net -work. Speech signals were analyzed sequentially and a feature vector was obtaine...A procedure has been developed for making voiced, unvoiced, and silence classifications of speech by using a multilayer feedforward net -work. Speech signals were analyzed sequentially and a feature vector was obtained for each segment . The feature vector served as input to a 3-layer feedforward network in which voiced, unvoiced, and silence classification was made. The network had a 6-12-3 node architecture and was trained using the generalized delta rule for back propagation of error . The performance of the network was evaluated using speech samples from 3 male and 3 female speakers . A speaker-dependent classification rate of 94.7% and speaker-independent classification rate of 94.3% were obtained. It is concluded that the voiced, unvoiced , and silence classification of speech can be effectively accomplished using a multilayer feedforward network.展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
Wire breakages and spark absence are two typical machining failures that occur during wire electric discharge machining(wire-EDM),if appropriate parameter settings are not maintained.Even after several attempts to opt...Wire breakages and spark absence are two typical machining failures that occur during wire electric discharge machining(wire-EDM),if appropriate parameter settings are not maintained.Even after several attempts to optimize the process,machining failures cannot be eliminated completely.A n offline classification model is presented herein to predict machining failures.The aim of the current study is to develop a multiclass classification model using an artificial neural network(ANN).The training dataset comprises 81 full factorial experiments with three levels of pulse-on time,pulse-off time,servo voltage,and wire feed rate as input parameters.The classes are labeled as normal machining,spark absence,and wire breakage.The model accuracy is tested by conducting 20 confirmation experiments,and the model is discovered to be 95%accurate in classifying the machining outcomes.The effects of process parameters on the process failures are discussed and analyzed.A microstructural analysis of the machined surface and worn wire surface is conducted.The developed model proved to be an easy and fast solution for verifying and eliminating process failures.展开更多
Naive Bayes(NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are ...Naive Bayes(NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are becoming increasingly accessible, due to the fast development of various social network services and websites. By contrast, data generated by a social network are most likely to be dependent. The dependency is mainly determined by their social network relationships. Then, how to extend the classical NB method to social network data becomes a problem of great interest. To this end, we propose here a network-based naive Bayes(NNB) method, which generalizes the classical NB model to social network data. The key advantage of the NNB method is that it takes the network relationships into consideration. The computational efficiency makes the NNB method even feasible in large scale social networks. The statistical properties of the NNB model are theoretically investigated. Simulation studies have been conducted to demonstrate its finite sample performance.A real data example is also analyzed for illustration purpose.展开更多
Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detecti...Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.展开更多
文摘In network traffic classification,it is important to understand the correlation between network traffic and its causal application,protocol,or service group,for example,in facilitating lawful interception,ensuring the quality of service,preventing application choke points,and facilitating malicious behavior identification.In this paper,we review existing network classification techniques,such as port-based identification and those based on deep packet inspection,statistical features in conjunction with machine learning,and deep learning algorithms.We also explain the implementations,advantages,and limitations associated with these techniques.Our review also extends to publicly available datasets used in the literature.Finally,we discuss existing and emerging challenges,as well as future research directions.
基金This work was supported by the National Natural Science Foundation of China(61871046).
文摘Attacks on websites and network servers are among the most critical threats in network security.Network behavior identification is one of the most effective ways to identify malicious network intrusions.Analyzing abnormal network traffic patterns and traffic classification based on labeled network traffic data are among the most effective approaches for network behavior identification.Traditional methods for network traffic classification utilize algorithms such as Naive Bayes,Decision Tree and XGBoost.However,network traffic classification,which is required for network behavior identification,generally suffers from the problem of low accuracy even with the recently proposed deep learning models.To improve network traffic classification accuracy thus improving network intrusion detection rate,this paper proposes a new network traffic classification model,called ArcMargin,which incorporates metric learning into a convolutional neural network(CNN)to make the CNN model more discriminative.ArcMargin maps network traffic samples from the same category more closely while samples from different categories are mapped as far apart as possible.The metric learning regularization feature is called additive angular margin loss,and it is embedded in the object function of traditional CNN models.The proposed ArcMargin model is validated with three datasets and is compared with several other related algorithms.According to a set of classification indicators,the ArcMargin model is proofed to have better performances in both network traffic classification tasks and open-set tasks.Moreover,in open-set tasks,the ArcMargin model can cluster unknown data classes that do not exist in the previous training dataset.
基金supported by the National Natural Science Foundation of China under Grant No.61003282Beijing Higher Education Young Elite Teacher Project+3 种基金China Next Generation Internet(CNGI)Project"Research and Trial on Evolving Next Generation Network Intelligence Capability Enhancement(NICE)"the National Basic Research Program(973 Program)under Grant No.2009CB320-505the National Science and Technology Major Project"Research about Architecture of Mobile Internet"under Grant No.2011ZX03-002-001-01the National High Technology Research and Development Program(863 Program)under Grant No.2011AA010704
文摘The Deep Packet Inspection(DPI)method is a popular method that can accurately identify the flow data and its corresponding application.Currently,the DPI method is widely used in common network management systems.However,the major limitation of DPI systems is that their signature library is mainly extracted manually,which makes it hard to efficiently obtain the signature of new applications.Hence,in this paper,we propose an automatic signature extraction mechanism using Principal Component Analysis(PCA)technology,which is able to extract the signature automatically.In the proposed method,the signatures are expressed in the form of serial consistent sequences constructed by principal components instead of normally separated substrings in the original data extracted from the traditional methods.Extensive experiments based on numerous sets of data have been carried out to evaluate the performance of the proposed scheme,and the results prove that the newly proposed method can achieve good performance in terms of accuracy and efficiency.
基金supported by the Ministry of Higher Education Malaysia under Fundamental Research Grant No.0719
文摘Moments have been used in all sorts of object classification systems based on image. There are lots of moments studied by many researchers in the area of object classification and one of the most preference moments is the Zernike moment. In this paper, the performance of object classification using the Zernike moment has been explored. The classifier based on neural networks has been used in this study. The results indicate the best performance in identifying the aggregate is at 91.4% with a ten orders of the Zernike moment. This encouraging result has shown that the Zernike moment is a suitable moment to be used as a feature of object classification systems.
基金supported by the National Science and Technology Major Project of the Ministry of Science and Technology of China(2014 ZX03001027)
文摘There are various heterogeneous networks for terminals to deliver a better quality of service. Signal system recognition and classification contribute a lot to the process. However, in low signal to noise ratio(SNR) circumstances or under time-varying multipath channels, the majority of the existing algorithms for signal recognition are already facing limitations. In this series, we present a robust signal recognition method based upon the original and latest updated version of the extreme learning machine(ELM) to help users to switch between networks. The ELM utilizes signal characteristics to distinguish systems. The superiority of this algorithm lies in the random choices of hidden nodes and in the fact that it determines the output weights analytically, which result in lower complexity. Theoretically, the algorithm tends to offer a good generalization performance at an extremely fast speed of learning. Moreover, we implement the GSM/WCDMA/LTE models in the Matlab environment by using the Simulink tools. The simulations reveal that the signals can be recognized successfully to achieve a 95% accuracy in a low SNR(0 dB) environment in the time-varying multipath Rayleigh fading channel.
文摘Intrusion Detection System(IDS)is a network security mechanism that analyses all users’and applications’traffic and detectsmalicious activities in real-time.The existing IDSmethods suffer fromlower accuracy and lack the required level of security to prevent sophisticated attacks.This problem can result in the system being vulnerable to attacks,which can lead to the loss of sensitive data and potential system failure.Therefore,this paper proposes an Intrusion Detection System using Logistic Tanh-based Convolutional Neural Network Classification(LTH-CNN).Here,the Correlation Coefficient based Mayfly Optimization(CC-MA)algorithm is used to extract the input characteristics for the IDS from the input data.Then,the optimized features are utilized by the LTH-CNN,which returns the attacked and non-attacked data.After that,the attacked data is stored in the log file and non-attacked data is mapped to the cyber security and data security phases.To prevent the system from cyber-attack,the Source and Destination IP address is converted into a complex binary format named 1’s Complement Reverse Shift Right(CRSR),where,in the data security phase the sensed data is converted into an encrypted format using Senders Public key Exclusive OR Receivers Public Key-Elliptic Curve Cryptography(PXORP-ECC)Algorithm to improve the data security.TheNetwork Security Laboratory-Knowledge Discovery inDatabases(NSLKDD)dataset and real-time sensor are used to train and evaluate the proposed LTH-CNN.The suggested model is evaluated based on accuracy,sensitivity,and specificity,which outperformed the existing IDS methods,according to the results of the experiments.
基金Projects 40771143 supported by the National Natural Science Foundation of China2007AA12Z162 by the Hi-tech Research and Development Program of China
文摘In this study,analyses are conducted on the information features of a construction site,a cornfield and subsidence seeper land in a coal mining area with a synthetic aperture radar (SAR) image of medium resolution. Based on features of land cover of the coal mining area,on texture feature extraction and a selection method of a gray-level co-occurrence matrix (GLCM) of the SAR image,we propose in this study that the optimum window size for computing the GLCM is an appropriate sized window that can effectively distinguish different types of land cover. Next,a band combination was carried out over the text feature images and the band-filtered SAR image to secure a new multi-band image. After the transformation of the new image with principal component analysis,a classification is conducted selectively on three principal component bands with the most information. Finally,through training and experimenting with the samples,a better three-layered BP neural network was established to classify the SAR image. The results show that,assisted by texture information,the neural network classification improved the accuracy of SAR image classification by 14.6%,compared with a classification by maximum likelihood estimation without texture information.
基金This study was sponsored by the National Natural Science Foundation of China (No. 81472377) and the Natural Science Foundation of Shanghai (No. 16ZR1406500). The authors also thank Wei-Yi Yang, Cui-Zhu Zhang, and Ying Shen for helping with follow-up of patients.
文摘This study aimed to assess the role of the National Comprehensive Cancer Network (NCCN) risk classification in predicting biochemical recurrence (BCR) after radical prostatectomy (RP) in Chinese prostate cancer patients. We included a consecutive cohort of 385 patients with prostate cancer who underwent RP at Fudan University Shanghai Cancer Center (Shanghai, China) from March 2011 to December 2014. Gleason grade groups were applied at analysis according to the 2014 International Society of Urological Pathology Consensus. Risk groups were stratified according to the NCCN Clinical Practice Guidelines in Oncology: Prostate Cancer version 1, 2017. All 385 patients were divided into BCR and non-BCR groups. The clinicopathological characteristics were compared using an independent sample t-test, Chi-squared test, and Fisher's exact test. BCR-free survival was compared using the log-rank test and multivariable Cox proportional hazard analysis. During median follow-up of 48 months (range: 1-78 months), 31 (8.05%) patients experienced BCR. The BCR group had higher prostate-specific antigen level at diagnosis (46.54 ± 39.58 ng m1-1 vs 21.02 ± 21.06 ng ml-1, P= 0.001), more advanced pT stage (P= 0.002), and higher pN1 rate (P〈 0.001). NCCN risk classification was a significant predictor of BCR {P = 0.0006) and BCR-free survival (P = 0.003) after RP. As NCCN risk level increased, there was a significant decreasing trend in BCR-free survival rate (Ptrend = 0.0002). This study confirmed and validated that NCCN risk classification was a significant predictor of BCR and BCR-free survival after RP.
文摘A technique for wear particle identification using computer vision system is described. The computer vision system employs LVQ Neural Networks as classifier to recognize the surface texture of wear particles in lubricating oil and determine the conditions of machines. The recognition process includes four stages:(1)capturing image from ferrographies containing wear particles;(2) digitising the image and extracting features;(3) learning the training data selected from the feature data set;(4) identifying the wear particles and generating the result report of machine condition classification. To verify the technique proposed here, the recognition results of several typical classes of wear particles generated at the sliding and rolling surfaces in a diesel engine are presented.
基金This work was supported by the State Grid Science and Technology Project Research on Key Technologies and Applications of Self-Service Big Data Governance of Power Grid(5442YD180015).
文摘Network traffic classification,which matches network traffic for a specific class of different granularities,plays a vital role in the domain of network administration and cyber security.With the rapid development of network communication techniques,more and more network applications adopt encryption techniques during communication,which brings significant challenges to traditional network traffic classification methods.On the one hand,traditional methods mainly depend on matching features on the application layer of the ISO/OSI reference model,which leads to the failure of classifying encrypted traffic.On the other hand,machine learning-based methods require human-made features from network traffic data by human experts,which renders it difficult for them to deal with complex network protocols.In this paper,the convolution attention network(CAT)is proposed to overcom those difficulties.As an end-to-end model,CAT takes raw data as input and returns classification results automatically,with engineering by human experts.In CAT,firstly,the importance of different bytes with an attention mechanism of network traffic is achieved.Then,convolution neural network(CNN)is used to learn features automatically and feed the output into a softmax function to get classification results.It enables CAT to learn enough information from network traffic data and ensure the classified accuracy.Extensive experiments on the public encrypted network traffic dataset ISCX2016 demonstrate the effectiveness of the proposed model.
基金supported by the National Basic Research Program of China(2009CB320505)
文摘Network traffic classification aims at identifying the application types of network packets. It is important for Internet service providers (ISPs) to manage bandwidth resources and ensure the quality of service for different network applications However, most classification techniques using machine learning only focus on high flow accuracy and ignore byte accuracy. The classifier would obtain low classification performance for elephant flows as the imbalance between elephant flows and mice flows on Internet. The elephant flows, however, consume much more bandwidth than mice flows. When the classifier is deployed for traffic policing, the network management system cannot penalize elephant flows and avoid network congestion effectively. This article explores the factors related to low byte accuracy, and secondly, it presents a new traffic classification method to improve byte accuracy at the aid of data cleaning. Experiments are carried out on three groups of real-world traffic datasets, and the method is compared with existing work on the performance of improving byte accuracy. Experiment shows that byte accuracy increased by about 22.31% on average. The method outperforms the existing one in most cases.
文摘A procedure has been developed for making voiced, unvoiced, and silence classifications of speech by using a multilayer feedforward net -work. Speech signals were analyzed sequentially and a feature vector was obtained for each segment . The feature vector served as input to a 3-layer feedforward network in which voiced, unvoiced, and silence classification was made. The network had a 6-12-3 node architecture and was trained using the generalized delta rule for back propagation of error . The performance of the network was evaluated using speech samples from 3 male and 3 female speakers . A speaker-dependent classification rate of 94.7% and speaker-independent classification rate of 94.3% were obtained. It is concluded that the voiced, unvoiced , and silence classification of speech can be effectively accomplished using a multilayer feedforward network.
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.
文摘Wire breakages and spark absence are two typical machining failures that occur during wire electric discharge machining(wire-EDM),if appropriate parameter settings are not maintained.Even after several attempts to optimize the process,machining failures cannot be eliminated completely.A n offline classification model is presented herein to predict machining failures.The aim of the current study is to develop a multiclass classification model using an artificial neural network(ANN).The training dataset comprises 81 full factorial experiments with three levels of pulse-on time,pulse-off time,servo voltage,and wire feed rate as input parameters.The classes are labeled as normal machining,spark absence,and wire breakage.The model accuracy is tested by conducting 20 confirmation experiments,and the model is discovered to be 95%accurate in classifying the machining outcomes.The effects of process parameters on the process failures are discussed and analyzed.A microstructural analysis of the machined surface and worn wire surface is conducted.The developed model proved to be an easy and fast solution for verifying and eliminating process failures.
基金supported by National Natural Science Foundation of China (Grant Nos. 11701560, 11501093, 11631003, 11690012, 71532001 and 11525101)the Fundamental Research Funds for the Central Universities+5 种基金the Fundamental Research Funds for the Central Universities (Grant Nos. 130028613, 130028729 and 2412017FZ030)the Research Funds of Renmin University of China (Grant No. 16XNLF01)the Beijing Municipal Social Science Foundation (Grant No. 17GLC051)Fund for Building World-Class Universities (Disciplines) of Renmin University of ChinaChina’s National Key Research Special Program (Grant No. 2016YFC0207700)Center for Statistical Science at Peking University
文摘Naive Bayes(NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are becoming increasingly accessible, due to the fast development of various social network services and websites. By contrast, data generated by a social network are most likely to be dependent. The dependency is mainly determined by their social network relationships. Then, how to extend the classical NB method to social network data becomes a problem of great interest. To this end, we propose here a network-based naive Bayes(NNB) method, which generalizes the classical NB model to social network data. The key advantage of the NNB method is that it takes the network relationships into consideration. The computational efficiency makes the NNB method even feasible in large scale social networks. The statistical properties of the NNB model are theoretically investigated. Simulation studies have been conducted to demonstrate its finite sample performance.A real data example is also analyzed for illustration purpose.
基金supported by the National Basic Research Program of China(2007CB310705)the Hi-Tech Research and Development Program of China(2007AA01Z255)+2 种基金the National Natural Science Foundation of China(60711140087)PCSIRT(IRT0609)ISTCP(2006DFA 11040) of China
文摘Classification of network traffic using port-based or payload-based analysis is becoming increasingly difficult when many applications use dynamic port numbers, masquerading techniques, and encryption to avoid detection. In this article, an approach is presented for online traffic classification relying on the observation of the first n packets of a transmission control protocol (TCP) connection. Its key idea is to utilize the properties of the observed first ten packets of a TCP connection and Bayesian network method to build a classifier. This classifier can classify TCP flows dynamically as packets pass through it by deciding whether a TCP flow belongs to a given application. The experimental results show that the proposed approach performs well in online Internet traffic classification and that it is superior to naive Bayesian method.