Novelty detection is to retrieve new information and filter redundancy fromgiven sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach tonovelty detection with semantic distance c...Novelty detection is to retrieve new information and filter redundancy fromgiven sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach tonovelty detection with semantic distance computation. The motivation is to expand a sentence byintroducing semantic information. Computation on semantic distance between sentences incorporatesWordNet with statistical information. The novelty detection is treated as a binary classificationproblem: new sentence or not. The feature vector, used in the vector space model for classification,consists of various factors, including the semantic distance from the sentence to the topic and thedistance from the sentence to the previous relevant context occurring before it. New sentences arethen detected with Winnow and support vector machine classifiers, respectively. Several experimentsare conducted to survey the relationship between different factors and performance. It is provedthat semantic computation is promising in novelty detection. The ratio of new sentence size torelevant size is further studied given different relevant document sizes. It is found that the ratioreduced with a certain speed (about 0.86). Then another group of experiments is performedsupervised with the ratio. It is demonstrated that the ratio is helpful to improve the noveltydetection performance.展开更多
Generative adversarial network(GAN) is the most exciting machine learning breakthrough in recent years,and it trains the learning model by finding the Nash equilibrium of a two-player zero-sum game.GAN is composed of ...Generative adversarial network(GAN) is the most exciting machine learning breakthrough in recent years,and it trains the learning model by finding the Nash equilibrium of a two-player zero-sum game.GAN is composed of a generator and a discriminator,both trained with the adversarial learning mechanism.In this paper,we introduce and investigate the use of GAN for novelty detection.In training,GAN learns from ordinary data.Then,using previously unknown data,the generator and the discriminator with the designed decision boundaries can both be used to separate novel patterns from ordinary patterns.The proposed GAN-based novelty detection method demonstrates a competitive performance on the MNIST digit database and the Tennessee Eastman(TE) benchmark process compared with the PCA-based novelty detection methods using Hotelling's T^2 and squared prediction error statistics.展开更多
Supervised fault diagnosis typically assumes that all the types of machinery failures are known.However,in practice unknown types of defect,i.e.,novelties,may occur,whose detection is a challenging task.In this paper,...Supervised fault diagnosis typically assumes that all the types of machinery failures are known.However,in practice unknown types of defect,i.e.,novelties,may occur,whose detection is a challenging task.In this paper,a novel fault diagnostic method is developed for both diagnostics and detection of novelties.To this end,a sparse autoencoder-based multi-head Deep Neural Network(DNN)is presented to jointly learn a shared encoding representation for both unsupervised reconstruction and supervised classification of the monitoring data.The detection of novelties is based on the reconstruction error.Moreover,the computational burden is reduced by directly training the multi-head DNN with rectified linear unit activation function,instead of performing the pre-training and fine-tuning phases required for classical DNNs.The addressed method is applied to a benchmark bearing case study and to experimental data acquired from a delta 3D printer.The results show that its performance is satisfactory both in detection of novelties and fault diagnosis,outperforming other state-of-the-art methods.This research proposes a novel fault diagnostics method which can not only diagnose the known type of defect,but also detect unknown types of defects.展开更多
Many studies have indicated that structural strain will be significantly influenced by temperature variations,and a good understanding of the effect of temperature on structural strain is essential.A structural health...Many studies have indicated that structural strain will be significantly influenced by temperature variations,and a good understanding of the effect of temperature on structural strain is essential.A structural health monitoring system has been installed in a typical Tibetan timber building to measure the structural strains and ambient temperature since 2012.This paper presents the correlation between temperature and strain data from the monitored structure.A method combining singular spectrum analysis and polynomial regression is proposed for modeling the temperature induced strains in the structure.Singular spectrum analysis is applied to smooth the temperature data,and the correlation between the resulting temperature time series and the measured strains is obtained by polynomial regression.Parameters of the singular spectrum analysis and the regression model are selected to have the least regression error.Results show that the proposed method has both good reproduction and prediction capabilities for temperature induced strains,and that the method is accurate and effective for eliminating the effect of temperature from the measured strain.A standardized Novelty Index based on the residual strain is also used for the condition assessment of the structure.展开更多
Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.T...Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.Thus it is important to differentiate abnormal or unknown patterns from normal pattern with novelty detection methods.One-class support vector machine (OCSVM) that has been commonly used for novelty detection cannot deal well with large scale samples.In order to model the normal pattern of the turbopump with OCSVM and so as to monitor the condition of the turbopump,a monitoring method that integrates OCSVM with incremental clustering is presented.In this method,the incremental clustering is used for sample reduction by extracting representative vectors from a large training set.The representative vectors are supposed to distribute uniformly in the object region and fulfill the region.And training OCSVM on these representative vectors yields a novelty detector.By applying this method to the analysis of the turbopump's historical test data,it shows that the incremental clustering algorithm can extract 91 representative points from more than 36 000 training vectors,and the OCSVM detector trained on these 91 representative points can recognize spikes in vibration signals caused by different abnormal events such as vane shedding,rub-impact and sensor faults.This monitoring method does not need fault samples during training as classical recognition methods.The method resolves the learning problem of large samples and is an alternative method for condition monitoring of the LRE turbopump.展开更多
The building sector significantly contributes to climate change.To improve its carbon footprint,applications like model predictive control and predictive maintenance rely on system models.However,the high modeling eff...The building sector significantly contributes to climate change.To improve its carbon footprint,applications like model predictive control and predictive maintenance rely on system models.However,the high modeling effort hinders practical application.Machine learning models can significantly reduce this modeling effort.To ensure a machine learning model’s reliability in all operating states,it is essential to know its validity domain.Operating states outside the validity domain might lead to extrapolation,resulting in unpredictable behavior.This paper addresses the challenge of identifying extrapolation in data-driven building energy system models and aims to raise knowledge about it.For that,a novel approach is proposed that calibrates novelty detection algorithms towards the machine learning model.Suitable novelty detection algorithms are identified through a literature review and a benchmark test with 15 candidates.A subset of five algorithms is then evaluated on building energy systems.First,on two-dimensional data,displaying the results with a novel visualization scheme.Then on more complex multi-dimensional use cases.The methodology performs well,and the validity domain could be approximated.The visualization allows for a profound analysis and an improved understanding of the fundamental effects behind a machine learning model’s validity domain and the extrapolation regimes.展开更多
Purpose–The task of internet intrusion detection is to detect anomalous network connections caused by intrusive activities.There have been many intrusion detection schemes proposed,most of which apply both normal and...Purpose–The task of internet intrusion detection is to detect anomalous network connections caused by intrusive activities.There have been many intrusion detection schemes proposed,most of which apply both normal and intrusion data to construct classifiers.However,normal data and intrusion data are often seriously imbalanced because intrusive connection data are usually difficult to collect.Internet intrusion detection can be considered as a novelty detection problem,which is the identification of new or unknown data,to which a learning system has not been exposed during training.This paper aims to address this issue.Design/methodology/approach–In this paper,a novelty detection-based intrusion detection system is proposed by combining the self-organizing map(SOM)and the kernel auto-associator(KAA)model proposed earlier by the first author.The KAA model is a generalization of auto-associative networks by training to recall the inputs through kernel subspace.For anomaly detection,the SOM organizes the prototypes of samples while the KAA provides data description for the normal connection patterns.The hybrid SOM/KAA model can also be applied to classify different types of attacks.Findings–Using the KDD CUP,1999 dataset,the performance of the proposed scheme in separating normal connection patterns from intrusive connection patterns was compared with some state-of-art novelty detection methods,showing marked improvements in terms of the high intrusion detection accuracy and low false positives.Simulations on the classification of attack categories also demonstrate favorable results of the accuracy,which are comparable to the entries from the KDD CUP,1999 data mining competition.Originality/value–The hybrid model of SOM and the KAA model can achieve significant results for intrusion detection.展开更多
Model predictive control is a promising approach to reduce the CO 2 emissions in the building sector.However,the vast modeling effort hampers the widescale practical application.Here,data-driven process models,like ar...Model predictive control is a promising approach to reduce the CO 2 emissions in the building sector.However,the vast modeling effort hampers the widescale practical application.Here,data-driven process models,like artificial neural networks,are well-suited to automatize the modeling.However,the underlying data set strongly determines the quality and reliability of artificial neural networks.In general,the validity domain of a machine learning model is limited to the data that was used to train it.Predictions based on system states outside that domain,so-called extrapolations,are unreliable and can negatively influence the control quality.We present a safe operation approach combined with online learning to deal with extrapolation in data-driven model predictive control.Here,the k-nearest neighbor algorithm is used to detect extrapolation to switch to a robust fallback controller.By continuously retraining the artificial neural networks during operation,we successively increase the validity domain of the artificial neural networks and the control quality.We apply the approach to control a building energy system provided by the BOPTEST framework.We compare controllers based on two data sets,one with extensive system excitation and one with baseline operation.The system is controlled to a fixed temperature set point in baseline operation.Therefore,the artificial neural networks trained on this data set tend to extrapolate in other operating points.We show that safe operation in combination with online learning significantly improves performance.展开更多
Stimulus-specific adaptation(SSA),defined as a decrease in responses to a common stimulus that only partially generalizes to other rare stimuli,is a widespread phenomenon in the brain that is believed to be related to...Stimulus-specific adaptation(SSA),defined as a decrease in responses to a common stimulus that only partially generalizes to other rare stimuli,is a widespread phenomenon in the brain that is believed to be related to novelty detection.Although cross-modal sensory processing is also a widespread phenomenon,the interaction between the two phenomena is not well understood.In this study,the thalamic reticular nucleus(TRN),which is regarded as a hub of the attentional system that contains multi-modal neurons,was investigated.The results showed that SSA existed in an interactive oddball stimulation,which mimics stimulation changes from one modality to another.In the bimodal integration,SSA to bimodal stimulation was stronger than to visual stimulation alone but similar to auditory stimulation alone,which indicated a limited integrative effect.Collectively,the present results provide evidence for independent cross-modal processing in bimodal TRN neurons.展开更多
文摘Novelty detection is to retrieve new information and filter redundancy fromgiven sentences that are relevant to a specific topic. In TREC2003, the authors tried an approach tonovelty detection with semantic distance computation. The motivation is to expand a sentence byintroducing semantic information. Computation on semantic distance between sentences incorporatesWordNet with statistical information. The novelty detection is treated as a binary classificationproblem: new sentence or not. The feature vector, used in the vector space model for classification,consists of various factors, including the semantic distance from the sentence to the topic and thedistance from the sentence to the previous relevant context occurring before it. New sentences arethen detected with Winnow and support vector machine classifiers, respectively. Several experimentsare conducted to survey the relationship between different factors and performance. It is provedthat semantic computation is promising in novelty detection. The ratio of new sentence size torelevant size is further studied given different relevant document sizes. It is found that the ratioreduced with a certain speed (about 0.86). Then another group of experiments is performedsupervised with the ratio. It is demonstrated that the ratio is helpful to improve the noveltydetection performance.
文摘Generative adversarial network(GAN) is the most exciting machine learning breakthrough in recent years,and it trains the learning model by finding the Nash equilibrium of a two-player zero-sum game.GAN is composed of a generator and a discriminator,both trained with the adversarial learning mechanism.In this paper,we introduce and investigate the use of GAN for novelty detection.In training,GAN learns from ordinary data.Then,using previously unknown data,the generator and the discriminator with the designed decision boundaries can both be used to separate novel patterns from ordinary patterns.The proposed GAN-based novelty detection method demonstrates a competitive performance on the MNIST digit database and the Tennessee Eastman(TE) benchmark process compared with the PCA-based novelty detection methods using Hotelling's T^2 and squared prediction error statistics.
基金Supported by National Natural Science Foundation of China(Grant Nos.52005103,71801046,51775112,51975121)Guangdong Province Basic and Applied Basic Research Foundation of China(Grant No.2019B1515120095)+1 种基金Intelligent Manufacturing PHM Innovation Team Program(Grant Nos.2018KCXTD029,TDYB2019010)MoST International Cooperation Program(6-14).
文摘Supervised fault diagnosis typically assumes that all the types of machinery failures are known.However,in practice unknown types of defect,i.e.,novelties,may occur,whose detection is a challenging task.In this paper,a novel fault diagnostic method is developed for both diagnostics and detection of novelties.To this end,a sparse autoencoder-based multi-head Deep Neural Network(DNN)is presented to jointly learn a shared encoding representation for both unsupervised reconstruction and supervised classification of the monitoring data.The detection of novelties is based on the reconstruction error.Moreover,the computational burden is reduced by directly training the multi-head DNN with rectified linear unit activation function,instead of performing the pre-training and fine-tuning phases required for classical DNNs.The addressed method is applied to a benchmark bearing case study and to experimental data acquired from a delta 3D printer.The results show that its performance is satisfactory both in detection of novelties and fault diagnosis,outperforming other state-of-the-art methods.This research proposes a novel fault diagnostics method which can not only diagnose the known type of defect,but also detect unknown types of defects.
基金National Natural Science Foundation of China for Excellent Young Scholars under Grant No.51422801National Natural Science Foundation of China under Key Program 51338001Beijing Natural Science Foundation of China under Key Program:8151003
文摘Many studies have indicated that structural strain will be significantly influenced by temperature variations,and a good understanding of the effect of temperature on structural strain is essential.A structural health monitoring system has been installed in a typical Tibetan timber building to measure the structural strains and ambient temperature since 2012.This paper presents the correlation between temperature and strain data from the monitored structure.A method combining singular spectrum analysis and polynomial regression is proposed for modeling the temperature induced strains in the structure.Singular spectrum analysis is applied to smooth the temperature data,and the correlation between the resulting temperature time series and the measured strains is obtained by polynomial regression.Parameters of the singular spectrum analysis and the regression model are selected to have the least regression error.Results show that the proposed method has both good reproduction and prediction capabilities for temperature induced strains,and that the method is accurate and effective for eliminating the effect of temperature from the measured strain.A standardized Novelty Index based on the residual strain is also used for the condition assessment of the structure.
基金supported by National Natural Science Foundation of China (Grant No. 50675219)Hu’nan Provincial Science Committee Excellent Youth Foundation of China (Grant No. 08JJ1008)
文摘Turbopump condition monitoring is a significant approach to ensure the safety of liquid rocket engine (LRE).Because of lack of fault samples,a monitoring system cannot be trained on all possible condition patterns.Thus it is important to differentiate abnormal or unknown patterns from normal pattern with novelty detection methods.One-class support vector machine (OCSVM) that has been commonly used for novelty detection cannot deal well with large scale samples.In order to model the normal pattern of the turbopump with OCSVM and so as to monitor the condition of the turbopump,a monitoring method that integrates OCSVM with incremental clustering is presented.In this method,the incremental clustering is used for sample reduction by extracting representative vectors from a large training set.The representative vectors are supposed to distribute uniformly in the object region and fulfill the region.And training OCSVM on these representative vectors yields a novelty detector.By applying this method to the analysis of the turbopump's historical test data,it shows that the incremental clustering algorithm can extract 91 representative points from more than 36 000 training vectors,and the OCSVM detector trained on these 91 representative points can recognize spikes in vibration signals caused by different abnormal events such as vane shedding,rub-impact and sensor faults.This monitoring method does not need fault samples during training as classical recognition methods.The method resolves the learning problem of large samples and is an alternative method for condition monitoring of the LRE turbopump.
基金the financial support by the Federal Ministry for Economic Affairs and Climate Action(BMWK),promotional reference 03EN1066A and 03EN3060Dfunding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No.101023666.
文摘The building sector significantly contributes to climate change.To improve its carbon footprint,applications like model predictive control and predictive maintenance rely on system models.However,the high modeling effort hinders practical application.Machine learning models can significantly reduce this modeling effort.To ensure a machine learning model’s reliability in all operating states,it is essential to know its validity domain.Operating states outside the validity domain might lead to extrapolation,resulting in unpredictable behavior.This paper addresses the challenge of identifying extrapolation in data-driven building energy system models and aims to raise knowledge about it.For that,a novel approach is proposed that calibrates novelty detection algorithms towards the machine learning model.Suitable novelty detection algorithms are identified through a literature review and a benchmark test with 15 candidates.A subset of five algorithms is then evaluated on building energy systems.First,on two-dimensional data,displaying the results with a novel visualization scheme.Then on more complex multi-dimensional use cases.The methodology performs well,and the validity domain could be approximated.The visualization allows for a profound analysis and an improved understanding of the fundamental effects behind a machine learning model’s validity domain and the extrapolation regimes.
基金Suzhou Municipal Science and Technology Foundation Key Technologies for Video Objects Intelligent Analysis for Criminal Investigation(SS201109).
文摘Purpose–The task of internet intrusion detection is to detect anomalous network connections caused by intrusive activities.There have been many intrusion detection schemes proposed,most of which apply both normal and intrusion data to construct classifiers.However,normal data and intrusion data are often seriously imbalanced because intrusive connection data are usually difficult to collect.Internet intrusion detection can be considered as a novelty detection problem,which is the identification of new or unknown data,to which a learning system has not been exposed during training.This paper aims to address this issue.Design/methodology/approach–In this paper,a novelty detection-based intrusion detection system is proposed by combining the self-organizing map(SOM)and the kernel auto-associator(KAA)model proposed earlier by the first author.The KAA model is a generalization of auto-associative networks by training to recall the inputs through kernel subspace.For anomaly detection,the SOM organizes the prototypes of samples while the KAA provides data description for the normal connection patterns.The hybrid SOM/KAA model can also be applied to classify different types of attacks.Findings–Using the KDD CUP,1999 dataset,the performance of the proposed scheme in separating normal connection patterns from intrusive connection patterns was compared with some state-of-art novelty detection methods,showing marked improvements in terms of the high intrusion detection accuracy and low false positives.Simulations on the classification of attack categories also demonstrate favorable results of the accuracy,which are comparable to the entries from the KDD CUP,1999 data mining competition.Originality/value–The hybrid model of SOM and the KAA model can achieve significant results for intrusion detection.
基金This project has received funding from the European Union’s Hori-zon 2020 research and innovation programme under grant agreement No.101023666.
文摘Model predictive control is a promising approach to reduce the CO 2 emissions in the building sector.However,the vast modeling effort hampers the widescale practical application.Here,data-driven process models,like artificial neural networks,are well-suited to automatize the modeling.However,the underlying data set strongly determines the quality and reliability of artificial neural networks.In general,the validity domain of a machine learning model is limited to the data that was used to train it.Predictions based on system states outside that domain,so-called extrapolations,are unreliable and can negatively influence the control quality.We present a safe operation approach combined with online learning to deal with extrapolation in data-driven model predictive control.Here,the k-nearest neighbor algorithm is used to detect extrapolation to switch to a robust fallback controller.By continuously retraining the artificial neural networks during operation,we successively increase the validity domain of the artificial neural networks and the control quality.We apply the approach to control a building energy system provided by the BOPTEST framework.We compare controllers based on two data sets,one with extensive system excitation and one with baseline operation.The system is controlled to a fixed temperature set point in baseline operation.Therefore,the artificial neural networks trained on this data set tend to extrapolate in other operating points.We show that safe operation in combination with online learning significantly improves performance.
基金This work was supported by the National Natural Science Foundation of China(31872768,32171044,and 32100827)Zhejiang University K.P.Chao's High Technology Development Foundation.
文摘Stimulus-specific adaptation(SSA),defined as a decrease in responses to a common stimulus that only partially generalizes to other rare stimuli,is a widespread phenomenon in the brain that is believed to be related to novelty detection.Although cross-modal sensory processing is also a widespread phenomenon,the interaction between the two phenomena is not well understood.In this study,the thalamic reticular nucleus(TRN),which is regarded as a hub of the attentional system that contains multi-modal neurons,was investigated.The results showed that SSA existed in an interactive oddball stimulation,which mimics stimulation changes from one modality to another.In the bimodal integration,SSA to bimodal stimulation was stronger than to visual stimulation alone but similar to auditory stimulation alone,which indicated a limited integrative effect.Collectively,the present results provide evidence for independent cross-modal processing in bimodal TRN neurons.