期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Exploiting Unlabeled Data for Neural Grammatical Error Detection 被引量:3
1
作者 Zhuo-Ran Liu Yang Liu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第4期758-767,共10页
Identifying and correcting grammatical errors in the text written by non-native writers have received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate da... Identifying and correcting grammatical errors in the text written by non-native writers have received increasing attention in recent years. Although a number of annotated corpora have been established to facilitate data-driven grammatical error detection and correction approaches, they are still limited in terms of quantity and coverage because human annotation is labor-intensive, time-consuming, and expensive. In this work, we propose to utilize unlabeled data to train neural network based grammatical error detection models. The basic idea is to cast error detection as a binary classification problem and derive positive and negative training examples from unlabeled data. We introduce an attention-based neural network to capture long-distance dependencies that influence the word being detected. Experiments show that the proposed approach significantly outperforms SVM and convolutional networks with fixed-size context window. 展开更多
关键词 unlabeled data grammatical error detection neural network
原文传递
Effcient poisoning attacks and defenses for unlabeled data in DDoS prediction of intelligent transportation systems 被引量:1
2
作者 Zhong Li Xianke Wu Changjun Jiang 《Security and Safety》 2022年第1期145-165,共21页
Nowadays,large numbers of smart sensors(e.g.,road-side cameras)which com-municate with nearby base stations could launch distributed denial of services(DDoS)attack storms in intelligent transportation systems.DDoS att... Nowadays,large numbers of smart sensors(e.g.,road-side cameras)which com-municate with nearby base stations could launch distributed denial of services(DDoS)attack storms in intelligent transportation systems.DDoS attacks disable the services provided by base stations.Thus in this paper,considering the uneven communication traffic ows and privacy preserving,we give a hidden Markov model-based prediction model by utilizing the multi-step characteristic of DDoS with a federated learning framework to predict whether DDoS attacks will happen on base stations in the future.However,in the federated learning,we need to consider the problem of poisoning attacks due to malicious participants.The poisoning attacks will lead to the intelligent transportation systems paralysis without security protection.Traditional poisoning attacks mainly apply to the classi cation model with labeled data.In this paper,we propose a reinforcement learning-based poisoningmethod speci cally for poisoning the prediction model with unlabeled data.Besides,previous related defense strategies rely on validation datasets with labeled data in the server.However,it is unrealistic since the local training datasets are not uploaded to the server due to privacy preserving,and our datasets are also unlabeled.Furthermore,we give a validation dataset-free defense strategy based on Dempster-Shafer(D-S)evidence theory avoiding anomaly aggregation to obtain a robust global model for precise DDoS prediction.In our experiments,we simulate 3000 points in combination with DARPA2000 dataset to carry out evaluations.The results indicate that our poisoning method can successfully poison the global prediction model with unlabeled data in a short time.Meanwhile,we compare our proposed defense algorithm with three popularly used defense algorithms.The results show that our defense method has a high accuracy rate of excluding poisoners and can obtain a high attack prediction probability. 展开更多
关键词 Poisoning attacks DEFENSES Multi-step DDoS prediction unlabeled data Intel-ligent transportation systems
原文传递
Iterative Semi-Supervised Learning Using Softmax Probability 被引量:1
3
作者 Heewon Chung Jinseok Lee 《Computers, Materials & Continua》 SCIE EI 2022年第9期5607-5628,共22页
For the classification problem in practice,one of the challenging issues is to obtain enough labeled data for training.Moreover,even if such labeled data has been sufficiently accumulated,most datasets often exhibit l... For the classification problem in practice,one of the challenging issues is to obtain enough labeled data for training.Moreover,even if such labeled data has been sufficiently accumulated,most datasets often exhibit long-tailed distribution with heavy class imbalance,which results in a biased model towards a majority class.To alleviate such class imbalance,semisupervised learning methods using additional unlabeled data have been considered.However,as a matter of course,the accuracy is much lower than that from supervised learning.In this study,under the assumption that additional unlabeled data is available,we propose the iterative semi-supervised learning algorithms,which iteratively correct the labeling of the extra unlabeled data based on softmax probabilities.The results show that the proposed algorithms provide the accuracy as high as that from the supervised learning.To validate the proposed algorithms,we tested on the two scenarios:with the balanced unlabeled dataset and with the imbalanced unlabeled dataset.Under both scenarios,our proposed semi-supervised learning algorithms provided higher accuracy than previous state-of-the-arts.Code is available at https://github.com/HeewonChung92/iterative-semi-learning. 展开更多
关键词 Semi-supervised learning class imbalance iterative learning unlabeled data
下载PDF
Research and Implementation of Unsupervised Clustering-Based Intrusion Detection
4
作者 Luo Min, Zhang Huan\|guo, Wang Li\|na School of Computer, Wuhan University, Wuhan 430072, Hubei, China 《Wuhan University Journal of Natural Sciences》 CAS 2003年第03A期803-807,共5页
An unsupervised clustering\|based intrusion detection algorithm is discussed in this paper. The basic idea of the algorithm is to produce the cluster by comparing the distances of unlabeled training data sets. With th... An unsupervised clustering\|based intrusion detection algorithm is discussed in this paper. The basic idea of the algorithm is to produce the cluster by comparing the distances of unlabeled training data sets. With the classified data instances, anomaly data clusters can be easily identified by normal cluster ratio and the identified cluster can be used in real data detection. The benefit of the algorithm is that it doesn't need labeled training data sets. The experiment concludes that this approach can detect unknown intrusions efficiently in the real network connections via using the data sets of KDD99. 展开更多
关键词 intrusion detection data mining unsupervised clustering unlabeled data
下载PDF
Learning to select pseudo labels: a semi-supervisedmethod for named entity recognition 被引量:2
5
作者 Zhen-zhen LI Da-wei FENG +1 位作者 Dong-sheng LI Xi-cheng LU 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第6期903-916,共14页
Deep learning models have achieved state-of-the-art performance in named entity recognition(NER);the good performance,however,relies heavily on substantial amounts of labeled data.In some specific areas such as medica... Deep learning models have achieved state-of-the-art performance in named entity recognition(NER);the good performance,however,relies heavily on substantial amounts of labeled data.In some specific areas such as medical,financial,and military domains,labeled data is very scarce,while unlabeled data is readily available.Previous studies have used unlabeled data to enrich word representations,but a large amount of entity information in unlabeled data is neglected,which may be beneficial to the NER task.In this study,we propose a semi-supervised method for NER tasks,which learns to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels.Pseudo labels are automatically generated for unlabeled data and used as if they were true labels.Our semi-supervised framework includes three steps:constructing an optimal single neural model for a specific NER task,learning a module that evaluates pseudo labels,and creating new labeled data and improving the NER model iteratively.Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model.Even when we use only pre-trained static word embeddings and do not rely on any external knowledge,our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks. 展开更多
关键词 Named entity recognition unlabeled data Deep learning Semi-supervised method
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部