Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF ide...Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF identification by leveraging the hardware-level features.However,traditional supervised learning methods require huge labeled training samples.Therefore,how to establish a highperformance supervised learning model with few labels under practical application is still challenging.To address this issue,we in this paper propose a novel RFF semi-supervised learning(RFFSSL)model which can obtain a better performance with few meta labels.Specifically,the proposed RFFSSL model is constituted by a teacher-student network,in which the student network learns from the pseudo label predicted by the teacher.Then,the output of the student model will be exploited to improve the performance of teacher among the labeled data.Furthermore,a comprehensive evaluation on the accuracy is conducted.We derive about 50 GB real long-term evolution(LTE)mobile phone’s raw signal datasets,which is used to evaluate various models.Experimental results demonstrate that the proposed RFFSSL scheme can achieve up to 97%experimental testing accuracy over a noisy environment only with 10%labeled samples when training samples equal to 2700.展开更多
The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has rec...The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has received a lot of research attention and various universal labeling methods have been proposed.However,the labeling task of malicious communication samples targeted at advanced threats has to face the two practical challenges:the difficulty of extracting effective features in advance and the complexity of the actual sample types.To address these problems,we proposed a sample labeling method for malicious communication based on semi-supervised deep neural network.This method supports continuous learning and optimization feature representation while labeling sample,and can handle uncertain samples that are outside the concerned sample types.According to the experimental results,our proposed deep neural network can automatically learn effective feature representation,and the validity of features is close to or even higher than that of features which extracted based on expert knowledge.Furthermore,our proposed method can achieve the labeling accuracy of 97.64%~98.50%,which is more accurate than the train-then-detect,kNN and LPA methodsin any labeled-sample proportion condition.The problem of insufficient labeled samples in many network attack detecting scenarios,and our proposed work can function as a reference for the sample labeling tasks in the similar real-world scenarios.展开更多
Sparse coding and supervised dictionary learning have rapidly developed in recent years,and achieved impressive performance in image classification. However, there is usually a limited number of labeled training sampl...Sparse coding and supervised dictionary learning have rapidly developed in recent years,and achieved impressive performance in image classification. However, there is usually a limited number of labeled training samples and a huge amount of unlabeled data in practical image classification,which degrades the discrimination of the learned dictionary. How to effectively utilize unlabeled training data and explore the information hidden in unlabeled data has drawn much attention of researchers. In this paper, we propose a novel discriminative semisupervised dictionary learning method using label propagation(SSD-LP). Specifically, we utilize a label propagation algorithm based on class-specific reconstruction errors to accurately estimate the identities of unlabeled training samples, and develop an algorithm for optimizing the discriminative dictionary and discriminative coding vectors simultaneously.Extensive experiments on face recognition, digit recognition, and texture classification demonstrate the effectiveness of the proposed method.展开更多
This paper describes our approach for the Chinese clinical named entity recognition(CNER) task organized by the 2020 China Conference on Knowledge Graph and Semantic Computing(CCKS) competition. In this task, we need ...This paper describes our approach for the Chinese clinical named entity recognition(CNER) task organized by the 2020 China Conference on Knowledge Graph and Semantic Computing(CCKS) competition. In this task, we need to identify the entity boundary and category labels of six entities from Chinese electronic medical record(EMR). We constructed a hybrid system composed of a semi-supervised noisy label learning model based on adversarial training and a rule post-processing module. The core idea of the hybrid system is to reduce the impact of data noise by optimizing the model results. Besides, we used post-processing rules to correct three cases of redundant labeling, missing labeling, and wrong labeling in the model prediction results. Our method proposed in this paper achieved strict criteria of 0.9156 and relax criteria of 0.9660 on the final test set, ranking first.展开更多
针对双关语样本短缺问题,研究提出了基于伪标签和迁移学习的双关语识别模型(pun detection based on Pseudo-label and transfer learning)。该模型利用上下文语义、音素向量和注意力机制生成伪标签;然后,迁移学习和置信度结合挑选可用...针对双关语样本短缺问题,研究提出了基于伪标签和迁移学习的双关语识别模型(pun detection based on Pseudo-label and transfer learning)。该模型利用上下文语义、音素向量和注意力机制生成伪标签;然后,迁移学习和置信度结合挑选可用的伪标签;最后,将伪标签数据和真实数据混合到网络中进行训练,重复伪标签标记和混合训练过程。一定程度上解决了双关语样本量少且获取困难的问题。使用该模型在SemEval 2017 shared task 7以及Pun of the Day数据集上进行双关语检测实验,结果表明模型性能均优于现有主流双关语识别方法。展开更多
基金supported by Innovation Talents Promotion Program of Shaanxi Province,China(No.2021TD08)。
文摘Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF identification by leveraging the hardware-level features.However,traditional supervised learning methods require huge labeled training samples.Therefore,how to establish a highperformance supervised learning model with few labels under practical application is still challenging.To address this issue,we in this paper propose a novel RFF semi-supervised learning(RFFSSL)model which can obtain a better performance with few meta labels.Specifically,the proposed RFFSSL model is constituted by a teacher-student network,in which the student network learns from the pseudo label predicted by the teacher.Then,the output of the student model will be exploited to improve the performance of teacher among the labeled data.Furthermore,a comprehensive evaluation on the accuracy is conducted.We derive about 50 GB real long-term evolution(LTE)mobile phone’s raw signal datasets,which is used to evaluate various models.Experimental results demonstrate that the proposed RFFSSL scheme can achieve up to 97%experimental testing accuracy over a noisy environment only with 10%labeled samples when training samples equal to 2700.
基金partially funded by the National Natural Science Foundation of China (Grant No. 61272447)National Entrepreneurship & Innovation Demonstration Base of China (Grant No. C700011)Key Research & Development Project of Sichuan Province of China (Grant No. 2018G20100)
文摘The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has received a lot of research attention and various universal labeling methods have been proposed.However,the labeling task of malicious communication samples targeted at advanced threats has to face the two practical challenges:the difficulty of extracting effective features in advance and the complexity of the actual sample types.To address these problems,we proposed a sample labeling method for malicious communication based on semi-supervised deep neural network.This method supports continuous learning and optimization feature representation while labeling sample,and can handle uncertain samples that are outside the concerned sample types.According to the experimental results,our proposed deep neural network can automatically learn effective feature representation,and the validity of features is close to or even higher than that of features which extracted based on expert knowledge.Furthermore,our proposed method can achieve the labeling accuracy of 97.64%~98.50%,which is more accurate than the train-then-detect,kNN and LPA methodsin any labeled-sample proportion condition.The problem of insufficient labeled samples in many network attack detecting scenarios,and our proposed work can function as a reference for the sample labeling tasks in the similar real-world scenarios.
基金partially supported by the National Natural Science Foundation for Young Scientists of China(No.61402289)the National Science Foundation of Guangdong Province(No.2014A030313558)
文摘Sparse coding and supervised dictionary learning have rapidly developed in recent years,and achieved impressive performance in image classification. However, there is usually a limited number of labeled training samples and a huge amount of unlabeled data in practical image classification,which degrades the discrimination of the learned dictionary. How to effectively utilize unlabeled training data and explore the information hidden in unlabeled data has drawn much attention of researchers. In this paper, we propose a novel discriminative semisupervised dictionary learning method using label propagation(SSD-LP). Specifically, we utilize a label propagation algorithm based on class-specific reconstruction errors to accurately estimate the identities of unlabeled training samples, and develop an algorithm for optimizing the discriminative dictionary and discriminative coding vectors simultaneously.Extensive experiments on face recognition, digit recognition, and texture classification demonstrate the effectiveness of the proposed method.
基金This work is supported by the National Key R&D Program of China(2020AAA0106400)the National Natural Science Foundation of China(No.61831022,No.61806201)+1 种基金the Key Research Program of the Chinese Academy of Sciences(Grant No.ZDBS-SSW-JSC006)This work is also supported by Beijing Academy of Artificial Intelligence(BAAI).
文摘This paper describes our approach for the Chinese clinical named entity recognition(CNER) task organized by the 2020 China Conference on Knowledge Graph and Semantic Computing(CCKS) competition. In this task, we need to identify the entity boundary and category labels of six entities from Chinese electronic medical record(EMR). We constructed a hybrid system composed of a semi-supervised noisy label learning model based on adversarial training and a rule post-processing module. The core idea of the hybrid system is to reduce the impact of data noise by optimizing the model results. Besides, we used post-processing rules to correct three cases of redundant labeling, missing labeling, and wrong labeling in the model prediction results. Our method proposed in this paper achieved strict criteria of 0.9156 and relax criteria of 0.9660 on the final test set, ranking first.
文摘针对双关语样本短缺问题,研究提出了基于伪标签和迁移学习的双关语识别模型(pun detection based on Pseudo-label and transfer learning)。该模型利用上下文语义、音素向量和注意力机制生成伪标签;然后,迁移学习和置信度结合挑选可用的伪标签;最后,将伪标签数据和真实数据混合到网络中进行训练,重复伪标签标记和混合训练过程。一定程度上解决了双关语样本量少且获取困难的问题。使用该模型在SemEval 2017 shared task 7以及Pun of the Day数据集上进行双关语检测实验,结果表明模型性能均优于现有主流双关语识别方法。