核电装备质量文本描述了核电装备在设计、采购、施工和调试阶段出现的质量缺陷等问题。由于不同阶段质量事件的发生频率不同,且同一装备对应不同阶段的质量文本中存在相同的关键词和相似的表述形式,针对类型数量不均衡和语义描述耦合的...核电装备质量文本描述了核电装备在设计、采购、施工和调试阶段出现的质量缺陷等问题。由于不同阶段质量事件的发生频率不同,且同一装备对应不同阶段的质量文本中存在相同的关键词和相似的表述形式,针对类型数量不均衡和语义描述耦合的质量文本分类问题,提出一种融合正则反馈焦点损失函数的改进循环池化网络分类模型。首先,采用BERT(Bidirectional Encoder Representation from Transformers)将核电装备质量文本转化为词向量;然后,提出一个改进的3层循环池化网络的分类模型结构,通过增加中间层并选择合适权重,扩大参数训练的提取空间,提升表征质量缺陷语义特征的能力;接着,提出正则反馈焦点损失函数来训练提出分类模型的参数,通过正则项使损失函数的梯度变化更稳定,根据反馈项对损失函数进行基于真实值和预测值之间误差的迭代调整,解决了不均衡样本在训练过程中梯度偏向不均衡的问题;最后,通过归一化指数函数计算出核电装备质量事件对应的阶段。在某核电公司真实数据集和公共数据集上,与Fast_Text网络相比,所提模型的F1值分别提高了2个百分点和1个百分点,实验结果表明该模型在文本分类任务中具有较高的准确性。展开更多
The data generated from non-Euclidean domains and its graphical representation(with complex-relationship object interdependence)applications has observed an exponential growth.The sophistication of graph data has pose...The data generated from non-Euclidean domains and its graphical representation(with complex-relationship object interdependence)applications has observed an exponential growth.The sophistication of graph data has posed consequential obstacles to the existing machine learning algorithms.In this study,we have considered a revamped version of a semi-supervised learning algorithm for graph-structured data to address the issue of expanding deep learning approaches to represent the graph data.Additionally,the quantum information theory has been applied through Graph Neural Networks(GNNs)to generate Riemannian metrics in closed-form of several graph layers.In further,to pre-process the adjacency matrix of graphs,a new formulation is established to incorporate high order proximities.The proposed scheme has shown outstanding improvements to overcome the deficiencies in Graph Convolutional Network(GCN),particularly,the information loss and imprecise information representation with acceptable computational overhead.Moreover,the proposed Quantum Graph Convolutional Network(QGCN)has significantly strengthened the GCN on semi-supervised node classification tasks.In parallel,it expands the generalization process with a significant difference by making small random perturbationsG of the graph during the training process.The evaluation results are provided on three benchmark datasets,including Citeseer,Cora,and PubMed,that distinctly delineate the superiority of the proposed model in terms of computational accuracy against state-of-the-art GCN and three other methods based on the same algorithms in the existing literature.展开更多
In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improve...In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improved mutual information algorithm, which is on the basis of traditional improved mutual information methods that enbance the MI value of negative characteristics and feature' s frequency, supports the concept of concentration degree and dispersion degree. In accordance with the concept of concentration degree and dispersion degree, formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these. In this paper, the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods. The experimental results showed that the improved mutu- al information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain. Through the introduction of the concept of concentration degree and dispersion degree, the improved mutual information feature selection method greatly improves the performance of text classification system.展开更多
With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public securit...With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public security work,public opinion news classification is an important topic.Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time.This paper introduces a combinedconvolutional neural network text classification model based on word2vec and improved TF-IDF:firstly,the word vector is trained through word2vec model,then the weight of each word is calculated by using the improved TFIDF algorithm based on class frequency variance,and the word vector and weight are combined to construct the text vector representation.Finally,the combined-convolutional neural network is used to train and test the Thucnews data set.The results show that the classification effect of this model is better than the traditional Text-RNN model,the traditional Text-CNN model and word2vec-CNN model.The test accuracy is 97.56%,the accuracy rate is 97%,the recall rate is 97%,and the F1-score is 97%.展开更多
A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system wit...A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters.展开更多
文摘核电装备质量文本描述了核电装备在设计、采购、施工和调试阶段出现的质量缺陷等问题。由于不同阶段质量事件的发生频率不同,且同一装备对应不同阶段的质量文本中存在相同的关键词和相似的表述形式,针对类型数量不均衡和语义描述耦合的质量文本分类问题,提出一种融合正则反馈焦点损失函数的改进循环池化网络分类模型。首先,采用BERT(Bidirectional Encoder Representation from Transformers)将核电装备质量文本转化为词向量;然后,提出一个改进的3层循环池化网络的分类模型结构,通过增加中间层并选择合适权重,扩大参数训练的提取空间,提升表征质量缺陷语义特征的能力;接着,提出正则反馈焦点损失函数来训练提出分类模型的参数,通过正则项使损失函数的梯度变化更稳定,根据反馈项对损失函数进行基于真实值和预测值之间误差的迭代调整,解决了不均衡样本在训练过程中梯度偏向不均衡的问题;最后,通过归一化指数函数计算出核电装备质量事件对应的阶段。在某核电公司真实数据集和公共数据集上,与Fast_Text网络相比,所提模型的F1值分别提高了2个百分点和1个百分点,实验结果表明该模型在文本分类任务中具有较高的准确性。
基金supported by the National Key Research and Development Program of China(2018YFB1600600)the National Natural Science Foundation of China under(61976034,U1808206)the Dalian Science and Technology Innovation Fund(2019J12GX035).
文摘The data generated from non-Euclidean domains and its graphical representation(with complex-relationship object interdependence)applications has observed an exponential growth.The sophistication of graph data has posed consequential obstacles to the existing machine learning algorithms.In this study,we have considered a revamped version of a semi-supervised learning algorithm for graph-structured data to address the issue of expanding deep learning approaches to represent the graph data.Additionally,the quantum information theory has been applied through Graph Neural Networks(GNNs)to generate Riemannian metrics in closed-form of several graph layers.In further,to pre-process the adjacency matrix of graphs,a new formulation is established to incorporate high order proximities.The proposed scheme has shown outstanding improvements to overcome the deficiencies in Graph Convolutional Network(GCN),particularly,the information loss and imprecise information representation with acceptable computational overhead.Moreover,the proposed Quantum Graph Convolutional Network(QGCN)has significantly strengthened the GCN on semi-supervised node classification tasks.In parallel,it expands the generalization process with a significant difference by making small random perturbationsG of the graph during the training process.The evaluation results are provided on three benchmark datasets,including Citeseer,Cora,and PubMed,that distinctly delineate the superiority of the proposed model in terms of computational accuracy against state-of-the-art GCN and three other methods based on the same algorithms in the existing literature.
基金Sponsored by the National Nature Science Foundation Projects (Grant No. 60773070,60736044)
文摘In order to solve the poor performance in text classification when using traditional formula of mutual information (MI) , a feature selection algorithm were proposed based on improved mutual information. The improved mutual information algorithm, which is on the basis of traditional improved mutual information methods that enbance the MI value of negative characteristics and feature' s frequency, supports the concept of concentration degree and dispersion degree. In accordance with the concept of concentration degree and dispersion degree, formulas which embody concentration degree and dispersion degree were constructed and the improved mutual information was implemented based on these. In this paper, the feature selection algorithm was applied based on improved mutual information to a text classifier based on Biomimetic Pattern Recognition and it was compared with several other feature selection methods. The experimental results showed that the improved mutu- al information feature selection method greatly enhances the performance compared with traditional mutual information feature selection methods and the performance is better than that of information gain. Through the introduction of the concept of concentration degree and dispersion degree, the improved mutual information feature selection method greatly improves the performance of text classification system.
基金This work was supported by Ministry of public security technology research program[Grant No.2020JSYJC22ok]Fundamental Research Funds for the Central Universities(No.2021JKF215)+1 种基金Open Research Fund of the Public Security Behavioral Science Laboratory,People’s Public Security University of China(2020SYS03)Police and people build/share a smart community(PJ13-201912-0525).
文摘With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public security work,public opinion news classification is an important topic.Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time.This paper introduces a combinedconvolutional neural network text classification model based on word2vec and improved TF-IDF:firstly,the word vector is trained through word2vec model,then the weight of each word is calculated by using the improved TFIDF algorithm based on class frequency variance,and the word vector and weight are combined to construct the text vector representation.Finally,the combined-convolutional neural network is used to train and test the Thucnews data set.The results show that the classification effect of this model is better than the traditional Text-RNN model,the traditional Text-CNN model and word2vec-CNN model.The test accuracy is 97.56%,the accuracy rate is 97%,the recall rate is 97%,and the F1-score is 97%.
基金Supported by the National High-Technology Re-search and Development Program(2005AA122210) the National Out-standing Youth Foundation (60325104)
文摘A kind of Web voice browser based on improved synchronous linear predictive coding (ISLPC) and Text-toSpeech (TTS) algorithm and Internet application was proposed. The paper analyzes the features of TTS system with ISLPC speech synthesis and discusses the design and implementation of ISLPC TTS-based Web voice browser. The browser integrates Web technology, Chinese information processing, artificial intelligence and the key technology of Chinese ISLPC speech synthesis. It's a visual and audible web browser that can improve information precision for network users. The evaluation results show that ISLPC-based TTS model has a better performance than other browsers in voice quality and capability of identifying Chinese characters.