文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了...文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了烦恼类型、心理疾病、伤害身体倾向三个文本多分类任务,提出了该语料库的数据预处理方案,同时研究了BERT、Ro BERTa等6个深度学习语言模型在这些多分类任务上的性能,并以这些模型作为基学习器构建了集成模型。实验结果表明,XLNet、RoBERTa、ERNIE模型在多个任务上的表现较为突出,同时集成学习能显著地提高分类模型的预测准确率,整体取得了良好的效果。展开更多
对领域知识挖掘利用的充分与否,直接影响到面向特定领域的词义消歧(Word sense disambiguation,WSD)的性能.本文提出一种基于领域知识的图模型词义消歧方法,该方法充分挖掘领域知识,为目标领域收集文本领域关联词作为文本领域知识,为目...对领域知识挖掘利用的充分与否,直接影响到面向特定领域的词义消歧(Word sense disambiguation,WSD)的性能.本文提出一种基于领域知识的图模型词义消歧方法,该方法充分挖掘领域知识,为目标领域收集文本领域关联词作为文本领域知识,为目标歧义词的各个词义获取词义领域标注作为词义领域知识;利用文本领域关联词和句子上下文词构建消歧图,并根据词义领域知识对消歧图进行调整;使用改进的图评分方法对消歧图的各个词义结点的重要度进行评分,选择正确的词义.该方法能有效地将领域知识整合到图模型中,在Koeling数据集上,取得了同类研究的最佳消歧效果.本文亦对多种图模型评分方法做了改进,进行了详细的对比实验研究.展开更多
An effective domain ontology automatically constructed is proposed in this paper. The main concept is using the Formal Concept Analysis to automatically establish domain ontology. Finally, the ontology is acted as the...An effective domain ontology automatically constructed is proposed in this paper. The main concept is using the Formal Concept Analysis to automatically establish domain ontology. Finally, the ontology is acted as the base for the Naive Bayes classifier to approve the effectiveness of the domain ontology for document classification. The 1752 documents divided into 10 categories are used to assess the effectiveness of the ontology, where 1252 and 500 documents are the training and testing documents, respectively. The Fl-measure is as the assessment criteria and the following three results are obtained. The average recall of Naive Bayes classifier is 0.94. Therefore, in recall, the performance of Naive Bayes classifier is excellent based on the automatically constructed ontology. The average precision of Naive Bayes classifier is 0.81. Therefore, in precision, the performance of Naive Bayes classifier is gored based on the automatically constructed ontology. The average Fl-measure for 10 categories by Naive Bayes classifier is 0.86. Therefore, the performance of Naive Bayes classifier is effective based on the automatically constructed ontology in the point of F 1-measure. Thus, the domain ontology automatically constructed could indeed be acted as the document categories to reach the effectiveness for document classification.展开更多
文摘文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了烦恼类型、心理疾病、伤害身体倾向三个文本多分类任务,提出了该语料库的数据预处理方案,同时研究了BERT、Ro BERTa等6个深度学习语言模型在这些多分类任务上的性能,并以这些模型作为基学习器构建了集成模型。实验结果表明,XLNet、RoBERTa、ERNIE模型在多个任务上的表现较为突出,同时集成学习能显著地提高分类模型的预测准确率,整体取得了良好的效果。
文摘对领域知识挖掘利用的充分与否,直接影响到面向特定领域的词义消歧(Word sense disambiguation,WSD)的性能.本文提出一种基于领域知识的图模型词义消歧方法,该方法充分挖掘领域知识,为目标领域收集文本领域关联词作为文本领域知识,为目标歧义词的各个词义获取词义领域标注作为词义领域知识;利用文本领域关联词和句子上下文词构建消歧图,并根据词义领域知识对消歧图进行调整;使用改进的图评分方法对消歧图的各个词义结点的重要度进行评分,选择正确的词义.该方法能有效地将领域知识整合到图模型中,在Koeling数据集上,取得了同类研究的最佳消歧效果.本文亦对多种图模型评分方法做了改进,进行了详细的对比实验研究.
文摘An effective domain ontology automatically constructed is proposed in this paper. The main concept is using the Formal Concept Analysis to automatically establish domain ontology. Finally, the ontology is acted as the base for the Naive Bayes classifier to approve the effectiveness of the domain ontology for document classification. The 1752 documents divided into 10 categories are used to assess the effectiveness of the ontology, where 1252 and 500 documents are the training and testing documents, respectively. The Fl-measure is as the assessment criteria and the following three results are obtained. The average recall of Naive Bayes classifier is 0.94. Therefore, in recall, the performance of Naive Bayes classifier is excellent based on the automatically constructed ontology. The average precision of Naive Bayes classifier is 0.81. Therefore, in precision, the performance of Naive Bayes classifier is gored based on the automatically constructed ontology. The average Fl-measure for 10 categories by Naive Bayes classifier is 0.86. Therefore, the performance of Naive Bayes classifier is effective based on the automatically constructed ontology in the point of F 1-measure. Thus, the domain ontology automatically constructed could indeed be acted as the document categories to reach the effectiveness for document classification.