文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了...文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了烦恼类型、心理疾病、伤害身体倾向三个文本多分类任务,提出了该语料库的数据预处理方案,同时研究了BERT、Ro BERTa等6个深度学习语言模型在这些多分类任务上的性能,并以这些模型作为基学习器构建了集成模型。实验结果表明,XLNet、RoBERTa、ERNIE模型在多个任务上的表现较为突出,同时集成学习能显著地提高分类模型的预测准确率,整体取得了良好的效果。展开更多
This paper reviews three main perspectives of chunk analyses: traditional phraseological, psycholinguistic, and corpus linguistic perspectives. Traditional phraseological perspective focuses on syntactic and semantic...This paper reviews three main perspectives of chunk analyses: traditional phraseological, psycholinguistic, and corpus linguistic perspectives. Traditional phraseological perspective focuses on syntactic and semantic aspects of chunks and its most important criteria of chunk identifications and classifications are compositionality and frozenness/fixedness. Psycholinguistic perspective focuses on the psychological salience of chunks and its most important criterion of identifying a chunk is whether it is processed as a whole unit. Corpus linguistic perspective focuses on frequencies of chunks and its identification of chunks is done on the basis of frequency counts. All of the three perspectives have tapped into the phenomenon of multi-word combinations and yielded fruitful findings on the use of chunks in aspects of quantitative, syntactic, semantic, functional, and psychological features; however, each has their pros and cons展开更多
文摘文本分类技术能够帮助心理咨询对话系统自动判别用户的心理状态,以便在聊天过程中正确对用户进行心理治疗及心理健康干预,在心理学领域中具有良好的应用前景。本文在近年提出的Emotional First Aid Dataset心理咨询语料库上依次构建了烦恼类型、心理疾病、伤害身体倾向三个文本多分类任务,提出了该语料库的数据预处理方案,同时研究了BERT、Ro BERTa等6个深度学习语言模型在这些多分类任务上的性能,并以这些模型作为基学习器构建了集成模型。实验结果表明,XLNet、RoBERTa、ERNIE模型在多个任务上的表现较为突出,同时集成学习能显著地提高分类模型的预测准确率,整体取得了良好的效果。
文摘This paper reviews three main perspectives of chunk analyses: traditional phraseological, psycholinguistic, and corpus linguistic perspectives. Traditional phraseological perspective focuses on syntactic and semantic aspects of chunks and its most important criteria of chunk identifications and classifications are compositionality and frozenness/fixedness. Psycholinguistic perspective focuses on the psychological salience of chunks and its most important criterion of identifying a chunk is whether it is processed as a whole unit. Corpus linguistic perspective focuses on frequencies of chunks and its identification of chunks is done on the basis of frequency counts. All of the three perspectives have tapped into the phenomenon of multi-word combinations and yielded fruitful findings on the use of chunks in aspects of quantitative, syntactic, semantic, functional, and psychological features; however, each has their pros and cons