期刊文献+

基于词对主题模型的题名信息自动分类方法研究

Research on Automatic Title Information Classification Method Based on BTM Topic Model
下载PDF
导出
摘要 从题名抽取关键词,把题名作为基于本体自动分类的文本主体,实现海量科技论文高效、精准地分类,已经成为图书馆事业发展的重要课题。本文利用文本内部词汇的语义关联特性,在高频词和隐含主题两个不同粒度层面,构建了基于BTM模型的题名信息自动分类方法:首先从细粒度层面进行词频统计,提取领域高频词;随后从粗粒度层面进行BTM模型分析,得到主题关键词;之后,将两者去重合并获得领域核心词集;最后,利用SVM算法进行文本分类。该方法有效地实现了知识的快速聚类和关联自动分类,为用户提供了满意度更高的知识发现及相关扩展服务。 It has become an important subject for the development of library to extract keywords from titles and take titles as the main body of text based on ontology automatic classification to realize efficient and accurate classification of massive scientific papers.In this paper,an automatic classification method of title information based on BTM model is constructed by using the semantic correlation characteristics of words in the text at two different granularity levels:high-frequency words and implied topics.Firstly,this method carries out word frequency statistics from the fine-grained level to extract domain high-frequency words;Then,the BTM model is analyzed from the coarse-grained level to get the subject keywords;Then,the two words are merged to obtain the domain core word set;Finally,SVM algorithm is used for text classification.This method effectively realizes the rapid clustering and automatic association classification of knowledge,and provides users with more satisfactory knowledge discovery and related extension services.
作者 刘爱琴 董婕 梁雅琨 LIU Aiqin;DONG Jie;LIANG Yakun(School of Economics and Management,Taiyuan 030006,China;School of Management,Northeastern University at Qinhuangdao,Qinhuangdao 066004,China;School of Chinese Language and Literature,Shanxi University,Taiyuan 030006,China)
出处 《晋图学刊》 2023年第4期29-38,共10页 Shanxi Library Journal
关键词 题名分类 词对主题模型 支持向量机算法 title classification BTM model SVM algorithm
  • 相关文献

参考文献10

二级参考文献118

共引文献207

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部