期刊文献+

一般间隙序列模式挖掘的关键词抽取 被引量:3

Keyphrase Extraction Using Sequential Patterns Mining Algorithm with One-Off and General Gaps Condition
下载PDF
导出
摘要 本文提出了有监督的关键词抽取算法——KEING(Keyphrase Extraction using sequentIal patterns with oNe-off and General gaps condition)算法.首先,将每篇文档作为一个序列库,利用SPING(Sequential Patterns mIning with oNe-off and General gaps condition)算法获取词语之间的关系及其多种变化形式,并利用统计模式特征的方式描述候选关键词;然后,通过朴素贝叶斯分类算法对大量带标记的训练数据进行训练,构造分类器;最后利用分类器从测试文档中识别出关键词.通过实验验证了SPING算法的完备性以及KEING算法的有效性. Keyphrases are used to summarize the document and high-quality keyphrases have great importance in text summarizing,reading and indexing.However,most studies of keyphrase extraction have strict limitation in the form of patterns,and are unable to achieve the semantic relation between words and phrases.The results are failure to autonomously extract keyphrases.Keyphrase extraction using sequential patterns mining with one-off and general gaps condition algorithm (KEING) is proposed in this paper.Taking into account one off condition and general gaps,SPING(Sequential Patterns mIning with oNe-off and General gaps condition)can catch semantic relations between words and phrases more effectively.Therefore,KEING will get effective candidate keyphrases and count their features.Then a supervised machine learning method is used to train features and construct a classification model,we can extract keyphrase with this model.Experimental results demonstrate KEING can effectively extract high quality keyphrases.
作者 刘慧婷 刘志中 王利利 吴信东 LIU Hui-ting;LIU Zhi-zhong;WANG Li-li;WU Xin-dong(Key Laboratory of Intelligent Computing and Signal Processing of the Ministry of Education,Anhui University,Hefei,Anhui 230601,China;School of Computer Science and Technology,Anhui University,Hefei,Anhui 230601,China;School of Computer Science and Information Engineering,Hefei University of Technology,Hefei,Anhui 230601,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2019年第5期1121-1128,共8页 Acta Electronica Sinica
基金 国家重点研发计划(No.2016YFB1000901) 国家自然科学基金(No.61202227) 安徽高校自然科学研究项目(No.KJ2018A0013)
关键词 一般间隙 模式挖掘 关键词抽取 机器学习 general gap sequential patterns mining keyphrase extraction machine learning
  • 相关文献

参考文献4

二级参考文献32

  • 1左晓飞,刘怀亮,范云杰,赵辉.基于概念语义场的文本聚类算法研究[J].情报杂志,2012,31(5):180-184. 被引量:2
  • 2ShanWang Kun-LongZhang.Searching Databases with Keywords[J].Journal of Computer Science & Technology,2005,20(1):55-62. 被引量:16
  • 3文继军,王珊.SEEKER:基于关键词的关系数据库信息检索[J].软件学报,2005,16(7):1270-1281. 被引量:46
  • 4Hulgeri A, Bhalotia G, Nakhe C, Chakrabarti S, Sudarshan S. Keyword search in databases[ J]. IEEE. Data Engineering Bul- letin, 2001,24(3) :22 - 31.
  • 5Agrawal S, Chaudhuri S, Das G. DBXplorer: A system for key- word-based search over relational databases[ A]. Proc of the 18th Int' 1 Conf. on Data Engineering [ C ]. San Jose: IEEE Press,2002.5- 16.
  • 6Hrisfidis V, Papakonstantinou Y. DISCOVER: Keyword search in relational databases[ A]. Proc of the 28th Int'l Conf on Very Large Data Bases[ C]. Hong Kong: Morgan Kaufmann Publish- ers, 2002.670 - 681.
  • 7Hristidis V, Gravano L, Papakonstanlinou Y. Efficient IR - style keyword search over relational databases[ A ]. Proc of the 29th Int'l Conf on Very Large Data Bases [ C ]. Berlin: Morgan KatLfmann Publishers, 2003. 850 - 861.
  • 8Bhalotia G, Hulgeri A, Nakhe C, Chakrabarti S, Sudarslian S. Keyword searching and browsing in databases using BANKS [ A] .Proc of the 18th Int'l Conf on Data Engineering[ C] .San Jose: IEEE Press, 2002.431 - 440.
  • 9Kacholia V, Pandit S, Chakrabarti S, Sudarshan S, Desai R, Karambelkar H. Bidirectional expansion for keyword search on graph databases[ A] .Proc of the 31st Int'l Conf on Very Large Data Bases[ C]. New York: ACM 2005.505 - 516.
  • 10Balmin A, Hristidis V, Papakonstantinou Y. ObjectRank: Au- thority-Based keyword search in databases [ A ]. Proc of the 30th Int'l Conf on Very Large Data Bases[C]. San Fransis- co:Morgan Kaufmann Publishers,2004.564- 575.

共引文献104

同被引文献41

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部