期刊文献+

基于改进的Prefixspan算法的中文文本新词提取方法研究 被引量:2

Research on the Method of Chinese Text New Word Discovery base on Improved Prefix Span Alporithm
下载PDF
导出
摘要 该文尝试将序列模式挖掘算法Prefixspan应用于中文文本新词提取中,针对Prefixspan算法挖掘出的序列模式不连续、挖掘出的序列模式项相互间存在包含关系等问题,对算法进行改进,采用语义特征与统计相结合的方法,实现了从中文语料中有效提取新词。实验结果表明,该方法对于专业领域新词的识别具有较高的准确性。 The article attempts to apply the sequential pattern mining algorithm—Prefixspan to the extraction of Chinese text Neologisms.Aiming at the problem of sequential pattern discontinuity,the mining sequence patterns include each other and so on, the paper improved the prefixspan algorithm and combined semantic features with statistics to achieve effective discovery new words from Chinese text. The experimental results show that the method has high accuracy in the new word discovery.
出处 《电脑知识与技术》 2018年第3Z期160-163,共4页 Computer Knowledge and Technology
基金 国家自然科学基金(41701537) 湖北省教育厅科研项目(B2015448)
关键词 PREFIXSPAN 序列模式挖掘 新词提取 投影数据库 新词发现 Prefixspan sequential pattern mining new word extract project database new word discovery
  • 相关文献

参考文献5

二级参考文献40

  • 1贾自艳,史忠植.基于概率统计技术和规则方法的新词发现[J].计算机工程,2004,30(20):19-21. 被引量:28
  • 2王立希,王建东,汪静.基于数据挖掘的新词发现[J].计算机应用研究,2006,23(12):195-197. 被引量:8
  • 3朱礼军,赵新力,乔晓东,孙钦山.跨领域多来源主题词表集成与服务研究[J].现代图书情报技术,2007(1):20-24. 被引量:16
  • 4张坤,朱扬勇.无重复投影数据库扫描的序列模式挖掘算法[J].计算机研究与发展,2007,44(1):126-132. 被引量:17
  • 5Agrawal R, Srikant R. Mining Sequential Pattems[C]//Proc. of the 11th Int'l Conf. on Data Engineering. Taipei, China: [s. n.], 1995: 3-L4.
  • 6Srikant R, Agrawal R. Mining Sequential Patterns: Generalizations and Performance Improvements[C]//Proc. of the 5th Int'l Conf. on Extending Database Technology. Avignon, France: [s. n.], 1996: 3-17.
  • 7Zaki M J. SPADE: An Efficient Algorithm for Mining Frequent Sequences[J]. Machine Learning Journal, Special Issue on Unsupervised Learning, 2001, 42(1/2): 31-60.
  • 8Pei Jian,- Han Jiawei, Mortazavi B, et al. FreeSpan: Frequent Pattern-projected Sequential Pattern Mining[C]//Proe. of the 6th Int'l Conf. on Knowledge Discovery and Data Mining. New York, USA: [s. n.], 2000: 355-359.
  • 9Pei Jian, Han Jiawei, Mortazavi-Asl B, et al. Mining Sequential Patterns by Pattern-growth: The PrefixSpan Approach[J]. IEEE Trans. on Knowledge and Data Engineering, 2004, 16(11): 1424-1440.
  • 10AGRAWAL B, SRIKANT It. Mining sequential patterns [C]// ICDE '95: Proceedings of the Eleventh International Conference on Data Engineering. Washington, DC: IEEE Computer Society, 1995:3 - 14.

共引文献51

同被引文献20

引证文献2

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部