摘要
目的提高在股票资讯领域中关键词的匹配效率。方法提出了一种改进的正向最大匹配算法。该方法先通过真实的股票资讯数据构建匹配关键词词集,然后通过分析关键词集合获取词集的特征,最后利用哈希技术整理重构算法词库以提高算法的匹配效率。结果对比无优化词库、tire索引数优化词库和本文的方法,在获取的抽词结果质量相同的前提下,采用本文提出的词典优化方案所需的运行时间最少。并且随着股票资讯新闻的规模增大,本文方法的优势愈加明显。结论实验结果表明,本文提出的方案能够在保证匹配质量的情况下有效提高算法效率。
Purposes-To improve the matching efficiency of the key words in stock news. Methods-An improved forward maximum matching algorithm is proposed herein.In particular,the keyword set is first constructed on the basis of the real stock news;then the feature of the keyword set is determined by analyzing the keyword set;and finally,hashing technology is utilized to reorganize the algorithm word base to improve the algorithm's efficiency. Result-Compared with the optimized lexicon without optimization,tire index number optimization and the method in this paper,the proposed dictionary optimization scheme needs the least runtime under the premise of the same quality of word extraction results.And with the increase in the size of the stock advisory news,the advantages of the method proposed herein are more and more obvious. Conclusion-The experimental results show that our proposal is both effective and efficient.
作者
朱钟元
杨莹
薛醒思
詹先银
王家华
范淑娟
刘艳萍
ZHU Zhong-yuan;YANG Ying;XUE Xing-si;ZHAN Xian-yin;WANG Jia-hua;FAN Shu-juan;LIU Yan-ping(School of Information Science and Engineering,Fujian University of Technology,Fuzhou 350118,Fujian,China)
出处
《宝鸡文理学院学报(自然科学版)》
CAS
2019年第1期58-62,共5页
Journal of Baoji University of Arts and Sciences(Natural Science Edition)
基金
大学生创新创业训练计划项目(201810388078)
关键词
股票领域
中文分词
最大匹配
stock area
Chinese word segmentation
maximum match