摘要
传统的TF-IDF算法主要依赖词频,往往忽略词语语义和一些具有重要意义的副词。针对这一问题,提出了一种基于语义分析的改进TF-IDF算法。该方法融入了词语语义来计算词频,改进了反义词语之间的相似度。实验结果表明,该方法在计算句子相似度中能根据语义方向对句中各词语词频进行统计,同时判断整个句子语义方向,与传统算法比较,语句相似度的准确性提高了5. 7%。
The traditional TF-IDF algorithm mainly relies on word frequency,and often ignores word semantics and some important adverbs.To solve this problem,an improved TF-IDF algorithm based on semantic analysis was proposed.This method integrates word semantics to calculate word frequency and improves the similarity between antisense words.The experimental results show that the method has great significance in calculating sentence similarity.It can count the word frequency of each sentence according to the semantic direction,and simultaneously judge the semantic direction of the whole sentence.Compared with the traditional algorithm,the accuracy of sentence similarity increased by 5.7%,therefore the improved TF-IDF Algorithm is more realistic.
作者
代钰琴
徐鲁强
DAI Yuqun;XU Luqiang(Schoolof Computer Scienceand Technology,Southwest University of Scienceand Technology,Mianyang 621010,Sichuan,China)
出处
《西南科技大学学报》
CAS
2019年第1期67-73,共7页
Journal of Southwest University of Science and Technology
基金
西南科技大学继续教育研究与发展基金资助项目(17sxb103)