摘要
一种文本句子比较相似度算法,以连续文字串为单元块,相同单元块越大越多越相似,相异部分的单元块越小越少越相似,依此计算相似度值。可用来消除传统相似度取值置信区间中模糊区,精确到一个非此即彼的二元逻辑值。
A comparison a text sentence similarity algorithm, order to a continuous text string as a unit, the same parts of unit block, the more and bigger the more similar, the different parts of unit block,the less and more small the more similar, according to it to calculate the similarity values.It can be used to eliminate the traditional similarity confidence interval of fuzzy area, accurate to a"yes or no"binary logic values.
作者
吴宏洲
WU Hong-zhou (The China Patent Information Center, Beijing 100088, China)
出处
《电脑知识与技术》
2016年第3期183-189,共7页
Computer Knowledge and Technology
关键词
文本句子比较
连续文字串
单元块
相似度
the comparison of the text sentence
Continuous text string
Unit block
Similarity