摘要
为快速找到题库中题干重复题或相似度很高的试题,利用java Excel API类配合Levenshtein Distance算法实现直接访问excel题库,设计了题库重复题检测算法。在实际使用过程中发现Levenshtein算法存在内存超限,检测结果输出越界等问题,采用字符串分割法及增加控制语句的方式进行改进,获得了良好的实际使用效果。
To find High similarity of question Bank quickly,Detection algorithm is designed with java Excel API. But there is possible phenomenon,such as memory limit,output bounds of test results,etc. in the actual use of the process. In order to solve these problems,we use String segmentation method and increase Control statement to get good effect.
出处
《东莞理工学院学报》
2014年第5期57-60,共4页
Journal of Dongguan University of Technology