摘要
复制检测技术在保护知识产权和信息检索中有着重要的作用。本文利用网格计算的思想,提出了一个基于网格的数字文档复制检测系统。该系统把单个海量土档集分割成若干个中小型的文档集,并将其分布在网络中,然后在网络中的多个节点上并行地执行检测操作。通过局域网上的模拟试验表明该系统可以动态地增扩文档集,缩短了检测的时间,并具有很高的性价比。
Copy detection technology is very important to intellectual property protection and information retrieval. By the thought of grid computing, we present a copy detection system for digital documents based on grid. In the system, we divide the single tremendous corpus into several small-sized corpora, distribute them on the Internet, and detect the plagiarism on several computers simultaneously. We test the system in a LAN and the result shows that the grid infrastructure brings us many benefits, such as the ability to enlarge the corpus and shorten the response time. The system has a high performance cost ratio.
出处
《通讯和计算机(中英文版)》
2005年第12期32-35,共4页
Journal of Communication and Computer
关键词
复制检测
网格
剽窃
指纹
Copy Detection
Grid
Plagiarism
Fingerprint