期刊文献+

抄袭剽窃论文识别研究综述 被引量:7

Literature Review on Copy and Plagiarism Detections
下载PDF
导出
摘要 抄袭剽窃论文的识别是知识产权保护中一项重要的内容,已有众多的识别方法和系统.本文从抄袭剽窃的定义、文本的表示(向量空间模型、广义向量空间模型、隐性语义索引模型)、文本相似度的研究内容、文本相似度的计算方法(基于统计学的计算方法和基于语义理解的计算方法)、数字指纹和词频统计两大类技术和方法和抄袭剽窃识别系统等方面为基本思路,对该领域中已提出的主要研究方案进行了分类阐述和比较分析,总结了其最新研究进展,为下一步的研究提出了新的课题和设想. Copy and plagiarism detection is emerging as one of the primary research areas in intellectual property protection. Many plagiarism detection methods and systems have been proposed. The paper summaries this research field from some points of view, such as the definition of copy and plagiarism, text representation( such as, Vector Space Model, Generalized Vector Space Model, Latent Semantic Index), research content of text similarity, computation method of text similarity( such as one computation method based on statistics, another computation method based on semantic comprehension), the two main techniques and methods, namely, finger printing and word frequency, and detection systems. At the end of the paper, some difficulties have to overcome in the future are pointed out, and directions to study are given.
出处 《情报学报》 CSSCI 北大核心 2007年第4期567-573,共7页 Journal of the China Society for Scientific and Technical Information
基金 江西省自然科学基金项目(程序切片技术在软件形式化中的应用)、江西省教育科学"十一五"规划重点课题(江西高校科研竞争力评价体系的研究)、江西省社会科学"十一五"规划课题(学校内部科研成果创新性评价及对应管理体制改革研究)和江西财经大学校级课题(程序切片技术在软件形式化
关键词 剽窃检测 数字指纹 词频统计 plagiarism detection, finger printing, word frequency
  • 相关文献

参考文献42

  • 1鲍军鹏,沈钧毅,刘晓东,宋擒豹.自然语言文档复制检测研究综述[J].软件学报,2003,14(10):1753-1760. 被引量:69
  • 2Salton G.Automatic text processing:the transformation analysis,and retrieval of information by computer.Addison-Wesley Longman Publishing Co.,Inc.Boston,MA,USA,1989.
  • 3Wong S K M,Ziarko W,Wong P C N.Generalized vector space model in information retrieval.Proceedings of the 8th Annual ACM SIGIR International Conference on Research and Development in Information Retrieval,1985:18-25.
  • 4Deerwester S,Dumais S T,Furnas G W,Landauer T K,Harshman R.Indexing by latent semantic analysis.Journal of the Society for Information Science,1990,41(6):391-407
  • 5Li Qin,Vijayalakshmi Atluri.An ontology-guided approach to change detection of the semantic web data.Journal on Data Semantics,2006,(5):130-157.
  • 6Nicki Hitchcott.Calixthe Beyala:prizes,plagiarism,and "authenticity".Research in African Literatures,2006,37(1):100-110.
  • 7Ottenstein K J.An algorithmic approach to the detection and prevention of plagiarism.ACM SIGCSE Bulletin,1976,8 (4):30-41.
  • 8Clough P.Plagiarism in natural and programming languages:an overview of current tools and technologies.Research Memoranda:CS-00-05,Department of Computer Science,University of Sheffield,2000.
  • 9Broder A Z.On the resemblance and containment of documents.Proceedings of Compression and Complexity of Sequences.Salerno:IEEE Computer Society,1997:21-29.
  • 10Callan J P.Passage-level evidence in document retrieval.Proceedings of the 17th annual International ACM SIGIR conference on Research and development in information retrieval.Dublin,Ireland,1994:302-310.

二级参考文献93

  • 1董振东,董强.面向信息处理的词汇语义研究中的若干问题[J].语言文字应用,2001(3):27-32. 被引量:35
  • 2史彦军,滕弘飞,金博.抄袭论文识别研究与进展[J].大连理工大学学报,2005,45(1):50-57. 被引量:36
  • 3宋擒豹.电子商务环境下的数据挖掘研究:博士学位论文[M].西安:西安交通大学,2001..
  • 4穗志文.基于骨架依存树的语句相似度计算模型[J].计算语言学文集,1998,(3):176-184.
  • 5[2]Griswold G N. A method for protecting copyright on networks. In: Proc of Joint Harvard MIT Workshop on Technology Strategies for Protecting Intellectual Property in the Networked Multimedia Environment. Cambridge, MA: MIT Press, 1993. 214~221
  • 6[3]Brassil J, Low S, Maxemchuk N et al. Document marking and identification using both line and word shifting. AT & T Bell Laboratories, Tech Rep: TR94.6.8, 1994
  • 7[4]JPEG. JPEG digital compression and coding of continuous still images. ISO, Draft, Tech Rep: ISO 10918, 1991
  • 8[5]Brin S, Davis J, Garciaolina H. Copy detection mechanisms for digital documents. In: Proc of the ACM SIGMOD Int'l Conf on Management of Data. San Francisco, CA: ACM Press, 1995. 398~409
  • 9[1]Popek G J, Kline C S. Encryption and secure computer networks. ACM Computing Surveys, 1979, 11(4): 331~356
  • 10[1]DONALD L M C. CAI research [EB/OL]. http:∥www.academicintegrity.org/cairesearch.asp., 2004-07-01.

共引文献439

同被引文献72

引证文献7

二级引证文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部