摘要
针对文本数据权属模糊不清、内容隐私容易泄露等问题,提出了一种隐私保护的文本数据确权方法。通过改进相似哈希(Simhash)算法,结合关键词和上下文特征生成数字指纹,提升相似文本检测效果。基于数字指纹和身份信息构造数字水印,嵌入原始文本作为所有权证明。利用区块链可靠记录文本数据权属信息,设计智能合约计算文本相似度,提高确权效率。实验结果及分析表明,该方法能够在保护内容隐私的同时有效检测相似文本,支持数据所有权的追溯及验证。
Aiming at the problems of unclear ownership of text data and easy leakage of content privacy,a privacy protected text data right confirmation method is proposed.By improving the Simhash algorithm,the digital fingerprint is generated by combining keywords and context features to improve the detection effect of similar texts.Based on digital fingerprint and identity information,the digital watermark is constructed to embed original text as proof of ownership.Blockchain is used to reliably record text data ownership information,and smart contract is designed to calculate text similarity and improve the efficiency of ownership confirmation.The experimental results and analysis show that the method can effectively detect similar text while protecting content privacy and support the traceability and verification of data ownership.
作者
刘静静
邓浩江
李杨
LIU Jingjing;DENG Haojiang;LI Yang(National Network New Media Engineering Research Center,Institute of Acoustics,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《电子设计工程》
2023年第9期24-28,共5页
Electronic Design Engineering
基金
中国科学院战略性先导科技专项(C类)(XDC02070100)。
关键词
文本数据确权
相似度检测
数字文本水印
隐私保护
区块链
text data right confirmation
similarity detection
digital text watermarking
privacy protection
blockchain