摘要
【目的】构建基于迁移学习的社交网络图像隐私自动分类器,对用户进行合理的提示,避免用户无意间上传包含隐私信息的内容。【方法】本文构建并标注了微博图像隐私分类数据集,采用深度迁移机器学习,尝试微调多种不同的图像预训练模型,对新浪微博图片是否包含隐私进行自动化分类。【结果】以相同的数据量,通过与非迁移学习方式对比,迁移学习的准确率至少提升了30%。迁移学习方式下,大部分ResNet深度神经网络架构的准确率可以达到88%以上。其中,ResNet50拥有最高的召回率(94.31%)、准确率(90.80%)和F1值(91.11%),且测试耗时最短(148 s),综合权衡对比,是最为适合当前场景需求的模型架构。【局限】标注的数据量相对偏少,可能没有囊括某些其他隐私类型。【结论】本文验证了深度迁移学习在微博隐私图片分类领域的可行性,可以为社交媒体用户提供隐私曝露预警。构建的微博图片隐私分类数据集为后续研究提供了基础和参考对照标准。
[Objective] This paper proposed a Social Network Image Privacy classifier based on transfer learning to provide reasonable hints for users to avoid accidentally uploading private information. [Methods] A new standard image dataset was created by gathering and annotating images from the Weibo platform. The deep transfer learning and fine-tuning of various image pre-training models were applied to classify whether the Weibo images contain privacy information or not automatically. [Results] With the same amount of data, the accuracy of transfer learning is improved by at least 30 percent compared to non-transfer learning approaches. Most ResNet deep neural network architectures can achieve more than 88% accuracy with transfer learning. Among them,ResNet50 has the highest recall rate(94.31%), accuracy(90.80%) and F1 value(91.11%), and the shortest testing time(148 s). It has been selected out after comprehensive measurements of the above metrics and recommended as the most suitable model structure for current scenario requirements. [Limitations] The amount of labeled data in this study is relatively small, which may not be able to cover all the types of private information. [Conclusions]This study validates the feasibility and efficiency of deep transfer learning in the field of classification of private Weibo images. The result can be applied to various types of social media platforms to warn users about the risk of privacy leaking. The annotated image dataset can be used in others’ further researches as both a foundation and a comparison.
作者
王树义
刘赛
马峥
Wang Shuyi;Liu Sai;Ma Zheng(Management School,Tianjin Normal University,Tianjin 300387,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第10期80-92,共13页
Data Analysis and Knowledge Discovery
基金
国家社会科学基金青年项目“基于信息价格动态揭示的社交媒体用户隐私保护研究”(项目编号:15CTQ017)的研究成果之一。
关键词
隐私保护
机器学习
深度迁移学习
Privacy Protection
Machine Learning
Deep Transfer Learning