摘要
随着深度神经网络在机器学习的各个领域获得广泛成功,其自身所存在的问题也日益尖锐和突出,例如可解释性差、鲁棒性弱和模型训练难度大等.这些问题严重影响了神经网络模型的安全性和易用性.因此,神经网络的可解释性受到了大量的关注,而利用模型可解释性改进和优化模型的性能也成为研究热点之一.在本文中,我们通过几何中流形的观点来理解深度神经网络的可解释性,在通过流形视角分析神经网络所遇到的问题的同时,汇总了数种有效的改进和优化策略并对其加以解释.最后,本文对深度神经网络流形解释目前存在的挑战加以分析,提出将来可能的发展方向,并对今后的工作进行了展望.
Deep learning has achieved significant success in various engineering fields.However,its drawback has also received considerable attention recently,i.e.,it suffers from poor interpretability,weak robustness and difficulty for network training,which seriously affect the security and usability of deep neural networks.Therefore adversarial attacks and interpretability become the focuses of the next generation of artificial intelligence research.In this paper,we survey recent works on them from a novel geometric perspective.We reformulate the problems in traditional deep learning models from the viewpoint of manifold theory,and summarize several strategies for possible optimization of the deep networks based on interpretability.Finally,we state several challenges on the interpretability from manifold theory and outline possible future directions.
作者
夏萌霏
叶子鹏
赵旺
易冉
刘永进
Mengfei XIA;Zipeng YE;Wang ZHAO;Ran YI;Yongjin LIU(Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2021年第9期1411-1437,共27页
Scientia Sinica(Informationis)
基金
国家杰出青年科学基金(批准号:61725204)
国家重点研发计划项目(批准号:2016YFB1001200)资助。
关键词
深度学习
对抗攻击
可解释性
流形
deep learning
adversarial attack
interpretability
manifold