摘要
随着科学技术的发展和互联网信息的传播,人们有了更加广泛的学习与娱乐平台,可以通过互联网了解到世界各地的信息。人们的娱乐方式和以前相比也有了巨大提升。观众在观看国外比赛时,由于不了解比赛人员信息,很多时候会出现“脸盲”的情况。因此本文致力于通过分析国外明星的人脸信息,帮助人们更快了解想要知道的明星情况。人脸识别技术在人们生活中许多方面都发挥着很大的作用。用来研究这项技术的方法也有很多。其中卷积神经网络(Convolutional Neural Networks, CNN)是当前推动机器学习的一项重要技术,该项技术在图像分类中也表现出很好的效果。当数据集是小样本时,使用预训练模型进行预测分析有比较好的效果,是一种高效的识别方法。本文主要通过CNN和已有的预训练模型帮助进行人脸识别。通过几何变换、图像模糊化、调整亮度和对比度等方法,进行随机搭配组合实现数据增强,达到扩充数据集的效果。数据增强消除了样本数据的尺度、位置和视角差异等因素,满足模型的平移不变性和尺度不变性,增强了训练模型的鲁棒性,提高了训练模型的识别准确率。此外在输入层加入Batch Normalization操作,使每层神经网络输入是相同分布,加快训练速度,提高学习率,并且使用自适应ReLU和RMSProp算法来提高收敛速度、降低错误率。最终该网络模型达到了76.6%的准确率。本文选择VGG16、VGG19和ResNet50预训练模型对样本数据进行拟合。通过大量试参数调整分析得到VGG19的模型效果最好,达到了89.4%的准确率。
With the development of science and technology and the spread of information on the Internet, people have a wider platform for learning and entertainment, and can learn about information from all over the world through the Internet. People’s entertainment methods have also improved tremendously compared to the past. When watching foreign games, viewers are often “face-blind”, because they do not know the information of the players. Therefore, this paper is dedicated to analyzing the face information of foreign celebrities to help people know more about the celebrities they want to know more quickly. Face recognition technology plays a big role in many aspects of people’s lives. There are also many methods used to study this technology. One of them is Convolutional Neural Networks (CNN), an important technology that is currently driving machine learning, which has also shown good results in image classification. When the data set is a small sample, using pretrained models for predictive analysis has better results and is an efficient recognition method. This paper focuses on face recognition with the help of CNN and existing pretrained models. Data augmentation is achieved by random pairwise combinations of geometric transformation, image blurring, and adjustment of brightness and contrast to expand the data set. The data enhancement eliminates the scale, position and view-point differences of sample data, satisfies the translation invariance and scale invariance of the model, enhances the robustness of the training model, and improves the recognition accuracy of the training model. In addition, the Batch Normalization operation is added to the input layer so that the input of each neural network layer is identically distributed to speed up the training and improve the learning rate, and the adaptive ReLU and RMSProp algorithms are used to improve the convergence speed and reduce the error rate. The final network model achieves an accuracy of 76.6%. In this paper, VGG16, VGG19 and ResNet50 pretraining models are selected to fit the sample data. The best model of VGG19 was obtained through a large number of trial parameters adjustment analysis, and achieved an accuracy of 89.4%.
出处
《运筹与模糊学》
2023年第3期2474-2486,共13页
Operations Research and Fuzziology