摘要
卷积神经网络是人工智能的重要组成部分,在自然语言处理、图像识别等领域表现优异。卷积神经网络模型超参数配置涉及训练策略,在卷积神经网络大模型优化方面起着至关重要的作用。现有超参数优化方法耗时耗力,遍历整个超参数空间,容易陷入局部最优解。首先,构建3个不同深度的自建卷积神经网络作为优化对象,以提高模型在验证集上的准确率为优化目标找到最佳的超参数配置。其次,考虑优化神经网络大模型的训练过程并提高模型性能的需求,提出一种基于实验方案设计的卷积神经网络超参数优化方法。最后,为了验证方法的有效性,依据均匀设计理念构建训练方案,生成超参数优化组合,进行主观经验生成训练方案的对比实验。结果表明:所提出的优化方法在收敛速度、准确率和计算效率上更具优势。该方法为实现卷积神经网络大模型的高效训练提供支持,具有良好的通用性,可以应用于不同规模的卷积神经网络训练任务。
Convolutional neural networks,a crucial component of artificial intelligence,demonstrate outstanding performance in fields such as natural language processing and image recognition.Optimizing hyperparameters in convolutional neural network models is essential for training and optimizing large models.Existing hyperparameter optimization methods are time-consuming and may lead to local optima.In order to optimize the training process of large neural network models,a novel hyperparameter optimization method based on experimental design was proposed.Firstly,three self-built convolutional neural networks with different depths were constructed as optimization objects,aiming to find the best hyperparameter configuration to improve the model's accuracy on the validation set.Finally,in order to verify the effectiveness of the method,a training scheme was constructed based on the optimization methods,generating combinations for hyperparameter optimization and comparative experiments of subjective experience-generated training schemes were conducted.The results show that the proposed optimization method demonstrates advantages in convergence rate,accuracy,and efficiency.It is concluded that the method supports efficient training of large convolutional neural network and exhibits good generality across tasks of different scales.
作者
徐慧智
吕佳明
XU Hui-zhi;LU Jia-ming(School of Civil Engineering and Transportation,Northeast Forestry University,Harbin 150040,China)
出处
《科学技术与工程》
北大核心
2024年第28期12227-12238,共12页
Science Technology and Engineering
基金
国家自然科学基金(62371170)。
关键词
均匀设计
超参数优化
卷积神经网络(CNN)
正交设计
机器学习
uniform design
hyper-parameters optimization
convolutional neural networks(CNN)
orthogonal design
machine learning