The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and diffic...The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and difficulty in establishing an optimization model,the optimization process is often restricted.To address this issue,we propose using a transfer learning Bayesian optimization strategy to improve the efficiency of parameter optimization while minimizing resource consumption.Specifically,we leverage Gaussian process(GP)regression models to establish an integrated model that incorporates both source and target grade production task data.We then measure the similarity weights of each model by comparing their predicted trends,and utilize these weights to accelerate the solution of optimal process parameters for producing target polyolefin grades.In order to enhance the accuracy of our approach,we acknowledge that measuring similarity in a global search space may not effectively capture local similarity characteristics.Therefore,we propose a novel method for transfer learning optimization that operates within a local space(LSTL-PBO).This method employs partial data acquired through random sampling from the target task data and utilizes Bayesian optimization techniques for model establishment.By focusing on a local search space,we aim to better discern and leverage the inherent similarities between source tasks and the target task.Additionally,we incorporate a parallel concept into our method to address multiple local search spaces simultaneously.By doing so,we can explore different regions of the parameter space in parallel,thereby increasing the chances of finding optimal process parameters.This localized approach allows us to improve the precision and effectiveness of our optimization process.The performance of our method is validated through experiments on benchmark problems,and we discuss the sensitivity of its hyperparameters.The results show that our proposed method can significantly improve the efficiency of process parameter optimization,reduce the dependence on source tasks,and enhance the method's robustness.This has great potential for optimizing processes in industrial environments.展开更多
Conventional gradient-based full waveform inversion (FWI) is a local optimization, which is highly dependent on the initial model and prone to trapping in local minima. Globally optimal FWI that can overcome this limi...Conventional gradient-based full waveform inversion (FWI) is a local optimization, which is highly dependent on the initial model and prone to trapping in local minima. Globally optimal FWI that can overcome this limitation is particularly attractive, but is currently limited by the huge amount of calculation. In this paper, we propose a globally optimal FWI framework based on GPU parallel computing, which greatly improves the efficiency, and is expected to make globally optimal FWI more widely used. In this framework, we simplify and recombine the model parameters, and optimize the model iteratively. Each iteration contains hundreds of individuals, each individual is independent of the other, and each individual contains forward modeling and cost function calculation. The framework is suitable for a variety of globally optimal algorithms, and we test the framework with particle swarm optimization algorithm for example. Both the synthetic and field examples achieve good results, indicating the effectiveness of the framework. .展开更多
Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes...Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.展开更多
The desire to increase spatial and temporal resolution in modeling groundwater system has led to the requirement for intensive computational ability and large memory space. In the course of satisfying such requirement...The desire to increase spatial and temporal resolution in modeling groundwater system has led to the requirement for intensive computational ability and large memory space. In the course of satisfying such requirement, parallel computing has played a core role over the past several decades. This paper reviews the parallel algebraic linear solution methods and the parallel implementation technologies for groundwater simulation. This work is carried out to provide guidance to enable modelers of groundwater systems to make sensible choices when developing solution methods based upon the current state of knowledge in parallel computing.展开更多
针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information ...针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information and mixed weighting,PDF-MIMW)。首先,在特征降维阶段提出了基于互信息的特征提取策略(feature extraction strategy based on mutual information,FE-MI),结合特征重要性、交互性和冗余性度量过滤原始特征,剔除过多的不相关和冗余特征;接着,在多粒度扫描阶段提出了基于填充的改进多粒度扫描策略(improved multi-granularity scanning strategy based on padding,IMGS-P),对精简后的特征进行填充并对窗口扫描后的子序列进行随机采样,保证多粒度扫描的平衡;其次,在级联森林构建阶段提出了并行子森林构建策略(sub-forest construction strategy based on mixed weighting,SFC-MW),结合Spark框架并行构建加权子森林,提升模型的分类性能;最后,在类向量合并阶段提出基于混合粒子群算法的负载均衡策略(load balancing strategy based on hybrid particle swarm optimization algorithm,LB-HPSO),优化Spark框架中任务节点的负载分配,降低类向量合并时的等待时长,提高模型的并行化效率。实验表明,PDF-MIMW算法的分类效果更佳,同时在大数据环境下的训练效率更高。展开更多
基金supported by National Natural Science Foundation of China(62394343)Major Program of Qingyuan Innovation Laboratory(00122002)+1 种基金Major Science and Technology Projects of Longmen Laboratory(231100220600)Shanghai Committee of Science and Technology(23ZR1416000)and Shanghai AI Lab.
文摘The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and difficulty in establishing an optimization model,the optimization process is often restricted.To address this issue,we propose using a transfer learning Bayesian optimization strategy to improve the efficiency of parameter optimization while minimizing resource consumption.Specifically,we leverage Gaussian process(GP)regression models to establish an integrated model that incorporates both source and target grade production task data.We then measure the similarity weights of each model by comparing their predicted trends,and utilize these weights to accelerate the solution of optimal process parameters for producing target polyolefin grades.In order to enhance the accuracy of our approach,we acknowledge that measuring similarity in a global search space may not effectively capture local similarity characteristics.Therefore,we propose a novel method for transfer learning optimization that operates within a local space(LSTL-PBO).This method employs partial data acquired through random sampling from the target task data and utilizes Bayesian optimization techniques for model establishment.By focusing on a local search space,we aim to better discern and leverage the inherent similarities between source tasks and the target task.Additionally,we incorporate a parallel concept into our method to address multiple local search spaces simultaneously.By doing so,we can explore different regions of the parameter space in parallel,thereby increasing the chances of finding optimal process parameters.This localized approach allows us to improve the precision and effectiveness of our optimization process.The performance of our method is validated through experiments on benchmark problems,and we discuss the sensitivity of its hyperparameters.The results show that our proposed method can significantly improve the efficiency of process parameter optimization,reduce the dependence on source tasks,and enhance the method's robustness.This has great potential for optimizing processes in industrial environments.
文摘Conventional gradient-based full waveform inversion (FWI) is a local optimization, which is highly dependent on the initial model and prone to trapping in local minima. Globally optimal FWI that can overcome this limitation is particularly attractive, but is currently limited by the huge amount of calculation. In this paper, we propose a globally optimal FWI framework based on GPU parallel computing, which greatly improves the efficiency, and is expected to make globally optimal FWI more widely used. In this framework, we simplify and recombine the model parameters, and optimize the model iteratively. Each iteration contains hundreds of individuals, each individual is independent of the other, and each individual contains forward modeling and cost function calculation. The framework is suitable for a variety of globally optimal algorithms, and we test the framework with particle swarm optimization algorithm for example. Both the synthetic and field examples achieve good results, indicating the effectiveness of the framework. .
文摘Hyperparameter tuning is a key step in developing high-performing machine learning models, but searching large hyperparameter spaces requires extensive computation using standard sequential methods. This work analyzes the performance gains from parallel versus sequential hyperparameter optimization. Using scikit-learn’s Randomized SearchCV, this project tuned a Random Forest classifier for fake news detection via randomized grid search. Setting n_jobs to -1 enabled full parallelization across CPU cores. Results show the parallel implementation achieved over 5× faster CPU times and 3× faster total run times compared to sequential tuning. However, test accuracy slightly dropped from 99.26% sequentially to 99.15% with parallelism, indicating a trade-off between evaluation efficiency and model performance. Still, the significant computational gains allow more extensive hyperparameter exploration within reasonable timeframes, outweighing the small accuracy decrease. Further analysis could better quantify this trade-off across different models, tuning techniques, tasks, and hardware.
基金supported by the National Basic Research Program (973 Program) of China under Grant No.2010CB428804 and 2011CB 309702
文摘The desire to increase spatial and temporal resolution in modeling groundwater system has led to the requirement for intensive computational ability and large memory space. In the course of satisfying such requirement, parallel computing has played a core role over the past several decades. This paper reviews the parallel algebraic linear solution methods and the parallel implementation technologies for groundwater simulation. This work is carried out to provide guidance to enable modelers of groundwater systems to make sensible choices when developing solution methods based upon the current state of knowledge in parallel computing.
文摘针对大数据环境下并行深度森林算法中存在不相关及冗余特征过多、多粒度扫描不平衡、分类性能不足以及并行化效率低等问题,提出了基于互信息和融合加权的并行深度森林算法(parallel deep forest algorithm based on mutual information and mixed weighting,PDF-MIMW)。首先,在特征降维阶段提出了基于互信息的特征提取策略(feature extraction strategy based on mutual information,FE-MI),结合特征重要性、交互性和冗余性度量过滤原始特征,剔除过多的不相关和冗余特征;接着,在多粒度扫描阶段提出了基于填充的改进多粒度扫描策略(improved multi-granularity scanning strategy based on padding,IMGS-P),对精简后的特征进行填充并对窗口扫描后的子序列进行随机采样,保证多粒度扫描的平衡;其次,在级联森林构建阶段提出了并行子森林构建策略(sub-forest construction strategy based on mixed weighting,SFC-MW),结合Spark框架并行构建加权子森林,提升模型的分类性能;最后,在类向量合并阶段提出基于混合粒子群算法的负载均衡策略(load balancing strategy based on hybrid particle swarm optimization algorithm,LB-HPSO),优化Spark框架中任务节点的负载分配,降低类向量合并时的等待时长,提高模型的并行化效率。实验表明,PDF-MIMW算法的分类效果更佳,同时在大数据环境下的训练效率更高。