Roadways excavated in soft rocks at great depth are difficult to be maintained due to large deformation of surrounding rocks, which greatly influences the safety and efficiency of deep resources exploitation. During t...Roadways excavated in soft rocks at great depth are difficult to be maintained due to large deformation of surrounding rocks, which greatly influences the safety and efficiency of deep resources exploitation. During the excavation process of a deep soft rock tunnel, the rock wall may be compacted due to large deformation. In this paper, the technique to address this problem by a two-dimensional (2D) finite element software, large deformation engineering analyses software (LDEAS 1.0), is provided. By using the Lagrange multiplier method, the kinematic constraint of non-penetrating condition and static constraint of Coulomb friction are introduced to the governing equations in the form of incremental displacement. The numerical example demonstrates the efficiency of this technology. Deformations of a transportation tunnel in inclined soft rock strata at the depth of 1 000 m in Qishan coal mine and a tunnel excavated to three different depths are analyzed by two models, i.e. the additive decomposition model and polar decomposition model. It can be found that the deformation of the transportation tunnel is asymmetrical due to the inclination of rock strata. For extremely soft rock, large deformation can converge only for the additive decomposition model. The deformation of surrounding rocks increases with the increase in the tunnel depth for both models. At the same depth, the deformation calculated by the additive decomposition model is smaller than that by the polar decomposition model.展开更多
In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinfor...In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.展开更多
本文针对多维背包问题维度高,约束强的特点提出了自记忆的学习优化模型(self memorized learn to improve,SML2I),通过深度强化学习的学习机制选择迭代搜索过程中的算子即模型学习当前的解以及历史搜索过程中的解,判断对当前解采用提升...本文针对多维背包问题维度高,约束强的特点提出了自记忆的学习优化模型(self memorized learn to improve,SML2I),通过深度强化学习的学习机制选择迭代搜索过程中的算子即模型学习当前的解以及历史搜索过程中的解,判断对当前解采用提升策略或者是扰动策略,在此基础上,进一步提出了哈希表与设计了2种有效的基于价值密度的扰动算子.使用哈希表记录历史搜索过程中的解,防止模型重复探索相同的解,基于价值密度的扰动策略生成的新解与之前的解决方案完全不同,因此针对扰动后的解再次采用提升策略同样有效,通过测试89个MKP数据集并与其他文献中先进的求解方法进行对比,实验结果验证了SML2I模型求解MKP问题的可行性与有效性.展开更多
基金Supported by the Fundamental Research Funds for the Central Universities of China (2009QL05)
文摘Roadways excavated in soft rocks at great depth are difficult to be maintained due to large deformation of surrounding rocks, which greatly influences the safety and efficiency of deep resources exploitation. During the excavation process of a deep soft rock tunnel, the rock wall may be compacted due to large deformation. In this paper, the technique to address this problem by a two-dimensional (2D) finite element software, large deformation engineering analyses software (LDEAS 1.0), is provided. By using the Lagrange multiplier method, the kinematic constraint of non-penetrating condition and static constraint of Coulomb friction are introduced to the governing equations in the form of incremental displacement. The numerical example demonstrates the efficiency of this technology. Deformations of a transportation tunnel in inclined soft rock strata at the depth of 1 000 m in Qishan coal mine and a tunnel excavated to three different depths are analyzed by two models, i.e. the additive decomposition model and polar decomposition model. It can be found that the deformation of the transportation tunnel is asymmetrical due to the inclination of rock strata. For extremely soft rock, large deformation can converge only for the additive decomposition model. The deformation of surrounding rocks increases with the increase in the tunnel depth for both models. At the same depth, the deformation calculated by the additive decomposition model is smaller than that by the polar decomposition model.
文摘In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller "aggregate" Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural networkbased reinforcement learning, thereby potentially leading to more effective policy improvement.
文摘本文针对多维背包问题维度高,约束强的特点提出了自记忆的学习优化模型(self memorized learn to improve,SML2I),通过深度强化学习的学习机制选择迭代搜索过程中的算子即模型学习当前的解以及历史搜索过程中的解,判断对当前解采用提升策略或者是扰动策略,在此基础上,进一步提出了哈希表与设计了2种有效的基于价值密度的扰动算子.使用哈希表记录历史搜索过程中的解,防止模型重复探索相同的解,基于价值密度的扰动策略生成的新解与之前的解决方案完全不同,因此针对扰动后的解再次采用提升策略同样有效,通过测试89个MKP数据集并与其他文献中先进的求解方法进行对比,实验结果验证了SML2I模型求解MKP问题的可行性与有效性.