摘要
我们提出了一个新的方法—基于奖励值RNN 的优势演员-评论家算法(R-A2C)来生成音乐。我们的模型首先使用一个带有注意力机制的RNN 模型(Recurrent neural network)预处理数据并将此作为先验策略,然后我们将包含先验策略和用户反馈信息的奖励值RNN 增加到A2C(Advantage Actor-Critic)模型中,使得任意用户给定的约束与循环网络的风格相结合,以此来鼓励演员生成更符合用户需求的音乐,实验表明我们的模型取得了预期的效果。
We present a novel method-Advantage Actor-Critic with Reward RNN(R-A2C)for generating music. Our model first uses a RNN with attention mechanism to pre-trained the data and use this as a prior strategy.Then we add the reward value RNN containing the a prior strategy and user feedback information to the A2C model, to encourage the actors to generate music that better suits the user's needs.Experimental results have shown that our model has achieved the desired results.
作者
孙承爱
张馨俸
田刚
SUN Cheng-ai;ZHANG Xin-feng;TIAN Gang(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao,Shandong 266000,China)
出处
《软件》
2019年第7期96-99,共4页
Software
基金
国家自然科学基金青年项目(No.61602279)