Stock market is a place,where shares of different companies are traded.It is a collection of buyers’and sellers’stocks.In this digital era,analysis and prediction in the stock market have gained an essential role in...Stock market is a place,where shares of different companies are traded.It is a collection of buyers’and sellers’stocks.In this digital era,analysis and prediction in the stock market have gained an essential role in shaping today’s economy.Stock market analysis can be either fundamental or technical.Technical analysis can be performed either with technical indicators or through machine learning techniques.In this paper,we report a system that uses a Reinforcement Learning(RL)network and market sentiments to make decisions about stock market trading.The system uses sentiment analysis on daily market news to spot trends in stock prices.The sentiment analysis module generates a unified score as a measure of the daily news about sentiments.This score is then fed into the RL module as one of its inputs.The RL section gives decisions in the form of three actions:buy,sell,or hold.The objective is to maximize long-term future profits.We have used stock data of Apple from 2006 to 2016 to interpret how sentiments affect trading.The stock price of any company rises,when significant positive news become available in the public domain.Our results reveal the influence of market sentiments on forecasting of stock prices.展开更多
In this paper,a theoretical framework of Multiagent Simulation(MAS)is proposed for strategic bidding in electricity markets using reinforcement learning,which consists of two parts:one is a MAS system used to simulate...In this paper,a theoretical framework of Multiagent Simulation(MAS)is proposed for strategic bidding in electricity markets using reinforcement learning,which consists of two parts:one is a MAS system used to simulate the competitive bidding of the actual electricity market;the other is an adaptive learning strategy bidding system used to provide agents with more intelligent bidding strategies.An ExperienceWeighted Attraction(EWA)reinforcement learning algorithm(RLA)is applied to the MAS model and a new MAS method is presented for strategic bidding in electricity markets using a new Improved EWA(IEWA).From both qualitative and quantitative perspectives,it is compared with three other MAS methods using the Roth-Erev(RE),Q-learning and EWA.The results show that the performance of the MAS method using IEWA is proved to be better than the others.The four MAS models using four RLAs are built for strategic bidding in electricity markets.Through running the four MAS models,the rationality and correctness of the four MAS methods are verified for strategic bidding in electricity markets using reinforcement learning.展开更多
We show the practicality of two existing meta-learning algorithms Model-</span></span><span><span><span> </span></span></span><span><span><span><spa...We show the practicality of two existing meta-learning algorithms Model-</span></span><span><span><span> </span></span></span><span><span><span><span style="font-family:Verdana;">Agnostic Meta-Learning and Fast Context Adaptation Via Meta-learning using an evolutionary strategy for parameter optimization, as well as propose two novel quantum adaptations of those algorithms using continuous quantum neural networks, for learning to trade portfolios of stocks on the stock market. The goal of meta-learning is to train a model on a variety of tasks, such that it can solve new learning tasks using only a small number of training samples. In our classical approach, we trained our meta-learning models on a variety of portfolios that contained 5 randomly sampled Consumer Cyclical stocks from a pool of 60. In our quantum approach, we trained our </span><span style="font-family:Verdana;">quantum meta-learning models on a simulated quantum computer with</span><span style="font-family:Verdana;"> portfolios containing 2 randomly sampled Consumer Cyclical stocks. Our findings suggest that both classical models could learn a new portfolio with 0.01% of the number of training samples to learn the original portfolios and can achieve a comparable performance within 0.1% Return on Investment of the Buy and Hold strategy. We also show that our much smaller quantum meta-learned models with only 60 model parameters and 25 training epochs </span><span style="font-family:Verdana;">have a similar learning pattern to our much larger classical meta-learned</span><span style="font-family:Verdana;"> models that have over 250,000 model parameters and 2500 training epochs. Given these findings</span></span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;"> we also discuss the benefits of scaling up our experiments from a simulated quantum computer to a real quantum computer. To the best of our knowledge, we are the first to apply the ideas of both classical meta-learning as well as quantum meta-learning to enhance stock trading.展开更多
Currently,critical peak load caused by residential customers has attracted utility companies and policymakers to pay more attention to residential demand response(RDR)programs.In typical RDR programs,residential custo...Currently,critical peak load caused by residential customers has attracted utility companies and policymakers to pay more attention to residential demand response(RDR)programs.In typical RDR programs,residential customers react to the price or incentive-based signals,but the actions can fall behind flexible market situations.For those residential customers equipped with smart meters,they may contribute more DR loads if they can participate in DR events in a proactive way.In this paper,we propose a comprehensive market framework in which residential customers can provide proactive RDR actions in a day-ahead market(DAM).We model and evaluate the interactions between generation companies(GenCos),retailers,residential customers,and the independent system operator(ISO)via an agent-based modeling and simulation(ABMS)approach.The simulation framework contains two main procedures—the bottom-up modeling procedure and the reinforcement learning(RL)procedure.The bottom-up modeling procedure models the residential load profiles separately by household types to capture the RDR potential differences in advance so that residential customers may rationally provide automatic DR actions.Retailers and GenCos optimize their bidding strategies via the RL procedure.The modified optimization approach in this procedure can prevent the training results from falling into local optimum solutions.The ISO clears the DAM to maximize social welfare via Karush-Kuhn-Tucker(KKT)conditions.Based on realistic residential data in China,the proposed models and methods are verified and compared in a large multi-scenario test case with 30,000 residential households.Results show that proactive RDR programs and interactions between market entities may yield significant benefits for both the supply and demand sides.The models and methods in this paper may be used by utility companies,electricity retailers,market operators,and policy makers to evaluate the consequences of a proactive RDR and the interactions among multi-entities.展开更多
This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studie...This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studies: garage parking and navigating an obstacle-dense area. Our analysis involves training a path-planning agent with real-time-only sensor information. This study addresses research questions insufficiently covered in the literature, exploring curriculum learning (CL), agent generalization (knowledge transfer), computation distribution (CPU vs. GPU), and mapless navigation. CL proved necessary for the garage scenario and beneficial for obstacle avoidance. It involved adjustments at different stages, including terminal conditions, environment complexity, and reward function hyperparameters, guided by their evolution in multiple training attempts. Fine-tuning the simulation tick and decision period parameters was crucial for effective training. The abstraction of high-level concepts (e.g., obstacle avoidance) necessitates training the agent in sufficiently complex environments in terms of the number of obstacles. While blogs and forums discuss training machine learning models in Unity, a lack of scientific articles on DRL agents for AD persists. However, since agent development requires considerable training time and difficult procedures, there is a growing need to support such research through scientific means. In addition to our findings, we contribute to the R&D community by providing our environment with open sources.展开更多
现代战争的战场较大且兵种较多,利用多智能体强化学习(MARL)进行战场推演可以加强作战单位之间的协同决策能力,从而提升战斗力。当前MARL在兵棋推演研究和对抗演练中的应用普遍存在两个简化:各个智能体的同质化以及作战单位分布稠密。...现代战争的战场较大且兵种较多,利用多智能体强化学习(MARL)进行战场推演可以加强作战单位之间的协同决策能力,从而提升战斗力。当前MARL在兵棋推演研究和对抗演练中的应用普遍存在两个简化:各个智能体的同质化以及作战单位分布稠密。实际战争场景中并不总是满足这两个设定,可能包含多种异质的智能体以及作战单位分布稀疏。为了探索强化学习在更多场景中的应用,分别就这两方面进行改进研究。首先,设计并实现了多尺度多智能体抢滩登陆环境M2ALE,M2ALE针对上述两个简化设定做了针对性的复杂化,添加了多种异质智能体和作战单位分布稀疏的场景,这两种复杂化设定加剧了多智能体环境的探索困难问题和非平稳性,使用常用的多智能体算法通常难以训练。其次,提出了一种异质多智能体课程学习框架HMACL,用于应对M2ALE环境的难点。HMACL包括3个模块:1)任务生成模块(STG),用于生成源任务以引导智能体训练;2)种类策略提升模块(CPI),针对多智能体系统本身的非平稳性,提出了一种基于智能体种类的参数共享(Class Based Parameter Sharing)策略,实现了异质智能体系统中的参数共享;3)训练模块(Trainer),通过从STG获取源任务,从CPI获取最新的策略,使用任意MARL算法训练当前的最新策略。HMACL可以缓解常用MARL算法在M2ALE环境中的探索难问题和非平稳性问题,引导多智能体系统在M2ALE环境中的学习过程。实验结果表明,使用HMACL使得MARL算法在M2ALE环境下的采样效率和最终性能得到大幅度的提升。展开更多
基金This research was funded by the Deanship of Scientific Research at Princess Nourah Bint Abdulrahman University through the Fast-track Research Funding Program.
文摘Stock market is a place,where shares of different companies are traded.It is a collection of buyers’and sellers’stocks.In this digital era,analysis and prediction in the stock market have gained an essential role in shaping today’s economy.Stock market analysis can be either fundamental or technical.Technical analysis can be performed either with technical indicators or through machine learning techniques.In this paper,we report a system that uses a Reinforcement Learning(RL)network and market sentiments to make decisions about stock market trading.The system uses sentiment analysis on daily market news to spot trends in stock prices.The sentiment analysis module generates a unified score as a measure of the daily news about sentiments.This score is then fed into the RL module as one of its inputs.The RL section gives decisions in the form of three actions:buy,sell,or hold.The objective is to maximize long-term future profits.We have used stock data of Apple from 2006 to 2016 to interpret how sentiments affect trading.The stock price of any company rises,when significant positive news become available in the public domain.Our results reveal the influence of market sentiments on forecasting of stock prices.
基金supported by the National Key Research and Development Program of China(2016YFB0901104)。
文摘In this paper,a theoretical framework of Multiagent Simulation(MAS)is proposed for strategic bidding in electricity markets using reinforcement learning,which consists of two parts:one is a MAS system used to simulate the competitive bidding of the actual electricity market;the other is an adaptive learning strategy bidding system used to provide agents with more intelligent bidding strategies.An ExperienceWeighted Attraction(EWA)reinforcement learning algorithm(RLA)is applied to the MAS model and a new MAS method is presented for strategic bidding in electricity markets using a new Improved EWA(IEWA).From both qualitative and quantitative perspectives,it is compared with three other MAS methods using the Roth-Erev(RE),Q-learning and EWA.The results show that the performance of the MAS method using IEWA is proved to be better than the others.The four MAS models using four RLAs are built for strategic bidding in electricity markets.Through running the four MAS models,the rationality and correctness of the four MAS methods are verified for strategic bidding in electricity markets using reinforcement learning.
文摘We show the practicality of two existing meta-learning algorithms Model-</span></span><span><span><span> </span></span></span><span><span><span><span style="font-family:Verdana;">Agnostic Meta-Learning and Fast Context Adaptation Via Meta-learning using an evolutionary strategy for parameter optimization, as well as propose two novel quantum adaptations of those algorithms using continuous quantum neural networks, for learning to trade portfolios of stocks on the stock market. The goal of meta-learning is to train a model on a variety of tasks, such that it can solve new learning tasks using only a small number of training samples. In our classical approach, we trained our meta-learning models on a variety of portfolios that contained 5 randomly sampled Consumer Cyclical stocks from a pool of 60. In our quantum approach, we trained our </span><span style="font-family:Verdana;">quantum meta-learning models on a simulated quantum computer with</span><span style="font-family:Verdana;"> portfolios containing 2 randomly sampled Consumer Cyclical stocks. Our findings suggest that both classical models could learn a new portfolio with 0.01% of the number of training samples to learn the original portfolios and can achieve a comparable performance within 0.1% Return on Investment of the Buy and Hold strategy. We also show that our much smaller quantum meta-learned models with only 60 model parameters and 25 training epochs </span><span style="font-family:Verdana;">have a similar learning pattern to our much larger classical meta-learned</span><span style="font-family:Verdana;"> models that have over 250,000 model parameters and 2500 training epochs. Given these findings</span></span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;">,</span></span></span><span style="font-family:Verdana;"><span style="font-family:Verdana;"><span style="font-family:Verdana;"> we also discuss the benefits of scaling up our experiments from a simulated quantum computer to a real quantum computer. To the best of our knowledge, we are the first to apply the ideas of both classical meta-learning as well as quantum meta-learning to enhance stock trading.
基金supported in part by the National Key Research and Development Program of China(2016YFB0901100)the National Natural Science Foundation of China(U1766203)+1 种基金the Science and Technology Project of State Grid Corporation of China(Friendly interaction system of supply-demand between urban electric power customers and power grid)the China Scholarship Council(CSC).
文摘Currently,critical peak load caused by residential customers has attracted utility companies and policymakers to pay more attention to residential demand response(RDR)programs.In typical RDR programs,residential customers react to the price or incentive-based signals,but the actions can fall behind flexible market situations.For those residential customers equipped with smart meters,they may contribute more DR loads if they can participate in DR events in a proactive way.In this paper,we propose a comprehensive market framework in which residential customers can provide proactive RDR actions in a day-ahead market(DAM).We model and evaluate the interactions between generation companies(GenCos),retailers,residential customers,and the independent system operator(ISO)via an agent-based modeling and simulation(ABMS)approach.The simulation framework contains two main procedures—the bottom-up modeling procedure and the reinforcement learning(RL)procedure.The bottom-up modeling procedure models the residential load profiles separately by household types to capture the RDR potential differences in advance so that residential customers may rationally provide automatic DR actions.Retailers and GenCos optimize their bidding strategies via the RL procedure.The modified optimization approach in this procedure can prevent the training results from falling into local optimum solutions.The ISO clears the DAM to maximize social welfare via Karush-Kuhn-Tucker(KKT)conditions.Based on realistic residential data in China,the proposed models and methods are verified and compared in a large multi-scenario test case with 30,000 residential households.Results show that proactive RDR programs and interactions between market entities may yield significant benefits for both the supply and demand sides.The models and methods in this paper may be used by utility companies,electricity retailers,market operators,and policy makers to evaluate the consequences of a proactive RDR and the interactions among multi-entities.
文摘This study provides a systematic analysis of the resource-consuming training of deep reinforcement-learning (DRL) agents for simulated low-speed automated driving (AD). In Unity, this study established two case studies: garage parking and navigating an obstacle-dense area. Our analysis involves training a path-planning agent with real-time-only sensor information. This study addresses research questions insufficiently covered in the literature, exploring curriculum learning (CL), agent generalization (knowledge transfer), computation distribution (CPU vs. GPU), and mapless navigation. CL proved necessary for the garage scenario and beneficial for obstacle avoidance. It involved adjustments at different stages, including terminal conditions, environment complexity, and reward function hyperparameters, guided by their evolution in multiple training attempts. Fine-tuning the simulation tick and decision period parameters was crucial for effective training. The abstraction of high-level concepts (e.g., obstacle avoidance) necessitates training the agent in sufficiently complex environments in terms of the number of obstacles. While blogs and forums discuss training machine learning models in Unity, a lack of scientific articles on DRL agents for AD persists. However, since agent development requires considerable training time and difficult procedures, there is a growing need to support such research through scientific means. In addition to our findings, we contribute to the R&D community by providing our environment with open sources.
文摘现代战争的战场较大且兵种较多,利用多智能体强化学习(MARL)进行战场推演可以加强作战单位之间的协同决策能力,从而提升战斗力。当前MARL在兵棋推演研究和对抗演练中的应用普遍存在两个简化:各个智能体的同质化以及作战单位分布稠密。实际战争场景中并不总是满足这两个设定,可能包含多种异质的智能体以及作战单位分布稀疏。为了探索强化学习在更多场景中的应用,分别就这两方面进行改进研究。首先,设计并实现了多尺度多智能体抢滩登陆环境M2ALE,M2ALE针对上述两个简化设定做了针对性的复杂化,添加了多种异质智能体和作战单位分布稀疏的场景,这两种复杂化设定加剧了多智能体环境的探索困难问题和非平稳性,使用常用的多智能体算法通常难以训练。其次,提出了一种异质多智能体课程学习框架HMACL,用于应对M2ALE环境的难点。HMACL包括3个模块:1)任务生成模块(STG),用于生成源任务以引导智能体训练;2)种类策略提升模块(CPI),针对多智能体系统本身的非平稳性,提出了一种基于智能体种类的参数共享(Class Based Parameter Sharing)策略,实现了异质智能体系统中的参数共享;3)训练模块(Trainer),通过从STG获取源任务,从CPI获取最新的策略,使用任意MARL算法训练当前的最新策略。HMACL可以缓解常用MARL算法在M2ALE环境中的探索难问题和非平稳性问题,引导多智能体系统在M2ALE环境中的学习过程。实验结果表明,使用HMACL使得MARL算法在M2ALE环境下的采样效率和最终性能得到大幅度的提升。