智慧井盖监测系统利用传感器、窄带物联网(Narrow Band Internet of Things,NB-IoT)技术实现对井盖的追溯监管。通过云平台、移动端、计算机网络以及数据通信技术等实现远程智能化管理。同时,收集和传输数据,将数据上传到云平台后就可...智慧井盖监测系统利用传感器、窄带物联网(Narrow Band Internet of Things,NB-IoT)技术实现对井盖的追溯监管。通过云平台、移动端、计算机网络以及数据通信技术等实现远程智能化管理。同时,收集和传输数据,将数据上传到云平台后就可以处理数据,再将处理后的结果反馈给设备端,设备端实现相应的动作,以到达智能化、自动化的目的,可以不受距离的限制,并在最短时间内处理,从而达到智慧的状态。展开更多
In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-ter...In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-term rewards and are unwilling to make early-stage investments, so they hardly get the ultimate success and the corresponding high rewards. Similarly, for a reinforcement learning(RL) model with long-delay rewards, the discount rate determines the strength of agent’s “farsightedness”.In order to enable the trained agent to make a chain of correct choices and succeed finally, the feasible region of the discount rate is obtained through mathematical derivation in this paper firstly. It satisfies the “farsightedness” requirement of agent. Afterwards, in order to avoid the complicated problem of solving implicit equations in the process of choosing feasible solutions,a simple method is explored and verified by theoreti cal demonstration and mathematical experiments. Then, a series of RL experiments are designed and implemented to verify the validity of theory. Finally, the model is extended from the finite process to the infinite process. The validity of the extended model is verified by theories and experiments. The whole research not only reveals the significance of the discount rate, but also provides a theoretical basis as well as a practical method for the choice of discount rate in future researches.展开更多
文摘智慧井盖监测系统利用传感器、窄带物联网(Narrow Band Internet of Things,NB-IoT)技术实现对井盖的追溯监管。通过云平台、移动端、计算机网络以及数据通信技术等实现远程智能化管理。同时,收集和传输数据,将数据上传到云平台后就可以处理数据,再将处理后的结果反馈给设备端,设备端实现相应的动作,以到达智能化、自动化的目的,可以不受距离的限制,并在最短时间内处理,从而达到智慧的状态。
基金supported by the National Natural Science Foundation of China (717712167170120972001214)。
文摘In the world, most of the successes are results of longterm efforts. The reward of success is extremely high, but before that, a long-term investment process is required. People who are “myopic” only value short-term rewards and are unwilling to make early-stage investments, so they hardly get the ultimate success and the corresponding high rewards. Similarly, for a reinforcement learning(RL) model with long-delay rewards, the discount rate determines the strength of agent’s “farsightedness”.In order to enable the trained agent to make a chain of correct choices and succeed finally, the feasible region of the discount rate is obtained through mathematical derivation in this paper firstly. It satisfies the “farsightedness” requirement of agent. Afterwards, in order to avoid the complicated problem of solving implicit equations in the process of choosing feasible solutions,a simple method is explored and verified by theoreti cal demonstration and mathematical experiments. Then, a series of RL experiments are designed and implemented to verify the validity of theory. Finally, the model is extended from the finite process to the infinite process. The validity of the extended model is verified by theories and experiments. The whole research not only reveals the significance of the discount rate, but also provides a theoretical basis as well as a practical method for the choice of discount rate in future researches.