摘要
Purpose–Decision-making is one of the key technologies for self-driving cars.The high dependency of previously existing methods on human driving data or rules makes it difficult to model policies for different driving situations.Design/methodology/approach–In this research,a probabilistic decision-making method based on the Markov decision process(MDP)is proposed to deduce the optimal maneuver automatically in a two-lane highway scenario without using any human data.The decision-making issues in a traffic environment are formulated as the MDP by defining basic elements including states,actions and basic models.Transition and reward models are defined by using a complete prediction model of the surrounding cars.An optimal policy was deduced using a dynamic programing method and evaluated under a two-dimensional simulation environment.Findings–Results show that,at the given scenario,the self-driving car maintained safety and efficiency with the proposed policy.Originality/value–This paper presents a framework used to derive a driving policy for self-driving cars without relying on any human driving data or rules modeled by hand.