Anticipating others’actions is innate and essential in order for humans to navigate and interact well with others in dense crowds.This ability is urgently required for unmanned systems such as service robots and self...Anticipating others’actions is innate and essential in order for humans to navigate and interact well with others in dense crowds.This ability is urgently required for unmanned systems such as service robots and self-driving cars.However,existing solutions struggle to predict pedestrian anticipation accurately,because the influence of group-related social behaviors has not been well considered.While group relationships and group interactions are ubiquitous and significantly influence pedestrian anticipation,their influence is diverse and subtle,making it difficult to explicitly quantify.Here,we propose the group interaction field(GIF),a novel group-aware representation that quantifies pedestrian anticipation into a probability field of pedestrians’future locations and attention orientations.An end-to-end neural network,GIFNet,is tailored to estimate the GIF from explicit multidimensional observations.GIFNet quantifies the influence of group behaviors by formulating a group interaction graph with propagation and graph attention that is adaptive to the group size and dynamic interaction states.The experimental results show that the GIF effectively represents the change in pedestrians’anticipation under the prominent impact of group behaviors and accurately predicts pedestrians’future states.Moreover,the GIF contributes to explaining various predictions of pedestrians’behavior in different social states.The proposed GIF will eventually be able to allow unmanned systems to work in a human-like manner and comply with social norms,thereby promoting harmonious human-machine relationships.展开更多
提出了基于卷积神经网络(Convolutional Neural Networks,CNN)和长短时记忆网络(Long Short Term Memory Network,LSTM)两种网络模型的混合模型C-L的红外人体行为识别方法。首先,通过提取红外视频中的每帧红外图像,对图像信息进行预处理...提出了基于卷积神经网络(Convolutional Neural Networks,CNN)和长短时记忆网络(Long Short Term Memory Network,LSTM)两种网络模型的混合模型C-L的红外人体行为识别方法。首先,通过提取红外视频中的每帧红外图像,对图像信息进行预处理,得到视频中动作的空间和时序信息;其次,分别输入CNN模型进行卷积、池化等空间特征提取操作,输入LSTM模型进行时序特征提取操作;最后,两条网络通过决策级别的得分融合策略获取分类结果。基于自建的红外人体行为数据集,对设计的十个行为动作进行分类,做了对比实验,最后取得了比较好的结果。展开更多
基金supported in part by the National Natural Science Foundation of China (NSFC,62125106,61860206003,and 62088102)in part by the Ministry of Science and Technology of China (2021ZD0109901)in part by the Provincial Key Research and Development Program of Zhejiang (2021C01016).
文摘Anticipating others’actions is innate and essential in order for humans to navigate and interact well with others in dense crowds.This ability is urgently required for unmanned systems such as service robots and self-driving cars.However,existing solutions struggle to predict pedestrian anticipation accurately,because the influence of group-related social behaviors has not been well considered.While group relationships and group interactions are ubiquitous and significantly influence pedestrian anticipation,their influence is diverse and subtle,making it difficult to explicitly quantify.Here,we propose the group interaction field(GIF),a novel group-aware representation that quantifies pedestrian anticipation into a probability field of pedestrians’future locations and attention orientations.An end-to-end neural network,GIFNet,is tailored to estimate the GIF from explicit multidimensional observations.GIFNet quantifies the influence of group behaviors by formulating a group interaction graph with propagation and graph attention that is adaptive to the group size and dynamic interaction states.The experimental results show that the GIF effectively represents the change in pedestrians’anticipation under the prominent impact of group behaviors and accurately predicts pedestrians’future states.Moreover,the GIF contributes to explaining various predictions of pedestrians’behavior in different social states.The proposed GIF will eventually be able to allow unmanned systems to work in a human-like manner and comply with social norms,thereby promoting harmonious human-machine relationships.
文摘提出了基于卷积神经网络(Convolutional Neural Networks,CNN)和长短时记忆网络(Long Short Term Memory Network,LSTM)两种网络模型的混合模型C-L的红外人体行为识别方法。首先,通过提取红外视频中的每帧红外图像,对图像信息进行预处理,得到视频中动作的空间和时序信息;其次,分别输入CNN模型进行卷积、池化等空间特征提取操作,输入LSTM模型进行时序特征提取操作;最后,两条网络通过决策级别的得分融合策略获取分类结果。基于自建的红外人体行为数据集,对设计的十个行为动作进行分类,做了对比实验,最后取得了比较好的结果。