针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行...针对室内场景图像语义分割结果不精确、显著图粗糙的问题,提出一种基于多模态特征优化提取和双路径引导解码的网络架构(feature regulator and dual-path guidance,FG-Net)。具体来说,设计的特征调节器对每个阶段的多模态特征依次进行噪声过滤、重加权表示、差异性互补和交互融合,通过强化RGB和深度特征聚合,优化特征提取过程中的多模态特征表示。然后,在解码阶段引入特征交互融合后丰富的跨模态线索,进一步发挥多模态特征的优势。结合双路径协同引导结构,在解码阶段融合多尺度、多层次的特征信息,从而输出更细致的显著图。实验在公开数据集NYUD-v2和SUN RGB-D上进行,在主要评价指标mIoU上达到48.5%,优于其他先进算法。结果表明,该算法实现了更精细的室内场景图像语义分割,表现出了较好的泛化性和鲁棒性。展开更多
This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedes...This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.展开更多
文摘This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.