摘要
A typical approach to describe an image in loop closure detection for visual SLAM is to extract a set of local patch descriptors and encode them into a co-occurrence vector.The most common patch encoding strategy is known as bag-of-visual-words(BoVW)representation,which consists of clustering the local descriptors into visual vocabulary.The distinctiveness of images is difficult to represent since most of them contain similar texture information,which may lead to false positive results.In this paper,the vocabulary is used as a whole by adopting the Fisher kernel(FK)framework.The new representation describes the image as the gradient vector of the likelihood function.The efficiently computed vectors can be compressed with a minimal loss of accuracy using product quantization and perform well in the task of loop closure detection.The proposed method achieves a higher recall rate with 100%precision in loop closure detection compared with state-of-the-art methods,and the detection on bidirectional loops is also enhanced.vSLAM systems may perceive the environment more efficiently by constructing a globally consistent map with the proposed loop closure detection method,which is potentially valuable for applications such as autonomous driving.
出处
《国际计算机前沿大会会议论文集》
2022年第1期219-239,共21页
International Conference of Pioneering Computer Scientists, Engineers and Educators(ICPCSEE)