The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l...The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.展开更多
To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching te...To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching template to represent the user' s searching intention through positive and negative training. By using the prior probabilities in the template, the supported probability and anti- supported probability of each text in the text library can be estimated for discrimination. The search- ing result can be ranked according to similarities between retrieved texts and the template. The com- plexity of DCTR is close to term frequency and mversed document frequency (TF-IDF). Its distin- guishing ability to confusable texts could be advanced and the performance of the result would be im- proved with increasing of training times.展开更多
文摘The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.
文摘To overcome the problem that the confusion between texts limits the precision in text re- trieval, a new text retrieval algorithm that decrease confusion (DCTR) is proposed. The algorithm constructs the searching template to represent the user' s searching intention through positive and negative training. By using the prior probabilities in the template, the supported probability and anti- supported probability of each text in the text library can be estimated for discrimination. The search- ing result can be ranked according to similarities between retrieved texts and the template. The com- plexity of DCTR is close to term frequency and mversed document frequency (TF-IDF). Its distin- guishing ability to confusable texts could be advanced and the performance of the result would be im- proved with increasing of training times.