提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的...提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的倒排索引,用于视频片段的匹配和检索。这种方法保留了局部特征的显著性及其相对位置关系,而且有效地压缩了视频的表达,加速的视频的匹配和检索过程。实验结果表明,和已有方法相比,基于"bag of words"的视频匹配方法在大视频样本库上获得了更高的检索精度和检索速度。展开更多
针对干果图像信息量大、分类精度低和耗时多的特点,提出利用Bag of Words模型提取图片的代表特征,并采用朴素贝叶斯分类器指导特征矩阵分类。结果表明,图像分类精度能达到80%,分类处理时间约为2 s。通过增加学习样本来进一步提高分类精...针对干果图像信息量大、分类精度低和耗时多的特点,提出利用Bag of Words模型提取图片的代表特征,并采用朴素贝叶斯分类器指导特征矩阵分类。结果表明,图像分类精度能达到80%,分类处理时间约为2 s。通过增加学习样本来进一步提高分类精度,将Bag of Words应用于干果图像识别和分类是可行的。展开更多
Bag of Words算法是一种有效的基于语义特征提取与表达的物体识别算法,算法充分学习文本检索算法的优点,将图片整理为一系列视觉词汇的集合,提取物体的语义特征,实现感兴趣物体的有效检测与识别。文章主要研究了Bagof Words算法的框架...Bag of Words算法是一种有效的基于语义特征提取与表达的物体识别算法,算法充分学习文本检索算法的优点,将图片整理为一系列视觉词汇的集合,提取物体的语义特征,实现感兴趣物体的有效检测与识别。文章主要研究了Bagof Words算法的框架和基本内容。展开更多
该系统将Bag of words模型用于大批量图像检索,基于OpenCV C语言库提取图像的SIFT特征,然后使用Kmeans算法进行聚类,再将其表示成Bag of words矢量并进行归一化,实现大批量图像检索,并用caltech256数据集进行实验。实验表明,该系统该系...该系统将Bag of words模型用于大批量图像检索,基于OpenCV C语言库提取图像的SIFT特征,然后使用Kmeans算法进行聚类,再将其表示成Bag of words矢量并进行归一化,实现大批量图像检索,并用caltech256数据集进行实验。实验表明,该系统该系统采用的方法是有效的。展开更多
将Bag of Words算法引入木材图像识别领域中,介绍Bag of Words算法在木材识别上的实现过程。首先用SURF提取特征点,然后再对这些特征点进行聚类,得到类心。基于类心得到各个训练树种的向量柱形图和待识别树种图片的向量柱形图。选择分...将Bag of Words算法引入木材图像识别领域中,介绍Bag of Words算法在木材识别上的实现过程。首先用SURF提取特征点,然后再对这些特征点进行聚类,得到类心。基于类心得到各个训练树种的向量柱形图和待识别树种图片的向量柱形图。选择分类器对用向量柱形图描述的木材图像进行分类。这将提高木材识别的效率,为没有木材专业知识的人能较为准确地辨别树种提供较为可靠的方法。展开更多
"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"..."视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。展开更多
It is illegal to spread and transmit pornographic images over internet,either in real or in artificial format.The traditional methods are designed to identify real pornographic images and they are less efficient in de...It is illegal to spread and transmit pornographic images over internet,either in real or in artificial format.The traditional methods are designed to identify real pornographic images and they are less efficient in dealing with artificial images.Therefore,criminals turn to release artificial pornographic images in some specific scenes,e.g.,in social networks.To efficiently identify artificial pornographic images,a novel bag-of-visual-words based approach is proposed in the work.In the bag-of-words(Bo W)framework,speeded-up robust feature(SURF)is adopted for feature extraction at first,then a visual vocabulary is constructed through K-means clustering and images are represented by an improved Bo W encoding method,and finally the visual words are fed into a learning machine for training and classification.Different from the traditional BoW method,the proposed method sets a weight on each visual word according to the number of features that each cluster contains.Moreover,a non-binary encoding method and cross-matching strategy are utilized to improve the discriminative power of the visual words.Experimental results indicate that the proposed method outperforms the traditional method.展开更多
Imaging and computer vision systems offer the ability to study quantitatively on human physiology. On contrary, manual interpretation requires tremendous amount of work, expertise and excessive processing time. This w...Imaging and computer vision systems offer the ability to study quantitatively on human physiology. On contrary, manual interpretation requires tremendous amount of work, expertise and excessive processing time. This work presents an algorithm that integrates image processing and machine learning to diagnose diabetic retinopathy from retinal fundus images. This automated method classifies diabetic retinopathy (or absence thereof) based on a dataset collected from some publicly available database such as DRIDB0, DRIDB1, MESSIDOR, STARE and HRF. Our approach utilizes bag of words model with Speeded Up Robust Features and demonstrate classification over 180 fundus images containing lesions (hard exudates, soft exudates, microaneurysms, and haemorrhages) and non-lesions with an accuracy of 94.4%, precision of 94%, recall and f1-score of 94% and AUC of 95%. Thus, the proposed approach presents a path toward precise and automated diabetic retinopathy diagnosis on a massive scale.展开更多
针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion...针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion SLAM based on semantic word bags)。采用视觉惯性系统引入激光雷达原始观测数据,并通过滑动窗口实现了IMU(inertia measurement unit)量测、视觉特征和激光点云特征的多源数据联合非线性优化;最后算法利用视觉与激光雷达的语义词袋互补特性进行闭环优化,进一步提升了多传感器融合SLAM系统的全局定位和建图精度。实验结果显示,相比于传统的紧耦合双目视觉惯性里程计和激光雷达里程计定位,MSW-SLAM算法能够有效探测轨迹中的闭环信息,并实现高精度的全局位姿图优化,闭环检测后的点云地图具有良好的分辨率和全局一致性。展开更多
回环检测作为同步建图与定位(Simulation Localization and Mapping,SLAM)算法中的基本组成部分,能有效关联相同场景之间的特征信息,提供全局一致性的位姿估计。基于词袋(Bag of Words,BoW)模型的回环检测算法在视觉SLAM领域有着显著成...回环检测作为同步建图与定位(Simulation Localization and Mapping,SLAM)算法中的基本组成部分,能有效关联相同场景之间的特征信息,提供全局一致性的位姿估计。基于词袋(Bag of Words,BoW)模型的回环检测算法在视觉SLAM领域有着显著成效,但对于激光雷达SLAM算法,主流的方法无法实时有效地识别回环场景,且通常无法校正完整的六自由度(6 Degree of Freedom,6-DOF)环路姿态。针对以上问题,文章提出了一种基于线性关键点特征表示的词袋模型,用于激光雷达SLAM中的实时回环检测。该词袋模型计算性能高效,可满足自动驾驶实时性要求。同时,算法具有稳定的姿态校正能力,可用于精确的点对点匹配。在公开数据集上,将文章提出的方法嵌入激光SLAM算法中进行闭环性能评估。结果表明,基于词袋模型的回环检测算法在激光SLAM领域优于现有的主流方法。展开更多
Two learning models,Zolu-continuous bags of words(ZL-CBOW)and Zolu-skip-grams(ZL-SG),based on the Zolu function are proposed.The slope of Relu in word2vec has been changed by the Zolu function.The proposed models can ...Two learning models,Zolu-continuous bags of words(ZL-CBOW)and Zolu-skip-grams(ZL-SG),based on the Zolu function are proposed.The slope of Relu in word2vec has been changed by the Zolu function.The proposed models can process extremely large data sets as well as word2vec without increasing the complexity.Also,the models outperform several word embedding methods both in word similarity and syntactic accuracy.The method of ZL-CBOW outperforms CBOW in accuracy by 8.43%on the training set of capital-world,and by 1.24%on the training set of plural-verbs.Moreover,experimental simulations on word similarity and syntactic accuracy show that ZL-CBOW and ZL-SG are superior to LL-CBOW and LL-SG,respectively.展开更多
文摘提出了一种利用"bag of words"模型对视频内容进行建模和匹配的方法。通过量化视频帧的局部特征构建视觉关键词(visual words)辞典,将视频的子镜头表示成若干视觉关键词的集合。在此基础上构建基于子镜头的视觉关键词词组的倒排索引,用于视频片段的匹配和检索。这种方法保留了局部特征的显著性及其相对位置关系,而且有效地压缩了视频的表达,加速的视频的匹配和检索过程。实验结果表明,和已有方法相比,基于"bag of words"的视频匹配方法在大视频样本库上获得了更高的检索精度和检索速度。
文摘针对干果图像信息量大、分类精度低和耗时多的特点,提出利用Bag of Words模型提取图片的代表特征,并采用朴素贝叶斯分类器指导特征矩阵分类。结果表明,图像分类精度能达到80%,分类处理时间约为2 s。通过增加学习样本来进一步提高分类精度,将Bag of Words应用于干果图像识别和分类是可行的。
文摘将Bag of Words算法引入木材图像识别领域中,介绍Bag of Words算法在木材识别上的实现过程。首先用SURF提取特征点,然后再对这些特征点进行聚类,得到类心。基于类心得到各个训练树种的向量柱形图和待识别树种图片的向量柱形图。选择分类器对用向量柱形图描述的木材图像进行分类。这将提高木材识别的效率,为没有木材专业知识的人能较为准确地辨别树种提供较为可靠的方法。
文摘"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。
基金Projects(41001260,61173122,61573380) supported by the National Natural Science Foundation of ChinaProject(11JJ5044) supported by the Hunan Provincial Natural Science Foundation of China
文摘It is illegal to spread and transmit pornographic images over internet,either in real or in artificial format.The traditional methods are designed to identify real pornographic images and they are less efficient in dealing with artificial images.Therefore,criminals turn to release artificial pornographic images in some specific scenes,e.g.,in social networks.To efficiently identify artificial pornographic images,a novel bag-of-visual-words based approach is proposed in the work.In the bag-of-words(Bo W)framework,speeded-up robust feature(SURF)is adopted for feature extraction at first,then a visual vocabulary is constructed through K-means clustering and images are represented by an improved Bo W encoding method,and finally the visual words are fed into a learning machine for training and classification.Different from the traditional BoW method,the proposed method sets a weight on each visual word according to the number of features that each cluster contains.Moreover,a non-binary encoding method and cross-matching strategy are utilized to improve the discriminative power of the visual words.Experimental results indicate that the proposed method outperforms the traditional method.
文摘Imaging and computer vision systems offer the ability to study quantitatively on human physiology. On contrary, manual interpretation requires tremendous amount of work, expertise and excessive processing time. This work presents an algorithm that integrates image processing and machine learning to diagnose diabetic retinopathy from retinal fundus images. This automated method classifies diabetic retinopathy (or absence thereof) based on a dataset collected from some publicly available database such as DRIDB0, DRIDB1, MESSIDOR, STARE and HRF. Our approach utilizes bag of words model with Speeded Up Robust Features and demonstrate classification over 180 fundus images containing lesions (hard exudates, soft exudates, microaneurysms, and haemorrhages) and non-lesions with an accuracy of 94.4%, precision of 94%, recall and f1-score of 94% and AUC of 95%. Thus, the proposed approach presents a path toward precise and automated diabetic retinopathy diagnosis on a massive scale.
文摘针对室外大范围场景移动机器人建图中,激光雷达里程计位姿计算不准确导致SLAM(simultaneous localization and mapping)算法精度下降的问题,提出一种基于多传感信息融合的SLAM语义词袋优化算法MSW-SLAM(multi-sensor information fusion SLAM based on semantic word bags)。采用视觉惯性系统引入激光雷达原始观测数据,并通过滑动窗口实现了IMU(inertia measurement unit)量测、视觉特征和激光点云特征的多源数据联合非线性优化;最后算法利用视觉与激光雷达的语义词袋互补特性进行闭环优化,进一步提升了多传感器融合SLAM系统的全局定位和建图精度。实验结果显示,相比于传统的紧耦合双目视觉惯性里程计和激光雷达里程计定位,MSW-SLAM算法能够有效探测轨迹中的闭环信息,并实现高精度的全局位姿图优化,闭环检测后的点云地图具有良好的分辨率和全局一致性。
文摘回环检测作为同步建图与定位(Simulation Localization and Mapping,SLAM)算法中的基本组成部分,能有效关联相同场景之间的特征信息,提供全局一致性的位姿估计。基于词袋(Bag of Words,BoW)模型的回环检测算法在视觉SLAM领域有着显著成效,但对于激光雷达SLAM算法,主流的方法无法实时有效地识别回环场景,且通常无法校正完整的六自由度(6 Degree of Freedom,6-DOF)环路姿态。针对以上问题,文章提出了一种基于线性关键点特征表示的词袋模型,用于激光雷达SLAM中的实时回环检测。该词袋模型计算性能高效,可满足自动驾驶实时性要求。同时,算法具有稳定的姿态校正能力,可用于精确的点对点匹配。在公开数据集上,将文章提出的方法嵌入激光SLAM算法中进行闭环性能评估。结果表明,基于词袋模型的回环检测算法在激光SLAM领域优于现有的主流方法。
基金Supported by the National Natural Science Foundation of China(61771051,61675025)。
文摘Two learning models,Zolu-continuous bags of words(ZL-CBOW)and Zolu-skip-grams(ZL-SG),based on the Zolu function are proposed.The slope of Relu in word2vec has been changed by the Zolu function.The proposed models can process extremely large data sets as well as word2vec without increasing the complexity.Also,the models outperform several word embedding methods both in word similarity and syntactic accuracy.The method of ZL-CBOW outperforms CBOW in accuracy by 8.43%on the training set of capital-world,and by 1.24%on the training set of plural-verbs.Moreover,experimental simulations on word similarity and syntactic accuracy show that ZL-CBOW and ZL-SG are superior to LL-CBOW and LL-SG,respectively.