期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Analyzing the Impact of Scene Transitions on Indoor Camera Localization through Scene Change Detection in Real-Time
1
作者 Muhammad S.Alam Farhan B.Mohamed +2 位作者 Ali Selamat Faruk Ahmed AKM B.Hossain 《Intelligent Automation & Soft Computing》 2024年第3期417-436,共20页
Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o... Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance. 展开更多
关键词 Camera pose estimation indoor camera localization real-time localization scene change detection simultaneous localization and mapping(SLAM)
下载PDF
YOLOv5ST:A Lightweight and Fast Scene Text Detector
2
作者 Yiwei Liu Yingnan Zhao +2 位作者 Yi Chen Zheng Hu Min Xia 《Computers, Materials & Continua》 SCIE EI 2024年第4期909-926,共18页
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ... Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments. 展开更多
关键词 scene text detection YOLOv5 LIGHTWEIGHT object detection
下载PDF
A Modified Method for Scene Text Detection by ResNet 被引量:2
3
作者 Shaozhang Niu Xiangxiang Li +1 位作者 Maosen Wang Yueying Li 《Computers, Materials & Continua》 SCIE EI 2020年第12期2233-2245,共13页
In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the... In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the content of the scenes themselves.The more accurate the text detection of the natural scenes is,the more accurate our semantic understanding of the images will be.Thus,scene text detection has also become the hot spot in the domain of computer vision.In this paper,we have presented a modified text detection network which is based on further research and improvement of Connectionist Text Proposal Network(CTPN)proposed by previous researchers.To extract deeper features that are less affected by different images,we use Residual Network(ResNet)to replace Visual Geometry Group Network(VGGNet)which is used in the original network.Meanwhile,to enhance the robustness of the models to multiple languages,we use the datasets for training from multi-lingual scene text detection and script identification datasets(MLT)of 2017 International Conference on Document Analysis and Recognition(ICDAR2017).And apart from that,the attention mechanism is used to get more reasonable weight distribution.We found the proposed models achieve 0.91 F1-score on ICDAR2011 test,better than CTPN trained on the same datasets by about 5%. 展开更多
关键词 CTPN scene text detection ResNet ATTENTION
下载PDF
Label Enhancement for Scene Text Detection
4
作者 MEI Junjun GUAN Tao TONG Junwen 《ZTE Communications》 2022年第4期89-95,共7页
Segmentation-based scene text detection has drawn a great deal of attention,as it can describe the text instance with arbitrary shapes based on its pixel-level prediction.However,most segmentation-based methods suffer... Segmentation-based scene text detection has drawn a great deal of attention,as it can describe the text instance with arbitrary shapes based on its pixel-level prediction.However,most segmentation-based methods suffer from complex post-processing to separate the text instances which are close to each other,resulting in considerable time consumption during the inference procedure.A label enhancement method is proposed to construct two kinds of training labels for segmentation-based scene text detection in this paper.The label distribution learning(LDL)method is used to overcome the problem brought by pure shrunk text labels that might result in suboptimal detection perfor⁃mance.The experimental results on three benchmarks demonstrate that the proposed method can consistently improve the performance with⁃out sacrificing inference speed. 展开更多
关键词 label enhancement scene text detection semantic segmentation
下载PDF
Automatic Mid-Level Concepts Clusteringfor Violent Movie Scenes Detection
5
作者 Terumasa AOKI Shinichi GOTO 《Journal of Mathematics and System Science》 2014年第9期609-619,共11页
This paper presents a novel system for violent scenes detection, which is based on machine learning to handle visual and audio features. MKL (Multiple Kernel Learning) is applied so that multimodality of videos can ... This paper presents a novel system for violent scenes detection, which is based on machine learning to handle visual and audio features. MKL (Multiple Kernel Learning) is applied so that multimodality of videos can be maximized. The largest features of our system is that mid-level concepts clustering is proposed and implemented in order to learn mid-level concepts implicitly. By this algorithm, our system does not need manually tagged annotations. The whole system is trained on the dataset from MediaEval 2013 Affect Task and evaluated by its official metric. The obtained results outperformed its best score. 展开更多
关键词 Multimedia analysis video processing violence scenes detection MediaEval machine learning MKL(Multiple KernelLearning)
下载PDF
Adaptive Multi-Scale HyperNet with Bi-Direction Residual Attention Module forScene Text Detection
6
作者 Junjie Qu Jin Liu Chao Yu 《Journal of Information Hiding and Privacy Protection》 2021年第2期83-89,共7页
Scene text detection is an important step in the scene text reading system.There are still two problems during the existing text detection methods:(1)The small receptive of the convolutional layer in text detection is... Scene text detection is an important step in the scene text reading system.There are still two problems during the existing text detection methods:(1)The small receptive of the convolutional layer in text detection is not sufficiently sensitive to the target area in the image;(2)The deep receptive of the convolutional layer in text detection lose a lot of spatial feature information.Therefore,detecting scene text remains a challenging issue.In this work,we design an effective text detector named Adaptive Multi-Scale HyperNet(AMSHN)to improve texts detection performance.Specifically,AMSHN enhances the sensitivity of target semantics in shallow features with a new attention mechanism to strengthen the region of interest in the image and weaken the region of no interest.In addition,it reduces the loss of spatial feature by fusing features on multiple paths,which significantly improves the detection performance of text.Experimental results on the Robust Reading Challenge on Reading Chinese Text on Signboard(ReCTS)dataset show that the proposed method has achieved the state-of-the-art results,which proves the ability of our detector on both particularity and universality applications. 展开更多
关键词 Deep learning scene text detection attention mechanism
下载PDF
Label distribution learning for scene text detection 被引量:1
7
作者 Haoyu MA Ningning LU +3 位作者 Junjun MEI Tao GUAN Yu ZHANG Xin GENG 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第6期5-12,共8页
Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts.However,existing methods usually ne... Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts.However,existing methods usually need complex post-processing stages to process ambiguous labels,i.e.,the labels of the pixels near the text boundary,which may belong to the text or background.In this paper,we present a framework for segmentation-based scene text detection by learning from ambiguous labels.We use the label distribution learning method to process the label ambiguity of text annotation,which achieves a good performance without using additional post-processing stage.Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection. 展开更多
关键词 scene text detection multi-task learning label distribution learning
原文传递
A Character Flow Framework for Multi-Oriented Scene Text Detection 被引量:1
8
作者 Wen-Jun Yang Bei-Ji Zou +1 位作者 Kai-Wen Li Shu Liu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第3期465-477,共13页
Scene text detection plays a significant role in various applications,such as object recognition,document management,and visual navigation.The instance segmentation based method has been mostly used in existing resear... Scene text detection plays a significant role in various applications,such as object recognition,document management,and visual navigation.The instance segmentation based method has been mostly used in existing research due to its advantages in dealing with multi-oriented texts.However,a large number of non-text pixels exist in the labels during the model training,leading to text mis-segmentation.In this paper,we propose a novel multi-oriented scene text detection framework,which includes two main modules:character instance segmentation(one instance corresponds to one character),and character flow construction(one character flow corresponds to one word).We use feature pyramid network(FPN)to predict character and non-character instances with arbitrary directions.A joint network of FPN and bidirectional long short-term memory(BLSTM)is developed to explore the context information among isolated characters,which are finally grouped into character flows.Extensive experiments are conducted on ICDAR2013,ICDAR2015,MSRA-TD500 and MLT datasets to demonstrate the effectiveness of our approach.The F-measures are 92.62%,88.02%,83.69%and 77.81%,respectively. 展开更多
关键词 multi-oriented scene text detection character instance segmentation character flow feature pyramid network(FPN) bidirectional long short-term memory(BLSTM)
原文传递
A Method of Text Extremum Region Extraction Based on Joint-Channels 被引量:1
9
作者 Xueming Qiao Weiyi Zhu +4 位作者 Dongjie Zhu Liang Kong Yingxue Xia Chunxu Lin Zhenhao Guo Yiheng Sun 《Journal on Artificial Intelligence》 2020年第1期29-37,共9页
Natural scene recognition has important significance and value in the fields of image retrieval,autonomous navigation,human-computer interaction and industrial automation.Firstly,the natural scene image non-text conte... Natural scene recognition has important significance and value in the fields of image retrieval,autonomous navigation,human-computer interaction and industrial automation.Firstly,the natural scene image non-text content takes up relatively high proportion;secondly,the natural scene images have a cluttered background and complex lighting conditions,angle,font and color.Therefore,how to extract text extreme regions efficiently from complex and varied natural scene images plays an important role in natural scene image text recognition.In this paper,a Text extremum region Extraction algorithm based on Joint-Channels(TEJC)is proposed.On the one hand,it can solve the problem that the maximum stable extremum region(MSER)algorithm is only suitable for gray images and difficult to process color images.On the other hand,it solves the problem that the MSER algorithm has high complexity and low accuracy when extracting the most stable extreme region.In this paper,the proposed algorithm is tested and evaluated on the ICDAR data set.The experimental results show that the method has superiority. 展开更多
关键词 Feature extraction scene text detection scene text feature extraction extreme region
下载PDF
JudPriNet: Video transition detection based on semantic relationship and Monte Carlo sampling
10
作者 Bo Ma Jinsong Wu Wei Qi Yan 《Intelligent and Converged Networks》 EI 2024年第2期134-146,共13页
Video understanding and content boundary detection are vital stages in video recommendation.However,previous content boundary detection methods require collecting information,including location,cast,action,and audio,a... Video understanding and content boundary detection are vital stages in video recommendation.However,previous content boundary detection methods require collecting information,including location,cast,action,and audio,and if any of these elements are missing,the results may be adversely affected.To address this issue and effectively detect transitions in video content,in this paper,we introduce a video classification and boundary detection method named JudPriNet.The focus of this paper is on objects in videos along with their labels,enabling automatic scene detection in video clips and establishing semantic connections among local objects in the images.As a significant contribution,JudPriNet presents a framework that maps labels to“Continuous Bag of Visual Words Model”to cluster labels and generates new standardized labels as video-type tags.This facilitates automatic classification of video clips.Furthermore,JudPriNet employs Monte Carlo sampling method to classify video clips,the features of video clips as elements within the framework.This proposed method seamlessly integrates video and textual components without compromising training and inference speed.Through experimentation,we have demonstrated that JudPriNet,with its semantic connections,is able to effectively classify videos alongside textual content.Our results indicate that,compared with several other detection approaches,JudPriNet excels in high-level content detection without disrupting the integrity of the video content,outperforming existing methods. 展开更多
关键词 video scene detection Monte Carlo object detection Continuous Bag-of-Words
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部