期刊文献+
共找到14篇文章
< 1 >
每页显示 20 50 100
Automatic Video Segmentation Algorithm by Background Model and Color Clustering
1
作者 沙芸 王军 刘玉树 《Journal of Beijing Institute of Technology》 EI CAS 2003年第S1期134-138,共5页
In order to detect the object in video efficiently, an automatic and real time video segmentation algorithm based on background model and color clustering is proposed. This algorithm consists of four phases: backgroun... In order to detect the object in video efficiently, an automatic and real time video segmentation algorithm based on background model and color clustering is proposed. This algorithm consists of four phases: background restoration, moving objects extract, moving objects region clustering and post processing. The threshold of the background restoration is not given in advanced. It can be gotten automatically. And a new object region cluster algorithm based on background model and color clustering to remove significance noise is proposed. An efficient method of eliminating shadow is also used. This approach was compared with other methods on pixel error ratio. The experiment result indicates the algorithm is correct and efficient. 展开更多
关键词 video segmentation background restoration object region cluster
下载PDF
Objective Performance Evaluation of Video Segmentation Algorithms with Ground-Truth 被引量:1
2
作者 杨高波 张兆扬 《Journal of Shanghai University(English Edition)》 CAS 2004年第1期70-74,共5页
While the development of particular video segmentation algorithms has attracted considerable research interest, relatively little effort has been devoted to provide a methodology for evaluating their performance. In t... While the development of particular video segmentation algorithms has attracted considerable research interest, relatively little effort has been devoted to provide a methodology for evaluating their performance. In this paper, we propose a methodology to objectively evaluate video segmentation algorithm with ground-truth, which is based on computing the deviation of segmentation results from the reference segmentation. Four different metrics based on classification pixels, edges, relative foreground area and relative position respectively are combined to address the spatial accuracy. Temporal coherency is evaluated by utilizing the difference of spatial accuracy between successive frames. The experimental results show the feasibility of our approach. Moreover, it is computationally more efficient than previous methods. It can be applied to provide an offline ranking among different segmentation algorithms and to optimally set the parameters for a given algorithm. 展开更多
关键词 video object segmentation performance evaluation MPEG-4.
下载PDF
Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows 被引量:1
3
作者 Zheyun Qin Xiankai Lu +3 位作者 Xiushan Nie Dongfang Liu Yilong Yin Wenguan Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第5期1192-1208,共17页
We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differ... We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differently from current discriminative tracking-by-detection solutions,our proposed hierarchical structural embedding learning can predict more highquality masks with accurate boundary details over spatio-temporal space via the normalizing flows.We formulate the instance inference procedure as a hierarchical spatio-temporal embedded learning across time and space.Given the video clip,our method first coarsely locates pixels belonging to a particular instance with Gaussian distribution and then builds a novel mixing distribution to promote the instance boundary by fusing hierarchical appearance embedding information in a coarse-to-fine manner.For the mixing distribution,we utilize a factorization condition normalized flow fashion to estimate the distribution parameters to improve the segmentation performance.Comprehensive qualitative,quantitative,and ablation experiments are performed on three representative video instance segmentation benchmarks(i.e.,YouTube-VIS19,YouTube-VIS21,and OVIS)and the effectiveness of the proposed method is demonstrated.More impressively,the superior performance of our model on an unsupervised video object segmentation dataset(i.e.,DAVIS19)proves its generalizability.Our algorithm implementations are publicly available at https://github.com/zyqin19/HEVis. 展开更多
关键词 Embedding learning generative model normalizing flows video instance segmentation(VIS)
下载PDF
Automatic Video Segmentation Based on Information Centroid and Optimized SaliencyCut
4
作者 Hui-Si Wu Meng-Shu Liu +3 位作者 Lu-Lu Yin Ping Li Zhen-Kun Wen Hon-Cheng Wong 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第3期564-575,共12页
We propose an automatic video segmentation method based on an optimized SaliencyCut equipped with information centroid(IC)detection according to level balance principle in physical theory.Unlike the existing methods,t... We propose an automatic video segmentation method based on an optimized SaliencyCut equipped with information centroid(IC)detection according to level balance principle in physical theory.Unlike the existing methods,the image information of another dimension is provided by the IC to enhance the video segmentation accuracy.Specifically,our IC is implemented based on the information-level balance principle in the image,and denoted as the information pivot by aggregating all the image information to a point.To effectively enhance the saliency value of the target object and suppress the background area,we also combine the color and the coordinate information of the image in calculating the local IC and the global IC in the image.Then saliency maps for all frames in the video are calculated based on the detected IC.By applying IC smoothing to enhance the optimized saliency detection,we can further correct the unsatisfied saliency maps,where sharp variations of colors or motions may exist in complex videos.Finally,we obtain the segmentation results based on IC-based saliency maps and optimized SaliencyCut.Our method is evaluated on the DAVIS dataset,consisting of different kinds of challenging videos.Comparisons with the state-of-the-art methods are also conducted to evaluate our method.Convincing visual results and statistical comparisons demonstrate its advantages and robustness for automatic video segmentation. 展开更多
关键词 automatic video segmentation information centroid saliency detection optimized SaliencyCut
原文传递
Non-interactive automatic video segmentation of moving targets
5
作者 Yu ZHOU An-wen SHEN Jin-bang XU 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2012年第10期736-749,共14页
Extracting moving targets from video accurately is of great significance in the field of intelligent transport.To some extent,it is related to video segmentation or matting.In this paper,we propose a non-interactive a... Extracting moving targets from video accurately is of great significance in the field of intelligent transport.To some extent,it is related to video segmentation or matting.In this paper,we propose a non-interactive automatic segmentation method for extracting moving targets.First,the motion knowledge in video is detected with orthogonal Gaussian-Hermite moments and the Otsu algorithm,and the knowledge is treated as foreground seeds.Second,the background seeds are generated with distance transformation based on foreground seeds.Third,the foreground and background seeds are treated as extra constraints,and then a mask is generated using graph cuts methods or closed-form solutions.Comparison showed that the closed-form solution based on soft segmentation has a better performance and that the extra constraint has a larger impact on the result than other parameters.Experiments demonstrated that the proposed method can effectively extract moving targets from video in real time. 展开更多
关键词 video segmentation Auto-generated seeds Cost function Alpha matte
原文传递
Scribble-Supervised Video Object Segmentation 被引量:1
6
作者 Peiliang Huang Junwei Han +2 位作者 Nian Liu Jun Ren Dingwen Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第2期339-353,共15页
Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to ... Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations. 展开更多
关键词 Convolutional neural networks(CNNs) SCRIBBLE self-attention video object segmentation weakly supervised
下载PDF
High-Movement Human Segmentation in Video Using Adaptive N-Frames Ensemble
7
作者 Yong-Woon Kim Yung-Cheol Byun +2 位作者 Dong Seog Han Dalia Dominic Sibu Cyriac 《Computers, Materials & Continua》 SCIE EI 2022年第12期4743-4762,共20页
Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)i... Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)is a suitable option for human segmentation,and the ensemble of multiple DL-based segmentation models can improve the segmentation result.However,these approaches are not as effective when directly applied to the image segmentation in a video.This paper proposes an Adaptive N-Frames Ensemble(AFE)approach for high-movement human segmentation in a video using an ensemble of multiple DL models.In contrast to an ensemble,which executes multiple DL models simultaneously for every single video frame,the proposed AFE approach executes only a single DL model upon a current video frame.It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold.Our method employs the idea of the N-Frames Ensemble(NFE)method,which uses the ensemble of the image segmentation of a current video frame and previous video frames.However,NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates.The proposed AFE approach addresses the limitations of the NFE method.Our experiment uses three human segmentation models,namely Fully Convolutional Network(FCN),DeepLabv3,and Mediapipe.We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view.The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping,resizing and dividing it into videos having 50 frames each.This paper compares the proposed AFE with single models and the Two-Models Ensemble,as well as the NFE models.The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video. 展开更多
关键词 High movement human segmentation artificial intelligence deep learning ENSEMBLE video instance segmentation
下载PDF
An Efficient Attention-Based Strategy for Anomaly Detection in Surveillance Video
8
作者 Sareer Ul Amin Yongjun Kim +2 位作者 Irfan Sami Sangoh Park Sanghyun Seo 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3939-3958,共20页
In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous e... In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous events manually in thesemassive video records since they happen infrequently and with a low probability in real-world monitoring systems.Therefore,intelligent surveillance is a requirement of the modern day,as it enables the automatic identification of normal and aberrant behavior using artificial intelligence and computer vision technologies.In this article,we introduce an efficient Attention-based deep-learning approach for anomaly detection in surveillance video(ADSV).At the input of the ADSV,a shots boundary detection technique is used to segment prominent frames.Next,The Lightweight ConvolutionNeuralNetwork(LWCNN)model receives the segmented frames to extract spatial and temporal information from the intermediate layer.Following that,spatial and temporal features are learned using Long Short-Term Memory(LSTM)cells and Attention Network from a series of frames for each anomalous activity in a sample.To detect motion and action,the LWCNN received chronologically sorted frames.Finally,the anomaly activity in the video is identified using the proposed trained ADSV model.Extensive experiments are conducted on complex and challenging benchmark datasets.In addition,the experimental results have been compared to state-ofthe-artmethodologies,and a significant improvement is attained,demonstrating the efficiency of our ADSV method. 展开更多
关键词 Attention-based anomaly detection video shots segmentation video surveillance computer vision deep learning smart surveillance system violence detection attention model
下载PDF
Evaluating quality of motion for unsupervised video object segmentation
9
作者 CHENG Guanjun SONG Huihui 《Optoelectronics Letters》 EI 2024年第6期379-384,共6页
Current mainstream unsupervised video object segmentation(UVOS) approaches typically incorporate optical flow as motion information to locate the primary objects in coherent video frames. However, they fuse appearance... Current mainstream unsupervised video object segmentation(UVOS) approaches typically incorporate optical flow as motion information to locate the primary objects in coherent video frames. However, they fuse appearance and motion information without evaluating the quality of the optical flow. When poor-quality optical flow is used for the interaction with the appearance information, it introduces significant noise and leads to a decline in overall performance. To alleviate this issue, we first employ a quality evaluation module(QEM) to evaluate the optical flow. Then, we select high-quality optical flow as motion cues to fuse with the appearance information, which can prevent poor-quality optical flow from diverting the network's attention. Moreover, we design an appearance-guided fusion module(AGFM) to better integrate appearance and motion information. Extensive experiments on several widely utilized datasets, including DAVIS-16, FBMS-59, and You Tube-Objects, demonstrate that the proposed method outperforms existing methods. 展开更多
关键词 Evaluating quality of motion for unsupervised video object segmentation
原文传递
Global video object segmentation with spatial constraint module
10
作者 Yadang Chen Duolin Wang +2 位作者 Zhiguo Chen Zhi-Xin Yang Enhua Wu 《Computational Visual Media》 SCIE EI CSCD 2023年第2期385-400,共16页
We present a lightweight and efficient semisupervised video object segmentation network based on the space-time memory framework.To some extent,our method solves the two difficulties encountered in traditional video o... We present a lightweight and efficient semisupervised video object segmentation network based on the space-time memory framework.To some extent,our method solves the two difficulties encountered in traditional video object segmentation:one is that the single frame calculation time is too long,and the other is that the current frame’s segmentation should use more information from past frames.The algorithm uses a global context(GC)module to achieve highperformance,real-time segmentation.The GC module can effectively integrate multi-frame image information without increased memory and can process each frame in real time.Moreover,the prediction mask of the previous frame is helpful for the segmentation of the current frame,so we input it into a spatial constraint module(SCM),which constrains the areas of segments in the current frame.The SCM effectively alleviates mismatching of similar targets yet consumes few additional resources.We added a refinement module to the decoder to improve boundary segmentation.Our model achieves state-of-the-art results on various datasets,scoring 80.1%on YouTube-VOS 2018 and a J&F score of 78.0%on DAVIS 2017,while taking 0.05 s per frame on the DAVIS 2016 validation dataset. 展开更多
关键词 video object segmentation semantic segmentation global context(GC)module spatial constraint
原文传递
Full-duplex strategy for video object segmentation
11
作者 Ge-Peng Ji Deng-Ping Fan +3 位作者 Keren Fu Zhe Wu Jianbing Shen Ling Shao 《Computational Visual Media》 SCIE EI CSCD 2023年第1期155-175,共21页
Previous video object segmentation approachesmainly focus on simplex solutions linking appearance and motion,limiting effective feature collaboration between these two cues.In this work,we study a novel and efficient ... Previous video object segmentation approachesmainly focus on simplex solutions linking appearance and motion,limiting effective feature collaboration between these two cues.In this work,we study a novel and efficient full-duplex strategy network(FSNet)to address this issue,by considering a better mutual restraint scheme linking motion and appearance allowing exploitation of cross-modal features from the fusion and decoding stage.Specifically,we introduce a relational cross-attention module(RCAM)to achieve bidirectional message propagation across embedding sub-spaces.To improve the model’s robustness and update inconsistent features from the spatiotemporal embeddings,we adopt a bidirectional purification module after the RCAM.Extensive experiments on five popular benchmarks show that our FSNet is robust to various challenging scenarios(e.g.,motion blur and occlusion),and compares well to leading methods both for video object segmentation and video salient object detection.The project is publicly available at https://github.com/GewelsJI/FSNet. 展开更多
关键词 video object segmentation(VOS) video salient object detection(V-SOD) visual attention
原文传递
Video Polyp Segmentation: A Deep Learning Perspective 被引量:5
12
作者 Ge-Peng Ji Guobao Xiao +4 位作者 Yu-Cheng Chou Deng-Ping Fan Kai Zhao Geng Chen Luc Van Gool 《Machine Intelligence Research》 EI CSCD 2022年第6期531-549,共19页
We present the first comprehensive video polyp segmentation(VPS)study in the deep learning era.Over the years,developments in VPS are not moving forward with ease due to the lack of a large-scale dataset with fine-gra... We present the first comprehensive video polyp segmentation(VPS)study in the deep learning era.Over the years,developments in VPS are not moving forward with ease due to the lack of a large-scale dataset with fine-grained segmentation annotations.To address this issue,we first introduce a high-quality frame-by-frame annotated VPS dataset,named SUN-SEG,which contains 158690colonoscopy video frames from the well-known SUN-database.We provide additional annotation covering diverse types,i.e.,attribute,object mask,boundary,scribble,and polygon.Second,we design a simple but efficient baseline,named PNS+,which consists of a global encoder,a local encoder,and normalized self-attention(NS)blocks.The global and local encoders receive an anchor frame and multiple successive frames to extract long-term and short-term spatial-temporal representations,which are then progressively refined by two NS blocks.Extensive experiments show that PNS+achieves the best performance and real-time inference speed(170 fps),making it a promising solution for the VPS task.Third,we extensively evaluate 13 representative polyp/object segmentation models on our SUN-SEG dataset and provide attribute-based comparisons.Finally,we discuss several open issues and suggest possible research directions for the VPS community.Our project and dataset are publicly available at https://github.com/GewelsJI/VPS. 展开更多
关键词 video polyp segmentation(VPS) dataset self-attention COLONOSCOPY ABDOMEN
原文传递
Video Structure Analysis
13
作者 张松海 张一飞 +2 位作者 陈韬 Peter M. Hall Ralph Martin 《Tsinghua Science and Technology》 SCIE EI CAS 2007年第6期714-718,共5页
Video structure analysis is a basic requirement for most content-based video editing and processing systems. This paper presents a fast video structure analysis method based on image segmentation in each frame, with r... Video structure analysis is a basic requirement for most content-based video editing and processing systems. This paper presents a fast video structure analysis method based on image segmentation in each frame, with region matching between frames. The structure analysis decomposes the video into several moving objects, including information about their colors, positions, shapes, movements, and lifetimes. The method also supports user interactions to improve the results. The result shows that this method is fast and stable and can complete video analyzinq interactivelv. 展开更多
关键词 video structure analysis video segmentation image segmentation region matching
原文传递
Two-dimensional entropy model for video shot partitioning 被引量:1
14
作者 ZHU SongHao LIU YunCai 《Science in China(Series F)》 2009年第2期183-194,共12页
A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass ... A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass for the task of video analysis, content-based video understanding. In this paper, a novel scheme based on improved two-dimensional entropy is proposed to complete the partition of video shots. Firstly, shot transition candidates are detected using a two-pass algorithm: a coarse searching pass and a fine searching pass. Secondly, with the character of two-dimensional entropy of the image, correctly detected transition candidates are further classified into different transition types whereas those falsely detected shot breaks are distinguished and removed. Finally, the boundary of gradual transition can be precisely located by merging the characters of two-dimensional entropy of the image into the gradual transition. A large number of video sequences are used to test our system performance and promising results are obtained. 展开更多
关键词 video shot segmentation two-dimensional entropy model coarse-to-fine algorithm content-based indexing and retrieval
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部