期刊文献+
共找到787篇文章
< 1 2 40 >
每页显示 20 50 100
Objective Performance Evaluation of Video Segmentation Algorithms with Ground-Truth 被引量:1
1
作者 杨高波 张兆扬 《Journal of Shanghai University(English Edition)》 CAS 2004年第1期70-74,共5页
While the development of particular video segmentation algorithms has attracted considerable research interest, relatively little effort has been devoted to provide a methodology for evaluating their performance. In t... While the development of particular video segmentation algorithms has attracted considerable research interest, relatively little effort has been devoted to provide a methodology for evaluating their performance. In this paper, we propose a methodology to objectively evaluate video segmentation algorithm with ground-truth, which is based on computing the deviation of segmentation results from the reference segmentation. Four different metrics based on classification pixels, edges, relative foreground area and relative position respectively are combined to address the spatial accuracy. Temporal coherency is evaluated by utilizing the difference of spatial accuracy between successive frames. The experimental results show the feasibility of our approach. Moreover, it is computationally more efficient than previous methods. It can be applied to provide an offline ranking among different segmentation algorithms and to optimally set the parameters for a given algorithm. 展开更多
关键词 视频分割算法 目标性能评估 图像处理 图像编码
下载PDF
Automatic Video Segmentation Algorithm by Background Model and Color Clustering
2
作者 沙芸 王军 刘玉树 《Journal of Beijing Institute of Technology》 EI CAS 2003年第S1期134-138,共5页
In order to detect the object in video efficiently, an automatic and real time video segmentation algorithmbased on background model and color clustering is proposed. This algorithm consists of four phases:backgroundr... In order to detect the object in video efficiently, an automatic and real time video segmentation algorithmbased on background model and color clustering is proposed. This algorithm consists of four phases:backgroundrestoration, moving objects extract,moving objects region clustering and post processing. The threshold of thebackground restoration is not given in advanced. It can be gotten automatically.And a new object region 展开更多
关键词 video segmentATION BACKGROUND RESTORATION OBJECT region CLUSTER
下载PDF
Video Segmentation by Acoustic Analysis
3
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第10期33-36,共4页
关键词 视频分割 声学分析 电视频道 静音检测 层次结构 视频记录 自动分割 重复使用
下载PDF
Improved C-V Level Set Algorithm and its Application in Video Segmentation
4
作者 Jinsheng XIAO Benshun YI Xiaoxiao QIU 《International Journal of Communications, Network and System Sciences》 2009年第5期453-458,共6页
Image segmentation method based on level set model has wide potential application for its excellent seg-mentation result. However its complex computing restricts its application in video segmentation. In order to impr... Image segmentation method based on level set model has wide potential application for its excellent seg-mentation result. However its complex computing restricts its application in video segmentation. In order to improve the speed of image segmentation, this paper presents a new level set initialization method based on Chan-Vese level set model. After a simple iterative, we can separate out the outline of objects. Experiments show that the method is simple and efficient, with good separation effects. The improved Chan-Vese method can be applied in video segmentation. 展开更多
关键词 IMAGE segmentATION LEVEL SET C-V Model video segmentATION
下载PDF
Automated neurosurgical video segmentation and retrieval system
5
作者 Engin Mendi Songul Cecen +1 位作者 Emre Ermisoglu Coskun Bayrak 《Journal of Biomedical Science and Engineering》 2010年第6期618-624,共7页
Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing ... Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing availability of the digital video data, indexing, annotating and the retrieval of the information are crucial. Since performing these processes are both computationally expensive and time consuming, automated systems are needed. In this paper, we present a medical video segmentation and retrieval research initiative. We describe the key components of the system including video segmentation engine, image retrieval engine and image quality assessment module. The aim of this research is to provide an online tool for indexing, browsing and retrieving the neurosurgical videotapes. This tool will allow people to retrieve the specific information in a long video tape they are interested in instead of looking through the entire content. 展开更多
关键词 video Processing video SUMMARIZATION video segmentation IMAGE RETRIEVAL IMAGE Quality Assessment
下载PDF
Video segmentation based on area selection
6
《International English Education Research》 2013年第12期168-169,共2页
关键词 英语教学 教学方法 阅读教学 课外阅读 英语语法
下载PDF
Coarse-to-Fine Video Instance Segmentation With Factorized Conditional Appearance Flows 被引量:1
7
作者 Zheyun Qin Xiankai Lu +3 位作者 Xiushan Nie Dongfang Liu Yilong Yin Wenguan Wang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第5期1192-1208,共17页
We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differ... We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differently from current discriminative tracking-by-detection solutions,our proposed hierarchical structural embedding learning can predict more highquality masks with accurate boundary details over spatio-temporal space via the normalizing flows.We formulate the instance inference procedure as a hierarchical spatio-temporal embedded learning across time and space.Given the video clip,our method first coarsely locates pixels belonging to a particular instance with Gaussian distribution and then builds a novel mixing distribution to promote the instance boundary by fusing hierarchical appearance embedding information in a coarse-to-fine manner.For the mixing distribution,we utilize a factorization condition normalized flow fashion to estimate the distribution parameters to improve the segmentation performance.Comprehensive qualitative,quantitative,and ablation experiments are performed on three representative video instance segmentation benchmarks(i.e.,YouTube-VIS19,YouTube-VIS21,and OVIS)and the effectiveness of the proposed method is demonstrated.More impressively,the superior performance of our model on an unsupervised video object segmentation dataset(i.e.,DAVIS19)proves its generalizability.Our algorithm implementations are publicly available at https://github.com/zyqin19/HEVis. 展开更多
关键词 Embedding learning generative model normalizing flows video instance segmentation(VIS)
下载PDF
Non-interactive automatic video segmentation of moving targets
8
作者 Yu ZHOU An-wen SHEN Jin-bang XU 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2012年第10期736-749,共14页
Extracting moving targets from video accurately is of great significance in the field of intelligent transport.To some extent,it is related to video segmentation or matting.In this paper,we propose a non-interactive a... Extracting moving targets from video accurately is of great significance in the field of intelligent transport.To some extent,it is related to video segmentation or matting.In this paper,we propose a non-interactive automatic segmentation method for extracting moving targets.First,the motion knowledge in video is detected with orthogonal Gaussian-Hermite moments and the Otsu algorithm,and the knowledge is treated as foreground seeds.Second,the background seeds are generated with distance transformation based on foreground seeds.Third,the foreground and background seeds are treated as extra constraints,and then a mask is generated using graph cuts methods or closed-form solutions.Comparison showed that the closed-form solution based on soft segmentation has a better performance and that the extra constraint has a larger impact on the result than other parameters.Experiments demonstrated that the proposed method can effectively extract moving targets from video in real time. 展开更多
关键词 video segmentation Auto-generated seeds Cost function Alpha matte
原文传递
Automatic Video Segmentation Based on Information Centroid and Optimized SaliencyCut
9
作者 Hui-Si Wu Meng-Shu Liu +3 位作者 Lu-Lu Yin Ping Li Zhen-Kun Wen Hon-Cheng Wong 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第3期564-575,共12页
We propose an automatic video segmentation method based on an optimized SaliencyCut equipped with information centroid(IC)detection according to level balance principle in physical theory.Unlike the existing methods,t... We propose an automatic video segmentation method based on an optimized SaliencyCut equipped with information centroid(IC)detection according to level balance principle in physical theory.Unlike the existing methods,the image information of another dimension is provided by the IC to enhance the video segmentation accuracy.Specifically,our IC is implemented based on the information-level balance principle in the image,and denoted as the information pivot by aggregating all the image information to a point.To effectively enhance the saliency value of the target object and suppress the background area,we also combine the color and the coordinate information of the image in calculating the local IC and the global IC in the image.Then saliency maps for all frames in the video are calculated based on the detected IC.By applying IC smoothing to enhance the optimized saliency detection,we can further correct the unsatisfied saliency maps,where sharp variations of colors or motions may exist in complex videos.Finally,we obtain the segmentation results based on IC-based saliency maps and optimized SaliencyCut.Our method is evaluated on the DAVIS dataset,consisting of different kinds of challenging videos.Comparisons with the state-of-the-art methods are also conducted to evaluate our method.Convincing visual results and statistical comparisons demonstrate its advantages and robustness for automatic video segmentation. 展开更多
关键词 automatic video segmentation information centroid saliency detection optimized SaliencyCut
原文传递
Scribble-Supervised Video Object Segmentation 被引量:1
10
作者 Peiliang Huang Junwei Han +2 位作者 Nian Liu Jun Ren Dingwen Zhang 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2022年第2期339-353,共15页
Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to ... Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations. 展开更多
关键词 Convolutional neural networks(CNNs) SCRIBBLE self-attention video object segmentation weakly supervised
下载PDF
Integrating Audio-Visual Features and Text Information for Story Segmentation of News Video 被引量:1
11
作者 Liu Hua-yong, Zhou Dong-ru  School of Computer,Wuhan University,Wuhan 430072, Hubei, China 《Wuhan University Journal of Natural Sciences》 CAS 2003年第04A期1070-1074,共5页
Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p... Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust. 展开更多
关键词 视频数据 多峰分析 拍摄边界检定技术 场景分割
下载PDF
High-Movement Human Segmentation in Video Using Adaptive N-Frames Ensemble
12
作者 Yong-Woon Kim Yung-Cheol Byun +2 位作者 Dong Seog Han Dalia Dominic Sibu Cyriac 《Computers, Materials & Continua》 SCIE EI 2022年第12期4743-4762,共20页
Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)i... Awide range of camera apps and online video conferencing services support the feature of changing the background in real-time for aesthetic,privacy,and security reasons.Numerous studies show that theDeep-Learning(DL)is a suitable option for human segmentation,and the ensemble of multiple DL-based segmentation models can improve the segmentation result.However,these approaches are not as effective when directly applied to the image segmentation in a video.This paper proposes an Adaptive N-Frames Ensemble(AFE)approach for high-movement human segmentation in a video using an ensemble of multiple DL models.In contrast to an ensemble,which executes multiple DL models simultaneously for every single video frame,the proposed AFE approach executes only a single DL model upon a current video frame.It combines the segmentation outputs of previous frames for the final segmentation output when the frame difference is less than a particular threshold.Our method employs the idea of the N-Frames Ensemble(NFE)method,which uses the ensemble of the image segmentation of a current video frame and previous video frames.However,NFE is not suitable for the segmentation of fast-moving objects in a video nor a video with low frame rates.The proposed AFE approach addresses the limitations of the NFE method.Our experiment uses three human segmentation models,namely Fully Convolutional Network(FCN),DeepLabv3,and Mediapipe.We evaluated our approach using 1711 videos of the TikTok50f dataset with a single-person view.The TikTok50f dataset is a reconstructed version of the publicly available TikTok dataset by cropping,resizing and dividing it into videos having 50 frames each.This paper compares the proposed AFE with single models and the Two-Models Ensemble,as well as the NFE models.The experiment results show that the proposed AFE is suitable for low-movement as well as high-movement human segmentation in a video. 展开更多
关键词 High movement human segmentation artificial intelligence deep learning ENSEMBLE video instance segmentation
下载PDF
AUTOMATIC SEGMENTATION OF VIDEO OBJECT PLANES IN MPEG-4 BASED ON SPATIO-TEMPORAL INFORMATION
13
作者 XiaJinxiang HuangShunji 《Journal of Electronics(China)》 2004年第3期206-212,共7页
Segmentation of semantic Video Object Planes (VOP's) from video sequence is a key to the standard MPEG-4 with content-based video coding. In this paper, the approach of automatic Segmentation of VOP's Based on... Segmentation of semantic Video Object Planes (VOP's) from video sequence is a key to the standard MPEG-4 with content-based video coding. In this paper, the approach of automatic Segmentation of VOP's Based on Spatio-Temporal Information (SBSTI) is proposed.The proceeding results demonstrate the good performance of the algorithm. 展开更多
关键词 MPEG-4 图像分割 VOP 空间信息 SBSTI
下载PDF
Segment-based traffic smoothing algorithm for VBR video stream 被引量:1
14
作者 LIU Yun-qiang YU Song-yu WANG Xiang-wen ZHOU Jun 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第4期543-548,共6页
Transmission of variable bit rate (VBR) video, because of the burstiness of VBR video traffic, has high fluctuation in bandwidth requirement. Traffic smoothing algorithm is very efficient in reducing burstiness of the... Transmission of variable bit rate (VBR) video, because of the burstiness of VBR video traffic, has high fluctuation in bandwidth requirement. Traffic smoothing algorithm is very efficient in reducing burstiness of the VBR video stream by trans- mitting data in a series of fixed rates. We propose in this paper a novel segment-based bandwidth allocation algorithm which dynamically adjusts the segmentation boundary and changes the transmission rate at the latest possible point so that the video segment will be extended as long as possible and the number of rate changes can be as small as possible while keeping the peak rate low. Simulation results showed that our approach has small bandwidth requirement, high bandwidth utilization and low computation cost. 展开更多
关键词 视频展宽 VBR 分裂 滤波
下载PDF
An Efficient Attention-Based Strategy for Anomaly Detection in Surveillance Video
15
作者 Sareer Ul Amin Yongjun Kim +2 位作者 Irfan Sami Sangoh Park Sanghyun Seo 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3939-3958,共20页
In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous e... In the present technological world,surveillance cameras generate an immense amount of video data from various sources,making its scrutiny tough for computer vision specialists.It is difficult to search for anomalous events manually in thesemassive video records since they happen infrequently and with a low probability in real-world monitoring systems.Therefore,intelligent surveillance is a requirement of the modern day,as it enables the automatic identification of normal and aberrant behavior using artificial intelligence and computer vision technologies.In this article,we introduce an efficient Attention-based deep-learning approach for anomaly detection in surveillance video(ADSV).At the input of the ADSV,a shots boundary detection technique is used to segment prominent frames.Next,The Lightweight ConvolutionNeuralNetwork(LWCNN)model receives the segmented frames to extract spatial and temporal information from the intermediate layer.Following that,spatial and temporal features are learned using Long Short-Term Memory(LSTM)cells and Attention Network from a series of frames for each anomalous activity in a sample.To detect motion and action,the LWCNN received chronologically sorted frames.Finally,the anomaly activity in the video is identified using the proposed trained ADSV model.Extensive experiments are conducted on complex and challenging benchmark datasets.In addition,the experimental results have been compared to state-ofthe-artmethodologies,and a significant improvement is attained,demonstrating the efficiency of our ADSV method. 展开更多
关键词 Attention-based anomaly detection video shots segmentation video surveillance computer vision deep learning smart surveillance system violence detection attention model
下载PDF
多层级视频会议系统跨网段融合技术的应用
16
作者 赵士达 马蕴玢 +3 位作者 朱宏 孙选超 杨朝 赵博宇 《华南地震》 2024年第1期105-110,共6页
通过介绍天津市地震局应急视频会议系统接入中国地震局视频会议系统、天津市政府视频系统和天津应急管理局视频系统的基本情况,结合地震应急视频会议系统现状,分析多类型、多层级、多网段视频会议系统的架构特点,着重介绍了多网段、多... 通过介绍天津市地震局应急视频会议系统接入中国地震局视频会议系统、天津市政府视频系统和天津应急管理局视频系统的基本情况,结合地震应急视频会议系统现状,分析多类型、多层级、多网段视频会议系统的架构特点,着重介绍了多网段、多视频源视频转发优化技术在视频会议系统融合中的应用。通过该技术的应用,实现了天津市地震应急视频会议系统与各相关单位视频会议系统的全部连通。 展开更多
关键词 视频会议系统 视频融合 跨网段 视频转发 级联
下载PDF
锚框校准和空间位置信息补偿的街道场景视频实例分割
17
作者 张印辉 赵崇任 +2 位作者 何自芬 杨宏宽 黄滢 《电子学报》 EI CAS CSCD 北大核心 2024年第1期94-106,共13页
街道场景视频实例分割是无人驾驶技术研究中的关键问题之一,可为车辆在街道场景下的环境感知和路径规划提供决策依据.针对现有方法存在多纵横比锚框应用单一感受野采样导致边缘特征提取不充分以及高层特征金字塔空间细节位置信息匮乏的... 街道场景视频实例分割是无人驾驶技术研究中的关键问题之一,可为车辆在街道场景下的环境感知和路径规划提供决策依据.针对现有方法存在多纵横比锚框应用单一感受野采样导致边缘特征提取不充分以及高层特征金字塔空间细节位置信息匮乏的问题,本文提出锚框校准和空间位置信息补偿视频实例分割(Anchor frame calibration and Spatial position information compensation for Video Instance Segmentation,AS-VIS)网络.首先,在预测头3个分支中添加锚框校准模块实现同锚框纵横比匹配的多类型感受野采样,解决目标边缘提取不充分问题.其次,设计多感受野下采样模块将各种感受野采样后的特征融合,解决下采样信息缺失问题.最后,应用多感受野下采样模块将特征金字塔低层目标区域激活特征映射嵌入到高层中实现空间位置信息补偿,解决高层特征空间细节位置信息匮乏问题.在Youtube-VIS标准库中提取街道场景视频数据集,其中包括训练集329个视频和验证集53个视频.实验结果与YolactEdge检测和分割精度指标定量对比表明,锚框校准平均精度分别提升8.63%和5.09%,空间位置信息补偿特征金字塔平均精度分别提升7.76%和4.75%,AS-VIS总体平均精度分别提升9.26%和6.46%.本文方法实现了街道场景视频序列实例级同步检测、跟踪与分割,为无人驾驶车辆环境感知提供有效的理论依据. 展开更多
关键词 街道场景 视频实例分割 锚框校准 空间信息补偿 无人驾驶
下载PDF
联合吸收马尔可夫链和骨架映射的视频分割
18
作者 梁云 张宇晴 +1 位作者 郑晋图 张勇 《软件学报》 EI CSCD 北大核心 2024年第3期1552-1568,共17页
因严重遮挡和剧烈形变等挑战长期共存,精准鲁棒的视频分割已成为计算机视觉的热点之一.构建联合吸收马尔可夫链和骨架映射的视频分割方法,经由“预分割—后优化—再提升”逐步递进地生成精准目标轮廓.在预分割阶段,基于孪生网络和区域... 因严重遮挡和剧烈形变等挑战长期共存,精准鲁棒的视频分割已成为计算机视觉的热点之一.构建联合吸收马尔可夫链和骨架映射的视频分割方法,经由“预分割—后优化—再提升”逐步递进地生成精准目标轮廓.在预分割阶段,基于孪生网络和区域生成网络获取目标感兴趣区域,建立这些区域内超像素的吸收马尔可夫链,计算出超像素的前景/背景标签.吸收马尔可夫链可灵活有效地感知和传播目标特征,能从复杂场景初步预分割出目标物体.后优化阶段,设计短期时空线索模型和长期时空线索模型,以获取目标的短期变化规律和长期稳定特征,进而优化超像素标签,降低相似物体和噪声带来的误差.在再提升阶段,为减少优化结果的边缘毛刺和不连贯,基于超像素标签和位置,提出前景骨架和背景骨架的自动生成算法,并构建基于编解码的骨架映射网络,以学习出像素级目标轮廓,最终得到精准视频分割结果.标准数据集的大量实验表明:所提方法优于现有主流视频分割方法,能够产生具有更高区域相似度和轮廓精准度的分割结果. 展开更多
关键词 视频分割 吸收马尔可夫链 长期/短期时空线索 骨架映射网络
下载PDF
面向智能视频监控的空中交通管制员图像分割
19
作者 王超 董杰 陈含露 《安全与环境学报》 CAS CSCD 北大核心 2024年第1期206-212,共7页
为解决复杂场景下空中交通管制员检测与分割精度低、鲁棒性差的问题,提出一种基于掩码区域卷积神经网络(Mask Region-based Convolutional Neural Networks, Mask R-CNN)的管制员图像分割模型ATC Mask R-CNN(ATC Mask Region-based Conv... 为解决复杂场景下空中交通管制员检测与分割精度低、鲁棒性差的问题,提出一种基于掩码区域卷积神经网络(Mask Region-based Convolutional Neural Networks, Mask R-CNN)的管制员图像分割模型ATC Mask R-CNN(ATC Mask Region-based Convolutional Neural Networks)。首先,构建管制员监控图像数据集(ATC Monitor Image Dataset, AMID)并用于模型训练、测试;其次,在主干网络中引入瓶颈注意力模块(Bottleneck Attention Module, BAM)以增强管制员特征提取,采取改进的柔性非极大值抑制算法(Soft Non-maximum Suppression, Soft-NMS)替代NMS算法进行候选框选取,提高对遮挡目标的检测分割;最后,基于AMID进行管制员图像分割试验。结果显示:ATC Mask R-CNN的精确率、召回率和平均精度分别为96.49%、95.62%和88.84%,表明了该方法的有效性。与Mask R-CNN相比,ATC Mask R-CNN有效降低了复杂场景的不利影响,更适用于管制员工作场景,可以为管制大厅安全管理自动化应用提供技术支撑。 展开更多
关键词 安全工程 智能视频监控 复杂场景 空中交通管制员 实例分割
下载PDF
基于动态采样对偶可变形网络的实时视频实例分割
20
作者 宋一然 周千寓 +2 位作者 邵志文 易冉 马利庄 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2024年第2期247-256,共10页
为了更好地利用视频帧中蕴含的时间信息,提升视频实例分割的推理速度,提出动态采样对偶可变形网络(DSDDN). DSDDN使用动态采样策略,根据前、后帧的相似性调整采样策略.对于相似性高的帧,该方法跳过当前帧的推理过程,仅使用前帧分割进行... 为了更好地利用视频帧中蕴含的时间信息,提升视频实例分割的推理速度,提出动态采样对偶可变形网络(DSDDN). DSDDN使用动态采样策略,根据前、后帧的相似性调整采样策略.对于相似性高的帧,该方法跳过当前帧的推理过程,仅使用前帧分割进行简单迁移计算.对于相似性低的帧,该方法动态聚合时间跨度更大的视频帧作为输入,对当前帧进行信息增强.在Transformer结构里,该方法额外使用2个可变形操作,避免基于注意力的方法中的指数级计算量.提供精心设计的追踪头和损失函数,优化复杂的网络.在YouTube-VIS数据集上获得了39.1%的平均推理精度与40.2帧/s的推理速度,验证了提出的方法能够在实时视频分割任务上取得精度与推理速度的良好平衡. 展开更多
关键词 视频 实时推理 实例分割 动态网络 对偶可变形网络
下载PDF
上一页 1 2 40 下一页 到第
使用帮助 返回顶部