期刊文献+
共找到577篇文章
< 1 2 29 >
每页显示 20 50 100
Advance on large scale near-duplicate video retrieval 被引量:1
1
作者 Ling Shen Richang Hong Yanbin Hao 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第5期1-24,共24页
Emerging Internet services and applications attract increasing users to involve in diverse video-related activities,such as video searching,video downloading,video sharing and so on.As normal operations,they lead to a... Emerging Internet services and applications attract increasing users to involve in diverse video-related activities,such as video searching,video downloading,video sharing and so on.As normal operations,they lead to an explosive growth of online video volume,and inevitably give rise to the massive near-duplicate contents.Near-duplicate video retrieval(NDVR)has always been a hot topic.The primary purpose of this paper is to present a comprehensive survey and an updated review of the advance on large-scale NDVR to supply guidance for researchers.Specifically,we summarize and compare the definitions of near-duplicate videos(NDVs)in the literature,analyze the relationship between NDVR and its related research topics theoretically,describe its generic framework in detail,investigate the existing state-of-the-art NDVR systems.Finally,we present the development trends and research directions of this topic. 展开更多
关键词 near-duplicate videos video retrieval feature representation video signature INDEXING similarity measurement
原文传递
A Sentence Retrieval Generation Network Guided Video Captioning
2
作者 Ou Ye Mimi Wang +3 位作者 Zhenhua Yu Yan Fu Shun Yi Jun Deng 《Computers, Materials & Continua》 SCIE EI 2023年第6期5675-5696,共22页
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide... Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches. 展开更多
关键词 video captioning encoder-decoder sentence retrieval external corpus RS GPT-2 network model
下载PDF
Real-time and Automatic Close-up Retrieval from Compressed Videos
3
作者 Ying Weng Jianmin Jiang 《International Journal of Automation and computing》 EI 2008年第2期198-201,共4页
This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based o... This paper proposes a thorough scheme, by virtue of camera zooming descriptor with two-level threshold, to automatically retrieve close-ups directly from moving picture experts group (MPEG) compressed videos based on camera motion analysis. A new algorithm for fast camera motion estimation in compressed domain is presented. In the retrieval process, camera-motion-based semantic retrieval is built. To improve the coverage of the proposed scheme, close-up retrieval in all kinds of videos is investigated. Extensive experiments illustrate that the proposed scheme provides promising retrieval results under real-time and automatic application scenario. 展开更多
关键词 Camera motion analysis close-up retrieval moving picture experts group (MPEG) compressed videos
下载PDF
Automated neurosurgical video segmentation and retrieval system
4
作者 Engin Mendi Songul Cecen +1 位作者 Emre Ermisoglu Coskun Bayrak 《Journal of Biomedical Science and Engineering》 2010年第6期618-624,共7页
Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing ... Medical video repositories play important roles for many health-related issues such as medical imaging, medical research and education, medical diagnostics and training of medical professionals. Due to the increasing availability of the digital video data, indexing, annotating and the retrieval of the information are crucial. Since performing these processes are both computationally expensive and time consuming, automated systems are needed. In this paper, we present a medical video segmentation and retrieval research initiative. We describe the key components of the system including video segmentation engine, image retrieval engine and image quality assessment module. The aim of this research is to provide an online tool for indexing, browsing and retrieving the neurosurgical videotapes. This tool will allow people to retrieve the specific information in a long video tape they are interested in instead of looking through the entire content. 展开更多
关键词 video Processing video SUMMARIZATION video SEGMENTATION IMAGE retrieval IMAGE Quality Assessment
下载PDF
Semantic-Based Video Retrieval Survey
5
作者 Shaimaa Toriah Mohamed Toriah Atef Zaki Ghalwash Aliaa A. A. Youssif 《Journal of Computer and Communications》 2018年第8期28-44,共17页
There is a tremendous growth of digital data due to the stunning progress of digital devices which facilitates capturing them. Digital data include image, text, and video. Video represents a rich source of information... There is a tremendous growth of digital data due to the stunning progress of digital devices which facilitates capturing them. Digital data include image, text, and video. Video represents a rich source of information. Thus, there is an urgent need to retrieve, organize, and automate videos. Video retrieval is a vital process in multimedia applications such as video search engines, digital museums, and video-on-demand broadcasting. In this paper, the different approaches of video retrieval are outlined and briefly categorized. Moreover, the different methods that bridge the semantic gap in video retrieval are discussed in more details. 展开更多
关键词 SEMANTIC video retrieval CONCEPT Detectors CONTEXT Based CONCEPT FUSION SEMANTIC GAP
下载PDF
Similar Video Retrieval via Order-Aware Exemplars and Alignment
6
作者 Teruki Horie Masato Uchida Yasuo Matsuyama 《Journal of Signal and Information Processing》 2018年第2期73-91,共19页
In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are ... In this paper, we present machine learning algorithms and systems for similar video retrieval. Here, the query is itself a video. For the similarity measurement, exemplars, or representative frames in each video, are extracted by unsupervised learning. For this learning, we chose the order-aware competitive learning. After obtaining a set of exemplars for each video, the similarity is computed. Because the numbers and positions of the exemplars are different in each video, we use a similarity computing method called M-distance, which generalizes existing global and local alignment methods using followers to the exemplars. To represent each frame in the video, this paper emphasizes the Frame Signature of the ISO/IEC standard so that the total system, along with its graphical user interface, becomes practical. Experiments on the detection of inserted plagiaristic scenes showed excellent precision-recall curves, with precision values very close to 1. Thus, the proposed system can work as a plagiarism detector for videos. In addition, this method can be regarded as the structuring of unstructured data via numerical labeling by exemplars. Finally, further sophistication of this labeling is discussed. 展开更多
关键词 Similar video retrieval EXEMPLAR Learning M-Distance Sequence ALIGNMENT Data STRUCTURING
下载PDF
Dynamic Hyperlinker: Innovative Solution for 3D Video Content Search and Retrieval
7
作者 Mohammad Rafiq Swash Amar Aggoun +1 位作者 Obaidullah Abdul Fatah Bei Li 《Journal of Computer and Communications》 2016年第6期10-23,共14页
Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing... Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing and becoming available ubiquitously. However, searching and visualizing 3D content remains a great challenge. In this paper, we propose and present the development of a novel approach for creating hypervideos, which ease the 3D content search and retrieval. It is called the dynamic hyperlinker for 3D content search and retrieval process. It advances 3D multimedia navigability and searchability by creating dynamic links for selectable and clickable objects in the video scene whilst the user consumes the 3D video clip. The proposed system involves 3D video processing, such as detecting/tracking clickable objects, annotating objects, and metadata engineering including 3D content descriptive protocol. Such system attracts the attention from both home and professional users and more specifically broadcasters and digital content providers. The experiment is conducted on full parallax holoscopic 3D videos “also known as integral images”. 展开更多
关键词 Holoscopic 3D Image Integral Image 3D video 3D Display video Search and retrieval Hyperlinker Hypervideo
下载PDF
Sign Language Video Retrieval Based on Trajectory
8
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第9期32-35,共4页
关键词 基于内容的视频检索 手语 编辑距离 距离算法 颜色直方图 字符串 修正方法 内存空间
下载PDF
Sign Video Retrieval under Complex Background
9
作者 Shilin Zhang Mei Gu 《通讯和计算机(中英文版)》 2010年第8期14-19,共6页
关键词 视频检索系统 复杂背景 隐马尔可夫模型 HMM模型 手语识别 搜索问题 动态特性 运动特征
下载PDF
Video Retrieval Using Color and Spatial Information of Human Appearance
10
作者 Sofina Yakhu Nikom Suvonvorn 《通讯和计算机(中英文版)》 2012年第6期636-643,共8页
关键词 基于内容的视频检索 外观颜色 空间信息 人性化 视频监控系统 目标搜索 视频数据 VR系统
下载PDF
Inference and retrieval of soccer event
11
作者 SUN Xing-hua YANG Jing-yu 《通讯和计算机(中英文版)》 2007年第3期18-32,共15页
关键词 英式足球比赛 视频提取 语境 贝氏网络 用户定义
下载PDF
基于互信息量均方差提取关键帧的激光视频图像检索研究
12
作者 胡秀 王书爱 《激光杂志》 CAS 北大核心 2024年第3期145-149,共5页
为保证激光视频图像检索结果中不存在重复性冗余图像,提出了基于互信息量均方差提取关键帧的激光视频图像检索方法。基于互信息量均方差的关键帧提取方法,以激光视频图像颜色的互信息量均方差最大化,为激光视频图像关键帧的聚类中心设... 为保证激光视频图像检索结果中不存在重复性冗余图像,提出了基于互信息量均方差提取关键帧的激光视频图像检索方法。基于互信息量均方差的关键帧提取方法,以激光视频图像颜色的互信息量均方差最大化,为激光视频图像关键帧的聚类中心设置标准,以此聚类提取不重复的视频图像关键帧;通过基于关键帧的激光视频图像检索方法,将所提取关键帧作为激光视频图像检索的核心判断内容,提取与所需图像关键帧相似度显著的激光视频图像,完成激光视频图像检索。实验结果显示:此方法使用后,提取的激光视频图像关键帧冗余度仅有0.01,激光视频图像检索结果的MAP指标测试值高达0.98,检索结果中不存在重复性冗余图像。 展开更多
关键词 互信息量 均方差 提取关键帧 激光视频 图像检索 聚类算法
下载PDF
融合多模态信息的电视视频检索系统设计 被引量:1
13
作者 张玉艳 《电视技术》 2024年第4期40-42,共3页
随着互联网和数字技术的快速发展,视频数据在网络中的重要性日益凸显,用户对视频检索的需求随之增加。针对传统视频检索方法在挖掘视频内容方面的局限性,设计一种融合多模态信息的电视视频检索系统,利用深度学习技术从视频中提取图像和... 随着互联网和数字技术的快速发展,视频数据在网络中的重要性日益凸显,用户对视频检索的需求随之增加。针对传统视频检索方法在挖掘视频内容方面的局限性,设计一种融合多模态信息的电视视频检索系统,利用深度学习技术从视频中提取图像和文本信息,并建立检索模型以进行信息融合,最后为融合后的信息建立储存服务器,实现更准确、更全面的视频检索。 展开更多
关键词 多模态信息 电视视频 检索系统
下载PDF
NewsVideoCAR:一个基于内容的视频新闻节目浏览检索系统 被引量:3
14
作者 熊华 老松杨 +3 位作者 吴玲琦 李恒峰 吴玲达 李国辉 《计算机工程》 CAS CSCD 北大核心 2000年第11期73-75,共3页
介绍了NewsVideoCAR系统的构成,核心技术的基本思想和浏览界面的设计要点.
关键词 NewsvideoCAR 电视新闻节目 节目浏览检索系统
下载PDF
面向跨模态检索的查询感知双重对比学习网络
15
作者 尹梦冉 梁美玉 +3 位作者 于洋 曹晓雯 杜军平 薛哲 《软件学报》 EI CSCD 北大核心 2024年第5期2120-2132,共13页
近期,跨模态视频语料库时刻检索(VCMR)这一新任务被提出,它的目标是从未分段的视频语料库中检索出与查询语句相对应的一小段视频片段.现有的跨模态视频文本检索工作的关键点在于不同模态特征的对齐和融合,然而,简单地执行跨模态对齐和... 近期,跨模态视频语料库时刻检索(VCMR)这一新任务被提出,它的目标是从未分段的视频语料库中检索出与查询语句相对应的一小段视频片段.现有的跨模态视频文本检索工作的关键点在于不同模态特征的对齐和融合,然而,简单地执行跨模态对齐和融合不能确保来自相同模态且语义相似的数据在联合特征空间下保持接近,也未考虑查询语句的语义.为了解决上述问题,提出一种面向多模态视频片段检索的查询感知跨模态双重对比学习网络(QACLN),该网络通过结合模态间和模态内的双重对比学习来获取不同模态数据的统一语义表示.具体地,提出一种查询感知的跨模态语义融合策略,根据感知到的查询语义自适应地融合视频的视觉模态特征和字幕模态特征等多模态特征,获得视频的查询感知多模态联合表示.此外,提出一种面向视频和查询语句的模态间及模态内双重对比学习机制,以增强不同模态的语义对齐和融合,从而提高不同模态数据表示的可分辨性和语义一致性.最后,采用一维卷积边界回归和跨模态语义相似度计算来完成时刻定位和视频检索.大量实验验证表明,所提出的QACLN优于基准方法. 展开更多
关键词 跨模态语义融合 跨模态检索 视频时刻定位 对比学习
下载PDF
Pano Video:摄像机运动建模及从视频估计摄像机运动参数的一种方法 被引量:6
16
作者 张茂军 胡晓峰 库锡树 《中国图象图形学报(A辑)》 CSCD 1997年第8期623-628,共6页
通过给摄像机平移、旋转与变焦等运动建模,并把运动模型与基于象素点亮度变化的方法相结合来估计摄像机运动参数。用得到的运动参数可以把视频构造成一幅全景图,全景图可广泛应用于视频压缩与检索。实验表明,该方法可成功地应用于视... 通过给摄像机平移、旋转与变焦等运动建模,并把运动模型与基于象素点亮度变化的方法相结合来估计摄像机运动参数。用得到的运动参数可以把视频构造成一幅全景图,全景图可广泛应用于视频压缩与检索。实验表明,该方法可成功地应用于视频会议系统中的视频压缩与视频检索。 展开更多
关键词 摄像机 运动估计 全景图 视频压缩 多媒体技术
下载PDF
面向慕课视频的关键信息检索系统设计
17
作者 赵博程 包兰天 +2 位作者 杨哲森 曹璇 苗启广 《计算机科学》 CSCD 北大核心 2024年第10期79-85,共7页
随着互联网技术的迅猛发展,慕课等在线教育平台日益受到广泛关注。慕课作为一种创新的教育形式,有效突破了传统教育模式的地域界限,实现了优质教育资源的全球共享。通过慕课,学习者能够根据个人兴趣自主选择课程,灵活安排学习时间与进度... 随着互联网技术的迅猛发展,慕课等在线教育平台日益受到广泛关注。慕课作为一种创新的教育形式,有效突破了传统教育模式的地域界限,实现了优质教育资源的全球共享。通过慕课,学习者能够根据个人兴趣自主选择课程,灵活安排学习时间与进度,且能便利地进行重复学习。然而,当前慕课平台在针对授课视频中的特定知识点进行时间定位时,仍存在很大挑战,导致用户在学习关键核心知识点时需频繁拖动视频进度以寻找相应视频片段。针对这一现状,提出了一种基于多重二分匹配的注意力机制模型的慕课视频知识抽取算法。算法框架的主体部分包括字幕文本识别与生成、字幕文本分段提取、知识点抽取模型,以及知识点检索模块。实验结果表明,相对于当前的知识点抽取模型,所提模型在Inspec,NUS,Krapivin,SemEval,KP20k等多个数据集上,在部分关键指标上达到了当前的最优表现,充分证明了本系统在实际应用中的潜力和价值。 展开更多
关键词 在线教育 慕课 视频检索 关键短语生成 知识点定位
下载PDF
监控视频高效智能检索研究
18
作者 孙菁遥 李亦欣 《无线互联科技》 2024年第9期27-30,54,共5页
随着监控设备和计算机视觉相关技术的逐步成熟,监控视频被广泛应用于安全防范、交通管理和刑事侦查等多个领域,如何准确高效地利用监控视频检索成为一个难题。文章在对基于语义的监控视频检索系统、基于对象的监控视频检索系统、基于颜... 随着监控设备和计算机视觉相关技术的逐步成熟,监控视频被广泛应用于安全防范、交通管理和刑事侦查等多个领域,如何准确高效地利用监控视频检索成为一个难题。文章在对基于语义的监控视频检索系统、基于对象的监控视频检索系统、基于颜色的监控视频检索系统以及基于深度卷积神经网络的视频检索的优势和局限性进行比较和研究后,对智能监控视频检索的未来发展趋势进行了展望。 展开更多
关键词 视频检索 监控视频 视频索引
下载PDF
基于成熟AI服务的音视频检索系统设计
19
作者 程通 《无线互联科技》 2024年第3期41-44,共4页
人工智能的时代已经到来,不断有新的人工智能算法、模型和服务产生。如何高质量利用这些成熟的人工智能,发挥其经济价值,是一个不得不思考的问题。文章选择音视频检索应用场景,并结合利用人工智能服务,阐述了基于成熟AI服务的音视频检... 人工智能的时代已经到来,不断有新的人工智能算法、模型和服务产生。如何高质量利用这些成熟的人工智能,发挥其经济价值,是一个不得不思考的问题。文章选择音视频检索应用场景,并结合利用人工智能服务,阐述了基于成熟AI服务的音视频检索系统的需求分析、功能设计和系统架构。 展开更多
关键词 人工智能 音视频检索 系统设计 人工智能服务
下载PDF
Visual polysemy and synonymy:toward near-duplicate image retrieval
20
作者 Manni DUAN Xiuqing WU 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2010年第4期419-429,共11页
that are duplicate or near duplicate to a query image.One of the most popular and practical methods in near-duplicate image retrieval is based on bag-of-words(BoW)model.However,the fundamental deficiency of current Bo... that are duplicate or near duplicate to a query image.One of the most popular and practical methods in near-duplicate image retrieval is based on bag-of-words(BoW)model.However,the fundamental deficiency of current BoW method is the gap between visual word and image’s semantic meaning.Similar problem also plagues existing text retrieval.A prevalent method against such issue in text retrieval is to eliminate text synonymy and polysemy and therefore improve the whole performance.Our proposed approach borrows ideas from text retrieval and tries to overcome these deficiencies of BoW model by treating the semantic gap problem as visual synonymy and polysemy issues.We use visual synonymy in a very general sense to describe the fact that there are many different visual words referring to the same visual meaning.By visual polysemy,we refer to the general fact that most visual words have more than one distinct meaning.To eliminate visual synonymy,we present an extended similarity function to implicitly extend query visual words.To eliminate visual polysemy,we use visual pattern and prove that the most efficient way of using visual pattern is merging visual word vector together with visual pattern vector and obtain the similarity score by cosine function.In addition,we observe that there is a high possibility that duplicates visual words occur in an adjacent area.Therefore,we modify traditional Apriori algorithm to mine quantitative pattern that can be defined as patterns containing duplicate items.Experiments prove quantitative patterns improving mean average precision(MAP)significantly. 展开更多
关键词 near-duplicate image retrieval bag-of-words(BoW)model visual synonymy visual polysemy extended similarity function query expansion visual pattern
原文传递
上一页 1 2 29 下一页 到第
使用帮助 返回顶部