期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Trends in Event Understanding and Caption Generation/Reconstruction in Dense Video:A Review
1
作者 Ekanayake Mudiyanselage Chulabhaya Lankanatha Ekanayake Abubakar Sulaiman Gezawa yunqi lei 《Computers, Materials & Continua》 SCIE EI 2024年第3期2941-2965,共25页
Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It... Video description generates natural language sentences that describe the subject,verb,and objects of the targeted Video.The video description has been used to help visually impaired people to understand the content.It is also playing an essential role in devolving human-robot interaction.The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping.Deep learning is changing the shape of computer vision(CV)technologies and natural language processing(NLP).There are hundreds of deep learning models,datasets,and evaluations that can improve the gaps in current research.This article filled this gap by evaluating some state-of-the-art approaches,especially focusing on deep learning and machine learning for video caption in a dense environment.In this article,some classic techniques concerning the existing machine learning were reviewed.And provides deep learning models,a detail of benchmark datasets with their respective domains.This paper reviews various evaluation metrics,including Bilingual EvaluationUnderstudy(BLEU),Metric for Evaluation of Translation with Explicit Ordering(METEOR),WordMover’s Distance(WMD),and Recall-Oriented Understudy for Gisting Evaluation(ROUGE)with their pros and cons.Finally,this article listed some future directions and proposed work for context enhancement using key scene extraction with object detection in a particular frame.Especially,how to improve the context of video description by analyzing key frames detection through morphological image analysis.Additionally,the paper discusses a novel approach involving sentence reconstruction and context improvement through key frame object detection,which incorporates the fusion of large languagemodels for refining results.The ultimate results arise fromenhancing the generated text of the proposedmodel by improving the predicted text and isolating objects using various keyframes.These keyframes identify dense events occurring in the video sequence. 展开更多
关键词 Video description video to text video caption sentence reconstruction
下载PDF
A Deep Learning Approach to Mesh Segmentation 被引量:1
2
作者 Abubakar Sulaiman Gezawa Qicong Wang +1 位作者 Haruna Chiroma yunqi lei 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第5期1745-1763,共19页
In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extra... In the shape analysis community,decomposing a 3D shape intomeaningful parts has become a topic of interest.3D model segmentation is largely used in tasks such as shape deformation,shape partial matching,skeleton extraction,shape correspondence,shape annotation and texture mapping.Numerous approaches have attempted to provide better segmentation solutions;however,the majority of the previous techniques used handcrafted features,which are usually focused on a particular attribute of 3Dobjects and so are difficult to generalize.In this paper,we propose a three-stage approach for using Multi-view recurrent neural network to automatically segment a 3D shape into visually meaningful sub-meshes.The first stage involves normalizing and scaling a 3D model to fit within the unit sphere and rendering the object into different views.Contrasting viewpoints,on the other hand,might not have been associated,and a 3D region could correlate into totally distinct outcomes depending on the viewpoint.To address this,we ran each view through(shared weights)CNN and Bolster block in order to create a probability boundary map.The Bolster block simulates the area relationships between different views,which helps to improve and refine the data.In stage two,the feature maps generated in the previous step are correlated using a Recurrent Neural network to obtain compatible fine detail responses for each view.Finally,a layer that is fully connected is used to return coherent edges,which are then back project to 3D objects to produce the final segmentation.Experiments on the Princeton Segmentation Benchmark dataset show that our proposed method is effective for mesh segmentation tasks. 展开更多
关键词 Deep learning mesh segmentation 3D shape shape features
下载PDF
带有隐式反馈的SVD推荐模型高效求解算法 被引量:2
3
作者 蔡剑平 雷蕴奇 +2 位作者 陈明明 王宁 张双越 《中国科学:信息科学》 CSCD 北大核心 2020年第10期1544-1558,共15页
作为推荐系统的重要组成部分,协同过滤已成为了当今主流的推荐方法之一.其中基于潜在因子的协同过滤常采用SVD推荐模型分析用户喜好.近年来,随着SVD推荐模型研究的深入,SVD++,TrustSVD等一类带有隐式反馈的SVD推荐模型被相继提出.此类... 作为推荐系统的重要组成部分,协同过滤已成为了当今主流的推荐方法之一.其中基于潜在因子的协同过滤常采用SVD推荐模型分析用户喜好.近年来,随着SVD推荐模型研究的深入,SVD++,TrustSVD等一类带有隐式反馈的SVD推荐模型被相继提出.此类模型能更有效地从有限的数据源中挖掘有用信息并取得了较好的效果,因此受到了人们广泛关注.然而,现有文献大多关注于模型设计,缺乏专门针对带有隐式反馈的SVD推荐模型的高效求解算法.为此,本文首先研究了一般性的SVD推荐模型梯度求解框架,然后以SVD++推荐模型为突破口,基于块梯度下降方法设计了高效求解算法BCDSVD++并解决了容量矩阵求逆、稀疏数据优化处理等两个关键问题.实验表明,本文所设计的BCDSVD++算法具有比传统的并行梯度下降法更高效的求解效率及收敛能力. 展开更多
关键词 SVD推荐模型 隐式反馈 SVD++ 块坐标下降法 协同过滤
原文传递
Face recognition by decision fusion of two-dimensional linear discriminant analysis and local binary pattern 被引量:1
4
作者 Qicong WANG Binbin WANG +4 位作者 Xinjie HAO Lisheng CHEN Jingmin CUI Rongrong JI yunqi lei 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第6期1118-1129,共12页
To investigate the robustness of face recognition algorithms under the complicated variations of illumination, facial expression and posture, the advantages and disadvantages of seven typical algorithms on extracting ... To investigate the robustness of face recognition algorithms under the complicated variations of illumination, facial expression and posture, the advantages and disadvantages of seven typical algorithms on extracting global and local features are studied through the experiments respectively on the Olivetti Research Laboratory database and the other three databases (the three subsets of illumination, expression and posture that are constructed by selecting images from several existing face databases). By taking the above experimental results into consideration, two schemes of face recognition which are based on the decision fusion of the twodimensional linear discriminant analysis (2DLDA) and local binary pattern (LBP) are proposed in this paper to heighten the recognition rates. In addition, partitioning a face nonuniformly for its LBP histograms is conducted to improve the performance. Our experimental results have shown the complementarities of the two kinds of features, the 2DLDA and LBP, and have verified the effectiveness of the proposed fusion algorithms. 展开更多
关键词 face recognition global feature local feature linear discriminant analysis local binary pattern decision fusion
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部