期刊文献+
共找到13,345篇文章
< 1 2 250 >
每页显示 20 50 100
Scene 3-D Reconstruction System in Scattering Medium
1
作者 Zhuoyifan Zhang Lu Zhang +1 位作者 LiangWang Haoming Wu 《Computers, Materials & Continua》 SCIE EI 2024年第8期3405-3420,共16页
Research on neural radiance fields for novel view synthesis has experienced explosive growth with the development of new models and extensions.The NeRF(Neural Radiance Fields)algorithm,suitable for underwater scenes o... Research on neural radiance fields for novel view synthesis has experienced explosive growth with the development of new models and extensions.The NeRF(Neural Radiance Fields)algorithm,suitable for underwater scenes or scattering media,is also evolving.Existing underwater 3D reconstruction systems still face challenges such as long training times and low rendering efficiency.This paper proposes an improved underwater 3D reconstruction system to achieve rapid and high-quality 3D reconstruction.First,we enhance underwater videos captured by a monocular camera to correct the image quality degradation caused by the physical properties of the water medium and ensure consistency in enhancement across frames.Then,we perform keyframe selection to optimize resource usage and reduce the impact of dynamic objects on the reconstruction results.After pose estimation using COLMAP,the selected keyframes undergo 3D reconstruction using neural radiance fields(NeRF)based on multi-resolution hash encoding for model construction and rendering.In terms of image enhancement,our method has been optimized in certain scenarios,demonstrating effectiveness in image enhancement and better continuity between consecutive frames of the same data.In terms of 3D reconstruction,our method achieved a peak signal-to-noise ratio(PSNR)of 18.40 dB and a structural similarity(SSIM)of 0.6677,indicating a good balance between operational efficiency and reconstruction quality. 展开更多
关键词 Underwater scene reconstruction image enhancement NeRF
下载PDF
YOLOv5ST:A Lightweight and Fast Scene Text Detector
2
作者 Yiwei Liu Yingnan Zhao +2 位作者 Yi Chen Zheng Hu Min Xia 《Computers, Materials & Continua》 SCIE EI 2024年第4期909-926,共18页
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ... Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments. 展开更多
关键词 scene text detection YOLOv5 LIGHTWEIGHT object detection
下载PDF
Towards complex scenes: A deep learning-based camouflaged people detection method for snapshot multispectral images
3
作者 Shu Wang Dawei Zeng +3 位作者 Yixuan Xu Gonghan Yang Feng Huang Liqiong Chen 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第4期269-281,共13页
Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems,... Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield. 展开更多
关键词 Camouflaged people detection Snapshot multispectral imaging Optimal band selection MS-YOLO Complex remote sensing scenes
下载PDF
A Survey of Crime Scene Investigation Image Retrieval Using Deep Learning
4
作者 Ying Liu Aodong Zhou +1 位作者 Jize Xue Zhijie Xu 《Journal of Beijing Institute of Technology》 EI CAS 2024年第4期271-286,共16页
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep... Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field. 展开更多
关键词 crime scene investigation(CSI)image image retrieval deep learning
下载PDF
A Dual Domain Robust Reversible Watermarking Algorithm for Frame Grouping Videos Using Scene Smoothness
5
作者 Yucheng Liang Ke Niu +1 位作者 Yingnan Zhang Yifei Meng 《Computers, Materials & Continua》 SCIE EI 2024年第6期5143-5174,共32页
The proposed robust reversible watermarking algorithm addresses the compatibility challenges between robustness and reversibility in existing video watermarking techniques by leveraging scene smoothness for frame grou... The proposed robust reversible watermarking algorithm addresses the compatibility challenges between robustness and reversibility in existing video watermarking techniques by leveraging scene smoothness for frame grouping videos.Grounded in the H.264 video coding standard,the algorithm first employs traditional robust watermark stitching technology to embed watermark information in the low-frequency coefficient domain of the U channel.Subsequently,it utilizes histogram migration techniques in the high-frequency coefficient domain of the U channel to embed auxiliary information,enabling successful watermark extraction and lossless recovery of the original video content.Experimental results demonstrate the algorithm’s strong imperceptibility,with each embedded frame in the experimental videos achieving a mean peak signal-to-noise ratio of 49.3830 dB and a mean structural similarity of 0.9996.Compared with the three comparison algorithms,the performance of the two experimental indexes is improved by 7.59%and 0.4%on average.At the same time,the proposed algorithm has strong robustness to both offline and online attacks:In the face of offline attacks,the average normalized correlation coefficient between the extracted watermark and the original watermark is 0.9989,and the average bit error rate is 0.0089.In the face of online attacks,the normalized correlation coefficient between the extracted watermark and the original watermark is 0.8840,and the mean bit error rate is 0.2269.Compared with the three comparison algorithms,the performance of the two experimental indexes is improved by 1.27%and 18.16%on average,highlighting the algorithm’s robustness.Furthermore,the algorithm exhibits low computational complexity,with the mean encoding and the mean decoding time differentials during experimental video processing being 3.934 and 2.273 s,respectively,underscoring its practical utility. 展开更多
关键词 Robust reversible watermarking scene smoothness dual-domain U channel H.264 encoding standard
下载PDF
Intelligent Sensing and Control of Road Construction Robot Scenes Based on Road Construction
6
作者 Zhongping Chen Weigong Zhang 《Structural Durability & Health Monitoring》 EI 2024年第2期111-124,共14页
Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real... Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real-world monitoring,the process will use RTK-GNSS positional perception technology,by projecting the left side of the earth from Gauss-Krueger projection method,and then carry out the Cartesian conversion based on the characteristics of drawing;steering control system is the core of the electric drive unmanned module,on the basis of the analysis of the composition of the steering system of unmanned engineering vehicles,the steering system key components such as direction,torque sensor,drive motor and other models are established,the joint simulation model of unmanned engineering vehicles is established,the steering controller is designed using the PID method,the simulation results show that the control method can meet the construction path demand for automatic steering.The path planning will first formulate the construction area with preset values and realize the steering angle correction during driving by PID algorithm,and never realize the construction-based path planning,and the results show that the method can control the straight path within the error of 10 cm and the curve error within 20 cm.With the collaboration of various modules,the automatic construction simulation results of this robot show that the design path and control method is effective. 展开更多
关键词 scene perception remote control technology cartesian coordinate system construction robot highway construction
下载PDF
Ground target localization of unmanned aerial vehicle based on scene matching
7
作者 ZHANG Yan CHEN Yukun +2 位作者 HUANG He TANG Simi LI Zhi 《High Technology Letters》 EI CAS 2024年第3期231-243,共13页
In order to improve target localization precision,accuracy,execution efficiency,and application range of the unmanned aerial vehicle(UAV)based on scene matching,a ground target localization method for unmanned aerial ... In order to improve target localization precision,accuracy,execution efficiency,and application range of the unmanned aerial vehicle(UAV)based on scene matching,a ground target localization method for unmanned aerial vehicle based on scene matching(GTLUAVSM)is proposed.The sugges-ted approach entails completing scene matching through a feature matching algorithm.Then,multi-sensor registration is optimized by robust estimation based on homologous registration.Finally,basemap generation and model solution are utilized to improve basemap correspondence and accom-plish aerial image positioning.Theoretical evidence and experimental verification demonstrate that GTLUAVSM can improve localization accuracy,speed,and precision while minimizing reliance on task equipment. 展开更多
关键词 scene matching basemap adjustment feature registration random sample con-sensus(RANSAC) unmanned aerial vehicle(UAV)
下载PDF
The Fusion of Temporal Sequence with Scene Priori Information in Deep Learning Object Recognition
8
作者 Yongkang Cao Fengjun Liu +2 位作者 Xian Wang Wenyun Wang Zhaoxin Peng 《Open Journal of Applied Sciences》 2024年第9期2610-2627,共18页
For some important object recognition applications such as intelligent robots and unmanned driving, images are collected on a consecutive basis and associated among themselves, besides, the scenes have steady prior fe... For some important object recognition applications such as intelligent robots and unmanned driving, images are collected on a consecutive basis and associated among themselves, besides, the scenes have steady prior features. Yet existing technologies do not take full advantage of this information. In order to take object recognition further than existing algorithms in the above application, an object recognition method that fuses temporal sequence with scene priori information is proposed. This method first employs YOLOv3 as the basic algorithm to recognize objects in single-frame images, then the DeepSort algorithm to establish association among potential objects recognized in images of different moments, and finally the confidence fusion method and temporal boundary processing method designed herein to fuse, at the decision level, temporal sequence information with scene priori information. Experiments using public datasets and self-built industrial scene datasets show that due to the expansion of information sources, the quality of single-frame images has less impact on the recognition results, whereby the object recognition is greatly improved. It is presented herein as a widely applicable framework for the fusion of information under multiple classes. All the object recognition algorithms that output object class, location information and recognition confidence at the same time can be integrated into this information fusion framework to improve performance. 展开更多
关键词 Computer Vison Object Recognition Deep Learning Consecutive scene Information Fusion
下载PDF
Analyzing the Impact of Scene Transitions on Indoor Camera Localization through Scene Change Detection in Real-Time
9
作者 Muhammad S.Alam Farhan B.Mohamed +2 位作者 Ali Selamat Faruk Ahmed AKM B.Hossain 《Intelligent Automation & Soft Computing》 2024年第3期417-436,共20页
Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o... Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance. 展开更多
关键词 Camera pose estimation indoor camera localization real-time localization scene change detection simultaneous localization and mapping(SLAM)
下载PDF
基于ArcScene三维数字流域建模研究 被引量:3
10
作者 李鑫龙 杨东旭 +3 位作者 王璐 季月 林浩 吕国辉 《安徽农业科学》 CAS 2015年第22期363-365,共3页
三维数字流域是对流域周边地理环境、自然环境和生态环境等各种信息的直观显示,对流域内经济建设与资源利用有重要的辅助作用。该研究提出了一种基于Arc Scene的三维数字流域建模方法。利用航空摄影测量获得的高分辨率的DOM影像与DEM数... 三维数字流域是对流域周边地理环境、自然环境和生态环境等各种信息的直观显示,对流域内经济建设与资源利用有重要的辅助作用。该研究提出了一种基于Arc Scene的三维数字流域建模方法。利用航空摄影测量获得的高分辨率的DOM影像与DEM数据进行叠加分析,实现了数字流域的建模与多视图的三维飞行动画预览。利用该方法可以对流域内地形地貌与人文特征进行快速直观的预览与分析,为流域内水利设施、交通设施、社会公共设施的设计施工与改建的顺利进行提供重要的技术支持。 展开更多
关键词 ARC scene 激光点云 DOM DEM 流域三维建模
下载PDF
Eye movements during inspecting pictures of natural scenes for information to verify sentences
11
作者 陈庆荣 蒋志杰 《Journal of Southeast University(English Edition)》 EI CAS 2010年第3期444-447,共4页
As eye tracking can be used to record moment-to-moment changes of eye movements as people inspect pictures of natural scenes and comprehend information, this paper attempts to use eye-movement technology to investigat... As eye tracking can be used to record moment-to-moment changes of eye movements as people inspect pictures of natural scenes and comprehend information, this paper attempts to use eye-movement technology to investigate how the order of presentation and the characteristics of information affect the semantic mismatch effect in the picture-sentence paradigm. A 3(syntax)×2(semantic relation) factorial design is adopted, with syntax and semantic relations as within-participant variables. The experiment finds that the semantic mismatch is most likely to increase cognitive loads as people have to spend more time, including first-pass time, regression path duration, and total fixation duration. Double negation does not significantly increase the processing difficulty of pictures and information. Experimental results show that people can extract the special syntactic strategy from long-term memory to process pictures and sentences with different semantic relations. It enables readers to comprehend double negation as affirmation. These results demonstrate that the constituent comparison model may not be a general model regarding other languages. 展开更多
关键词 natural scene semantic mismatch double negation eye movement
下载PDF
Suitable Region for Flue-cured Tobacco (Nicotiana glauca L.) Planting Based on Spatial Scene Similarity
12
作者 董钧祥 郭旦怀 邵小东 《Agricultural Science & Technology》 CAS 2012年第9期1947-1949,1981,共4页
[Objective] The aim was to establish a model based on spatial scene similarity, for which soil, slope, transport, water conservancy, light, social economic factors in suitable planting areas were all considered. A new... [Objective] The aim was to establish a model based on spatial scene similarity, for which soil, slope, transport, water conservancy, light, social economic factors in suitable planting areas were all considered. A new suitable planting area of flue-cured tobacco was determined by comparison and analysis, with consideration of excellent area. [Method] Totaling thirty natural factors were chosen, which were clas- sified into nine categories, from Longpeng Town (LP) and Shaochong Town (SC) in Shiping County in Honghe Hani and Yi Autonomous Prefecture. [Result] According to weights, the factors from high to low were as follows: soil〉light〉elevation〉slope〉 water conservancy〉transport〉baking facility〉planting plans over the years〉others. The similarity of geographical conditions in the area was 0.894 3, which indicated that the planting conditions in the two regions are similar. If farmer population in unit area, farmland quantity for individual farmer, labors in every household, activity in planting flue-cured tobacco and work of local instructor were considered, the weights of different factors were as follows: farmer population in unit area〉farmland quantity for individual farmer〉farmers' activity in planting flue-cured tobacco〉educational back- ground〉labor force in every household〉instructor〉population of farmers' children at- tending school. The similarity of geographical conditions was 0.703 1, which indicated that it is none-natural factors that influence yield and quality of flue-cured tobacco. [Conclusion] According to analysis on suitable planting area of flue-cured tobacco based on assessment of spatial scene similarity, similarity of growing conditions in two spatial scenes can be analyzed and evaluated, which would promote further exploration on, influencing factors and effects on tobacco production. 展开更多
关键词 Similarity of spatial scene Planting of flue-cured tobacco Suitable region
下载PDF
基于线性特征边界线的实景三维动态更新关键技术研究
13
作者 李佳佳 吴昊 朱理想 《山西建筑》 2025年第2期191-194,198,共5页
实景三维中国建设是落实数字中国、平安中国、数字经济战略的重要举措,是新时期国家重要新型基础设施,是服务生态文明建设和经济社会发展的基础支撑。“十四五”以来,各地已经陆续开展了一定规模的实景三维建设项目,积累了不同尺度的实... 实景三维中国建设是落实数字中国、平安中国、数字经济战略的重要举措,是新时期国家重要新型基础设施,是服务生态文明建设和经济社会发展的基础支撑。“十四五”以来,各地已经陆续开展了一定规模的实景三维建设项目,积累了不同尺度的实景三维底板数据。如何充分利用已有建设成果,且持续保障实景三维数据的时效性、可用性,已经成为下一个亟需解决的问题。文章针对实景三维数据的场景特征属性,提出一种线性特征边界线的实景三维动态更新和场景融合的方案。基于原有场景数据,将目标更新区域最邻近线性特征边界线作为实际更新范围,对其进行切割、拼接、调色、融合等处理,实现多期实景三维数据的无缝融合。 展开更多
关键词 实景三维 动态更新 数据融合 倾斜摄影
下载PDF
基于ArcScene与SketchUp交互建模技术的生态景观开发——以当阳市庙前镇关雎河畔为例
14
作者 冯德鸿 雷奥林 池晓霞 《测绘与空间地理信息》 2022年第8期57-59,共3页
以当阳市庙前镇关雎河畔为例,重点介绍了基于ArcScene与SketchUp交互建模技术:基于SketchUp Esri插件和基于ArcScene中3D编辑器的交互建模技术,并比较两种不同建模技术之间的优缺点和适用范围,同时应用带有高程值的航拍地形数据点要素... 以当阳市庙前镇关雎河畔为例,重点介绍了基于ArcScene与SketchUp交互建模技术:基于SketchUp Esri插件和基于ArcScene中3D编辑器的交互建模技术,并比较两种不同建模技术之间的优缺点和适用范围,同时应用带有高程值的航拍地形数据点要素模拟关雎河畔地形,对场景精细化以完成生态景观场景开发。 展开更多
关键词 Arcscene SKETCHUP 交互建模 地形模拟 场景精细化
下载PDF
Deep Scalogram Representations for Acoustic Scene Classification 被引量:5
15
作者 Zhao Ren Kun Qian +3 位作者 Zixing Zhang Vedhas Pandit Alice Baird Bjorn Schuller 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第3期662-669,共8页
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info... Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system. 展开更多
关键词 Acoustic scene classification(ASC) (bidirectional) gated recurrent neural networks((B) GRNNs) convolutional neural networks(CNNs) deep scalogram representation spectrogram representation
下载PDF
A Study of Video Scenes Clustering Based on Shot Key Frames 被引量:1
16
作者 CAI Bo ZHANG Lu ZHOU Dong-ru 《Wuhan University Journal of Natural Sciences》 EI CAS 2005年第6期966-970,共5页
In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clu... In digital video analysis, browse, retrieval and query, shot is incapable of meeting needs. Scene is a cluster of a series of shots, which partially meets above demands. In this paper, an algorithm of video scenes clustering based on shot key frame sets is proposed. We use X^2 histogram match and twin histogram comparison for shot detection. A method is presented for key frame set extraction based on distance of non adjacent frames, further more, the minimum distance of key frame sets as distance of shots is computed, eventually scenes are clustered according to the distance of shots. Experiments of this algorithm show satisfactory performance in cor rectness and computing speed. 展开更多
关键词 shot scene key frame CLUSTERING
下载PDF
Semantic segmentation method of road scene based on Deeplabv3+ and attention mechanism 被引量:6
17
作者 BAI Yanqiong ZHENG Yufu TIAN Hong 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第4期412-422,共11页
In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in acc... In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in accordance with the pixel level,so as to help vehicles to perceive and obtain the surrounding road environment information,which would improve driving safety.Deeplabv3+is the current popular semantic segmentation model.There are phenomena that small targets are missed and similar objects are easily misjudged during its semantic segmentation tasks,which leads to rough segmentation boundary and reduces semantic accuracy.This study focuses on the issue,based on the Deeplabv3+network structure and combined with the attention mechanism,to increase the weight of the segmentation area,and then proposes an improved Deeplabv3+fusion attention mechanism for road scene semantic segmentation method.First,a group of parallel position attention module and channel attention module are introduced on the Deeplabv3+encoding end to capture more spatial context information and high-level semantic information.Then,an attention mechanism is introduced to restore the spatial detail information,and the data shall be normalized in order to accelerate the convergence speed of the model at the decoding end.The effects of model segmentation with different attention-introducing mechanisms are compared and tested on CamVid and Cityscapes datasets.The experimental results show that the mean Intersection over Unons of the improved model segmentation accuracies on the two datasets are boosted by 6.88%and 2.58%,respectively,which is better than using Deeplabv3+.This method does not significantly increase the amount of network calculation and complexity,and has a good balance of speed and accuracy. 展开更多
关键词 autonomous driving road scene semantic segmentation Deeplabv3+ attention mechanism
下载PDF
Advanced Feature Fusion Algorithm Based on Multiple Convolutional Neural Network for Scene Recognition 被引量:5
18
作者 Lei Chen Kanghu Bo +1 位作者 Feifei Lee Qiu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第2期505-523,共19页
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti... Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches. 展开更多
关键词 scene recognition deep feature fusion multiple convolutional neural network.
下载PDF
Seabed scene simulation and its realization in extending Vega 被引量:3
19
作者 Zhi-ming Song Feng-ju Kang Kai Tang and Yan-jun Chu 《Journal of Marine Science and Application》 2003年第2期40-45,共6页
Realistic simulation of underwater scene is always difficult because of the special and complex vision effects in underwater space. Seabed is an important part of underwater environment. This paper describes the metho... Realistic simulation of underwater scene is always difficult because of the special and complex vision effects in underwater space. Seabed is an important part of underwater environment. This paper describes the methods for seabed scene simulation based on OpenGL. It includes construction of fluctuant terrain based on the random sinusoidal algorithm, simulation of seabed flicker effect by means of circular texture mapping and generation of turbidity effect by using fog techniques. For the application based on the leading high level 3D development environment - Vega, underwater scene simulation is still a difficulty since there is no module for it. Rased on the analysis of Vega software and the research on seabed scene simulation methods, a Vega extending module named 'Underwater Space' was created through developing module class and extending lynx interface. The module class was designed through developing DLL written in C + + . The Lynx was extended through developing keyword configure file, GUI configure file and lynx plug-in DLL. The problem that Vega can't simulate underwater space, is elementarily resolved. The results show that this module is efficient, easy using, and the seabed scene images are vivid. 展开更多
关键词 scene simulation SEABED OpenGL Vega module circular texture FOG FLICKER TURBIDITY
下载PDF
A Modified Method for Scene Text Detection by ResNet 被引量:2
20
作者 Shaozhang Niu Xiangxiang Li +1 位作者 Maosen Wang Yueying Li 《Computers, Materials & Continua》 SCIE EI 2020年第12期2233-2245,共13页
In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the... In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the content of the scenes themselves.The more accurate the text detection of the natural scenes is,the more accurate our semantic understanding of the images will be.Thus,scene text detection has also become the hot spot in the domain of computer vision.In this paper,we have presented a modified text detection network which is based on further research and improvement of Connectionist Text Proposal Network(CTPN)proposed by previous researchers.To extract deeper features that are less affected by different images,we use Residual Network(ResNet)to replace Visual Geometry Group Network(VGGNet)which is used in the original network.Meanwhile,to enhance the robustness of the models to multiple languages,we use the datasets for training from multi-lingual scene text detection and script identification datasets(MLT)of 2017 International Conference on Document Analysis and Recognition(ICDAR2017).And apart from that,the attention mechanism is used to get more reasonable weight distribution.We found the proposed models achieve 0.91 F1-score on ICDAR2011 test,better than CTPN trained on the same datasets by about 5%. 展开更多
关键词 CTPN scene text detection ResNet ATTENTION
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部