期刊文献+
共找到13,353篇文章
< 1 2 250 >
每页显示 20 50 100
Towards complex scenes: A deep learning-based camouflaged people detection method for snapshot multispectral images
1
作者 Shu Wang Dawei Zeng +3 位作者 Yixuan Xu Gonghan Yang Feng Huang Liqiong Chen 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2024年第4期269-281,共13页
Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems,... Camouflaged people are extremely expert in actively concealing themselves by effectively utilizing cover and the surrounding environment. Despite advancements in optical detection capabilities through imaging systems, including spectral, polarization, and infrared technologies, there is still a lack of effective real-time method for accurately detecting small-size and high-efficient camouflaged people in complex real-world scenes. Here, this study proposes a snapshot multispectral image-based camouflaged detection model, multispectral YOLO(MS-YOLO), which utilizes the SPD-Conv and Sim AM modules to effectively represent targets and suppress background interference by exploiting the spatial-spectral target information. Besides, the study constructs the first real-shot multispectral camouflaged people dataset(MSCPD), which encompasses diverse scenes, target scales, and attitudes. To minimize information redundancy, MS-YOLO selects an optimal subset of 12 bands with strong feature representation and minimal inter-band correlation as input. Through experiments on the MSCPD, MS-YOLO achieves a mean Average Precision of 94.31% and real-time detection at 65 frames per second, which confirms the effectiveness and efficiency of our method in detecting camouflaged people in various typical desert and forest scenes. Our approach offers valuable support to improve the perception capabilities of unmanned aerial vehicles in detecting enemy forces and rescuing personnel in battlefield. 展开更多
关键词 Camouflaged people detection Snapshot multispectral imaging Optimal band selection MS-YOLO Complex remote sensing scenes
下载PDF
YOLOv5ST:A Lightweight and Fast Scene Text Detector
2
作者 Yiwei Liu Yingnan Zhao +2 位作者 Yi Chen Zheng Hu Min Xia 《Computers, Materials & Continua》 SCIE EI 2024年第4期909-926,共18页
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ... Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments. 展开更多
关键词 scene text detection YOLOv5 LIGHTWEIGHT object detection
下载PDF
Scene 3-D Reconstruction System in Scattering Medium
3
作者 Zhuoyifan Zhang Lu Zhang +1 位作者 LiangWang Haoming Wu 《Computers, Materials & Continua》 SCIE EI 2024年第8期3405-3420,共16页
Research on neural radiance fields for novel view synthesis has experienced explosive growth with the development of new models and extensions.The NeRF(Neural Radiance Fields)algorithm,suitable for underwater scenes o... Research on neural radiance fields for novel view synthesis has experienced explosive growth with the development of new models and extensions.The NeRF(Neural Radiance Fields)algorithm,suitable for underwater scenes or scattering media,is also evolving.Existing underwater 3D reconstruction systems still face challenges such as long training times and low rendering efficiency.This paper proposes an improved underwater 3D reconstruction system to achieve rapid and high-quality 3D reconstruction.First,we enhance underwater videos captured by a monocular camera to correct the image quality degradation caused by the physical properties of the water medium and ensure consistency in enhancement across frames.Then,we perform keyframe selection to optimize resource usage and reduce the impact of dynamic objects on the reconstruction results.After pose estimation using COLMAP,the selected keyframes undergo 3D reconstruction using neural radiance fields(NeRF)based on multi-resolution hash encoding for model construction and rendering.In terms of image enhancement,our method has been optimized in certain scenarios,demonstrating effectiveness in image enhancement and better continuity between consecutive frames of the same data.In terms of 3D reconstruction,our method achieved a peak signal-to-noise ratio(PSNR)of 18.40 dB and a structural similarity(SSIM)of 0.6677,indicating a good balance between operational efficiency and reconstruction quality. 展开更多
关键词 Underwater scene reconstruction image enhancement NeRF
下载PDF
A Dual Domain Robust Reversible Watermarking Algorithm for Frame Grouping Videos Using Scene Smoothness
4
作者 Yucheng Liang Ke Niu +1 位作者 Yingnan Zhang Yifei Meng 《Computers, Materials & Continua》 SCIE EI 2024年第6期5143-5174,共32页
The proposed robust reversible watermarking algorithm addresses the compatibility challenges between robustness and reversibility in existing video watermarking techniques by leveraging scene smoothness for frame grou... The proposed robust reversible watermarking algorithm addresses the compatibility challenges between robustness and reversibility in existing video watermarking techniques by leveraging scene smoothness for frame grouping videos.Grounded in the H.264 video coding standard,the algorithm first employs traditional robust watermark stitching technology to embed watermark information in the low-frequency coefficient domain of the U channel.Subsequently,it utilizes histogram migration techniques in the high-frequency coefficient domain of the U channel to embed auxiliary information,enabling successful watermark extraction and lossless recovery of the original video content.Experimental results demonstrate the algorithm’s strong imperceptibility,with each embedded frame in the experimental videos achieving a mean peak signal-to-noise ratio of 49.3830 dB and a mean structural similarity of 0.9996.Compared with the three comparison algorithms,the performance of the two experimental indexes is improved by 7.59%and 0.4%on average.At the same time,the proposed algorithm has strong robustness to both offline and online attacks:In the face of offline attacks,the average normalized correlation coefficient between the extracted watermark and the original watermark is 0.9989,and the average bit error rate is 0.0089.In the face of online attacks,the normalized correlation coefficient between the extracted watermark and the original watermark is 0.8840,and the mean bit error rate is 0.2269.Compared with the three comparison algorithms,the performance of the two experimental indexes is improved by 1.27%and 18.16%on average,highlighting the algorithm’s robustness.Furthermore,the algorithm exhibits low computational complexity,with the mean encoding and the mean decoding time differentials during experimental video processing being 3.934 and 2.273 s,respectively,underscoring its practical utility. 展开更多
关键词 Robust reversible watermarking scene smoothness dual-domain U channel H.264 encoding standard
下载PDF
A Survey of Crime Scene Investigation Image Retrieval Using Deep Learning
5
作者 Ying Liu Aodong Zhou +1 位作者 Jize Xue Zhijie Xu 《Journal of Beijing Institute of Technology》 EI CAS 2024年第4期271-286,共16页
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep... Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field. 展开更多
关键词 crime scene investigation(CSI)image image retrieval deep learning
下载PDF
Intelligent Sensing and Control of Road Construction Robot Scenes Based on Road Construction
6
作者 Zhongping Chen Weigong Zhang 《Structural Durability & Health Monitoring》 EI 2024年第2期111-124,共14页
Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real... Automatic control technology is the basis of road robot improvement,according to the characteristics of construction equipment and functions,the research will be input type perception from positioning acquisition,real-world monitoring,the process will use RTK-GNSS positional perception technology,by projecting the left side of the earth from Gauss-Krueger projection method,and then carry out the Cartesian conversion based on the characteristics of drawing;steering control system is the core of the electric drive unmanned module,on the basis of the analysis of the composition of the steering system of unmanned engineering vehicles,the steering system key components such as direction,torque sensor,drive motor and other models are established,the joint simulation model of unmanned engineering vehicles is established,the steering controller is designed using the PID method,the simulation results show that the control method can meet the construction path demand for automatic steering.The path planning will first formulate the construction area with preset values and realize the steering angle correction during driving by PID algorithm,and never realize the construction-based path planning,and the results show that the method can control the straight path within the error of 10 cm and the curve error within 20 cm.With the collaboration of various modules,the automatic construction simulation results of this robot show that the design path and control method is effective. 展开更多
关键词 scene perception remote control technology cartesian coordinate system construction robot highway construction
下载PDF
Ground target localization of unmanned aerial vehicle based on scene matching
7
作者 ZHANG Yan CHEN Yukun +2 位作者 HUANG He TANG Simi LI Zhi 《High Technology Letters》 EI CAS 2024年第3期231-243,共13页
In order to improve target localization precision,accuracy,execution efficiency,and application range of the unmanned aerial vehicle(UAV)based on scene matching,a ground target localization method for unmanned aerial ... In order to improve target localization precision,accuracy,execution efficiency,and application range of the unmanned aerial vehicle(UAV)based on scene matching,a ground target localization method for unmanned aerial vehicle based on scene matching(GTLUAVSM)is proposed.The sugges-ted approach entails completing scene matching through a feature matching algorithm.Then,multi-sensor registration is optimized by robust estimation based on homologous registration.Finally,basemap generation and model solution are utilized to improve basemap correspondence and accom-plish aerial image positioning.Theoretical evidence and experimental verification demonstrate that GTLUAVSM can improve localization accuracy,speed,and precision while minimizing reliance on task equipment. 展开更多
关键词 scene matching basemap adjustment feature registration random sample con-sensus(RANSAC) unmanned aerial vehicle(UAV)
下载PDF
Analyzing the Impact of Scene Transitions on Indoor Camera Localization through Scene Change Detection in Real-Time
8
作者 Muhammad S.Alam Farhan B.Mohamed +2 位作者 Ali Selamat Faruk Ahmed AKM B.Hossain 《Intelligent Automation & Soft Computing》 2024年第3期417-436,共20页
Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance o... Real-time indoor camera localization is a significant problem in indoor robot navigation and surveillance systems.The scene can change during the image sequence and plays a vital role in the localization performance of robotic applications in terms of accuracy and speed.This research proposed a real-time indoor camera localization system based on a recurrent neural network that detects scene change during the image sequence.An annotated image dataset trains the proposed system and predicts the camera pose in real-time.The system mainly improved the localization performance of indoor cameras by more accurately predicting the camera pose.It also recognizes the scene changes during the sequence and evaluates the effects of these changes.This system achieved high accuracy and real-time performance.The scene change detection process was performed using visual rhythm and the proposed recurrent deep architecture,which performed camera pose prediction and scene change impact evaluation.Overall,this study proposed a novel real-time localization system for indoor cameras that detects scene changes and shows how they affect localization performance. 展开更多
关键词 Camera pose estimation indoor camera localization real-time localization scene change detection simultaneous localization and mapping(SLAM)
下载PDF
The Fusion of Temporal Sequence with Scene Priori Information in Deep Learning Object Recognition
9
作者 Yongkang Cao Fengjun Liu +2 位作者 Xian Wang Wenyun Wang Zhaoxin Peng 《Open Journal of Applied Sciences》 2024年第9期2610-2627,共18页
For some important object recognition applications such as intelligent robots and unmanned driving, images are collected on a consecutive basis and associated among themselves, besides, the scenes have steady prior fe... For some important object recognition applications such as intelligent robots and unmanned driving, images are collected on a consecutive basis and associated among themselves, besides, the scenes have steady prior features. Yet existing technologies do not take full advantage of this information. In order to take object recognition further than existing algorithms in the above application, an object recognition method that fuses temporal sequence with scene priori information is proposed. This method first employs YOLOv3 as the basic algorithm to recognize objects in single-frame images, then the DeepSort algorithm to establish association among potential objects recognized in images of different moments, and finally the confidence fusion method and temporal boundary processing method designed herein to fuse, at the decision level, temporal sequence information with scene priori information. Experiments using public datasets and self-built industrial scene datasets show that due to the expansion of information sources, the quality of single-frame images has less impact on the recognition results, whereby the object recognition is greatly improved. It is presented herein as a widely applicable framework for the fusion of information under multiple classes. All the object recognition algorithms that output object class, location information and recognition confidence at the same time can be integrated into this information fusion framework to improve performance. 展开更多
关键词 Computer Vison Object Recognition Deep Learning Consecutive scene Information Fusion
下载PDF
基于ArcScene三维数字流域建模研究 被引量:3
10
作者 李鑫龙 杨东旭 +3 位作者 王璐 季月 林浩 吕国辉 《安徽农业科学》 CAS 2015年第22期363-365,共3页
三维数字流域是对流域周边地理环境、自然环境和生态环境等各种信息的直观显示,对流域内经济建设与资源利用有重要的辅助作用。该研究提出了一种基于Arc Scene的三维数字流域建模方法。利用航空摄影测量获得的高分辨率的DOM影像与DEM数... 三维数字流域是对流域周边地理环境、自然环境和生态环境等各种信息的直观显示,对流域内经济建设与资源利用有重要的辅助作用。该研究提出了一种基于Arc Scene的三维数字流域建模方法。利用航空摄影测量获得的高分辨率的DOM影像与DEM数据进行叠加分析,实现了数字流域的建模与多视图的三维飞行动画预览。利用该方法可以对流域内地形地貌与人文特征进行快速直观的预览与分析,为流域内水利设施、交通设施、社会公共设施的设计施工与改建的顺利进行提供重要的技术支持。 展开更多
关键词 ARC scene 激光点云 DOM DEM 流域三维建模
下载PDF
Eye movements during inspecting pictures of natural scenes for information to verify sentences
11
作者 陈庆荣 蒋志杰 《Journal of Southeast University(English Edition)》 EI CAS 2010年第3期444-447,共4页
As eye tracking can be used to record moment-to-moment changes of eye movements as people inspect pictures of natural scenes and comprehend information, this paper attempts to use eye-movement technology to investigat... As eye tracking can be used to record moment-to-moment changes of eye movements as people inspect pictures of natural scenes and comprehend information, this paper attempts to use eye-movement technology to investigate how the order of presentation and the characteristics of information affect the semantic mismatch effect in the picture-sentence paradigm. A 3(syntax)×2(semantic relation) factorial design is adopted, with syntax and semantic relations as within-participant variables. The experiment finds that the semantic mismatch is most likely to increase cognitive loads as people have to spend more time, including first-pass time, regression path duration, and total fixation duration. Double negation does not significantly increase the processing difficulty of pictures and information. Experimental results show that people can extract the special syntactic strategy from long-term memory to process pictures and sentences with different semantic relations. It enables readers to comprehend double negation as affirmation. These results demonstrate that the constituent comparison model may not be a general model regarding other languages. 展开更多
关键词 natural scene semantic mismatch double negation eye movement
下载PDF
Suitable Region for Flue-cured Tobacco (Nicotiana glauca L.) Planting Based on Spatial Scene Similarity
12
作者 董钧祥 郭旦怀 邵小东 《Agricultural Science & Technology》 CAS 2012年第9期1947-1949,1981,共4页
[Objective] The aim was to establish a model based on spatial scene similarity, for which soil, slope, transport, water conservancy, light, social economic factors in suitable planting areas were all considered. A new... [Objective] The aim was to establish a model based on spatial scene similarity, for which soil, slope, transport, water conservancy, light, social economic factors in suitable planting areas were all considered. A new suitable planting area of flue-cured tobacco was determined by comparison and analysis, with consideration of excellent area. [Method] Totaling thirty natural factors were chosen, which were clas- sified into nine categories, from Longpeng Town (LP) and Shaochong Town (SC) in Shiping County in Honghe Hani and Yi Autonomous Prefecture. [Result] According to weights, the factors from high to low were as follows: soil〉light〉elevation〉slope〉 water conservancy〉transport〉baking facility〉planting plans over the years〉others. The similarity of geographical conditions in the area was 0.894 3, which indicated that the planting conditions in the two regions are similar. If farmer population in unit area, farmland quantity for individual farmer, labors in every household, activity in planting flue-cured tobacco and work of local instructor were considered, the weights of different factors were as follows: farmer population in unit area〉farmland quantity for individual farmer〉farmers' activity in planting flue-cured tobacco〉educational back- ground〉labor force in every household〉instructor〉population of farmers' children at- tending school. The similarity of geographical conditions was 0.703 1, which indicated that it is none-natural factors that influence yield and quality of flue-cured tobacco. [Conclusion] According to analysis on suitable planting area of flue-cured tobacco based on assessment of spatial scene similarity, similarity of growing conditions in two spatial scenes can be analyzed and evaluated, which would promote further exploration on, influencing factors and effects on tobacco production. 展开更多
关键词 Similarity of spatial scene Planting of flue-cured tobacco Suitable region
下载PDF
Traffic Scene Captioning with Multi-Stage Feature Enhancement
13
作者 Dehai Zhang Yu Ma +3 位作者 Qing Liu Haoxing Wang Anquan Ren Jiashu Liang 《Computers, Materials & Continua》 SCIE EI 2023年第9期2901-2920,共20页
Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providi... Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene. 展开更多
关键词 Traffic scene captioning sustainable transportation feature enhancement encoder-decoder structure multi-level granularity scene knowledge graph
下载PDF
3D scene graph prediction from point clouds
14
作者 Fanfan WU Feihu YAN +1 位作者 Weimin SHI Zhong ZHOU 《Virtual Reality & Intelligent Hardware》 EI 2022年第1期76-88,共13页
Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes... Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes and their relationships are modeled as edges.More specifically,we employ the DGCNN to capture the features of objects and their relationships in the scene.A Graph Attention Network(GAT)is introduced to exploit latent features obtained from the initial estimation to further refine the object arrangement in the graph structure.A one loss function modified from cross entropy with a variable weight is proposed to solve the multi-category problem in the prediction of object and predicate.Results Experiments reveal that the proposed approach performs favorably against the state-of-the-art methods in terms of predicate classification and relationship prediction and achieves comparable performance on object classification prediction.Conclusions The 3D scene graph prediction approach can form an abstract description of the scene space from point clouds. 展开更多
关键词 scene understanding 3D scene graph Point cloud DGCNN GAT
下载PDF
Preliminary Study on Program Setting Path of Park City Street Scene Construction
15
作者 YANG Maomao LI Yali +2 位作者 WU Yulan YAN Qi LIU Shiliang 《Journal of Landscape Research》 2022年第5期88-92,96,共6页
In the context of park city construction,urban street space system and scene construction are the most important forms of presentation of value transformation framework.Most of the outdoor activities of urban resident... In the context of park city construction,urban street space system and scene construction are the most important forms of presentation of value transformation framework.Most of the outdoor activities of urban residents are completed in the urban street space which constitutes various scenes.Scene construction not only includes the material space as the carrier,but also includes the behaviors and activities of the participants and the time and path of “program setting”,as well as the opening and closing of events.Scene construction is an effective approach explored during the construction of the park city demonstration area,which is currently practicing the new development concept.However,there have been no reports on the specific theories,methods,processes,and systems of scene construction,especially for the lack of pattern summary and method induction for program setting in scene construction.Therefore,from the perspective of building a park city demonstration area that implements the new development concept,the paper discusses the concept,connotation,process and modularity of the program setting of street space scene construction,in order to provide certain theoretical basis for the scene construction and “value transformation” of park city street space system. 展开更多
关键词 Park city Street space Program setting scene construction scene module
下载PDF
基于ArcScene与SketchUp交互建模技术的生态景观开发——以当阳市庙前镇关雎河畔为例
16
作者 冯德鸿 雷奥林 池晓霞 《测绘与空间地理信息》 2022年第8期57-59,共3页
以当阳市庙前镇关雎河畔为例,重点介绍了基于ArcScene与SketchUp交互建模技术:基于SketchUp Esri插件和基于ArcScene中3D编辑器的交互建模技术,并比较两种不同建模技术之间的优缺点和适用范围,同时应用带有高程值的航拍地形数据点要素... 以当阳市庙前镇关雎河畔为例,重点介绍了基于ArcScene与SketchUp交互建模技术:基于SketchUp Esri插件和基于ArcScene中3D编辑器的交互建模技术,并比较两种不同建模技术之间的优缺点和适用范围,同时应用带有高程值的航拍地形数据点要素模拟关雎河畔地形,对场景精细化以完成生态景观场景开发。 展开更多
关键词 Arcscene SKETCHUP 交互建模 地形模拟 场景精细化
下载PDF
Constructing Multiple Scene Graphs in Distributed Environment
17
作者 XIA Rui WANG Guo-ping +1 位作者 LI Sheng WANG Heng 《Computer Aided Drafting,Design and Manufacturing》 2015年第1期16-21,共6页
Scene graph is a infrastructure of the virtual reality system to organize the virtual scene with abstraction, it can provide facility for the rendering engine and should be integrated effectively on demand into a real... Scene graph is a infrastructure of the virtual reality system to organize the virtual scene with abstraction, it can provide facility for the rendering engine and should be integrated effectively on demand into a real-time system, where a large quantities of scene objects and resources can be manipulated and managed with high flexibility and reliability. We present a new scheme of multiple scene graphs to accommodate the features of rendering engine and distributed systems. Based upon that, some other functions, e.g. block query, interactive editing, permission management, instance response, "redo" and "undo", are implemented to satisfy various requirements. At the same time, our design has compatibility to popular C/S architecture with good concurrent performance. Above all, it is convenient to be used for further development. The results of experiments including responding time demonstrate its good performance. 展开更多
关键词 multiple scene graphs AUTHORITY scene editing distributed systems
下载PDF
Deep Scalogram Representations for Acoustic Scene Classification 被引量:5
18
作者 Zhao Ren Kun Qian +3 位作者 Zixing Zhang Vedhas Pandit Alice Baird Bjorn Schuller 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2018年第3期662-669,共8页
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info... Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system. 展开更多
关键词 Acoustic scene classification(ASC) (bidirectional) gated recurrent neural networks((B) GRNNs) convolutional neural networks(CNNs) deep scalogram representation spectrogram representation
下载PDF
Semantic segmentation method of road scene based on Deeplabv3+ and attention mechanism 被引量:6
19
作者 BAI Yanqiong ZHENG Yufu TIAN Hong 《Journal of Measurement Science and Instrumentation》 CAS CSCD 2021年第4期412-422,共11页
In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in acc... In the study of automatic driving,understanding the road scene is a key to improve driving safety.The semantic segmentation method could divide the image into different areas associated with semantic categories in accordance with the pixel level,so as to help vehicles to perceive and obtain the surrounding road environment information,which would improve driving safety.Deeplabv3+is the current popular semantic segmentation model.There are phenomena that small targets are missed and similar objects are easily misjudged during its semantic segmentation tasks,which leads to rough segmentation boundary and reduces semantic accuracy.This study focuses on the issue,based on the Deeplabv3+network structure and combined with the attention mechanism,to increase the weight of the segmentation area,and then proposes an improved Deeplabv3+fusion attention mechanism for road scene semantic segmentation method.First,a group of parallel position attention module and channel attention module are introduced on the Deeplabv3+encoding end to capture more spatial context information and high-level semantic information.Then,an attention mechanism is introduced to restore the spatial detail information,and the data shall be normalized in order to accelerate the convergence speed of the model at the decoding end.The effects of model segmentation with different attention-introducing mechanisms are compared and tested on CamVid and Cityscapes datasets.The experimental results show that the mean Intersection over Unons of the improved model segmentation accuracies on the two datasets are boosted by 6.88%and 2.58%,respectively,which is better than using Deeplabv3+.This method does not significantly increase the amount of network calculation and complexity,and has a good balance of speed and accuracy. 展开更多
关键词 autonomous driving road scene semantic segmentation Deeplabv3+ attention mechanism
下载PDF
Advanced Feature Fusion Algorithm Based on Multiple Convolutional Neural Network for Scene Recognition 被引量:5
20
作者 Lei Chen Kanghu Bo +1 位作者 Feifei Lee Qiu Chen 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第2期505-523,共19页
Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recogniti... Scene recognition is a popular open problem in the computer vision field.Among lots of methods proposed in recent years,Convolutional Neural Network(CNN)based approaches achieve the best performance in scene recognition.We propose in this paper an advanced feature fusion algorithm using Multiple Convolutional Neural Network(Multi-CNN)for scene recognition.Unlike existing works that usually use individual convolutional neural network,a fusion of multiple different convolutional neural networks is applied for scene recognition.Firstly,we split training images in two directions and apply to three deep CNN model,and then extract features from the last full-connected(FC)layer and probabilistic layer on each model.Finally,feature vectors are fused with different fusion strategies in groups forwarded into SoftMax classifier.Our proposed algorithm is evaluated on three scene datasets for scene recognition.The experimental results demonstrate the effectiveness of proposed algorithm compared with other state-of-art approaches. 展开更多
关键词 scene recognition deep feature fusion multiple convolutional neural network.
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部