期刊文献+
共找到16篇文章
< 1 >
每页显示 20 50 100
A Concise and Varied Visual Features-Based Image Captioning Model with Visual Selection
1
作者 Alaa Thobhani Beiji Zou +4 位作者 Xiaoyan Kui Amr Abdussalam Muhammad Asim Naveed Ahmed Mohammed Ali Alshara 《Computers, Materials & Continua》 SCIE EI 2024年第11期2873-2894,共22页
Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms... Image captioning has gained increasing attention in recent years.Visual characteristics found in input images play a crucial role in generating high-quality captions.Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image,improving the effectiveness of identifying relevant image regions at each step of caption generation.However,providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features.Consequently,this leads to enhanced captioning network performance.In light of this,we present an image captioning framework that efficiently exploits the extracted representations of the image.Our framework comprises three key components:the Visual Feature Detector module(VFD),the Visual Feature Visual Attention module(VFVA),and the language model.The VFD module is responsible for detecting a subset of the most pertinent features from the local visual features,creating an updated visual features matrix.Subsequently,the VFVA directs its attention to the visual features matrix generated by the VFD,resulting in an updated context vector employed by the language model to generate an informative description.Integrating the VFD and VFVA modules introduces an additional layer of processing for the visual features,thereby contributing to enhancing the image captioning model’s performance.Using the MS-COCO dataset,our experiments show that the proposed framework competes well with state-of-the-art methods,effectively leveraging visual representations to improve performance.The implementation code can be found here:https://github.com/althobhani/VFDICM(accessed on 30 July 2024). 展开更多
关键词 visual attention image captioning visual feature detector visual feature visual attention
下载PDF
An investigation of the visual features of urban street vitality using a convolutional neural network 被引量:2
2
作者 Yi Qi Sonam Chodron Drolma +4 位作者 Xiang Zhang Jing Liang Haibing Jiang Jiangang Xu Tianhua Ni 《Geo-Spatial Information Science》 SCIE CSCD 2020年第4期341-351,共11页
As a well-known urban landscape concept to describe urban space quality,urban street vitality is a subjective human perception of the urban environment but difficult to evaluate directly from the physical space.The st... As a well-known urban landscape concept to describe urban space quality,urban street vitality is a subjective human perception of the urban environment but difficult to evaluate directly from the physical space.The study utilized a modern machine learning computer vision algorithm in the urban build environment to simulate the process,which starts with the visual perception of the urban street landscape and ends with the human reaction to street vitality.By analyzing the optimized trained model,we tried to identify urban street vitality’s visual features and evaluate their importance.A region around the Mochou Lake in Nanjing,China,was set as our study area.Seven investigators surveyed the area,recorded their evaluation score on each site’s vitality level with a corresponding picture taken on site.A total of 370 pictures and recorded score pairs from 231 valid survey sites were used to train a convolutional neural network.After optimization,a deep neural network model with 43 layers,including 11 convolutional ones,was created.Heat maps were then used to identify the features which lead to high vitality score outputs.The spatial distributions of different types of feature entities were also analyzed to help identify the spatial effects.The study found that visual features,including human,construction site,shop front,and roadside/walking pavement,are vital ones that correspond to the vitality of the urban street.The consistency of these critical features with traditional urban vitality features indicates the model had learned useful knowledge from the training process.Applying the trained model in urban planning practices can help to improve the city environment for better attraction of residents’activities and communications. 展开更多
关键词 Urban street vitality visual feature convolutional neural network NANJING China
原文传递
Visual learning graph convolution for multi-grained orange quality grading 被引量:1
3
作者 GUAN Zhi-bin ZHANG Yan-qi +4 位作者 CHAI Xiu-juan CHAI Xin ZHANG Ning ZHANG Jian-hua SUN Tan 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2023年第1期279-291,共13页
The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that v... The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that visual perception technologies can automatically identify.Nonetheless,the current orange quality assessment needs to address two issues:1)There are no image datasets for orange quality grading;2)It is challenging to effectively learn the fine-grained and distinct visual semantics of oranges from diverse angles.This study collected 12522 images from 2087 oranges for multi-grained grading tasks.In addition,it presented a visual learning graph convolution approach for multi-grained orange quality grading,including a backbone network and a graph convolutional network(GCN).The backbone network’s object detection,data augmentation,and feature extraction can remove extraneous visual information.GCN was utilized to learn the topological semantics of orange feature maps.Finally,evaluation results proved that the recognition accuracy of diameter size,appearance,and fine-grained orange quality were 99.50,97.27,and 97.99%,respectively,indicating that the proposed approach is superior to others. 展开更多
关键词 GCN MULTI-VIEW FINE-GRAINED visual feature APPEARANCE diameter size
下载PDF
Video-Based Deception Detection with Non-Contact Heart Rate Monitoring and Multi-Modal Feature Selection
4
作者 Yanfeng Li Jincheng Bian +1 位作者 Yiqun Gao Rencheng Song 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期175-185,共11页
Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of decepti... Deception detection plays a crucial role in criminal investigation.Videos contain a wealth of information regarding apparent and physiological changes in individuals,and thus can serve as an effective means of deception detection.In this paper,we investigate video-based deception detection considering both apparent visual features such as eye gaze,head pose and facial action unit(AU),and non-contact heart rate detected by remote photoplethysmography(rPPG)technique.Multiple wrapper-based feature selection methods combined with the K-nearest neighbor(KNN)and support vector machine(SVM)classifiers are employed to screen the most effective features for deception detection.We evaluate the performance of the proposed method on both a self-collected physiological-assisted visual deception detection(PV3D)dataset and a public bag-oflies(BOL)dataset.Experimental results demonstrate that the SVM classifier with symbiotic organisms search(SOS)feature selection yields the best overall performance,with an area under the curve(AUC)of 83.27%and accuracy(ACC)of 83.33%for PV3D,and an AUC of 71.18%and ACC of 70.33%for BOL.This demonstrates the stability and effectiveness of the proposed method in video-based deception detection tasks. 展开更多
关键词 deception detection apparent visual features remote photoplethysmography non-contact heart rate feature selection
下载PDF
Video Concept Detection Based on Multiple Features and Classifiers Fusion 被引量:1
5
作者 Dong Yuan Zhang Jiwei +2 位作者 Zhao Nan Chang Xiaofu Liu Wei 《China Communications》 SCIE CSCD 2012年第8期105-121,共17页
The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the ... The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts. 展开更多
关键词 concept detection visual feature extraction kemel-based learning classifier fusion
下载PDF
Hard exudates referral system in eye fundus utilizing speeded up robust features 被引量:1
6
作者 Syed Ali Gohar Naqvi Hafiz Muhammad Faisal Zafar Ihsanul Haq 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2017年第7期1171-1174,共4页
In the paper a referral system to assist the medical experts in the screening/referral of diabetic retinopathy is suggested. The system has been developed by a sequential use of different existing mathematical techniq... In the paper a referral system to assist the medical experts in the screening/referral of diabetic retinopathy is suggested. The system has been developed by a sequential use of different existing mathematical techniques. These techniques involve speeded up robust features(SURF), K-means clustering and visual dictionaries(VD). Three databases are mixed to test the working of the system when the sources are dissimilar. When experiments were performed an area under the curve(AUC) of 0.9343 was attained. The results acquired from the system are promising. 展开更多
关键词 referral system speeded up robust features eye fundus visual dictionaries
下载PDF
Webpage Matching Based on Visual Similarity
7
作者 Mengmeng Ge Xiangzhan Yu +1 位作者 Lin Ye Jiantao Shi 《Computers, Materials & Continua》 SCIE EI 2022年第5期3393-3405,共13页
With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormou... With the rapid development of the Internet,the types of webpages are more abundant than in previous decades.However,it becomes severe that people are facing more and more significant network security risks and enormous losses caused by phishing webpages,which imitate the interface of real webpages and deceive the victims.To better identify and distinguish phishing webpages,a visual feature extraction method and a visual similarity algorithm are proposed.First,the visual feature extraction method improves the Visionbased Page Segmentation(VIPS)algorithm to extract the visual block and calculate its signature by perceptual hash technology.Second,the visual similarity algorithm presents a one-to-one correspondence based on the visual blocks’coordinates and thresholds.Then the weights are assigned according to the tree structure,and the similarity of the visual blocks is calculated on the basis of the measurement of the visual features’Hamming distance.Further,the visual similarity of webpages is generated by integrating the similarity and weight of different visual blocks.Finally,multiple pairs of phishing webpages and legitimate webpages are evaluated to verify the feasibility of the algorithm.The experimental results achieve excellent performance and demonstrate that our method can achieve 94%accuracy. 展开更多
关键词 Web security visual feature perceptual hash visual similarity
下载PDF
On Lemon Defect Recognition with Visual Feature Extraction and Transfers Learning
8
作者 Yizhi He Tiancheng Zhu +1 位作者 Mingxuan Wang Hanqing Lu 《Journal of Data Analysis and Information Processing》 2021年第4期233-248,共16页
Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer ... Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer learning to recognize defect lemons (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;">, green and mold defects). First, the data enhancement and brightness compensation techniques are used for data prepossessing. The visual feature extraction is used to quantify the defects and determine the feature variables as the bandit basis for classification. Then we construct a convolutional neural network with an embedded Visual Geome</span><span style="font-family:Verdana;">try Group 16 based (VGG16-based) network using transfer learning. The proposed model is compared with many benchmark models such as</span><span style="font-family:Verdana;"> K-</span></span><span style="font-family:Verdana;">n</span><span style="font-family:Verdana;">earest</span><span style="font-family:""> </span><span style="font-family:Verdana;">Neighbor (KNN) and Support Vector Machine (SVM). Result</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> show that the proposed model achieves the highest accuracy (95.44%) in the testing data set. The research provides a new solution for lemon defect recognition. 展开更多
关键词 Machine Learning visual Feature Extraction Convolutional Neural Networks Transfer Learning
下载PDF
SPATIAL TRAJECTORY PREDICTION OF VISUAL SERVOING
9
作者 WangGang QiHui 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2003年第1期7-9,12,共4页
Target tracking is one typical application of visual servoing technology. It is still a difficult task to track high speed target with current visual servo system. The improvement of visual servoing scheme is strongly... Target tracking is one typical application of visual servoing technology. It is still a difficult task to track high speed target with current visual servo system. The improvement of visual servoing scheme is strongly required. A position-based visual servo parallel system is presented for tracking target with high speed. A local Frenet frame is assigned to the sampling point of spatial trajectory. Position estimation is formed by the differential features of intrinsic geometry, and orientation estimation is formed by homogenous transformation. The time spent for searching and processing can be greatly reduced by shifting the window according to features location prediction. The simulation results have demonstrated the ability of the system to track spatial moving object. 展开更多
关键词 Robot visual servo Pose estimation Feature location prediction Target tracking
下载PDF
Visual-feature-assisted mobile robot localization in a long corridor environment 被引量:1
10
作者 Gengyu GE Yi ZHANG +3 位作者 Wei WANG Lihe HU Yang WANG Qin JIANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2023年第6期876-889,共14页
Localization plays a vital role in the mobile robot navigation system and is a fundamental capability for autonomous movement.In an indoor environment,the current mainstream localization scheme uses two-dimensional(2D... Localization plays a vital role in the mobile robot navigation system and is a fundamental capability for autonomous movement.In an indoor environment,the current mainstream localization scheme uses two-dimensional(2D)laser light detection and ranging(LiDAR)to build an occupancy grid map with simultaneous localization and mapping(SLAM)technology;it then locates the robot based on the known grid map.However,such solutions work effectively only in those areas with salient geometrical features.For areas with repeated,symmetrical,or similar structures,such as a long corridor,the conventional particle filtering method will fail.To solve this crucial problem,this paper presents a novel coarse-to-fine paradigm that uses visual features to assist mobile robot localization in a long corridor.First,the mobile robot is remote-controlled to move from the starting position to the end along a middle line.In the moving process,a grid map is built using the laser-based SLAM method.At the same time,a visual map consisting of special images which are keyframes is created according to a keyframe selection strategy.The keyframes are associated with the robot’s poses through timestamps.Second,a moving strategy is proposed,based on the extracted range features of the laser scans,to decide on an initial rough position.This is vital for the mobile robot because it gives instructions on where the robot needs to move to adjust its pose.Third,the mobile robot captures images in a proper perspective according to the moving strategy and matches them with the image map to achieve a coarse localization.Finally,an improved particle filtering method is presented to achieve fine localization.Experimental results show that our method is effective and robust for global localization.The localization success rate reaches 98.8%while the average moving distance is only 0.31 m.In addition,the method works well when the mobile robot is kidnapped to another position in the corridor. 展开更多
关键词 Mobile robot LOCALIZATION Simultaneous localization and mapping(SLAM) Corridor environment Particle filter visual features
原文传递
Historical Arabic Images Classification and Retrieval Using Siamese Deep Learning Model
11
作者 Manal M.Khayyat Lamiaa A.Elrefaei Mashael M.Khayyat 《Computers, Materials & Continua》 SCIE EI 2022年第7期2109-2125,共17页
Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of eff... Classifying the visual features in images to retrieve a specific image is a significant problem within the computer vision field especially when dealing with historical faded colored images.Thus,there were lots of efforts trying to automate the classification operation and retrieve similar images accurately.To reach this goal,we developed a VGG19 deep convolutional neural network to extract the visual features from the images automatically.Then,the distances among the extracted features vectors are measured and a similarity score is generated using a Siamese deep neural network.The Siamese model built and trained at first from scratch but,it didn’t generated high evaluation metrices.Thus,we re-built it from VGG19 pre-trained deep learning model to generate higher evaluation metrices.Afterward,three different distance metrics combined with the Sigmoid activation function are experimented looking for the most accurate method formeasuring the similarities among the retrieved images.Reaching that the highest evaluation parameters generated using the Cosine distance metric.Moreover,the Graphics Processing Unit(GPU)utilized to run the code instead of running it on the Central Processing Unit(CPU).This step optimized the execution further since it expedited both the training and the retrieval time efficiently.After extensive experimentation,we reached satisfactory solution recording 0.98 and 0.99 F-score for the classification and for the retrieval,respectively. 展开更多
关键词 visual features vectors deep learning models distance methods similar image retrieval
下载PDF
Book Retrieval Method Based on QR Code and CBIR Technology
12
作者 Qiuyan Wang Haibing Dong 《Journal on Artificial Intelligence》 2019年第2期101-110,共10页
It is the development trend of library information management,which applies the mature and cutting-edge information technology to library information retrieval.In order to realize the rapid retrieval of massive book i... It is the development trend of library information management,which applies the mature and cutting-edge information technology to library information retrieval.In order to realize the rapid retrieval of massive book information,this paper proposes a book retrieval method combining QR code with image retrieval technology.This method analyzes the visual features of book images,design a book image retrieval method based on boundary contour and regional pixel distribution features,and realizes the association retrieval of book information combined with the QR code,so as to improve the efficiency of book retrieval.The experimental results show that,the books can be retrieved effectively through the boundary contour and regional pixel distribution features,the book information can be displayed through QR code,readers can be provided with fast and intelligent massive book retrieval services. 展开更多
关键词 Book retrieval image retrieval QR code visual features.
下载PDF
Correlation Analysis of Control Parameters of Flotation Process
13
作者 Yanpeng Wu Xiaoqi Peng Nur Mohammad 《Journal on Internet of Things》 2019年第2期63-69,共7页
The dosage of gold-antimony flotation process of 5 main drugs,including Copper Sulfate,Lead Nitrate,Yellow Medicine,No.2 Oil,Black Medicine,with corresponding visual features of foam images,including Stability,Gray Sc... The dosage of gold-antimony flotation process of 5 main drugs,including Copper Sulfate,Lead Nitrate,Yellow Medicine,No.2 Oil,Black Medicine,with corresponding visual features of foam images,including Stability,Gray Scale,Mean R,Mean G,Mean B,Mean Average,Dimension and Degree Variance,were recorded.Parameter correlation analysis showed that the correlation among Copper Sulfate,Yellow Medicine,Black Medicine,as well as the correlation among Gray Scale,Mean R,Mean G,Mean B,is strong,and the correlation among Dimension,Gray Scale,Mean R,Mean G,Mean B,as well as the correlation between Stability and each dosing parameter,is week.It also indicated a feasible way to decrease the complexity of flotation control system by reducing some parameters. 展开更多
关键词 visual features flotation process correlation analysis.
下载PDF
Robust Local Light Field Synthesis via Occlusion-aware Sampling and Deep Visual Feature Fusion
14
作者 Wenpeng Xing Jie Chen Yike Guo 《Machine Intelligence Research》 EI CSCD 2023年第3期408-420,共13页
Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseli... Novel view synthesis has attracted tremendous research attention recently for its applications in virtual reality and immersive telepresence.Rendering a locally immersive light field(LF)based on arbitrary large baseline RGB references is a challenging problem that lacks efficient solutions with existing novel view synthesis techniques.In this work,we aim at truthfully rendering local immersive novel views/LF images based on large baseline LF captures and a single RGB image in the target view.To fully explore the precious information from source LF captures,we propose a novel occlusion-aware source sampler(OSS)module which efficiently transfers the pixels of source views to the target view′s frustum in an occlusion-aware manner.An attention-based deep visual fusion module is proposed to fuse the revealed occluded background content with a preliminary LF into a final refined LF.The proposed source sampling and fusion mechanism not only helps to provide information for occluded regions from varying observation angles,but also proves to be able to effectively enhance the visual rendering quality.Experimental results show that our proposed method is able to render high-quality LF images/novel views with sparse RGB references and outperforms state-of-the-art LF rendering and novel view synthesis methods. 展开更多
关键词 Novel view synthesis light field(LF)imaging multi-view stereo occlusion sampling deep visual feature(DVF)fusion
原文传递
Heterogeneous data-driven aerodynamic modeling based on physical feature embedding 被引量:1
15
作者 Weiwei ZHANG Xuhao PENG +1 位作者 Jiaqing KOU Xu WANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2024年第3期1-6,共6页
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ... Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework. 展开更多
关键词 Transonic flow Data-driven modeling Feature embedding Heterogenous data Feature visualization
原文传递
Structured Computational Modeling of Human Visual System for No-reference Image Quality Assessment
16
作者 Wen-Han Zhu Wei Sun +2 位作者 Xiong-Kuo Min Guang-Tao Zhai Xiao-Kang Yang 《International Journal of Automation and computing》 EI CSCD 2021年第2期204-218,共15页
Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate eval... Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures. 展开更多
关键词 Image quality assessment(IQA) no-reference(NR) structural computational modeling human visual system visual feature extraction
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部