期刊文献+
共找到1,218篇文章
< 1 2 61 >
每页显示 20 50 100
Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset,Methodology and Evaluation
1
作者 Shiwen Song Rui Zhang +1 位作者 Min Hu Feiyao Huang 《Computers, Materials & Continua》 SCIE EI 2024年第6期5243-5271,共29页
Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi... Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security.Currently,with the emergence of massive high-resolution multi-modality images,the use of multi-modality images for fine-grained recognition has become a promising technology.Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples.The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features.The attention mechanism helps the model to pinpoint the key information in the image,resulting in a significant improvement in the model’s performance.In this paper,a dataset for fine-grained recognition of ships based on visible and near-infrared multi-modality remote sensing images has been proposed first,named Dataset for Multimodal Fine-grained Recognition of Ships(DMFGRS).It includes 1,635 pairs of visible and near-infrared remote sensing images divided into 20 categories,collated from digital orthophotos model provided by commercial remote sensing satellites.DMFGRS provides two types of annotation format files,as well as segmentation mask images corresponding to the ship targets.Then,a Multimodal Information Cross-Enhancement Network(MICE-Net)fusing features of visible and near-infrared remote sensing images,has been proposed.In the network,a dual-branch feature extraction and fusion module has been designed to obtain more expressive features.The Feature Cross Enhancement Module(FCEM)achieves the fusion enhancement of the two modal features by making the channel attention and spatial attention work cross-functionally on the feature map.A benchmark is established by evaluating state-of-the-art object recognition algorithms on DMFGRS.MICE-Net conducted experiments on DMFGRS,and the precision,recall,mAP0.5 and mAP0.5:0.95 reached 87%,77.1%,83.8%and 63.9%,respectively.Extensive experiments demonstrate that the proposed MICE-Net has more excellent performance on DMFGRS.Built on lightweight network YOLO,the model has excellent generalizability,and thus has good potential for application in real-life scenarios. 展开更多
关键词 Multi-modality dataset ship recognition fine-grained recognition attention mechanism
下载PDF
Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network 被引量:1
2
作者 Xuan Zhou Jianping Yi 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期2103-2116,共14页
Mining more discriminative temporal features to enrich temporal context representation is considered the key to fine-grained action recog-nition.Previous action recognition methods utilize a fixed spatiotemporal windo... Mining more discriminative temporal features to enrich temporal context representation is considered the key to fine-grained action recog-nition.Previous action recognition methods utilize a fixed spatiotemporal window to learn local video representation.However,these methods failed to capture complex motion patterns due to their limited receptive field.To solve the above problems,this paper proposes a lightweight Temporal Pyramid Excitation(TPE)module to capture the short,medium,and long-term temporal context.In this method,Temporal Pyramid(TP)module can effectively expand the temporal receptive field of the network by using the multi-temporal kernel decomposition without significantly increasing the computational cost.In addition,the Multi Excitation module can emphasize temporal importance to enhance the temporal feature representation learning.TPE can be integrated into ResNet50,and building a compact video learning framework-TPENet.Extensive validation experiments on several challenging benchmark(Something-Something V1,Something-Something V2,UCF-101,and HMDB51)datasets demonstrate that our method achieves a preferable balance between computation and accuracy. 展开更多
关键词 fine-grained action recognition temporal pyramid excitation module temporal receptive multi-excitation module
下载PDF
Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
3
作者 Gonghui Deng Dunzhi Wu Weizhen Chen 《Computers, Materials & Continua》 SCIE EI 2024年第8期1985-2003,共19页
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula... The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field. 展开更多
关键词 fine-grained image recognition food image recognition attention mechanism local feature fusion
下载PDF
Recognition and rejection of foreign eggs of different colors in Barn Swallows
4
作者 Kui Yan Wei Liang 《Avian Research》 SCIE CSCD 2024年第3期374-378,共5页
Brood parasitic birds lay eggs in the nests of other birds,and the parasitized hosts can reduce the cost of raising unrelated offspring through the recognition of parasitic eggs.Hosts can adopt vision-based cognitive ... Brood parasitic birds lay eggs in the nests of other birds,and the parasitized hosts can reduce the cost of raising unrelated offspring through the recognition of parasitic eggs.Hosts can adopt vision-based cognitive mechanisms to recognize foreign eggs by comparing the colors of foreign and host eggs.However,there is currently no uniform conclusion as to whether this comparison involves the single or multiple threshold decision rules.In this study,we tested both hypotheses by adding model eggs of different colors to the nests of Barn Swallows(Hirundo rustica)of two geographical populations breeding in Hainan and Heilongjiang Provinces in China.Results showed that Barn Swallows rejected more white model eggs(moderate mimetic to their own eggs)and blue model eggs(highly non-mimetic eggs with shorter reflectance spectrum)than red model eggs(highly nonmimetic eggs with longer reflectance spectrum).There was no difference in the rejection rate of model eggs between the two populations of Barn Swallows,and clutch size was not a factor affecting egg recognition.Our results are consistent with the single rejection threshold model.This study provides strong experimental evidence that the color of model eggs can has an important effect on egg recognition in Barn Swallows,opening up new avenues to uncover the evolution of cuckoo egg mimicry and explore the cognitive mechanisms underlying the visual recognition of foreign eggs by hosts. 展开更多
关键词 Barn Swallow Egg color Hirundo rustica Multiple rejection threshold Single rejection threshold visual recognition system
下载PDF
Fine-grained Ship Image Recognition Based on BCNN with Inception and AM-Softmax
5
作者 Zhilin Zhang Ting Zhang +4 位作者 Zhaoying Liu Peijie Zhang Shanshan Tu Yujian Li Muhammad Waqas 《Computers, Materials & Continua》 SCIE EI 2022年第10期1527-1539,共13页
The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make th... The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make the task difficult.Therefore,to enhance the accuracy of the fine-grained ship image recognition,we design a fine-grained ship image recognition network based on bilinear convolutional neural network(BCNN)with Inception and additive margin Softmax(AM-Softmax).This network improves the BCNN in two aspects.Firstly,by introducing Inception branches to the BCNN network,it is helpful to enhance the ability of extracting comprehensive features from ships.Secondly,by adding margin values to the decision boundary,the AM-Softmax function can better extend the inter-class differences and reduce the intra-class differences.In addition,as there are few publicly available datasets for fine-grained ship image recognition,we construct a Ship-43 dataset containing 47,300 ship images belonging to 43 categories.Experimental results on the constructed Ship-43 dataset demonstrate that our method can effectively improve the accuracy of ship image recognition,which is 4.08%higher than the BCNN model.Moreover,comparison results on the other three public fine-grained datasets(Cub,Cars,and Aircraft)further validate the effectiveness of the proposed method. 展开更多
关键词 fine-grained ship image recognition INCEPTION AM-softmax BCNN
下载PDF
METHODS OF VISUAL RECOGNITION,POSITIONING AND ORIENTATING OF 3 D SIMPLE GEOMETRIC WORKPIECE
6
作者 王向军 王以忠 叶声华 《Transactions of Tianjin University》 EI CAS 1998年第2期36-40,共5页
The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run le... The methods of visual recognition,positioning and orienting with simple 3 D geometric workpieces are presented in this paper.The principle and operating process of multiple orientation run length coding based on general orientation run length coding and visual recognition method are described elaborately.The method of positioning and orientating based on the moment of inertia of the workpiece binary image is stated also.It has been applied in a research on flexible automatic coordinate measuring system formed by integrating computer aided design,computer vision and computer aided inspection planning,with a coordinate measuring machine.The results show that integrating computer vision with measurement system is a feasible and effective approach to improve their flexibility and automation. 展开更多
关键词 automatic measurement visual recognition visual positioning visual orientating coordinate measuring machine
下载PDF
Visualization of flatness pattern recognition based on T-S cloud inference network 被引量:2
7
作者 张秀玲 赵亮 +1 位作者 臧佳音 樊红敏 《Journal of Central South University》 SCIE EI CAS CSCD 2015年第2期560-566,共7页
Flatness pattern recognition is the key of the flatness control. The accuracy of the present flatness pattern recognition is limited and the shape defects cannot be reflected intuitively. In order to improve it, a nov... Flatness pattern recognition is the key of the flatness control. The accuracy of the present flatness pattern recognition is limited and the shape defects cannot be reflected intuitively. In order to improve it, a novel method via T-S cloud inference network optimized by genetic algorithm(GA) is proposed. T-S cloud inference network is constructed with T-S fuzzy neural network and the cloud model. So, the rapid of fuzzy logic and the uncertainty of cloud model for processing data are both taken into account. What's more, GA possesses good parallel design structure and global optimization characteristics. Compared with the simulation recognition results of traditional BP Algorithm, GA is more accurate and effective. Moreover, virtual reality technology is introduced into the field of shape control by Lab VIEW, MATLAB mixed programming. And virtual flatness pattern recognition interface is designed.Therefore, the data of engineering analysis and the actual model are combined with each other, and the shape defects could be seen more lively and intuitively. 展开更多
关键词 pattern recognition T-S cloud inference network cloud model mixed programming virtual reality visual recognition
下载PDF
Baseline Isolated Printed Text Image Database for Pashto Script Recognition
8
作者 Arfa Siddiqu Abdul Basit +3 位作者 Waheed Noor Muhammad Asfandyar Khan M.Saeed H.Kakar Azam Khan 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期875-885,共11页
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the... The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats. 展开更多
关键词 Text-image database optical character recognition(OCR) pashto isolated characters visual recognition autonomous language understanding deep learning convolutional neural network(CNN)
下载PDF
Design of an Intelligent Robotic Excavator Based on Binocular Visual Recognition Technique 被引量:1
9
作者 ZHANG Xin LIU Jing WEN Huai-xing 《International Journal of Plant Engineering and Management》 2009年第1期48-51,共4页
Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which c... Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which can make the intelligent robotic excavator perform excavating operation autonomously. It can recognize the excavating targets by itself, program the operation automatically based on the original parameter, and finish all the tasks. Experimental results indicate the validity in real-time performance and precision of the control system. The intelligent robotic excavator can remarkably ease the labor intensity and enhance the working efficiency. 展开更多
关键词 excavating robot binocular visual recognition distributed control system trajectory tracing
下载PDF
Preliminary study on visual recognition under low visibility conditions caused by artificial dynamic smog
10
作者 Xu-Hong Zhang Zhe-Yi Chen +6 位作者 Bin-Bin Su Karunanedi Soobraydoo Hao-Ran Wu Qin-Zhuan Ren Lu Sun Fan Lyu Jun Jiang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2018年第11期1821-1828,共8页
AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber fil... AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber filled with a PM2.5 aerosol, and 14 subjects with normal visual function were examined by psychophysical methods with the foggy smog box placed in front of their eyes. The transmission of light through the smog box, an indication of the percentage concentration of smog, was determined with a luminance meter. Visual function under different smog concentrations was evaluated by the E-visual acuity, crowded E-visual acuity and contrast sensitivity.RESULTS: E-visual acuity, crowded E-visual acuity and contrast sensitivity were all impaired with a decrease in the transmission rate(TR) according to power functions, with invariable exponents of-1.41,-1.62 and-0.7, respectively, and R2 values of 0.99 for E and crowded E-visual acuity, 0.96 for contrast sensitivity. Crowded E-visual acuity decreased faster than E-visual acuity. There was a good correlation between the TR, extinction coefficient and visibility under heavy-smog conditions.CONCLUSION: Increases in smog concentration have a strong effect on visual function. 展开更多
关键词 visual recognition low visibility conditions artificial smog
下载PDF
Deep Learning-Based Approach for Arabic Visual Speech Recognition
11
作者 Nadia H.Alsulami Amani T.Jamal Lamiaa A.Elrefaei 《Computers, Materials & Continua》 SCIE EI 2022年第4期85-108,共24页
Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In thi... Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In this paper,we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms.The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers.The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase.Firstly,we extract keyframes from our dataset.Secondly,we produce a Concatenated Frame Images(CFIs)that represent the utterance sequence in one single image.Finally,the VGG-19 is employed for visual features extraction in our proposed model.We have examined different keyframes:10,15,and 20 for comparing two types of approaches in the proposed model:(1)the VGG-19 base model and(2)VGG-19 base model with batch normalization.The results show that the second approach achieves greater accuracy:94%for digit recognition,97%for phrase recognition,and 93%for digits and phrases recognition in the test dataset.Therefore,our proposed model is superior to models based on CFIs input. 展开更多
关键词 Convolutional neural network deep learning lip reading transfer learning visual speech recognition
下载PDF
Presence, Physiological Arousal, and Visual Recognition in 3D TV
12
作者 Soyoung Bae Chris Eller Annie Lang 《通讯和计算机(中英文版)》 2012年第5期560-567,共8页
关键词 视觉识别 电视机 3D 生理 电影剪辑 数码相机 电影院 2D
下载PDF
Audio-visual emotion recognition with multilayer boosted HMM
13
作者 吕坤 贾云得 张欣 《Journal of Beijing Institute of Technology》 EI CAS 2013年第1期89-93,共5页
Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A mod... Emotion recognition has become an important task of modern human-computer interac- tion. A multilayer boosted HMM ( MBHMM ) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learn- ing and adaptive boosting (AdaBoost) is used to train ensemble classifiers for different layers (cues). Except for the first layer, the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM clas- sifier is combined by these ensemble classifiers and takes advantage of the complementary informa- tion from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising. 展开更多
关键词 emotion recognition audio-visual fusion Baum-Welch algorithm multilayer boostedHMM Wizard of Oz scenario
下载PDF
Paper-based biosensors based on multiple recognition modes for visual detection of microbially contaminated food
14
作者 Jie Li Keren Chen +4 位作者 Yuan Su Longjiao Zhu Hongxing Zhang Wentao Xu Xiangyang Li 《Journal of Future Foods》 2024年第1期61-70,共10页
Microbially contaminated food can cause serious health hazards and economic losses,therefore sensitive,rapid,and highly specific visual detection is called for.Traditional detection of microorganisms is complex and ti... Microbially contaminated food can cause serious health hazards and economic losses,therefore sensitive,rapid,and highly specific visual detection is called for.Traditional detection of microorganisms is complex and time-consuming,which cannot meet current testing demands.The emergence of paper-based biosensors provided an effective method for efficient and visual detection of microorganisms,due to its high speed,all-in-one device,low cost,and convenience.This review focused on 5 biomarkers,namely nucleic acids,proteins,lipopolysaccharides.metabolites,and the whole microorganism of microorganisms.Besides,the recognition methods were summed up in 5 forms,including immunological recognition,aptamer recognition,nucleic acid amplification-mediated recognition.DNAzyme recognition and clustered regularly interspaced short palindromic repeats mediated recognition.In addition,we summarized the applications of paper-based biosensors in the detection of microorganisms thoroughly.Through the exploration of different biomarkers,identification methods,and applications,we hope to provide a reference for the development of paper-based biosensors and their application in safeguarding the food chain. 展开更多
关键词 Paper-based biosensor MICROORGANISM Multiple recognition BIOMARKER visual detection
原文传递
Place recognition based on saliency for topological localization 被引量:2
15
作者 王璐 蔡自兴 《Journal of Central South University of Technology》 EI 2006年第5期536-541,共6页
Based on salient visual regions for mobile robot navigation in unknown environments, a new place recognition system was presented. The system uses monocular camera to acquire omni-directional images of the environment... Based on salient visual regions for mobile robot navigation in unknown environments, a new place recognition system was presented. The system uses monocular camera to acquire omni-directional images of the environment where the robot locates. Salient local regions are detected from these images using center-surround difference method, which computes opponencies of color and texture among multi-scale image spaces. And then they are organized using hidden Markov model (HMM) to form the vertex of topological map. So localization, that is place recognition in our system, can be converted to evaluation of HMM. Experimental results show that the saliency detection is immune to the changes of scale, 2D rotation and viewpoint etc. The created topological map has smaller size and a higher ratio of recognition is obtained. 展开更多
关键词 visual saliency place recognition mobile robot localization hidden Markov model
下载PDF
Behavior recognition based on the fusion of 3D-BN-VGG and LSTM network 被引量:4
16
作者 Wu Jin Min Yu +2 位作者 Shi Qianwen Zhang Weihua Zhao Bo 《High Technology Letters》 EI CAS 2020年第4期372-382,共11页
In order to effectively solve the problems of low accuracy,large amount of computation and complex logic of deep learning algorithms in behavior recognition,a kind of behavior recognition based on the fusion of 3 dime... In order to effectively solve the problems of low accuracy,large amount of computation and complex logic of deep learning algorithms in behavior recognition,a kind of behavior recognition based on the fusion of 3 dimensional batch normalization visual geometry group(3D-BN-VGG)and long short-term memory(LSTM)network is designed.In this network,3D convolutional layer is used to extract the spatial domain features and time domain features of video sequence at the same time,multiple small convolution kernels are stacked to replace large convolution kernels,thus the depth of neural network is deepened and the number of network parameters is reduced.In addition,the latest batch normalization algorithm is added to the 3-dimensional convolutional network to improve the training speed.Then the output of the full connection layer is sent to LSTM network as the feature vectors to extract the sequence information.This method,which directly uses the output of the whole base level without passing through the full connection layer,reduces the parameters of the whole fusion network to 15324485,nearly twice as much as those of 3D-BN-VGG.Finally,it reveals that the proposed network achieves 96.5%and 74.9%accuracy in the UCF-101 and HMDB-51 respectively,and the algorithm has a calculation speed of 1066 fps and an acceleration ratio of 1,which has a significant predominance in velocity. 展开更多
关键词 behavior recognition deep learning 3 dimensional batch normalization visual geometry group(3D-BN-VGG) long short-term memory(LSTM)network
下载PDF
Using Speech Recognition in Learning Primary School Mathematics via Explain, Instruct and Facilitate Techniques 被引量:1
17
作者 Ab Rahman Ahmad Sami M. Halawani Samir K. Boucetta 《Journal of Software Engineering and Applications》 2014年第4期233-255,共23页
The application of Information and Communication Technologies has transformed traditional Teaching and Learning in the past decade to computerized-based era. This evolution has resulted from the emergence of the digit... The application of Information and Communication Technologies has transformed traditional Teaching and Learning in the past decade to computerized-based era. This evolution has resulted from the emergence of the digital system and has greatly impacted on the global education and socio-cultural development. Multimedia has been absorbed into the education sector for producing a new learning concept and a combination of educational and entertainment approach. This research is concerned with the application of Window Speech Recognition and Microsoft Visual Basic 2008 Integrated/Interactive Development Environment in Multimedia-Assisted Courseware prototype development for Primary School Mathematics contents, namely, single digits and the addition. The Teaching and Learning techniques—Explain, Instruct and Facilitate are proposed and these could be viewed as instructors’ centered strategy, instructors’—learners’ dual communication and learners' active participation. The prototype is called M-EIF and deployed only users' voices;hence the activation of Window Speech Recognition is required prior to a test run. 展开更多
关键词 EXPLAIN Instruct and Facilitate TECHNIQUES MULTIMEDIA-ASSISTED COURSEWARE Primary School MATHEMATICS visual Natural Language Window Speech recognition
下载PDF
A Vision-Based Fingertip-Writing Character Recognition System 被引量:1
18
作者 Ching-Long Shih Wen-Yo Lee Yu-Te Ku 《Journal of Computer and Communications》 2016年第4期160-168,共9页
This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simpli... This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simplify fingertip detection and to enhance recognition accuracy. For each character stroke, 8 sample points (including start and end points) are recorded. 7 tangent angles between consecutive sampled points are also recorded as features. In addition, 3 features angles are extracted: angles of the triangle consisting of the start point, end point and average point of all (8 total) sampled points. According to these key feature angles, a simple template matching K-nearest-neighbor classifier is applied to distinguish each character stroke. Experimental result showed that the system can successfully recognize fingertip-writing character strokes of digits and small lower case letter alphabets with an accuracy of almost 100%. Overall, the proposed finger-tip-writing recognition system provides an easy-to-use and accurate visual character input method. 展开更多
关键词 visual Character recognition Fingertip Detection Template Matching K-Nearest-Neighbor Classifier FPGA
下载PDF
Relative attribute based incremental learning for image recognition 被引量:3
19
作者 Emrah Ergul 《CAAI Transactions on Intelligence Technology》 2017年第1期1-11,共11页
In this study, we propose an incremental learning approach based on a machine-machine interaction via relative attribute feedbacks that exploit comparative relationships among top level image categories. One machine a... In this study, we propose an incremental learning approach based on a machine-machine interaction via relative attribute feedbacks that exploit comparative relationships among top level image categories. One machine acts as 'Student (S)' with initially limited information and it endeavors to capture the task domain gradually by questioning its mentor on a pool of unlabeled data. The other machine is 'Teacher (T)' with the implicit knowledge for helping S on learning the class models. T initiates relative attributes as a communication channel by randomly sorting the classes on attribute space in an unsupervised manner. S starts modeling the categories in this intermediate level by using only a limited number of labeled data. Thereafter, it first selects an entropy-based sample from the pool of unlabeled data and triggers the conversation by propagating the selected image with its belief class in a query. Since T already knows the ground truth labels, it not only decides whether the belief is true or false, but it also provides an attribute-based feedback to S in each case without revealing the true label of the query sample if the belief is false. So the number of training data is increased virtually by dropping the falsely predicted sample back into the unlabeled pool. Next, S updates the attribute space which, in fact, has an impact on T's future responses, and then the category models are updated concurrently for the next run. We experience the weakly supervised algorithm on the real world datasets of faces and natural scenes in comparison with direct attribute prediction and semi-supervised learning approaches, and a noteworthy performance increase is achieved. 展开更多
关键词 Image classification Incremental learning Relative attribute visual recognition
下载PDF
Visual learning graph convolution for multi-grained orange quality grading 被引量:1
20
作者 GUAN Zhi-bin ZHANG Yan-qi +4 位作者 CHAI Xiu-juan CHAI Xin ZHANG Ning ZHANG Jian-hua SUN Tan 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2023年第1期279-291,共13页
The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that v... The quality of oranges is grounded on their appearance and diameter.Appearance refers to the skin’s smoothness and surface cleanliness;diameter refers to the transverse diameter size.They are visual attributes that visual perception technologies can automatically identify.Nonetheless,the current orange quality assessment needs to address two issues:1)There are no image datasets for orange quality grading;2)It is challenging to effectively learn the fine-grained and distinct visual semantics of oranges from diverse angles.This study collected 12522 images from 2087 oranges for multi-grained grading tasks.In addition,it presented a visual learning graph convolution approach for multi-grained orange quality grading,including a backbone network and a graph convolutional network(GCN).The backbone network’s object detection,data augmentation,and feature extraction can remove extraneous visual information.GCN was utilized to learn the topological semantics of orange feature maps.Finally,evaluation results proved that the recognition accuracy of diameter size,appearance,and fine-grained orange quality were 99.50,97.27,and 97.99%,respectively,indicating that the proposed approach is superior to others. 展开更多
关键词 GCN MULTI-VIEW fine-grained visual feature APPEARANCE diameter size
下载PDF
上一页 1 2 61 下一页 到第
使用帮助 返回顶部