期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Design of an Intelligent Robotic Excavator Based on Binocular Visual Recognition Technique 被引量:1
1
作者 ZHANG Xin LIU Jing WEN Huai-xing 《International Journal of Plant Engineering and Management》 2009年第1期48-51,共4页
Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which c... Research on intelligent and robotic excavator has become a focus both at home and abroad, and this type of excavator becomes more and more important in application. In this paper, we developed a control system which can make the intelligent robotic excavator perform excavating operation autonomously. It can recognize the excavating targets by itself, program the operation automatically based on the original parameter, and finish all the tasks. Experimental results indicate the validity in real-time performance and precision of the control system. The intelligent robotic excavator can remarkably ease the labor intensity and enhance the working efficiency. 展开更多
关键词 excavating robot binocular visual recognition distributed control system trajectory tracing
下载PDF
Preliminary study on visual recognition under low visibility conditions caused by artificial dynamic smog
2
作者 Xu-Hong Zhang Zhe-Yi Chen +6 位作者 Bin-Bin Su Karunanedi Soobraydoo Hao-Ran Wu Qin-Zhuan Ren Lu Sun Fan Lyu Jun Jiang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2018年第11期1821-1828,共8页
AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber fil... AIM: To quantitatively evaluate the effect of a simulated smog environment on human visual function by psychophysical methods.METHODS: The smog environment was simulated in a 40×40×60 cm3 glass chamber filled with a PM2.5 aerosol, and 14 subjects with normal visual function were examined by psychophysical methods with the foggy smog box placed in front of their eyes. The transmission of light through the smog box, an indication of the percentage concentration of smog, was determined with a luminance meter. Visual function under different smog concentrations was evaluated by the E-visual acuity, crowded E-visual acuity and contrast sensitivity.RESULTS: E-visual acuity, crowded E-visual acuity and contrast sensitivity were all impaired with a decrease in the transmission rate(TR) according to power functions, with invariable exponents of-1.41,-1.62 and-0.7, respectively, and R2 values of 0.99 for E and crowded E-visual acuity, 0.96 for contrast sensitivity. Crowded E-visual acuity decreased faster than E-visual acuity. There was a good correlation between the TR, extinction coefficient and visibility under heavy-smog conditions.CONCLUSION: Increases in smog concentration have a strong effect on visual function. 展开更多
关键词 visual recognition low visibility conditions artificial smog
下载PDF
Efficient Visual Recognition:A Survey on Recent Advances and Brain-inspired Methodologies 被引量:1
3
作者 Yang Wu Ding-Heng Wang +5 位作者 Xiao-Tong Lu Fan Yang Man Yao Wei-Sheng Dong Jian-Bo Shi Guo-Qi Li 《Machine Intelligence Research》 EI CSCD 2022年第5期366-411,共46页
Visual recognition is currently one of the most important and active research areas in computer vision,pattern recognition,and even the general field of artificial intelligence.It has great fundamental importance and ... Visual recognition is currently one of the most important and active research areas in computer vision,pattern recognition,and even the general field of artificial intelligence.It has great fundamental importance and strong industrial needs,particularly the modern deep neural networks(DNNs)and some brain-inspired methodologies,have largely boosted the recognition performance on many concrete tasks,with the help of large amounts of training data and new powerful computation resources.Although recognition accuracy is usually the first concern for new progresses,efficiency is actually rather important and sometimes critical for both academic research and industrial applications.Moreover,insightful views on the opportunities and challenges of efficiency are also highly required for the entire community.While general surveys on the efficiency issue have been done from various perspectives,as far as we are aware,scarcely any of them focused on visual recognition systematically,and thus it is unclear which progresses are applicable to it and what else should be concerned.In this survey,we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches,including efficient network compression and dynamic brain-inspired networks.We investigate not only from the model but also from the data point of view(which is not the case in existing surveys)and focus on four typical data types(images,video,points,and events).This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems. 展开更多
关键词 visual recognition deep neural networks(DNNS) brain-inspired methodologies network compression dynamic inference SURVEY
原文传递
Visual recognition of melamine in milk via selective metallo-hydrogel formation
4
作者 Xiaoling Bao Jianhong Liu +4 位作者 Qingshu Zheng Wei Pei Yimei Yang Yanyun Dai Tao Tu 《Chinese Chemical Letters》 SCIE CAS CSCD 2019年第12期2266-2270,共5页
A series of novel six-coordinated terpyridine zinc complexes,containing ammonium salts and thymine fragment at the two terminals,have been designed and synthesized,which can function as highly sensitive visualized sen... A series of novel six-coordinated terpyridine zinc complexes,containing ammonium salts and thymine fragment at the two terminals,have been designed and synthesized,which can function as highly sensitive visualized sensors for melamine detection via selective metallo-hydrogel formation.After fully characterization by various techniques,the complementary triple-hydrogen-bonding between the thymine fragment and melamine,as well as π-π stacking interactions may be responsible for the selective metallo-hydrogel formation.In light of the possible interference aroused by milk ingredients(proteins,peptides and amino acids) and legal/illegal additives(urine,sugars and vitamins),a series of control experiments are therefore involved.To our delight,this visual recognition is highly selective,no gelation was observed with the selected milk ingredients or additives.Remarkably,this new developed protocol enables convenient and highly selective visual recognition of melamine at a concentration as low as 10 ppm in raw milk without any tedious pretreatment. 展开更多
关键词 Hydrogen-bonding interaction Pincer zinc complex MELAMINE Metallo-hydrogel visual recognition
原文传递
Baseline Isolated Printed Text Image Database for Pashto Script Recognition
5
作者 Arfa Siddiqu Abdul Basit +3 位作者 Waheed Noor Muhammad Asfandyar Khan M.Saeed H.Kakar Azam Khan 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期875-885,共11页
The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the... The optical character recognition for the right to left and cursive languages such as Arabic is challenging and received little attention from researchers in the past compared to the other Latin languages.Moreover,the absence of a standard publicly available dataset for several low-resource lan-guages,including the Pashto language remained a hurdle in the advancement of language processing.Realizing that,a clean dataset is the fundamental and core requirement of character recognition,this research begins with dataset generation and aims at a system capable of complete language understanding.Keeping in view the complete and full autonomous recognition of the cursive Pashto script.The first achievement of this research is a clean and standard dataset for the isolated characters of the Pashto script.In this paper,a database of isolated Pashto characters for forty four alphabets using various font styles has been introduced.In order to overcome the font style shortage,the graphical software Inkscape has been used to generate sufficient image data samples for each character.The dataset has been pre-processed and reduced in dimensions to 32×32 pixels,and further converted into the binary format with a black background and white text so that it resembles the Modified National Institute of Standards and Technology(MNIST)database.The benchmark database is publicly available for further research on the standard GitHub and Kaggle database servers both in pixel and Comma Separated Values(CSV)formats. 展开更多
关键词 Text-image database optical character recognition(OCR) pashto isolated characters visual recognition autonomous language understanding deep learning convolutional neural network(CNN)
下载PDF
Deep Learning-Based Approach for Arabic Visual Speech Recognition
6
作者 Nadia H.Alsulami Amani T.Jamal Lamiaa A.Elrefaei 《Computers, Materials & Continua》 SCIE EI 2022年第4期85-108,共24页
Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In thi... Lip-reading technologies are rapidly progressing following the breakthrough of deep learning.It plays a vital role in its many applications,such as:human-machine communication practices or security applications.In this paper,we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms.The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers.The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase.Firstly,we extract keyframes from our dataset.Secondly,we produce a Concatenated Frame Images(CFIs)that represent the utterance sequence in one single image.Finally,the VGG-19 is employed for visual features extraction in our proposed model.We have examined different keyframes:10,15,and 20 for comparing two types of approaches in the proposed model:(1)the VGG-19 base model and(2)VGG-19 base model with batch normalization.The results show that the second approach achieves greater accuracy:94%for digit recognition,97%for phrase recognition,and 93%for digits and phrases recognition in the test dataset.Therefore,our proposed model is superior to models based on CFIs input. 展开更多
关键词 Convolutional neural network deep learning lip reading transfer learning visual speech recognition
下载PDF
Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning
7
作者 Nada Faisal Aljohani Emad Sami Jaha 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3037-3058,共22页
The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual sp... The continuing advances in deep learning have paved the way for several challenging ideas.One such idea is visual lip-reading,which has recently drawn many research interests.Lip-reading,often referred to as visual speech recognition,is the ability to understand and predict spoken speech based solely on lip movements without using sounds.Due to the lack of research studies on visual speech recognition for the Arabic language in general,and its absence in the Quranic research,this research aims to fill this gap.This paper introduces a new publicly available Arabic lip-reading dataset containing 10490 videos captured from multiple viewpoints and comprising data samples at the letter level(i.e.,single letters(single alphabets)and Quranic disjoined letters)and in the word level based on the content and context of the book Al-Qaida Al-Noorania.This research uses visual speech recognition to recognize spoken Arabic letters(Arabic alphabets),Quranic disjoined letters,and Quranic words,mainly phonetic as they are recited in the Holy Quran according to Quranic study aid entitled Al-Qaida Al-Noorania.This study could further validate the correctness of pronunciation and,subsequently,assist people in correctly reciting Quran.Furthermore,a detailed description of the created dataset and its construction methodology is provided.This new dataset is used to train an effective pre-trained deep learning CNN model throughout transfer learning for lip-reading,achieving the accuracies of 83.3%,80.5%,and 77.5%on words,disjoined letters,and single letters,respectively,where an extended analysis of the results is provided.Finally,the experimental outcomes,different research aspects,and dataset collection consistency and challenges are discussed and concluded with several new promising trends for future work. 展开更多
关键词 visual speech recognition LIP-READING deep learning quranic Arabic dataset Tajwid
下载PDF
A Vision-Based Fingertip-Writing Character Recognition System 被引量:1
8
作者 Ching-Long Shih Wen-Yo Lee Yu-Te Ku 《Journal of Computer and Communications》 2016年第4期160-168,共9页
This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simpli... This paper presents a vision-based fingertip-writing character recognition system. The overall system is implemented through a CMOS image camera on a FPGA chip. A blue cover is mounted on the top of a finger to simplify fingertip detection and to enhance recognition accuracy. For each character stroke, 8 sample points (including start and end points) are recorded. 7 tangent angles between consecutive sampled points are also recorded as features. In addition, 3 features angles are extracted: angles of the triangle consisting of the start point, end point and average point of all (8 total) sampled points. According to these key feature angles, a simple template matching K-nearest-neighbor classifier is applied to distinguish each character stroke. Experimental result showed that the system can successfully recognize fingertip-writing character strokes of digits and small lower case letter alphabets with an accuracy of almost 100%. Overall, the proposed finger-tip-writing recognition system provides an easy-to-use and accurate visual character input method. 展开更多
关键词 visual Character recognition Fingertip Detection Template Matching K-Nearest-Neighbor Classifier FPGA
下载PDF
Application of Multivariate Reinforcement Learning Engine in Optimizing the Power Generation Process of Domestic Waste Incineration
9
作者 Tao Ning Dunli Chen 《Journal of Electronic Research and Application》 2023年第5期30-41,共12页
Garbage incineration is an ideal method for the harmless and resource-oriented treatment of urban domestic waste.However,current domestic waste incineration power plants often face challenges related to maintaining co... Garbage incineration is an ideal method for the harmless and resource-oriented treatment of urban domestic waste.However,current domestic waste incineration power plants often face challenges related to maintaining consistent steam production and high operational costs.This article capitalizes on the technical advantages of big data artificial intelligence,optimizing the power generation process of domestic waste incineration as the entry point,and adopts four main engine modules of Alibaba Cloud reinforcement learning algorithm engine,operating parameter prediction engine,anomaly recognition engine,and video visual recognition algorithm engine.The reinforcement learning algorithm extracts the operational parameters of each incinerator to obtain a control benchmark.Through the operating parameter prediction algorithm,prediction models for drum pressure,primary steam flow,NOx,SO2,and HCl are constructed to achieve short-term prediction of operational parameters,ultimately improving control performance.The anomaly recognition algorithm develops a thickness identification model for the material layer in the drying section,allowing for rapid and effective assessment of feed material thickness to ensure uniformity control.Meanwhile,the visual recognition algorithm identifies flame images and assesses the combustion status and location of the combustion fire line within the furnace.This real-time understanding of furnace flame combustion conditions guides adjustments to the grate and air volume.Integrating AI technology into the waste incineration sector empowers the environmental protection industry with the potential to leverage big data.This development holds practical significance in optimizing the harmless and resource-oriented treatment of urban domestic waste,reducing operational costs,and increasing efficiency. 展开更多
关键词 Multivariable reinforcement learning engine Waste incineration power generation visual recognition algorithm
下载PDF
HLR-Net: A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks 被引量:1
10
作者 Amany M.Sarhan Nada M.Elshennawy Dina M.Ibrahim 《Computers, Materials & Continua》 SCIE EI 2021年第8期1531-1549,共19页
Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading mode... Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking.This is a task of decoding the text from the speaker’s mouth movement.This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles.Using deep learning technologies makes it easier for users to extract a large number of different features,which can then be converted to probabilities of letters to obtain accurate results.Recently proposed methods for lip reading are based on sequence-to-sequence architectures that are designed for natural machine translation and audio speech recognition.However,in this paper,a deep convolutional neural network model called the hybrid lip-reading(HLR-Net)model is developed for lip reading from a video.The proposed model includes three stages,namely,preprocessing,encoder,and decoder stages,which produce the output subtitle.The inception,gradient,and bidirectional GRU layers are used to build the encoder,and the attention,fully-connected,activation function layers are used to build the decoder,which performs the connectionist temporal classification(CTC).In comparison with the three recent models,namely,the LipNet model,the lip-reading model with cascaded attention(LCANet),and attention-CTC(A-ACA)model,on the GRID corpus dataset,the proposed HLR-Net model can achieve significant improvements,achieving the CER of 4.9%,WER of 9.7%,and Bleu score of 92%in the case of unseen speakers,and the CER of 1.4%,WER of 3.3%,and Bleu score of 99%in the case of overlapped speakers. 展开更多
关键词 LIP-READING visual speech recognition deep neural network connectionist temporal classification
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部