期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Automatic Speaker Recognition Using Mel-Frequency Cepstral Coefficients Through Machine Learning 被引量:1
1
作者 U˘gur Ayvaz Hüseyin Gürüler +3 位作者 Faheem Khan Naveed Ahmed Taegkeun Whangbo Abdusalomov Akmalbek Bobomirzaevich 《Computers, Materials & Continua》 SCIE EI 2022年第6期5511-5521,共11页
Automatic speaker recognition(ASR)systems are the field of Human-machine interaction and scientists have been using feature extraction and feature matching methods to analyze and synthesize these signals.One of the mo... Automatic speaker recognition(ASR)systems are the field of Human-machine interaction and scientists have been using feature extraction and feature matching methods to analyze and synthesize these signals.One of the most commonly used methods for feature extraction is Mel Frequency Cepstral Coefficients(MFCCs).Recent researches show that MFCCs are successful in processing the voice signal with high accuracies.MFCCs represents a sequence of voice signal-specific features.This experimental analysis is proposed to distinguish Turkish speakers by extracting the MFCCs from the speech recordings.Since the human perception of sound is not linear,after the filterbank step in theMFCC method,we converted the obtained log filterbanks into decibel(dB)features-based spectrograms without applying the Discrete Cosine Transform(DCT).A new dataset was created with converted spectrogram into a 2-D array.Several learning algorithms were implementedwith a 10-fold cross-validationmethod to detect the speaker.The highest accuracy of 90.2%was achieved using Multi-layer Perceptron(MLP)with tanh activation function.The most important output of this study is the inclusion of human voice as a new feature set. 展开更多
关键词 Automatic speaker recognition human voice recognition spatial pattern recognition MFCCs SPECTROGRAM machine learning artificial intelligence
下载PDF
Dynamic Audio-Visual Biometric Fusion for Person Recognition
2
作者 Najlaa Hindi Alsaedi Emad Sami Jaha 《Computers, Materials & Continua》 SCIE EI 2022年第4期1283-1311,共29页
Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recogni... Biometric recognition refers to the process of recognizing a person’s identity using physiological or behavioral modalities,such as face,voice,fingerprint,gait,etc.Such biometric modalities are mostly used in recognition tasks separately as in unimodal systems,or jointly with two or more as in multimodal systems.However,multimodal systems can usually enhance the recognition performance over unimodal systems by integrating the biometric data of multiple modalities at different fusion levels.Despite this enhancement,in real-life applications some factors degrade multimodal systems’performance,such as occlusion,face poses,and noise in voice data.In this paper,we propose two algorithms that effectively apply dynamic fusion at feature level based on the data quality of multimodal biometrics.The proposed algorithms attempt to minimize the negative influence of confusing and low-quality features by either exclusion or weight reduction to achieve better recognition performance.The proposed dynamic fusion was achieved using face and voice biometrics,where face features were extracted using principal component analysis(PCA),and Gabor filters separately,whilst voice features were extracted using Mel-Frequency Cepstral Coefficients(MFCCs).Here,the facial data quality assessment of face images is mainly based on the existence of occlusion,whereas the assessment of voice data quality is substantially based on the calculation of signal to noise ratio(SNR)as per the existence of noise.To evaluate the performance of the proposed algorithms,several experiments were conducted using two combinations of three different databases,AR database,and the extended Yale Face Database B for face images,in addition to VOiCES database for voice data.The obtained results show that both proposed dynamic fusion algorithms attain improved performance and offer more advantages in identification and verification over not only the standard unimodal algorithms but also the multimodal algorithms using standard fusion methods. 展开更多
关键词 BIOMETRICS dynamic fusion feature fusion identification multimodal biometrics occluded face recognition quality-based recognition verification voice recognition
下载PDF
Development of Voice Control Algorithm for Robotic Wheelchair Using MIN and LSTM Models
3
作者 Mohsen Bakouri 《Computers, Materials & Continua》 SCIE EI 2022年第11期2441-2456,共16页
In this work,we developed and implemented a voice control algorithm to steer smart robotic wheelchairs(SRW)using the neural network technique.This technique used a network in network(NIN)and long shortterm memory(LSTM... In this work,we developed and implemented a voice control algorithm to steer smart robotic wheelchairs(SRW)using the neural network technique.This technique used a network in network(NIN)and long shortterm memory(LSTM)structure integrated with a built-in voice recognition algorithm.An Android Smartphone application was designed and configured with the proposed method.A Wi-Fi hotspot was used to connect the software and hardware components of the system in an offline mode.To operate and guide SRW,the design technique proposed employing five voice commands(yes,no,left,right,no,and stop)via the Raspberry Pi and DC motors.Ten native Arabic speakers trained and validated an English speech corpus to determine the method’s overall effectiveness.The design method of SRW was evaluated in both indoor and outdoor environments in order to determine its time response and performance.The results showed that the accuracy rate for the system reached 98.2%for the five-voice commends in classifying voices accurately.Another interesting finding from the real-time test was that the root-mean-square deviation(RMSD)for indoor/outdoor maneuvering nodes was 2.2∗10–5(for latitude),while that for longitude coordinates was a whopping 2.4∗10–5(for latitude). 展开更多
关键词 Network in network long short-term memory voice recognition WHEELCHAIR
下载PDF
Research on Voiceprint Recognition of Camouflage Voice Based on Deep Belief Network 被引量:3
4
作者 Nan Jiang Ting Liu 《International Journal of Automation and computing》 EI CSCD 2021年第6期947-962,共16页
The problem of disguised voice recognition based on deep belief networks is studied. A hybrid feature extraction algorithm based on formants, Gammatone frequency cepstrum coefficients(GFCC) and their different coeffic... The problem of disguised voice recognition based on deep belief networks is studied. A hybrid feature extraction algorithm based on formants, Gammatone frequency cepstrum coefficients(GFCC) and their different coefficients is proposed to extract more discriminative speaker features from the original voice data. Using mixed features as the input of the model, a masquerade voice library is constructed. A masquerade voice recognition model based on a depth belief network is proposed. A dropout strategy is introduced to prevent overfitting, which effectively solves the problems of traditional Gaussian mixture models, such as insufficient modeling ability and low discrimination. Experimental results show that the proposed disguised voice recognition method can better fit the feature distribution, and significantly improve the classification effect and recognition rate. 展开更多
关键词 Disguised voice recognition deep belief network feature extraction Gammatone frequency cepstrum coefficients(GFCC) DROPOUT
原文传递
具有超高灵敏度、宽工作范围、低检测限的3D气凝胶可穿戴压力传感器用于语音识别和生理信号监测 被引量:1
5
作者 尚成硕 何翔天 +9 位作者 李晓迪 刘泽瑞 宋玉祥 张玉林 李旭 鲁勇 丁小康 刘婷 张纪才 徐福建 《Science China Materials》 SCIE EI CAS CSCD 2023年第5期1911-1922,共12页
随着智能电子设备的快速发展,对同时具有超高灵敏度、宽工作范围、低检测限的可穿戴压力传感器的需求越来越大.本文开发了一种基于超轻(29.5 mg cm^(-3))和弹性的3D壳聚糖/MXene(CS/MXene)复合气凝胶的压阻式压力传感器.由于CS和MXene... 随着智能电子设备的快速发展,对同时具有超高灵敏度、宽工作范围、低检测限的可穿戴压力传感器的需求越来越大.本文开发了一种基于超轻(29.5 mg cm^(-3))和弹性的3D壳聚糖/MXene(CS/MXene)复合气凝胶的压阻式压力传感器.由于CS和MXene之间的强静电吸引力,具有良好机械性能的CS/MXene气凝胶只需一步冷冻干燥即可获得,无需额外的化学处理.CS/MXene复合气凝胶压力传感器在小压力区(<1 kPa)和大压力区(1-20 kPa)的灵敏度分别为709.38和252.37 kPa^(-1).在此压力范围下,其灵敏度是目前报道的同类型气凝胶压力传感器的最高值.此外,该传感器具有快速的响应时间(<120 ms)、1.4 Pa的超低检测限以及10,000次循环后几乎无衰减的良好稳定性.以上出色的性能不仅使得该传感器可用于检测肢体活动和空间压力分布等较大幅度的压力信号,而且还能准确检测脉搏、语音等微小压力信号.这种多功能的柔性压力传感器极大地拓宽了可穿戴电子器件在语音识别、健康监测和人机交互等诸多领域的应用范围. 展开更多
关键词 wearable electronics CS/MXene aerogel piezoresistive pressure sensor voice recognition health monitoring
原文传递
Flexible electronic eardrum 被引量:2
6
作者 Yang Gu Xuewen Wang +3 位作者 Wen Gu Yongjin Wu Tie Li Ting Zhang 《Nano Research》 SCIE EI CAS CSCD 2017年第8期2683-2691,共9页
Flexible mechanosensors with a high sensitivity and fast response speed may advance the wearable and implantable applications of healthcare devices, such as real-time heart rate, pulse, and respiration monitoring. In ... Flexible mechanosensors with a high sensitivity and fast response speed may advance the wearable and implantable applications of healthcare devices, such as real-time heart rate, pulse, and respiration monitoring. In this paper, we introduce a novel flexible electronic eardrum (EE) based on single-walled carbon nanotubes, polyethylene, and polydimethylsiloxane with micro-structured pyramid arrays. The EE device shows a high sensitivity, high signal-to-noise ratio (approximately 55 dB), and fast response time (76.9 μs) in detecting and recording sound within a frequency domain of 20-13,000 Hz. The mechanism for sound detection is investigated and the sensitivity is determined using the micro-structure, thickness, and strain state. We also demonstrated that the device is able to distinguish human voices. This unprecedented performance of the flexible electronic eardrum has implications for many applications such as implantable acoustical bioelectronics and personal voice recognition. 展开更多
关键词 electronic eardrum (EE) pressure sensor carbon nanotube voice recognition
原文传递
Artificial intelligence-based medical image segmentation for 3D printing and naked eye 3D visualization
7
作者 Guang Jia Xunan Huang +10 位作者 Sen Tao Xianghuai Zhang Yue Zhao Hongcai Wang Jie He Jiaxue Hao Bo Liu Jiejing Zhou Tanping Li Xiaoling Zhang Jinglong Gao 《Intelligent Medicine》 2022年第1期48-53,共6页
Image segmentation for 3D printing and 3D visualization has become an essential component in many fields of medical research,teaching,and clinical practice.Medical image segmentation requires sophisticated computerize... Image segmentation for 3D printing and 3D visualization has become an essential component in many fields of medical research,teaching,and clinical practice.Medical image segmentation requires sophisticated computerized quantifications and visualization tools.Recently,with the development of artificial intelligence(AI)technology,tumors or organs can be quickly and accurately detected and automatically contoured from medical images.This paper introduces a platform-independent,multi-modality image registration,segmentation,and 3D visualization program,named artificial intelligence-based medical image segmentation for 3D printing and naked eye 3D visualization(AIMIS3D).YOLOV3 algorithm was used to recognize prostate organ from T2-weighted MRI images with proper training.Prostate cancer and bladder cancer were segmented based on U-net from MRI images.CT images of osteosarcoma were loaded into the platform for the segmentation of lumbar spine,osteosarcoma,vessels,and local nerves for 3D printing.Breast displacement during each radiation therapy was quantitatively evaluated by automatically identifying the position of the 3D printed plastic breast bra.Brain vessel from multimodality MRI images was segmented by using model-based transfer learning for 3D printing and naked eye 3D visualization in AIMIS3D platform. 展开更多
关键词 Medical image segmentation Artificial intelligence Tumor segmentation 3D printing voice recognition Gesture recognition
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部