Distributed speech recognition (DSR) applications have certain QoS (Quality of service) requirements in terms of latency, packet loss rate, etc. To deliver quality guaranteed DSR application over wirelined or wireless...Distributed speech recognition (DSR) applications have certain QoS (Quality of service) requirements in terms of latency, packet loss rate, etc. To deliver quality guaranteed DSR application over wirelined or wireless links, some QoS mechanisms should be provided. We put forward a RTP/RSVP transmission scheme with DSR-specific payload and QoS parameters by modifying the present WAP protocol stack. The simulation result shows that this scheme will provide adequate network bandwidth to keep the real-time transport of DSR data over either wirelined or wireless channels.展开更多
Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource effic...Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource efficiency, we propose a high efficiency hardware implementation for TSR. We divide the TSR procedure into two stages, detection and recognition. In the detection stage, under the assumption that most German traffic signs have red or blue colors with circle, triangle or rectangle shapes, we use Normalized RGB color transform and Single-Pass Connected Component Labeling (CCL) to find the potential traffic signs efficiently. For Single-Pass CCL, our contribution is to eliminate the “merge-stack” operations by recording connected relations of region in the scan phase and updating the labels in the iterating phase. In the recognition stage, the Histogram of Oriented Gradient (HOG) is used to generate the descriptor of the signs, and we classify the signs with Support Vector Machine (SVM). In the HOG module, we analyze the required minimum bits under different recognition rate. The proposed method achieves 96.61% detection rate and 90.85% recognition rate while testing with the GTSDB dataset. Our hardware implementation reduces the storage of CCL and simplifies the HOG computation. Main CCL storage size is reduced by 20% comparing to the most advanced design under typical condition. By using TSMC 90 nm technology, the proposed design operates at 105 MHz clock rate and processes in 135 fps with the image size of 1360 × 800. The chip size is about 1 mm2 and the power consumption is close to 8 mW. Therefore, this work is resource efficient and achieves real-time requirement.展开更多
Humans,as intricate beings driven by a multitude of emotions,possess a remarkable ability to decipher and respond to socio-affective cues.However,many individuals and machines struggle to interpret such nuanced signal...Humans,as intricate beings driven by a multitude of emotions,possess a remarkable ability to decipher and respond to socio-affective cues.However,many individuals and machines struggle to interpret such nuanced signals,including variations in tone of voice.This paper explores the potential of intelligent technologies to bridge this gap and improve the quality of conversations.In particular,the authors propose a real-time processing method that captures and evaluates emotions in speech,utilizing a terminal device like the Raspberry Pi computer.Furthermore,the authors provide an overview of the current research landscape surrounding speech emotional recognition and delve into our methodology,which involves analyzing audio files from renowned emotional speech databases.To aid incomprehension,the authors present visualizations of these audio files in situ,employing dB-scaled Mel spectrograms generated through TensorFlow and Matplotlib.The authors use a support vector machine kernel and a Convolutional Neural Network with transfer learning to classify emotions.Notably,the classification accuracies achieved are 70% and 77%,respectively,demonstrating the efficacy of our approach when executed on an edge device rather than relying on a server.The system can evaluate pure emotion in speech and provide corresponding visualizations to depict the speaker’s emotional state in less than one second on a Raspberry Pi.These findings pave the way for more effective and emotionally intelligent human-machine interactions in various domains.展开更多
With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors we...With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors were widely applied due to their low cost. This paper explored the implementation of a human hand posture recognition system using ToF sensors and residual neural networks. Firstly, this paper reviewed the typical applications of human hand recognition. Secondly, this paper designed a hand gesture recognition system using a ToF sensor VL53L5. Subsequently, data preprocessing was conducted, followed by training the constructed residual neural network. Then, the recognition results were analyzed, indicating that gesture recognition based on the residual neural network achieved an accuracy of 98.5% in a 5-class classification scenario. Finally, the paper discussed existing issues and future research directions.展开更多
As a new technical means that can detect abnormal signs of water inrush in advance and give an early warning,the automatic monitoring and early warning of water inrush in mines has been widely valued in recent years.D...As a new technical means that can detect abnormal signs of water inrush in advance and give an early warning,the automatic monitoring and early warning of water inrush in mines has been widely valued in recent years.Due to the many factors affecting water inrush and the complicated water inrush mechanism,many factors close to water inrush may have precursory abnormal changes.At present,the existing monitoring and early warning system mainly uses a few monitoring indicators such as groundwater level,water influx,and temperature,and performs water inrush early warning through the abnormal change of a single factor.However,there are relatively few multi-factor comprehensive early warning identification models.Based on the analysis of the abnormal changes of precursor factors in multiple water inrush cases,11 measurable and effective indicators including groundwater flow field,hydrochemical field and temperature field are proposed.Finally,taking Hengyuan coal mine as an example,6 indicators with long-term monitoring data sequences were selected to establish a single-index hierarchical early-warning recognition model,a multi-factor linear recognition model,and a comprehensive intelligent early-warning recognition model.The results show that the correct rate of early warning can reach 95.2%.展开更多
Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbase...Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.展开更多
Because of the excellent performance of Transformer in sequence learning tasks,such as natural language processing,an improved Transformer-like model is proposed that is suitable for speech emotion recognition tasks.T...Because of the excellent performance of Transformer in sequence learning tasks,such as natural language processing,an improved Transformer-like model is proposed that is suitable for speech emotion recognition tasks.To alleviate the prohibitive time consumption and memory footprint caused by softmax inside the multihead attention unit in Transformer,a new linear self-attention algorithm is proposed.The original exponential function is replaced by a Taylor series expansion formula.On the basis of the associative property of matrix products,the time and space complexity of softmax operation regarding the input's length is reduced from O(N2)to O(N),where N is the sequence length.Experimental results on the emotional corpora of two languages show that the proposed linear attention algorithm can achieve similar performance to the original scaled dot product attention,while the training time and memory cost are reduced by half.Furthermore,the improved model obtains more robust performance on speech emotion recognition compared with the original Transformer.展开更多
Iris recognition enjoys universality, high degree of uniqueness and moderate user co-operation. This makes iris recognition systems unavoidable in emerging security & authentication mechanisms. An iris recognition sy...Iris recognition enjoys universality, high degree of uniqueness and moderate user co-operation. This makes iris recognition systems unavoidable in emerging security & authentication mechanisms. An iris recognition system based on vector quantization (VQ) techniques is proposed and its performance is compared with the discrete cosine transform (DCT). The proposed system does not need any pre-processing and segmentation of the iris. We have tested Linde-Buzo- Gray (LBG), Kekre's proportionate error (KPE) algorithm and Kekre's fast codebook generation (KFCG) algorithm for the clustering purpose. Proposed vector quantization based method using KFCG requires 99.99% less computations as that of full 2-dimensional DCT. Further, the KFCG method gives better performance with the accuracy of 89.10% outperforming DCT that gives accuracy around 66.10%.展开更多
Recognizing various traffic signs,especially the popular circular traffic signs,is an essential task for implementing advanced driver assistance system.To recognize circular traffic signs with high accuracy and robust...Recognizing various traffic signs,especially the popular circular traffic signs,is an essential task for implementing advanced driver assistance system.To recognize circular traffic signs with high accuracy and robustness,a novel approach which uses the so-called improved constrained binary fast radial symmetry(ICBFRS) detector and pseudo-zernike moments based support vector machine(PZM-SVM) classifier is proposed.In the detection stage,the scene image containing the traffic signs will be converted into Lab color space for color segmentation.Then the ICBFRS detector can efficiently capture the position and scale of sign candidates within the scene by detecting the centers of circles.In the classification stage,once the candidates are cropped out of the image,pseudo-zernike moments are adopted to represent the features of extracted pictogram,which are then fed into a support vector machine to classify different traffic signs.Experimental results under different lighting conditions indicate that the proposed method has robust detection effect and high classification accuracy.展开更多
As multimedia data sharing increases,data security in mobile devices and its mechanism can be seen as critical.Biometrics combines the physiological and behavioral qualities of an individual to validate their characte...As multimedia data sharing increases,data security in mobile devices and its mechanism can be seen as critical.Biometrics combines the physiological and behavioral qualities of an individual to validate their character in real-time.Humans incorporate physiological attributes like a fingerprint,face,iris,palm print,finger knuckle print,Deoxyribonucleic Acid(DNA),and behavioral qualities like walk,voice,mark,or keystroke.The main goal of this paper is to design a robust framework for automatic face recognition.Scale Invariant Feature Transform(SIFT)and Speeded-up Robust Features(SURF)are employed for face recognition.Also,we propose a modified Gabor Wavelet Transform for SIFT/SURF(GWT-SIFT/GWT-SURF)to increase the recognition accuracy of human faces.The proposed scheme is composed of three steps.First,the entropy of the image is removed using Discrete Wavelet Transform(DWT).Second,the computational complexity of the SIFT/SURF is reduced.Third,the accuracy is increased for authentication by the proposed GWT-SIFT/GWT-SURF algorithm.A comparative analysis of the proposed scheme is done on real-time Olivetti Research Laboratory(ORL)and Poznan University of Technology(PUT)databases.When compared to the traditional SIFT/SURF methods,we verify that the GWT-SIFT achieves the better accuracy of 99.32%and the better approach is the GWT-SURF as the run time of the GWT-SURF for 100 images is 3.4 seconds when compared to the GWT-SIFT which has a run time of 4.9 seconds for 100 images.展开更多
Seabed sediment recognition is vital for the exploitation of marine resources.Side-scan sonar(SSS)is an excellent tool for acquiring the imagery of seafloor topography.Combined with ocean surface sampling,it provides ...Seabed sediment recognition is vital for the exploitation of marine resources.Side-scan sonar(SSS)is an excellent tool for acquiring the imagery of seafloor topography.Combined with ocean surface sampling,it provides detailed and accurate images of marine substrate features.Most of the processing of SSS imagery works around limited sampling stations and requires manual interpretation to complete the classification of seabed sediment imagery.In complex sea areas,with manual interpretation,small targets are often lost due to a large amount of information.To date,studies related to the automatic recognition of seabed sediments are still few.This paper proposes a seabed sediment recognition method based on You Only Look Once version 5 and SSS imagery to perform real-time sedi-ment classification and localization for accuracy,particularly on small targets and faster speeds.We used methods such as changing the dataset size,epoch,and optimizer and adding multiscale training to overcome the challenges of having a small sample and a low accuracy.With these methods,we improved the results on mean average precision by 8.98%and F1 score by 11.12%compared with the original method.In addition,the detection speed was approximately 100 frames per second,which is faster than that of previous methods.This speed enabled us to achieve real-time seabed sediment recognition from SSS imagery.展开更多
The traditional oriented FAST and rotated BRIEF(ORB) algorithm has problems of instability and repetition of keypoints and it does not possess scale invariance. In order to deal with these drawbacks, a modified ORB...The traditional oriented FAST and rotated BRIEF(ORB) algorithm has problems of instability and repetition of keypoints and it does not possess scale invariance. In order to deal with these drawbacks, a modified ORB(MORB) algorithm is proposed. In order to improve the precision of matching and tracking, this paper puts forward an MOK algorithm that fuses MORB and Kanade-Lucas-Tomasi(KLT). By using Kalman, the object's state in the next frame is predicted in order to reduce the size of search window and improve the real-time performance of object tracking. The experimental results show that the MOK algorithm can accurately track objects with deformation or with background clutters, exhibiting higher robustness and accuracy on diverse datasets. Also, the MOK algorithm has a good real-time performance with the average frame rate reaching 90.8 fps.展开更多
The myosin heavy chain(MyHC)is one of the major structural and contracting proteins of muscle.We have isolated the cDNA clone encoding MyHC of the grass carp,Ctenopharyngodon idella. The sequence comprises 5 934 bp,in...The myosin heavy chain(MyHC)is one of the major structural and contracting proteins of muscle.We have isolated the cDNA clone encoding MyHC of the grass carp,Ctenopharyngodon idella. The sequence comprises 5 934 bp,including a 5 814 bp open reading frame encoding an amino acid sequence of 1 937 residues.The deduced amino acid sequence showed 69%homology to rabbit fast skeletal MyHC and 73%–76%homology to the MyHCs from the mandarin fish,walleye pollack,white croaker,chum salmon,and carp.The putative sequences of subfragment-1 and the light meromyosin region showed 61.4%–80%homology to the corresponding regions of other fish MyHCs.The tissue-specific and developmental stage-specific expressions of the MyHC gene were analyzed by quantitative real-time PCR.The MyHC gene showed the highest expression in the muscles compared with the kidney,spleen and intestine.Developmentally,there was a gradual increase in MyHC mRNA expression from the neural formation stage to the tail bud stage.The highest expression was detected in hatching larva.Our work on the MyHC gene from the grass carp has provided useful information for fish molecular biology and fish genomics.展开更多
文摘Distributed speech recognition (DSR) applications have certain QoS (Quality of service) requirements in terms of latency, packet loss rate, etc. To deliver quality guaranteed DSR application over wirelined or wireless links, some QoS mechanisms should be provided. We put forward a RTP/RSVP transmission scheme with DSR-specific payload and QoS parameters by modifying the present WAP protocol stack. The simulation result shows that this scheme will provide adequate network bandwidth to keep the real-time transport of DSR data over either wirelined or wireless channels.
文摘Traffic sign recognition (TSR, or Road Sign Recognition, RSR) is one of the Advanced Driver Assistance System (ADAS) devices in modern cars. To concern the most important issues, which are real-time and resource efficiency, we propose a high efficiency hardware implementation for TSR. We divide the TSR procedure into two stages, detection and recognition. In the detection stage, under the assumption that most German traffic signs have red or blue colors with circle, triangle or rectangle shapes, we use Normalized RGB color transform and Single-Pass Connected Component Labeling (CCL) to find the potential traffic signs efficiently. For Single-Pass CCL, our contribution is to eliminate the “merge-stack” operations by recording connected relations of region in the scan phase and updating the labels in the iterating phase. In the recognition stage, the Histogram of Oriented Gradient (HOG) is used to generate the descriptor of the signs, and we classify the signs with Support Vector Machine (SVM). In the HOG module, we analyze the required minimum bits under different recognition rate. The proposed method achieves 96.61% detection rate and 90.85% recognition rate while testing with the GTSDB dataset. Our hardware implementation reduces the storage of CCL and simplifies the HOG computation. Main CCL storage size is reduced by 20% comparing to the most advanced design under typical condition. By using TSMC 90 nm technology, the proposed design operates at 105 MHz clock rate and processes in 135 fps with the image size of 1360 × 800. The chip size is about 1 mm2 and the power consumption is close to 8 mW. Therefore, this work is resource efficient and achieves real-time requirement.
文摘Humans,as intricate beings driven by a multitude of emotions,possess a remarkable ability to decipher and respond to socio-affective cues.However,many individuals and machines struggle to interpret such nuanced signals,including variations in tone of voice.This paper explores the potential of intelligent technologies to bridge this gap and improve the quality of conversations.In particular,the authors propose a real-time processing method that captures and evaluates emotions in speech,utilizing a terminal device like the Raspberry Pi computer.Furthermore,the authors provide an overview of the current research landscape surrounding speech emotional recognition and delve into our methodology,which involves analyzing audio files from renowned emotional speech databases.To aid incomprehension,the authors present visualizations of these audio files in situ,employing dB-scaled Mel spectrograms generated through TensorFlow and Matplotlib.The authors use a support vector machine kernel and a Convolutional Neural Network with transfer learning to classify emotions.Notably,the classification accuracies achieved are 70% and 77%,respectively,demonstrating the efficacy of our approach when executed on an edge device rather than relying on a server.The system can evaluate pure emotion in speech and provide corresponding visualizations to depict the speaker’s emotional state in less than one second on a Raspberry Pi.These findings pave the way for more effective and emotionally intelligent human-machine interactions in various domains.
文摘With the advancement of technology and the increase in user demands, gesture recognition played a pivotal role in the field of human-computer interaction. Among various sensing devices, Time-of-Flight (ToF) sensors were widely applied due to their low cost. This paper explored the implementation of a human hand posture recognition system using ToF sensors and residual neural networks. Firstly, this paper reviewed the typical applications of human hand recognition. Secondly, this paper designed a hand gesture recognition system using a ToF sensor VL53L5. Subsequently, data preprocessing was conducted, followed by training the constructed residual neural network. Then, the recognition results were analyzed, indicating that gesture recognition based on the residual neural network achieved an accuracy of 98.5% in a 5-class classification scenario. Finally, the paper discussed existing issues and future research directions.
基金financially supported by the National Key Research and Development Program of China(No.2019YFC1805400)。
文摘As a new technical means that can detect abnormal signs of water inrush in advance and give an early warning,the automatic monitoring and early warning of water inrush in mines has been widely valued in recent years.Due to the many factors affecting water inrush and the complicated water inrush mechanism,many factors close to water inrush may have precursory abnormal changes.At present,the existing monitoring and early warning system mainly uses a few monitoring indicators such as groundwater level,water influx,and temperature,and performs water inrush early warning through the abnormal change of a single factor.However,there are relatively few multi-factor comprehensive early warning identification models.Based on the analysis of the abnormal changes of precursor factors in multiple water inrush cases,11 measurable and effective indicators including groundwater flow field,hydrochemical field and temperature field are proposed.Finally,taking Hengyuan coal mine as an example,6 indicators with long-term monitoring data sequences were selected to establish a single-index hierarchical early-warning recognition model,a multi-factor linear recognition model,and a comprehensive intelligent early-warning recognition model.The results show that the correct rate of early warning can reach 95.2%.
文摘Recognition of dynamic hand gestures in real-time is a difficult task because the system can never know when or from where the gesture starts and ends in a video stream.Many researchers have been working on visionbased gesture recognition due to its various applications.This paper proposes a deep learning architecture based on the combination of a 3D Convolutional Neural Network(3D-CNN)and a Long Short-Term Memory(LSTM)network.The proposed architecture extracts spatial-temporal information from video sequences input while avoiding extensive computation.The 3D-CNN is used for the extraction of spectral and spatial features which are then given to the LSTM network through which classification is carried out.The proposed model is a light-weight architecture with only 3.7 million training parameters.The model has been evaluated on 15 classes from the 20BN-jester dataset available publicly.The model was trained on 2000 video-clips per class which were separated into 80%training and 20%validation sets.An accuracy of 99%and 97%was achieved on training and testing data,respectively.We further show that the combination of 3D-CNN with LSTM gives superior results as compared to MobileNetv2+LSTM.
基金The National Key Research and Development Program of China(No.2020YFC2004002,2020YFC2004003)the National Natural Science Foundation of China(No.61871213,61673108,61571106).
文摘Because of the excellent performance of Transformer in sequence learning tasks,such as natural language processing,an improved Transformer-like model is proposed that is suitable for speech emotion recognition tasks.To alleviate the prohibitive time consumption and memory footprint caused by softmax inside the multihead attention unit in Transformer,a new linear self-attention algorithm is proposed.The original exponential function is replaced by a Taylor series expansion formula.On the basis of the associative property of matrix products,the time and space complexity of softmax operation regarding the input's length is reduced from O(N2)to O(N),where N is the sequence length.Experimental results on the emotional corpora of two languages show that the proposed linear attention algorithm can achieve similar performance to the original scaled dot product attention,while the training time and memory cost are reduced by half.Furthermore,the improved model obtains more robust performance on speech emotion recognition compared with the original Transformer.
文摘Iris recognition enjoys universality, high degree of uniqueness and moderate user co-operation. This makes iris recognition systems unavoidable in emerging security & authentication mechanisms. An iris recognition system based on vector quantization (VQ) techniques is proposed and its performance is compared with the discrete cosine transform (DCT). The proposed system does not need any pre-processing and segmentation of the iris. We have tested Linde-Buzo- Gray (LBG), Kekre's proportionate error (KPE) algorithm and Kekre's fast codebook generation (KFCG) algorithm for the clustering purpose. Proposed vector quantization based method using KFCG requires 99.99% less computations as that of full 2-dimensional DCT. Further, the KFCG method gives better performance with the accuracy of 89.10% outperforming DCT that gives accuracy around 66.10%.
基金Supported by the Program for Changjiang Scholars and Innovative Research Team (2008)Program for New Centoury Excellent Talents in University(NCET-09-0045)+1 种基金the National Nat-ural Science Foundation of China (60773044,61004059)the Natural Science Foundation of Beijing(4101001)
文摘Recognizing various traffic signs,especially the popular circular traffic signs,is an essential task for implementing advanced driver assistance system.To recognize circular traffic signs with high accuracy and robustness,a novel approach which uses the so-called improved constrained binary fast radial symmetry(ICBFRS) detector and pseudo-zernike moments based support vector machine(PZM-SVM) classifier is proposed.In the detection stage,the scene image containing the traffic signs will be converted into Lab color space for color segmentation.Then the ICBFRS detector can efficiently capture the position and scale of sign candidates within the scene by detecting the centers of circles.In the classification stage,once the candidates are cropped out of the image,pseudo-zernike moments are adopted to represent the features of extracted pictogram,which are then fed into a support vector machine to classify different traffic signs.Experimental results under different lighting conditions indicate that the proposed method has robust detection effect and high classification accuracy.
文摘As multimedia data sharing increases,data security in mobile devices and its mechanism can be seen as critical.Biometrics combines the physiological and behavioral qualities of an individual to validate their character in real-time.Humans incorporate physiological attributes like a fingerprint,face,iris,palm print,finger knuckle print,Deoxyribonucleic Acid(DNA),and behavioral qualities like walk,voice,mark,or keystroke.The main goal of this paper is to design a robust framework for automatic face recognition.Scale Invariant Feature Transform(SIFT)and Speeded-up Robust Features(SURF)are employed for face recognition.Also,we propose a modified Gabor Wavelet Transform for SIFT/SURF(GWT-SIFT/GWT-SURF)to increase the recognition accuracy of human faces.The proposed scheme is composed of three steps.First,the entropy of the image is removed using Discrete Wavelet Transform(DWT).Second,the computational complexity of the SIFT/SURF is reduced.Third,the accuracy is increased for authentication by the proposed GWT-SIFT/GWT-SURF algorithm.A comparative analysis of the proposed scheme is done on real-time Olivetti Research Laboratory(ORL)and Poznan University of Technology(PUT)databases.When compared to the traditional SIFT/SURF methods,we verify that the GWT-SIFT achieves the better accuracy of 99.32%and the better approach is the GWT-SURF as the run time of the GWT-SURF for 100 images is 3.4 seconds when compared to the GWT-SIFT which has a run time of 4.9 seconds for 100 images.
基金funded by the Natural Science Foundation of Fujian Province(No.2018J01063)the Project of Deep Learning Based Underwater Cultural Relics Recognization(No.38360041)the Project of the State Administration of Cultural Relics(No.2018300).
文摘Seabed sediment recognition is vital for the exploitation of marine resources.Side-scan sonar(SSS)is an excellent tool for acquiring the imagery of seafloor topography.Combined with ocean surface sampling,it provides detailed and accurate images of marine substrate features.Most of the processing of SSS imagery works around limited sampling stations and requires manual interpretation to complete the classification of seabed sediment imagery.In complex sea areas,with manual interpretation,small targets are often lost due to a large amount of information.To date,studies related to the automatic recognition of seabed sediments are still few.This paper proposes a seabed sediment recognition method based on You Only Look Once version 5 and SSS imagery to perform real-time sedi-ment classification and localization for accuracy,particularly on small targets and faster speeds.We used methods such as changing the dataset size,epoch,and optimizer and adding multiscale training to overcome the challenges of having a small sample and a low accuracy.With these methods,we improved the results on mean average precision by 8.98%and F1 score by 11.12%compared with the original method.In addition,the detection speed was approximately 100 frames per second,which is faster than that of previous methods.This speed enabled us to achieve real-time seabed sediment recognition from SSS imagery.
基金supported by the National Natural Science Foundation of China(61471194)the Fundamental Research Funds for the Central Universities+2 种基金the Science and Technology on Avionics Integration Laboratory and Aeronautical Science Foundation of China(20155552050)the CASC(China Aerospace Science and Technology Corporation) Aerospace Science and Technology Innovation Foundation Projectthe Nanjing University of Aeronautics And Astronautics Graduate School Innovation Base(Laboratory)Open Foundation Program(kfjj20151505)
文摘The traditional oriented FAST and rotated BRIEF(ORB) algorithm has problems of instability and repetition of keypoints and it does not possess scale invariance. In order to deal with these drawbacks, a modified ORB(MORB) algorithm is proposed. In order to improve the precision of matching and tracking, this paper puts forward an MOK algorithm that fuses MORB and Kanade-Lucas-Tomasi(KLT). By using Kalman, the object's state in the next frame is predicted in order to reduce the size of search window and improve the real-time performance of object tracking. The experimental results show that the MOK algorithm can accurately track objects with deformation or with background clutters, exhibiting higher robustness and accuracy on diverse datasets. Also, the MOK algorithm has a good real-time performance with the average frame rate reaching 90.8 fps.
基金Supported by the National Natural Science Foundation of China(Nos.30972263,30771644)the Natural Science Foundation of HunanProvince(No.09jj6037)
文摘The myosin heavy chain(MyHC)is one of the major structural and contracting proteins of muscle.We have isolated the cDNA clone encoding MyHC of the grass carp,Ctenopharyngodon idella. The sequence comprises 5 934 bp,including a 5 814 bp open reading frame encoding an amino acid sequence of 1 937 residues.The deduced amino acid sequence showed 69%homology to rabbit fast skeletal MyHC and 73%–76%homology to the MyHCs from the mandarin fish,walleye pollack,white croaker,chum salmon,and carp.The putative sequences of subfragment-1 and the light meromyosin region showed 61.4%–80%homology to the corresponding regions of other fish MyHCs.The tissue-specific and developmental stage-specific expressions of the MyHC gene were analyzed by quantitative real-time PCR.The MyHC gene showed the highest expression in the muscles compared with the kidney,spleen and intestine.Developmentally,there was a gradual increase in MyHC mRNA expression from the neural formation stage to the tail bud stage.The highest expression was detected in hatching larva.Our work on the MyHC gene from the grass carp has provided useful information for fish molecular biology and fish genomics.