Building an automatic fish recognition and detection system for largescale fish classes is helpful for marine researchers and marine scientists because there are large numbers of fish species.However,it is quite diffi...Building an automatic fish recognition and detection system for largescale fish classes is helpful for marine researchers and marine scientists because there are large numbers of fish species.However,it is quite difficult to build such systems owing to the lack of data imbalance problems and large number of classes.To solve these issues,we propose a transfer learning-based technique in which we use Efficient-Net,which is pre-trained on ImageNet dataset and fine-tuned on QuT Fish Database,which is a large scale dataset.Furthermore,prior to the activation layer,we use Global Average Pooling(GAP)instead of dense layer with the aim of averaging the results of predictions along with having more information compared to the dense layer.To check the validity of our model,we validate our model on the validation set which achieves satisfactory results.Also,for the localization task,we propose an architecture that consists of localization aware block,which captures localization information for better prediction and residual connections to handle the over-fitting problem.Actually,the residual connections help the layer to combine missing information with the relevant one.In addition,we use class weights and Focal Loss(FL)to handle class imbalance problems along with reducing false predictions.Actually,class weights assign less weights to classes having fewer instances and large weights to classes having more number of instances.During the localization,the qualitative assessment shows that we achieve 57%Mean Intersection Over Union(IoU)on testing data,and the classification results show 75%precision,70%recall,78%accuracy and 74%F1-Score for 468 fish species.展开更多
Astrocytoma IV or glioblastoma is one of the fatal and dangerous types of brain tumors.Early detection of brain tumor increases the survival rate and helps in reducing the fatality rate.Various imaging modalities have...Astrocytoma IV or glioblastoma is one of the fatal and dangerous types of brain tumors.Early detection of brain tumor increases the survival rate and helps in reducing the fatality rate.Various imaging modalities have been used for diagnosing by expert radiologists,and Medical Resonance Image(MRI)is considered a better option for detecting brain tumors as MRI is a non-invasive technique and provides better visualization of the brain region.One of the challenging issues is to identify the tumorous region from the MRI scans correctly.Manual segmentation is performed by medical experts,which is a time-consuming task and got chances of errors.To overcome this issue,automatic segmentation is performed for quick and accurate results.The proposed approach is to capture inter-slice information and reduce the outliers.Deep learning-based brain tumor segmentation techniques proved best among available segmentation techniques.However,deep learning may miss some preliminary info while using MRI images during segmentation.As MRI volumes are volumetric,3D U-Net-based models are used but complex.Combinations of multiple 2D U-Net predictions in axial,sagittal,and coronal views help to capture inter-slice information.This approach may reduce the system complexity.Moreover,the Conditional Random Fields(CRF)reduce the predictions’false positives and improve the segmentation results.This model is applied to Brain Tumor Segmentation(BraTS)2019 dataset,and cross-validation is performed to check the accuracy of results.The proposed approach achieves Dice Similarity Score(DSC)of 0.77 on Enhancing Tumor(ET),0.90 on Whole Tumor(WT),and 0.84 on Tumor Core(TC)with reduced Hausdorff Distance(HD)of 3.05 on ET,5.12 on WT and 3.89 on TC.展开更多
The deep learning advancements have greatly improved the performance of speech recognition systems,and most recent systems are based on the Recurrent Neural Network(RNN).Overall,the RNN works fine with the small seque...The deep learning advancements have greatly improved the performance of speech recognition systems,and most recent systems are based on the Recurrent Neural Network(RNN).Overall,the RNN works fine with the small sequence data,but suffers from the gradient vanishing problem in case of large sequence.The transformer networks have neutralized this issue and have shown state-of-the-art results on sequential or speech-related data.Generally,in speech recognition,the input audio is converted into an image using Mel-spectrogram to illustrate frequencies and intensities.The image is classified by the machine learning mechanism to generate a classification transcript.However,the audio frequency in the image has low resolution and causing inaccurate predictions.This paper presents a novel end-to-end binary view transformer-based architecture for speech recognition to cope with the frequency resolution problem.Firstly,the input audio signal is transformed into a 2D image using Mel-spectrogram.Secondly,the modified universal transformers utilize the multi-head attention to derive contextual information and derive different speech-related features.Moreover,a feedforward neural network is also deployed for classification.The proposed system has generated robust results on Google’s speech command dataset with an accuracy of 95.16%and with minimal loss.The binary-view transformer eradicates the eventuality of the over-fitting problem by deploying a multiview mechanism to diversify the input data,and multi-head attention captures multiple contexts from the data’s feature map.展开更多
The exponential growth of population in developing countries likeIndia should focus on innovative technologies in the Agricultural processto meet the future crisis. One of the vital tasks is the crop yield predictiona...The exponential growth of population in developing countries likeIndia should focus on innovative technologies in the Agricultural processto meet the future crisis. One of the vital tasks is the crop yield predictionat its early stage;because it forms one of the most challenging tasks inprecision agriculture as it demands a deep understanding of the growth patternwith the highly nonlinear parameters. Environmental parameters like rainfall,temperature, humidity, and management practices like fertilizers, pesticides,irrigation are very dynamic in approach and vary from field to field. In theproposed work, the data were collected from paddy fields of 28 districts in widespectrum of Tamilnadu over a period of 18 years. The Statistical model MultiLinear Regression was used as a benchmark for crop yield prediction, whichyielded an accuracy of 82% owing to its wide ranging input data. Therefore,machine learning models are developed to obtain improved accuracy, namelyBack Propagation Neural Network (BPNN), Support Vector Machine, andGeneral Regression Neural Networks with the given data set. Results showthat GRNN has greater accuracy of 97% (R2 = 0.97) with a normalizedmean square error (NMSE) of 0.03. Hence GRNN can be used for crop yieldprediction in diversified geographical fields.展开更多
基金Zamil S.Alzamil would like to thank Deanship of Scientific Research at Majmaah University for supporting this work under Project No.R-2022-172.
文摘Building an automatic fish recognition and detection system for largescale fish classes is helpful for marine researchers and marine scientists because there are large numbers of fish species.However,it is quite difficult to build such systems owing to the lack of data imbalance problems and large number of classes.To solve these issues,we propose a transfer learning-based technique in which we use Efficient-Net,which is pre-trained on ImageNet dataset and fine-tuned on QuT Fish Database,which is a large scale dataset.Furthermore,prior to the activation layer,we use Global Average Pooling(GAP)instead of dense layer with the aim of averaging the results of predictions along with having more information compared to the dense layer.To check the validity of our model,we validate our model on the validation set which achieves satisfactory results.Also,for the localization task,we propose an architecture that consists of localization aware block,which captures localization information for better prediction and residual connections to handle the over-fitting problem.Actually,the residual connections help the layer to combine missing information with the relevant one.In addition,we use class weights and Focal Loss(FL)to handle class imbalance problems along with reducing false predictions.Actually,class weights assign less weights to classes having fewer instances and large weights to classes having more number of instances.During the localization,the qualitative assessment shows that we achieve 57%Mean Intersection Over Union(IoU)on testing data,and the classification results show 75%precision,70%recall,78%accuracy and 74%F1-Score for 468 fish species.
基金This research was supported by Suranaree University of Technology,Thailand,Grant Number:BRO7-709-62-12-03.
文摘Astrocytoma IV or glioblastoma is one of the fatal and dangerous types of brain tumors.Early detection of brain tumor increases the survival rate and helps in reducing the fatality rate.Various imaging modalities have been used for diagnosing by expert radiologists,and Medical Resonance Image(MRI)is considered a better option for detecting brain tumors as MRI is a non-invasive technique and provides better visualization of the brain region.One of the challenging issues is to identify the tumorous region from the MRI scans correctly.Manual segmentation is performed by medical experts,which is a time-consuming task and got chances of errors.To overcome this issue,automatic segmentation is performed for quick and accurate results.The proposed approach is to capture inter-slice information and reduce the outliers.Deep learning-based brain tumor segmentation techniques proved best among available segmentation techniques.However,deep learning may miss some preliminary info while using MRI images during segmentation.As MRI volumes are volumetric,3D U-Net-based models are used but complex.Combinations of multiple 2D U-Net predictions in axial,sagittal,and coronal views help to capture inter-slice information.This approach may reduce the system complexity.Moreover,the Conditional Random Fields(CRF)reduce the predictions’false positives and improve the segmentation results.This model is applied to Brain Tumor Segmentation(BraTS)2019 dataset,and cross-validation is performed to check the accuracy of results.The proposed approach achieves Dice Similarity Score(DSC)of 0.77 on Enhancing Tumor(ET),0.90 on Whole Tumor(WT),and 0.84 on Tumor Core(TC)with reduced Hausdorff Distance(HD)of 3.05 on ET,5.12 on WT and 3.89 on TC.
基金This research was supported by Suranaree University of Technology,Thailand,Grant Number:BRO7-709-62-12-03.
文摘The deep learning advancements have greatly improved the performance of speech recognition systems,and most recent systems are based on the Recurrent Neural Network(RNN).Overall,the RNN works fine with the small sequence data,but suffers from the gradient vanishing problem in case of large sequence.The transformer networks have neutralized this issue and have shown state-of-the-art results on sequential or speech-related data.Generally,in speech recognition,the input audio is converted into an image using Mel-spectrogram to illustrate frequencies and intensities.The image is classified by the machine learning mechanism to generate a classification transcript.However,the audio frequency in the image has low resolution and causing inaccurate predictions.This paper presents a novel end-to-end binary view transformer-based architecture for speech recognition to cope with the frequency resolution problem.Firstly,the input audio signal is transformed into a 2D image using Mel-spectrogram.Secondly,the modified universal transformers utilize the multi-head attention to derive contextual information and derive different speech-related features.Moreover,a feedforward neural network is also deployed for classification.The proposed system has generated robust results on Google’s speech command dataset with an accuracy of 95.16%and with minimal loss.The binary-view transformer eradicates the eventuality of the over-fitting problem by deploying a multiview mechanism to diversify the input data,and multi-head attention captures multiple contexts from the data’s feature map.
文摘The exponential growth of population in developing countries likeIndia should focus on innovative technologies in the Agricultural processto meet the future crisis. One of the vital tasks is the crop yield predictionat its early stage;because it forms one of the most challenging tasks inprecision agriculture as it demands a deep understanding of the growth patternwith the highly nonlinear parameters. Environmental parameters like rainfall,temperature, humidity, and management practices like fertilizers, pesticides,irrigation are very dynamic in approach and vary from field to field. In theproposed work, the data were collected from paddy fields of 28 districts in widespectrum of Tamilnadu over a period of 18 years. The Statistical model MultiLinear Regression was used as a benchmark for crop yield prediction, whichyielded an accuracy of 82% owing to its wide ranging input data. Therefore,machine learning models are developed to obtain improved accuracy, namelyBack Propagation Neural Network (BPNN), Support Vector Machine, andGeneral Regression Neural Networks with the given data set. Results showthat GRNN has greater accuracy of 97% (R2 = 0.97) with a normalizedmean square error (NMSE) of 0.03. Hence GRNN can be used for crop yieldprediction in diversified geographical fields.