Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of ...As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.展开更多
In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of ...In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.展开更多
In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use...In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.展开更多
Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effe...Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金This work was funded by the foundation of Liaoning Educational committee under the Grant No.2019LNJC03.
文摘As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.
基金supported by National key research and development program sub-topics[2018YFF0213606-03(Mu Y.,Hu T.L.,Gong H.,Li S.J.and Sun Y.H.)http://www.most.gov.cn]Jilin Province Science and Technology Development Plan focuses on research and development projects[20200402006NC(Mu Y.,Hu T.L.,Gong H.and Li S.J.)http://kjt.jl.gov.cn]+1 种基金Science and technology support project for key industries in southern Xinjiang[2018DB001(Gong H.,and Li S.J.)http://kjj.xjbt.gov.cn]Key technology R&D project of Changchun Science and Technology Bureau of Jilin Province[21ZGN29(Mu Y.,Bao H.P.,Wang X.B.)http://kjj.changchun.gov.cn].
文摘In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.
基金This work was financially supported by the Basic Public Welfare Research Project of Zhejiang Province(Grant No.LGN20E050007).
文摘In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.
基金Project supported by the Directorate of Advanced Studies,Research&Technological Development,University of Engineering and Technology Taxila(No.UET/ASRTD/RG-1002-3)。
文摘Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.