Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist...Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.展开更多
Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recogn...Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition(CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition. Methods In this paper, an adaptive spatio-temporal attention neural network(ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach,which extracts motion information among video frames that represent discriminative features of micro-expression.After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment(CORAL) loss such that the source and target databases share similar distributions. Results To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks. Conclusions Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.展开更多
Marine life is very sensitive to changes in pH.Even slight changes can cause ecosystems to collapse.Therefore,understanding the future pH of seawater is of great significance for the protection of the marine environme...Marine life is very sensitive to changes in pH.Even slight changes can cause ecosystems to collapse.Therefore,understanding the future pH of seawater is of great significance for the protection of the marine environment.At present,the monitoring method of seawater pH has been matured.However,how to accurately predict future changes has been lacking effective solutions.Based on this,the model of bidirectional gated recurrent neural network with multi-headed self-attention based on improved complete ensemble empirical mode decomposition with adaptive noise combined with phase space reconstruction(ICPBGA)is proposed to achieve seawater pH prediction.To verify the validity of this model,pH data of two monitoring sites in the coastal sea area of Beihai,China are selected to verify the effect.At the same time,the ICPBGA model is compared with other excellent models for predicting chaotic time series,and root mean square error(RMSE),mean absolute error(MAE),mean absolute percentage error(MAPE),and coefficient of determination(R2)are used as performance evaluation indicators.The R2 of the ICPBGA model at Sites 1 and 2 are above 0.9,and the prediction errors are also the smallest.The results show that the ICPBGA model has a wide range of applicability and the most satisfactory prediction effect.The prediction method in this paper can be further expanded and used to predict other marine environmental indicators.展开更多
现有跨域人脸活体检测算法,其特征提取过程容易发生过拟合和缺乏特征聚合所导致的泛化性不足问题。针对该问题,提出了跨域人脸活体检测的单边对抗网络算法,将分组卷积与改进的倒残差结构融合替换普通卷积,降低网络参数同时加强人脸细粒...现有跨域人脸活体检测算法,其特征提取过程容易发生过拟合和缺乏特征聚合所导致的泛化性不足问题。针对该问题,提出了跨域人脸活体检测的单边对抗网络算法,将分组卷积与改进的倒残差结构融合替换普通卷积,降低网络参数同时加强人脸细粒度特征的表达能力,并引入自适应特征归一化模块,强调图像中人脸活体信息区域淡化无关背景区域,有效避免人脸活体信息的过拟合并加强来自不同源域的人脸活体检测能力。基于NetVLAD引入通道注意力机制模块,通道注意力机制模块作为特征聚合网络的分支,学习不同源域中人脸局部特征的语义信息,有效增强对不同源域的人脸活体信息分类的泛化能力。设计两模块融合网络以提高未知场景下跨域人脸活体检测精度。在OULU-NPU、CASIA-FASD、MSU-MFSD和Idiap Replay-Attack数据集上的实验结果表明,该算法在跨数据集测试O&C&M to I、O&C&I to M、I&C&M to O、O&M&I to C均有不错的表现,其中,在O&C&I to M及O&M&I to C性能评估指标分别提升了0.99个百分点和0.5个百分点的精度。展开更多
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
基金Supported by Tianjin Municipal Natural Science Foundation of China(Grant No.19JCJQJC61600)Hebei Provincial Natural Science Foundation of China(Grant Nos.F2020202051,F2020202053).
文摘Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.
文摘Background The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition(CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition. Methods In this paper, an adaptive spatio-temporal attention neural network(ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach,which extracts motion information among video frames that represent discriminative features of micro-expression.After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment(CORAL) loss such that the source and target databases share similar distributions. Results To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks. Conclusions Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.
基金The National Natural Science Foundation of China under contract No.62275228the S&T Program of Hebei under contract Nos 19273901D and 20373301Dthe Hebei Natural Science Foundation under contract No.F2020203066.
文摘Marine life is very sensitive to changes in pH.Even slight changes can cause ecosystems to collapse.Therefore,understanding the future pH of seawater is of great significance for the protection of the marine environment.At present,the monitoring method of seawater pH has been matured.However,how to accurately predict future changes has been lacking effective solutions.Based on this,the model of bidirectional gated recurrent neural network with multi-headed self-attention based on improved complete ensemble empirical mode decomposition with adaptive noise combined with phase space reconstruction(ICPBGA)is proposed to achieve seawater pH prediction.To verify the validity of this model,pH data of two monitoring sites in the coastal sea area of Beihai,China are selected to verify the effect.At the same time,the ICPBGA model is compared with other excellent models for predicting chaotic time series,and root mean square error(RMSE),mean absolute error(MAE),mean absolute percentage error(MAPE),and coefficient of determination(R2)are used as performance evaluation indicators.The R2 of the ICPBGA model at Sites 1 and 2 are above 0.9,and the prediction errors are also the smallest.The results show that the ICPBGA model has a wide range of applicability and the most satisfactory prediction effect.The prediction method in this paper can be further expanded and used to predict other marine environmental indicators.
文摘现有跨域人脸活体检测算法,其特征提取过程容易发生过拟合和缺乏特征聚合所导致的泛化性不足问题。针对该问题,提出了跨域人脸活体检测的单边对抗网络算法,将分组卷积与改进的倒残差结构融合替换普通卷积,降低网络参数同时加强人脸细粒度特征的表达能力,并引入自适应特征归一化模块,强调图像中人脸活体信息区域淡化无关背景区域,有效避免人脸活体信息的过拟合并加强来自不同源域的人脸活体检测能力。基于NetVLAD引入通道注意力机制模块,通道注意力机制模块作为特征聚合网络的分支,学习不同源域中人脸局部特征的语义信息,有效增强对不同源域的人脸活体信息分类的泛化能力。设计两模块融合网络以提高未知场景下跨域人脸活体检测精度。在OULU-NPU、CASIA-FASD、MSU-MFSD和Idiap Replay-Attack数据集上的实验结果表明,该算法在跨数据集测试O&C&M to I、O&C&I to M、I&C&M to O、O&M&I to C均有不错的表现,其中,在O&C&I to M及O&M&I to C性能评估指标分别提升了0.99个百分点和0.5个百分点的精度。