Scale variation is amajor challenge inmulti-person pose estimation.In scenes where persons are present at various distances,models tend to perform better on larger-scale persons,while the performance for smaller-scale...Scale variation is amajor challenge inmulti-person pose estimation.In scenes where persons are present at various distances,models tend to perform better on larger-scale persons,while the performance for smaller-scale persons often falls short of expectations.Therefore,effectively balancing the persons of different scales poses a significant challenge.So this paper proposes a newmulti-person pose estimation model called FSANet to improve themodel’s performance in complex scenes.Our model utilizes High-Resolution Network(HRNet)as the backbone and feeds the outputs of the last stage’s four branches into the DCB module.The dilated convolution-based(DCB)module employs a parallel structure that incorporates dilated convolutions with different rates to expand the receptive field of each branch.Subsequently,the attention operation-based(AOB)module performs attention operations at both branch and channel levels to enhance high-frequency features and reduce the influence of noise.Finally,predictions are made using the heatmap representation.The model can recognize images with diverse scales and more complex semantic information.Experimental results demonstrate that FSA Net achieves competitive results on the MSCOCO and MPII datasets,validating the effectiveness of our proposed approach.展开更多
Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area...Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.展开更多
To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by exist...To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.展开更多
针对高分辨率人体姿态估计网络存在参数量大、运算复杂度高等问题,提出一种基于高分辨率网络(HRNet)的轻量型沙漏坐标注意力网络(SCANet)用于人体姿态估计。首先引入沙漏(Sandglass)模块和坐标注意力(CoordAttention)模块;然后在此基础...针对高分辨率人体姿态估计网络存在参数量大、运算复杂度高等问题,提出一种基于高分辨率网络(HRNet)的轻量型沙漏坐标注意力网络(SCANet)用于人体姿态估计。首先引入沙漏(Sandglass)模块和坐标注意力(CoordAttention)模块;然后在此基础上构建了沙漏坐标注意力瓶颈(SCAneck)模块和沙漏坐标注意力基础(SCAblock)模块两种轻量型模块,在降低模型参数量和运算复杂度的同时,获取特征图空间方向的长程依赖和精确位置信息。实验结果显示,在相同图像分辨率和环境配置的情况下,在COCO(Common Objects in COntext)校验集上,SCANet模型与HRNet模型相比参数量降低了52.6%,运算复杂度降低了60.6%;在MPII(Max Planck Institute for Informatics)校验集上,SCANet模型与HRNet模型相比参数量和运算复杂度分别降低了52.6%和61.1%;与常见的人体姿态估计网络如堆叠沙漏网络(Hourglass)、级联金字塔网络(CPN)和SimpleBaseline相比,SCANet模型在拥有更少的参数量与运算复杂度的情况下,仍能实现对人体关键点的高准确度预测。展开更多
Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the a...Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.展开更多
Safety production is of great significance to the development of enterprises and society.Accidents often cause great losses because of the particularity environment of electric power.Therefore,it is important to impro...Safety production is of great significance to the development of enterprises and society.Accidents often cause great losses because of the particularity environment of electric power.Therefore,it is important to improve the safety supervision and protection in the electric power environment.In this paper,we simulate the actual electric power operation scenario by monitoring equipment and propose a real-time detection method of illegal actions based on human body key points to ensure safety behavior in real time.In this method,the human body key points in video frames were first extracted by the high-resolution network,and then classified in real time by spatial-temporal graph convolutional network.Experimental results show that this method can effectively detect illegal actions in the simulated scene.展开更多
Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which ...Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which is crucial for intelligent applications,contradicts the lowdetection accuracy of human posture detection models in practical scenarios.To address this issue,a human pose estimation network called AT-HRNet has been proposed,which combines convolu-tional self-attention and cross-dimensional feature transformation.AT-HRNet captures significantfeature information from various regions in an adaptive manner,aggregating them through convolu-tional operations within the local receptive domain.The residual structures TripNeck and Trip-Block of the high-resolution network are designed to further refine the key point locations,wherethe attention weight is adjusted by a cross-dimensional interaction to obtain more features.To vali-date the effectiveness of this network,AT-HRNet was evaluated using the COCO2017 dataset.Theresults show that AT-HRNet outperforms HRNet by improving 3.2%in mAP,4.0%in AP75,and3.9%in AP^(M).This suggests that AT-HRNet can offer more beneficial solutions for human posture estimation.展开更多
The increasingly mature computer vision(CV)technology represented by convolutional neural networks(CNN)and available high-resolution remote sensing images(HR-RSIs)provide opportunities to accurately measure the evolut...The increasingly mature computer vision(CV)technology represented by convolutional neural networks(CNN)and available high-resolution remote sensing images(HR-RSIs)provide opportunities to accurately measure the evolution of natural and artificial environments on Earth at a large scale.Based on the advanced CNN method high-resolution net(HRNet)and multi-temporal HR-RSIs,a framework is proposed for monitoring a green evolution of courtyard buildings characterized by their courtyards being roofed(CBR).The proposed framework consists of an expert module focusing on scenes analysis,a CV module for automatic detection,an evaluation module containing thresholds,and an output module for data analysis.Based on this,the changes in the adoption of different CBR technologies(CBRTs),including light-translucent CBRTs(LT-CBRTs)and non-lighttranslucent CBRTs(NLT-CBRTs),in 24 villages in southern Hebei were identified from 2007 to 2021.The evolution of CBRTs was featured as an inverse S-curve,and differences were found in their evolution stage,adoption ratio,and development speed for different villages.LT-CBRTs are the dominant type but are being replaced and surpassed by NLT-CBRTs in some villages,characterizing different preferences for the technology type of villages.The proposed research framework provides a reference for the evolution monitoring of vernacular buildings,and the identified evolution laws enable to trace and predict the adoption of different CBRTs in a particular village.This work lays a foundation for future exploration of the occurrence and development mechanism of the CBR phenomenon and provides an important reference for the optimization and promotion of CBRTs.展开更多
Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection de...Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection detection, and recognition. Recently, deep learning techniques have emerged and blossomed, producing " the state-of-the-art” in many domains. Due to their capability in feature extraction and mapping, it is very helpful to predict high-frequency details lost in low-resolution images. In this paper, we give an overview of recent advances in deep learning-based models and methods that have been applied to single image super-resolution tasks. We also summarize, compare and discuss various models from the past and present for comprehensive understanding and finally provide open problems and possible directions for future research.展开更多
Given that it was a once-in-a-century emergency event,the confinement measures related to the coronavirus disease 2019(COVID-19)pandemic caused diverse disruptions and changes in life and work patterns.These changes s...Given that it was a once-in-a-century emergency event,the confinement measures related to the coronavirus disease 2019(COVID-19)pandemic caused diverse disruptions and changes in life and work patterns.These changes significantly affected water consumption both during and after the pandemic,with direct and indirect consequences on biodiversity.However,there has been a lack of holistic evaluation of these responses.Here,we propose a novel framework to study the impacts of this unique global emergency event by embedding an environmentally extended supply-constrained global multi-regional input-output model(MRIO)into the drivers-pressure-state-impact-response(DPSIR)framework.This framework allowed us to develop scenarios related to COVID-19 confinement measures to quantify country-sector-specific changes in freshwater consumption and the associated changes in biodiversity for the period of 2020-2025.The results suggest progressively diminishing impacts due to the implementation of COVID-19 vaccines and the socio-economic system’s self-adjustment to the new normal.In 2020,the confinement measures were estimated to decrease global water consumption by about 5.7% on average across all scenarios when compared with the baseline level with no confinement measures.Further,such a decrease is estimated to lead to a reduction of around 5% in the related pressure on biodiversity.Given the interdependencies and interactions across global supply chains,even those countries and sectors that were not directly affected by the COVID-19 shocks experienced significant impacts:Our results indicate that the supply chain propagations contributed to 79% of the total estimated decrease in water consumption and 84%of the reduction in biodiversity loss on average.Our study demonstrates that the MRIO-enhanced DSPIR framework can help quantify resource pressures and the resultant environmental impacts across supply chains when facing a global emergency event.Further,we recommend the development of more locally based water conservation measures—to mitigate the effects of trade disruptions—and the explicit inclusion of water resources in post-pandemic recovery schemes.In addition,innovations that help conserve natural resources are essential for maintaining environmental gains in the post-pandemic world.展开更多
基金supported in part by the National Natural Science Foundation of China 6167246662011530130,Joint Fund of Zhejiang Provincial Natural Science Foundation LSZ19F010001.
文摘Scale variation is amajor challenge inmulti-person pose estimation.In scenes where persons are present at various distances,models tend to perform better on larger-scale persons,while the performance for smaller-scale persons often falls short of expectations.Therefore,effectively balancing the persons of different scales poses a significant challenge.So this paper proposes a newmulti-person pose estimation model called FSANet to improve themodel’s performance in complex scenes.Our model utilizes High-Resolution Network(HRNet)as the backbone and feeds the outputs of the last stage’s four branches into the DCB module.The dilated convolution-based(DCB)module employs a parallel structure that incorporates dilated convolutions with different rates to expand the receptive field of each branch.Subsequently,the attention operation-based(AOB)module performs attention operations at both branch and channel levels to enhance high-frequency features and reduce the influence of noise.Finally,predictions are made using the heatmap representation.The model can recognize images with diverse scales and more complex semantic information.Experimental results demonstrate that FSA Net achieves competitive results on the MSCOCO and MPII datasets,validating the effectiveness of our proposed approach.
基金Supported by the National Key Research and Development Program of China(No.2016YFC1402003)the National Natural Science Foundation of China(No.41671436)the Innovation Project of LREIS(No.O88RAA01YA)
文摘Sanduao is an important sea-breeding bay in Fujian,South China and holds a high economic status in aquaculture.Quickly and accurately obtaining information including the distribution area,quantity,and aquaculture area is important for breeding area planning,production value estimation,ecological survey,and storm surge prevention.However,as the aquaculture area expands,the seawater background becomes increasingly complex and spectral characteristics differ dramatically,making it difficult to determine the aquaculture area.In this study,we used a high-resolution remote-sensing satellite GF-2 image to introduce a deep-learning Richer Convolutional Features(RCF)network model to extract the aquaculture area.Then we used the density of aquaculture as an assessment index to assess the vulnerability of aquaculture areas in Sanduao.The results demonstrate that this method does not require land and water separation of the area in advance,and good extraction can be achieved in the areas with more sediment and waves,with an extraction accuracy>93%,which is suitable for large-scale aquaculture area extraction.Vulnerability assessment results indicate that the density of aquaculture in the eastern part of Sanduao is considerably high,reaching a higher vulnerability level than other parts.
基金This research was funded by College Student Innovation and Entrepreneurship Training Program,grant number 2021055Z and S202110082031the Special Project for Cultivating Scientific and Technological Innovation Ability of College and Middle School Students in Hebei Province,Grant Number 2021H011404.
文摘To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.
文摘针对高分辨率人体姿态估计网络存在参数量大、运算复杂度高等问题,提出一种基于高分辨率网络(HRNet)的轻量型沙漏坐标注意力网络(SCANet)用于人体姿态估计。首先引入沙漏(Sandglass)模块和坐标注意力(CoordAttention)模块;然后在此基础上构建了沙漏坐标注意力瓶颈(SCAneck)模块和沙漏坐标注意力基础(SCAblock)模块两种轻量型模块,在降低模型参数量和运算复杂度的同时,获取特征图空间方向的长程依赖和精确位置信息。实验结果显示,在相同图像分辨率和环境配置的情况下,在COCO(Common Objects in COntext)校验集上,SCANet模型与HRNet模型相比参数量降低了52.6%,运算复杂度降低了60.6%;在MPII(Max Planck Institute for Informatics)校验集上,SCANet模型与HRNet模型相比参数量和运算复杂度分别降低了52.6%和61.1%;与常见的人体姿态估计网络如堆叠沙漏网络(Hourglass)、级联金字塔网络(CPN)和SimpleBaseline相比,SCANet模型在拥有更少的参数量与运算复杂度的情况下,仍能实现对人体关键点的高准确度预测。
基金National Key Research and Development Program of China(No.2017YFC0405806)。
文摘Currently,deep convolutional neural networks have made great progress in the field of semantic segmentation.Because of the fixed convolution kernel geometry,standard convolution neural networks have been limited the ability to simulate geometric transformations.Therefore,a deformable convolution is introduced to enhance the adaptability of convolutional networks to spatial transformation.Considering that the deep convolutional neural networks cannot adequately segment the local objects at the output layer due to using the pooling layers in neural network architecture.To overcome this shortcoming,the rough prediction segmentation results of the neural network output layer will be processed by fully connected conditional random fields to improve the ability of image segmentation.The proposed method can easily be trained by end-to-end using standard backpropagation algorithms.Finally,the proposed method is tested on the ISPRS dataset.The results show that the proposed method can effectively overcome the influence of the complex structure of the segmentation object and obtain state-of-the-art accuracy on the ISPRS Vaihingen 2D semantic labeling dataset.
基金the Science and Technology Program of State Grid Corporation of China(No.5211TZ1900S6)。
文摘Safety production is of great significance to the development of enterprises and society.Accidents often cause great losses because of the particularity environment of electric power.Therefore,it is important to improve the safety supervision and protection in the electric power environment.In this paper,we simulate the actual electric power operation scenario by monitoring equipment and propose a real-time detection method of illegal actions based on human body key points to ensure safety behavior in real time.In this method,the human body key points in video frames were first extracted by the high-resolution network,and then classified in real time by spatial-temporal graph convolutional network.Experimental results show that this method can effectively detect illegal actions in the simulated scene.
基金the National Natural Science Foundation of China(No.61975015)the Research and Innovation Project for Graduate Students at Zhongyuan University of Technology(No.YKY2024ZK14).
文摘Human posture estimation is a prominent research topic in the fields of human-com-puter interaction,motion recognition,and other intelligent applications.However,achieving highaccuracy in key point localization,which is crucial for intelligent applications,contradicts the lowdetection accuracy of human posture detection models in practical scenarios.To address this issue,a human pose estimation network called AT-HRNet has been proposed,which combines convolu-tional self-attention and cross-dimensional feature transformation.AT-HRNet captures significantfeature information from various regions in an adaptive manner,aggregating them through convolu-tional operations within the local receptive domain.The residual structures TripNeck and Trip-Block of the high-resolution network are designed to further refine the key point locations,wherethe attention weight is adjusted by a cross-dimensional interaction to obtain more features.To vali-date the effectiveness of this network,AT-HRNet was evaluated using the COCO2017 dataset.Theresults show that AT-HRNet outperforms HRNet by improving 3.2%in mAP,4.0%in AP75,and3.9%in AP^(M).This suggests that AT-HRNet can offer more beneficial solutions for human posture estimation.
基金supported by National Natural Science Foundation of China (No.52108010).
文摘The increasingly mature computer vision(CV)technology represented by convolutional neural networks(CNN)and available high-resolution remote sensing images(HR-RSIs)provide opportunities to accurately measure the evolution of natural and artificial environments on Earth at a large scale.Based on the advanced CNN method high-resolution net(HRNet)and multi-temporal HR-RSIs,a framework is proposed for monitoring a green evolution of courtyard buildings characterized by their courtyards being roofed(CBR).The proposed framework consists of an expert module focusing on scenes analysis,a CV module for automatic detection,an evaluation module containing thresholds,and an output module for data analysis.Based on this,the changes in the adoption of different CBR technologies(CBRTs),including light-translucent CBRTs(LT-CBRTs)and non-lighttranslucent CBRTs(NLT-CBRTs),in 24 villages in southern Hebei were identified from 2007 to 2021.The evolution of CBRTs was featured as an inverse S-curve,and differences were found in their evolution stage,adoption ratio,and development speed for different villages.LT-CBRTs are the dominant type but are being replaced and surpassed by NLT-CBRTs in some villages,characterizing different preferences for the technology type of villages.The proposed research framework provides a reference for the evolution monitoring of vernacular buildings,and the identified evolution laws enable to trace and predict the adoption of different CBRTs in a particular village.This work lays a foundation for future exploration of the occurrence and development mechanism of the CBR phenomenon and provides an important reference for the optimization and promotion of CBRTs.
基金the support from the Shanxi Hundred People Plan of China
文摘Single image super-resolution has attracted increasing attention and has a wide range of applications in satellite imaging, medical imaging, computer vision, security surveillance imaging, remote sensing, objection detection, and recognition. Recently, deep learning techniques have emerged and blossomed, producing " the state-of-the-art” in many domains. Due to their capability in feature extraction and mapping, it is very helpful to predict high-frequency details lost in low-resolution images. In this paper, we give an overview of recent advances in deep learning-based models and methods that have been applied to single image super-resolution tasks. We also summarize, compare and discuss various models from the past and present for comprehensive understanding and finally provide open problems and possible directions for future research.
基金supported by Aalto University and the Henan Provincial Key Laboratory of Hydrosphere and Watershed Water SecurityAdditional support was provided by the National Natural Science Foundation of China(42361144001,72304112,72074136,and 72104129)the Key Program of International Cooperation,Bureau of International Cooperation,the Chinese Academy of Sciences(131551KYSB20210030).
文摘Given that it was a once-in-a-century emergency event,the confinement measures related to the coronavirus disease 2019(COVID-19)pandemic caused diverse disruptions and changes in life and work patterns.These changes significantly affected water consumption both during and after the pandemic,with direct and indirect consequences on biodiversity.However,there has been a lack of holistic evaluation of these responses.Here,we propose a novel framework to study the impacts of this unique global emergency event by embedding an environmentally extended supply-constrained global multi-regional input-output model(MRIO)into the drivers-pressure-state-impact-response(DPSIR)framework.This framework allowed us to develop scenarios related to COVID-19 confinement measures to quantify country-sector-specific changes in freshwater consumption and the associated changes in biodiversity for the period of 2020-2025.The results suggest progressively diminishing impacts due to the implementation of COVID-19 vaccines and the socio-economic system’s self-adjustment to the new normal.In 2020,the confinement measures were estimated to decrease global water consumption by about 5.7% on average across all scenarios when compared with the baseline level with no confinement measures.Further,such a decrease is estimated to lead to a reduction of around 5% in the related pressure on biodiversity.Given the interdependencies and interactions across global supply chains,even those countries and sectors that were not directly affected by the COVID-19 shocks experienced significant impacts:Our results indicate that the supply chain propagations contributed to 79% of the total estimated decrease in water consumption and 84%of the reduction in biodiversity loss on average.Our study demonstrates that the MRIO-enhanced DSPIR framework can help quantify resource pressures and the resultant environmental impacts across supply chains when facing a global emergency event.Further,we recommend the development of more locally based water conservation measures—to mitigate the effects of trade disruptions—and the explicit inclusion of water resources in post-pandemic recovery schemes.In addition,innovations that help conserve natural resources are essential for maintaining environmental gains in the post-pandemic world.