The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in ...The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in the per-frame 3D posture estimation from two-dimensional(2D)mapping to 3D mapping.Firstly,by examining the relationship between the movements of different bones in the human body,four virtual skeletons are proposed to enhance the cyclic constraints of limb joints.Then,multiple parameters describing the skeleton are fused and projected into a high-dimensional space.Utilizing a multi-branch network,motion features between bones and overall motion features are extracted to mitigate the drift error in the estimation results.Furthermore,the estimated relative depth is projected into 3D space,and the error is calculated against real 3D data,forming a loss function along with the relative depth error.This article adopts the average joint pixel error as the primary performance metric.Compared to the benchmark approach,the estimation findings indicate an increase in average precision of 1.8 mm within the Human3.6M sample.展开更多
The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on ...The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on wind turbine blades,a blade surface defect detection and quantification method based on an improved Deeplabv3+deep learning model is proposed.Firstly,an improved method for wind turbine blade surface defect detection,utilizing Mobilenetv2 as the backbone feature extraction network,is proposed based on an original Deeplabv3+deep learning model to address the issue of limited robustness.Secondly,through integrating the concept of pre-trained weights from transfer learning and implementing a freeze training strategy,significant improvements have been made to enhance both the training speed and model training accuracy of this deep learning model.Finally,based on segmented blade surface defect images,a method for quantifying blade defects is proposed.This method combines image stitching algorithms to achieve overall quantification and risk assessment of the entire blade.Test results show that the improved Deeplabv3+deep learning model reduces training time by approximately 43.03%compared to the original model,while achieving mAP and MIoU values of 96.87%and 96.93%,respectively.Moreover,it demonstrates robustness in detecting different surface defects on blades across different back-grounds.The application of a blade surface defect quantification method enables the precise quantification of dif-ferent defects and facilitates the assessment of risk levels associated with defect measurements across the entire blade.This method enables non-contact,long-distance,high-precision detection and quantification of surface defects on the blades,providing a reference for assessing surface defects on wind turbine blades.展开更多
This paper deals with a binocular 3-D computer vision system based on the hierarchicalmatching of edge features, Frei and Chen operator is used to extract the edge. The averagegradients of an image obtained by two iso...This paper deals with a binocular 3-D computer vision system based on the hierarchicalmatching of edge features, Frei and Chen operator is used to extract the edge. The averagegradients of an image obtained by two isotropic operators are non-equal quantized andthresholded in an angle, Edge features are extracted after passing a preemphasis transferfunction which can equalize, the noise affection. Binary edge images are decomposed into apyramid structure which is stored and searched using llliffe’s location method. Corre-sponding points are used to determine the range data using triangulation based on an improvedTrivedi’s formula. In calibration the authors set the optical axes of the two cameras parallelto simplify the calculation, A 3 rd order Householder transform is used to solve the compati-ble coupled equations.展开更多
This paper theoretically analyzes and researches the coordinate frames of a 3D vision scanning system, establishes the mathematic model of a system scanning process, derives the relationship between the general non-or...This paper theoretically analyzes and researches the coordinate frames of a 3D vision scanning system, establishes the mathematic model of a system scanning process, derives the relationship between the general non-orthonormal sensor coordinate system and the machine coordinate system and the coordinate transformation matrix of the extrinsic calibration for the system.展开更多
Single-camera mobile-vision coordinate measurement is one of the primary methods of 3D-coordinate vision measurement, and coded target plays an important role in this system. A multifunctional coded target and its rec...Single-camera mobile-vision coordinate measurement is one of the primary methods of 3D-coordinate vision measurement, and coded target plays an important role in this system. A multifunctional coded target and its recognition algorithm is developed, which can realize automatic match of feature points, calculation of camera initial exterior orientation and space scale factor constraint in measurement system. The uniqueness and scalability of coding are guaranteed by the rational arrangement of code bits. The recognition of coded targets is realized by cross-ratio invariance restriction, space coordinates transform of feature points based on spacial pose estimation algorithm, recognition of code bits and computation of coding values. The experiment results demonstrate the uniqueness of the coding form and the reliability of recognition.展开更多
The fruit industry has been known as one of the largest businesses in Malaysia,where most of the fruits pass through the peeling process well in advance before the final product as juice in a bottle or slices in a can...The fruit industry has been known as one of the largest businesses in Malaysia,where most of the fruits pass through the peeling process well in advance before the final product as juice in a bottle or slices in a can.The current industrial fruit peeling techniques are passive and inefficient by cutting parts of the pulp of the fruit with peels leading to losses.To avoid this issue,a multi-axis CNC fruit peeler can be used to precisely peel the outer layer with the guidance of a 3D virtual model of fruit.In this work,a new cost-effective method of 3D image reconstruction was developed to convert 36 fruit images captured by a normal RGB camera to a 3D model by capturing a single image every 10 degrees of fruit rotation along a fixed axis.The point cloud data extracted with edge detection were passed to Blender 3D software for meshing in different approaches.The vertical link frame meshing method developed in this research proved a qualitative similarity between the output result and the scanned fruit in a processing time of less than 50 seconds.展开更多
In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include obj...In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include object detection and object recognition.Moreover,wireless communication technologies are presently adopted and they have impacted the way of education that has been changed.There are different phases of changes in the traditional system.Perception of three-dimensional(3D)from two-dimensional(2D)image is one of the demanding tasks.Because human can easily perceive but making 3D using software will take time manually.Firstly,the blackboard has been replaced by projectors and other digital screens so such that people can understand the concept better through visualization.Secondly,the computer labs in schools are now more common than ever.Thirdly,online classes have become a reality.However,transferring to online education or e-learning is not without challenges.Therefore,we propose a method for improving the efficiency of e-learning.Our proposed system consists of twoand-a-half dimensional(2.5D)features extraction using machine learning and image processing.Then,these features are utilized to generate 3D mesh using ellipsoidal deformation method.After that,3D bounding box estimation is applied.Our results show that there is a need to move to 3D virtual reality(VR)with haptic sensors in the field of e-learning for a better understanding of real-world objects.Thus,people will have more information as compared to the traditional or simple online education tools.We compare our result with the ShapeNet dataset to check the accuracy of our proposed method.Our proposed system achieved an accuracy of 90.77%on plane class,85.72%on chair class,and car class have 72.14%.Mean accuracy of our method is 70.89%.展开更多
Fast and accurate measurement of the volume of earthmoving materials is of great signifcance for the real-time evaluation of loader operation efciency and the realization of autonomous operation. Existing methods for ...Fast and accurate measurement of the volume of earthmoving materials is of great signifcance for the real-time evaluation of loader operation efciency and the realization of autonomous operation. Existing methods for volume measurement, such as total station-based methods, cannot measure the volume in real time, while the bucket-based method also has the disadvantage of poor universality. In this study, a fast estimation method for a loader’s shovel load volume by 3D reconstruction of material piles is proposed. First, a dense stereo matching method (QORB–MAPM) was proposed by integrating the improved quadtree ORB algorithm (QORB) and the maximum a posteriori probability model (MAPM), which achieves fast matching of feature points and dense 3D reconstruction of material piles. Second, the 3D point cloud model of the material piles before and after shoveling was registered and segmented to obtain the 3D point cloud model of the shoveling area, and the Alpha-shape algorithm of Delaunay triangulation was used to estimate the volume of the 3D point cloud model. Finally, a shovel loading volume measurement experiment was conducted under loose-soil working conditions. The results show that the shovel loading volume estimation method (QORB–MAPM VE) proposed in this study has higher estimation accuracy and less calculation time in volume estimation and bucket fll factor estimation, and it has signifcant theoretical research and engineering application value.展开更多
The mortar pumpability is essential in the construction industry,which requires much labor to estimate manually and always causes material waste.This paper proposes an effective method by combining a 3-dimensional con...The mortar pumpability is essential in the construction industry,which requires much labor to estimate manually and always causes material waste.This paper proposes an effective method by combining a 3-dimensional convolutional neural network(3D CNN)with a 2-dimensional convolutional long short-term memory network(ConvLSTM2D)to automatically classify the mortar pumpability.Experiment results show that the proposed model has an accuracy rate of 100%with a fast convergence speed,based on the dataset organized by collecting the corresponding mortar image sequences.This work demonstrates the feasibility of using computer vision and deep learning for mortar pumpability classification.展开更多
Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,s...Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,sev-eral cameras are installed underseas to collect videos.However,on the other hand,these large size videos require a lot of time and memory for their processing to extract relevant information.Hence,to automate this manual procedure of video assessment,an accurate and efficient automated system is a greater necessity.From this perspective,we intend to present a complete framework solution for the task of video summarization and object detection in underwater videos.We employed a perceived motion energy(PME)method tofirst extract the keyframes followed by an object detection model approach namely YoloV3 to perform object detection in underwater videos.The issues of blurriness and low contrast in underwater images are also taken into account in the presented approach by applying the image enhancement method.Furthermore,the suggested framework of underwater video summarization and object detection has been evaluated on a publicly available brackish dataset.It is observed that the proposed framework shows good performance and hence ultimately assists several marine researchers or scientists related to thefield of underwater archaeology,stock assessment,and surveillance.展开更多
EDGE是一种基于G S M/G P R S网络的数据增强型移动通信技术,通常又被人们称为2.75G技术。2005年曾一度遭遇冷落的EDGE先后取得了TelefonicaMovistar、Orange、BrasilTelecomGSM等24家运营商的青睐。截至7月份,EDGE在全球已经拥有170个...EDGE是一种基于G S M/G P R S网络的数据增强型移动通信技术,通常又被人们称为2.75G技术。2005年曾一度遭遇冷落的EDGE先后取得了TelefonicaMovistar、Orange、BrasilTelecomGSM等24家运营商的青睐。截至7月份,EDGE在全球已经拥有170个运营商追随。然而,中国市场3G的呼声日渐高涨,从标准到产品再到牌照,业界已经把3G研究了个彻头彻尾。因此,中国的移动市场对于EDGE来说又是不同于其他国家。作为全球著名移动通信整体解决方案供应商的杰尔系统本次推出了他们基于Vision架构的X115解决方案表达了来自杰尔系统对于移动市场的声音。展开更多
基金supported by the Medical Special Cultivation Project of Anhui University of Science and Technology(Grant No.YZ2023H2B013)the Anhui Provincial Key Research and Development Project(Grant No.2022i01020015)the Open Project of Key Laboratory of Conveyance Equipment(East China Jiaotong University),Ministry of Education(KLCE2022-01).
文摘The human pose paradigm is estimated using a transformer-based multi-branch multidimensional directed the three-dimensional(3D)method that takes into account self-occlusion,badly posedness,and a lack of depth data in the per-frame 3D posture estimation from two-dimensional(2D)mapping to 3D mapping.Firstly,by examining the relationship between the movements of different bones in the human body,four virtual skeletons are proposed to enhance the cyclic constraints of limb joints.Then,multiple parameters describing the skeleton are fused and projected into a high-dimensional space.Utilizing a multi-branch network,motion features between bones and overall motion features are extracted to mitigate the drift error in the estimation results.Furthermore,the estimated relative depth is projected into 3D space,and the error is calculated against real 3D data,forming a loss function along with the relative depth error.This article adopts the average joint pixel error as the primary performance metric.Compared to the benchmark approach,the estimation findings indicate an increase in average precision of 1.8 mm within the Human3.6M sample.
基金supported by the National Science Foundation of China(Grant Nos.52068049 and 51908266)the Science Fund for Distinguished Young Scholars of Gansu Province(No.21JR7RA267)Hongliu Outstanding Young Talents Program of Lanzhou University of Technology.
文摘The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on wind turbine blades,a blade surface defect detection and quantification method based on an improved Deeplabv3+deep learning model is proposed.Firstly,an improved method for wind turbine blade surface defect detection,utilizing Mobilenetv2 as the backbone feature extraction network,is proposed based on an original Deeplabv3+deep learning model to address the issue of limited robustness.Secondly,through integrating the concept of pre-trained weights from transfer learning and implementing a freeze training strategy,significant improvements have been made to enhance both the training speed and model training accuracy of this deep learning model.Finally,based on segmented blade surface defect images,a method for quantifying blade defects is proposed.This method combines image stitching algorithms to achieve overall quantification and risk assessment of the entire blade.Test results show that the improved Deeplabv3+deep learning model reduces training time by approximately 43.03%compared to the original model,while achieving mAP and MIoU values of 96.87%and 96.93%,respectively.Moreover,it demonstrates robustness in detecting different surface defects on blades across different back-grounds.The application of a blade surface defect quantification method enables the precise quantification of dif-ferent defects and facilitates the assessment of risk levels associated with defect measurements across the entire blade.This method enables non-contact,long-distance,high-precision detection and quantification of surface defects on the blades,providing a reference for assessing surface defects on wind turbine blades.
文摘This paper deals with a binocular 3-D computer vision system based on the hierarchicalmatching of edge features, Frei and Chen operator is used to extract the edge. The averagegradients of an image obtained by two isotropic operators are non-equal quantized andthresholded in an angle, Edge features are extracted after passing a preemphasis transferfunction which can equalize, the noise affection. Binary edge images are decomposed into apyramid structure which is stored and searched using llliffe’s location method. Corre-sponding points are used to determine the range data using triangulation based on an improvedTrivedi’s formula. In calibration the authors set the optical axes of the two cameras parallelto simplify the calculation, A 3 rd order Householder transform is used to solve the compati-ble coupled equations.
文摘This paper theoretically analyzes and researches the coordinate frames of a 3D vision scanning system, establishes the mathematic model of a system scanning process, derives the relationship between the general non-orthonormal sensor coordinate system and the machine coordinate system and the coordinate transformation matrix of the extrinsic calibration for the system.
文摘Single-camera mobile-vision coordinate measurement is one of the primary methods of 3D-coordinate vision measurement, and coded target plays an important role in this system. A multifunctional coded target and its recognition algorithm is developed, which can realize automatic match of feature points, calculation of camera initial exterior orientation and space scale factor constraint in measurement system. The uniqueness and scalability of coding are guaranteed by the rational arrangement of code bits. The recognition of coded targets is realized by cross-ratio invariance restriction, space coordinates transform of feature points based on spacial pose estimation algorithm, recognition of code bits and computation of coding values. The experiment results demonstrate the uniqueness of the coding form and the reliability of recognition.
基金the support from the University-Private Matching Fund(UniPRIMA)from the Research Management CentreUniMAPWalta Engineering Sdn.Bhd.
文摘The fruit industry has been known as one of the largest businesses in Malaysia,where most of the fruits pass through the peeling process well in advance before the final product as juice in a bottle or slices in a can.The current industrial fruit peeling techniques are passive and inefficient by cutting parts of the pulp of the fruit with peels leading to losses.To avoid this issue,a multi-axis CNC fruit peeler can be used to precisely peel the outer layer with the guidance of a 3D virtual model of fruit.In this work,a new cost-effective method of 3D image reconstruction was developed to convert 36 fruit images captured by a normal RGB camera to a 3D model by capturing a single image every 10 degrees of fruit rotation along a fixed axis.The point cloud data extracted with edge detection were passed to Blender 3D software for meshing in different approaches.The vertical link frame meshing method developed in this research proved a qualitative similarity between the output result and the scanned fruit in a processing time of less than 50 seconds.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2023-2018-0-01426)supervised by the IITP(Institute for Information&Communications Technology Planning&Evaluation).In additionsupport of the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University,This work has also been supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R239),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Alsosupported by the Taif University Researchers Supporting Project Number(TURSP-2020/115),Taif University,Taif,Saudi Arabia.
文摘In the past two decades,there has been a lot of work on computer vision technology that incorporates many tasks which implement basic filtering to image classification.Themajor research areas of this field include object detection and object recognition.Moreover,wireless communication technologies are presently adopted and they have impacted the way of education that has been changed.There are different phases of changes in the traditional system.Perception of three-dimensional(3D)from two-dimensional(2D)image is one of the demanding tasks.Because human can easily perceive but making 3D using software will take time manually.Firstly,the blackboard has been replaced by projectors and other digital screens so such that people can understand the concept better through visualization.Secondly,the computer labs in schools are now more common than ever.Thirdly,online classes have become a reality.However,transferring to online education or e-learning is not without challenges.Therefore,we propose a method for improving the efficiency of e-learning.Our proposed system consists of twoand-a-half dimensional(2.5D)features extraction using machine learning and image processing.Then,these features are utilized to generate 3D mesh using ellipsoidal deformation method.After that,3D bounding box estimation is applied.Our results show that there is a need to move to 3D virtual reality(VR)with haptic sensors in the field of e-learning for a better understanding of real-world objects.Thus,people will have more information as compared to the traditional or simple online education tools.We compare our result with the ShapeNet dataset to check the accuracy of our proposed method.Our proposed system achieved an accuracy of 90.77%on plane class,85.72%on chair class,and car class have 72.14%.Mean accuracy of our method is 70.89%.
基金Supported by National Key R&D Program of China(Grant Nos.2020YFB1709901 and 2020YFB1709904)National Natural Science Foundation of China(Grant Nos.51975495 and 51905460)+1 种基金Guangdong Provincial Basic and Applied Basic Research Foundation(Grant No.2021A1515012286)Guiding Funds of Central Government for Supporting the Development of the Local Science and Technology(Grant No.2022L3049).
文摘Fast and accurate measurement of the volume of earthmoving materials is of great signifcance for the real-time evaluation of loader operation efciency and the realization of autonomous operation. Existing methods for volume measurement, such as total station-based methods, cannot measure the volume in real time, while the bucket-based method also has the disadvantage of poor universality. In this study, a fast estimation method for a loader’s shovel load volume by 3D reconstruction of material piles is proposed. First, a dense stereo matching method (QORB–MAPM) was proposed by integrating the improved quadtree ORB algorithm (QORB) and the maximum a posteriori probability model (MAPM), which achieves fast matching of feature points and dense 3D reconstruction of material piles. Second, the 3D point cloud model of the material piles before and after shoveling was registered and segmented to obtain the 3D point cloud model of the shoveling area, and the Alpha-shape algorithm of Delaunay triangulation was used to estimate the volume of the 3D point cloud model. Finally, a shovel loading volume measurement experiment was conducted under loose-soil working conditions. The results show that the shovel loading volume estimation method (QORB–MAPM VE) proposed in this study has higher estimation accuracy and less calculation time in volume estimation and bucket fll factor estimation, and it has signifcant theoretical research and engineering application value.
基金supported by the Key Project of National Natural Science Foundation of China-Civil Aviation Joint Fund under Grant No.U2033212。
文摘The mortar pumpability is essential in the construction industry,which requires much labor to estimate manually and always causes material waste.This paper proposes an effective method by combining a 3-dimensional convolutional neural network(3D CNN)with a 2-dimensional convolutional long short-term memory network(ConvLSTM2D)to automatically classify the mortar pumpability.Experiment results show that the proposed model has an accuracy rate of 100%with a fast convergence speed,based on the dataset organized by collecting the corresponding mortar image sequences.This work demonstrates the feasibility of using computer vision and deep learning for mortar pumpability classification.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2020R1G1A1099559).
文摘Currently,worldwide industries and communities are concerned with building,expanding,and exploring the assets and resources found in the oceans and seas.More precisely,to analyze a stock,archaeology,and surveillance,sev-eral cameras are installed underseas to collect videos.However,on the other hand,these large size videos require a lot of time and memory for their processing to extract relevant information.Hence,to automate this manual procedure of video assessment,an accurate and efficient automated system is a greater necessity.From this perspective,we intend to present a complete framework solution for the task of video summarization and object detection in underwater videos.We employed a perceived motion energy(PME)method tofirst extract the keyframes followed by an object detection model approach namely YoloV3 to perform object detection in underwater videos.The issues of blurriness and low contrast in underwater images are also taken into account in the presented approach by applying the image enhancement method.Furthermore,the suggested framework of underwater video summarization and object detection has been evaluated on a publicly available brackish dataset.It is observed that the proposed framework shows good performance and hence ultimately assists several marine researchers or scientists related to thefield of underwater archaeology,stock assessment,and surveillance.
文摘EDGE是一种基于G S M/G P R S网络的数据增强型移动通信技术,通常又被人们称为2.75G技术。2005年曾一度遭遇冷落的EDGE先后取得了TelefonicaMovistar、Orange、BrasilTelecomGSM等24家运营商的青睐。截至7月份,EDGE在全球已经拥有170个运营商追随。然而,中国市场3G的呼声日渐高涨,从标准到产品再到牌照,业界已经把3G研究了个彻头彻尾。因此,中国的移动市场对于EDGE来说又是不同于其他国家。作为全球著名移动通信整体解决方案供应商的杰尔系统本次推出了他们基于Vision架构的X115解决方案表达了来自杰尔系统对于移动市场的声音。