Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Ou...Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Our review traces the evolution of CNN, emphasizing the adaptation and capabilities of the U-Net 3D model in automating seismic fault delineation with unprecedented accuracy. We find: 1) The transition from basic neural networks to sophisticated CNN has enabled remarkable advancements in image recognition, which are directly applicable to analyzing seismic data. The U-Net 3D model, with its innovative architecture, exemplifies this progress by providing a method for detailed and accurate fault detection with reduced manual interpretation bias. 2) The U-Net 3D model has demonstrated its superiority over traditional fault identification methods in several key areas: it has enhanced interpretation accuracy, increased operational efficiency, and reduced the subjectivity of manual methods. 3) Despite these achievements, challenges such as the need for effective data preprocessing, acquisition of high-quality annotated datasets, and achieving model generalization across different geological conditions remain. Future research should therefore focus on developing more complex network architectures and innovative training strategies to refine fault identification performance further. Our findings confirm the transformative potential of deep learning, particularly CNN like the U-Net 3D model, in geosciences, advocating for its broader integration to revolutionize geological exploration and seismic analysis.展开更多
Esophageal disease is a common disorder of the digestive system that can severely affect the quality of life andprognosis of patients. Esophageal stenting is an effective treatment that has been widely used in clinica...Esophageal disease is a common disorder of the digestive system that can severely affect the quality of life andprognosis of patients. Esophageal stenting is an effective treatment that has been widely used in clinical practice.However, esophageal stents of different types and parameters have varying adaptability and effectiveness forpatients, and they need to be individually selected according to the patient’s specific situation. The purposeof this study was to provide a reference for clinical doctors to choose suitable esophageal stents. We used 3Dprinting technology to fabricate esophageal stents with different ratios of thermoplastic polyurethane (TPU)/(Poly-ε-caprolactone) PCL polymer, and established an artificial neural network model that could predict the radial forceof esophageal stents based on the content of TPU, PCL and print parameter. We selected three optimal ratios formechanical performance tests and evaluated the biomechanical effects of different ratios of stents on esophagealimplantation, swallowing, and stent migration processes through finite element numerical simulation and in vitrosimulation tests. The results showed that different ratios of polymer stents had different mechanical properties,affecting the effectiveness of stent expansion treatment and the possibility of postoperative complications of stentimplantation.展开更多
In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and...In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation.展开更多
When checking the ice shape calculation software,its accuracy is judged based on the proximity between the calculated ice shape and the typical test ice shape.Therefore,determining the typical test ice shape becomes t...When checking the ice shape calculation software,its accuracy is judged based on the proximity between the calculated ice shape and the typical test ice shape.Therefore,determining the typical test ice shape becomes the key task of the icing wind tunnel tests.In the icing wind tunnel test of the tail wing model of a large amphibious aircraft,in order to obtain accurate typical test ice shape,the Romer Absolute Scanner is used to obtain the 3D point cloud data of the ice shape on the tail wing model.Then,the batch-learning self-organizing map(BLSOM)neural network is used to obtain the 2D average ice shape along the model direction based on the 3D point cloud data of the ice shape,while its tolerance band is calculated using the probabilistic statistical method.The results show that the combination of 2D average ice shape and its tolerance band can represent the 3D characteristics of the test ice shape effectively,which can be used as the typical test ice shape for comparative analysis with the calculated ice shape.展开更多
In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for...In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.展开更多
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India...Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.展开更多
In this research, a method called ANNMG is presented to integrate Artificial Neural Networks and Geostatistics for optimum mineral reserve evaluation. The word ANNMG simply means Artificial Neural Network Model integr...In this research, a method called ANNMG is presented to integrate Artificial Neural Networks and Geostatistics for optimum mineral reserve evaluation. The word ANNMG simply means Artificial Neural Network Model integrated with Geostatiscs, In this procedure, the Artificial Neural Network was trained, tested and validated using assay values obtained from exploratory drillholes. Next, the validated model was used to generalize mineral grades at known and unknown sampled locations inside the drilling region respectively. Finally, the reproduced and generalized assay values were combined and fed to geostatistics in order to develop a geological 3D block model. The regression analysis revealed that the predicted sample grades were in close proximity to the actual sample grades, The generalized grades from the ANNMG show that this process could be used to complement exploration activities thereby reducing drilling requirement. It could also be an effective mineral reserve evaluation method that could oroduce optimum block model for mine design.展开更多
In robot-assisted surgery projects,researchers should be able to make fast 3D reconstruction. Usually 2D images acquired with common diagnostic equipments such as UT, CT and MRI are not enough and complete for an accu...In robot-assisted surgery projects,researchers should be able to make fast 3D reconstruction. Usually 2D images acquired with common diagnostic equipments such as UT, CT and MRI are not enough and complete for an accurate 3D reconstruction. There are some interpolation methods for approximating non value voxels which consume large execution time. A novel algorithm is introduced based on generalized regression neural network (GRNN) which can interpolate unknown voxles fast and reliable. The GRNN interpolation is used to produce new 2D images between each two succeeding ultrasonic images. It is shown that the composition of GRNN with image distance transformation can produce higher quality 3D shapes. The results of this method are compared with other interpolation methods practically. It shows this method can decrease overall time consumption on online 3D reconstruction.展开更多
The aim of this study is to develop a reliable method to determine optical constants for 3D-nanonetwork Si thin films manufactured using a pulsed-laser ablation technique that can be applied to other materials synthes...The aim of this study is to develop a reliable method to determine optical constants for 3D-nanonetwork Si thin films manufactured using a pulsed-laser ablation technique that can be applied to other materials synthesized by this tech-nique.An analytical method was introduced to calculate optical constants from reflectance and transmittance spectra.Optical band gaps for this novel material and other important insights on the physical properties were derived from the optical constants.The existing optimization methods described in the literature were found to be complex and prone to errors while determining optical constants of opaque materials where only reflectance data is available.A supervised Deep Learning Algorithm was developed to accurately predict optical constants from the reflectance spectrum alone.The hybrid method introduced in this study was proved to be effective with an accuracy of 95%.展开更多
Construction 3D printing is changing construction industry, but for its immaturity, there are still many problems to be solved. One of the major problems is to study materials for construction 3D printing. Because pri...Construction 3D printing is changing construction industry, but for its immaturity, there are still many problems to be solved. One of the major problems is to study materials for construction 3D printing. Because printed buildings are very different from traditional buildings, there are special requirements for printing materials. Based on environmental and cost considerations, the recycled concrete as printing material is a perfect choice. In order to study and develop the construction 3D printing materials, it is necessary to predict the properties of them. As one of the most effective artificial intelligence algorithms, artificial neural network can deal with multi-parameter and nonlinear problems, and it can provide useful reference to predict the performance of recycled concrete for 3D printing. However, since there are many types and parameters for neural network, it is difficult to select the optimal neural network with excellent prediction performance. In this paper, by comparing different types of neural networks and statistically analyzing the distribution of the root-mean-square error (RMSE) and the coefficient of determination (R2) of these neural networks, we can determine the best performance among four neural networks and finally select the suitable one to predict the performance of 3D printing concrete.展开更多
In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads...In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group.展开更多
A new 3D surface contouring and ranging system based on digital fringe projection and phase shifting technique is presented. Using the phase-shift technique, points cloud with high spatial resolution and limited accur...A new 3D surface contouring and ranging system based on digital fringe projection and phase shifting technique is presented. Using the phase-shift technique, points cloud with high spatial resolution and limited accuracy can be generated. Stereo-pair images obtained from two cameras can be used to compute 3D world coordinates of a point using traditional active triangulation approach, yet the camera calibration is crucial. Neural network is a well-known approach to approximate a nonlinear system without an explicit physical model, in this work it is used to train the stereo vision application system to calculating 3D world coordinates such that the camera calibration can be bypassed. The training set for neural network consists of a variety of stereo-pair images and the corresponding 3D world coordinates. The picture elements correspondence problem is solved by using projected color-coded fringes with different orientations. Color imbalance is completely eliminated by the new color-coded method. Once the high accuracy correspondence of 2D images with 3D points is acquired, high precision 3D points cloud can be recognized by the well trained net. The obvious advantage of this approach is that high spatial resolution can be obtained by the phase-shifting technique and high accuracy 3D object point coordinates are achieved by the well trained net which is independent of the camera model works for any type of camera. Some experiments verified the performance of the method.展开更多
针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成...针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成分分析(IPCA)对输入的数据进行降维预处理;其次将输入模型的像素分割成小的重叠的三维小卷积块,在分割的小块上基于中心像素形成地面标签,利用三维核函数进行卷积处理,形成连续的三维特征图,保留空谱特征。用3D-CNN同时提取空谱特征,然后在三维卷积中加入深度可分离卷积对空间特征再次提取,丰富空谱特征的同时减少参数量,从而减少计算时间,分类精度也有所提高。所提模型在Indian Pines、Salinas Scene和University of Pavia公开数据集上验证,并且同其他经典的分类方法进行比较。实验结果表明,该方法不仅能大幅度节省可学习的参数,降低模型复杂度,而且表现出较好的分类性能,其中总体精度(OA)、平均分类精度(AA)和Kappa系数均可达99%以上。展开更多
AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segment...AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.展开更多
In the railway system,fasteners have the functions of damping,maintaining the track distance,and adjusting the track level.Therefore,routine maintenance and inspection of fasteners are important to ensure the safe ope...In the railway system,fasteners have the functions of damping,maintaining the track distance,and adjusting the track level.Therefore,routine maintenance and inspection of fasteners are important to ensure the safe operation of track lines.Currently,assessment methods for fastener tightness include manual observation,acoustic wave detection,and image detection.There are limitations such as low accuracy and efficiency,easy interference and misjudgment,and a lack of accurate,stable,and fast detection methods.Aiming at the small deformation characteristics and large elastic change of fasteners from full loosening to full tightening,this study proposes high-precision surface-structured light technology for fastener detection and fastener deformation feature extraction based on the center-line projection distance and a fastener tightness regression method based on neural networks.First,the method uses a 3D camera to obtain a fastener point cloud and then segments the elastic rod area based on the iterative closest point algorithm registration.Principal component analysis is used to calculate the normal vector of the segmented elastic rod surface and extract the point on the centerline of the elastic rod.The point is projected onto the upper surface of the bolt to calculate the projection distance.Subsequently,the mapping relationship between the projection distance sequence and fastener tightness is established,and the influence of each parameter on the fastener tightness prediction is analyzed.Finally,by setting up a fastener detection scene in the track experimental base,collecting data,and completing the algorithm verification,the results showed that the deviation between the fastener tightness regression value obtained after the algorithm processing and the actual measured value RMSE was 0.2196 mm,which significantly improved the effect compared with other tightness detection methods,and realized an effective fastener tightness regression.展开更多
As neural radiance fields continue to advance in 3D content representation,the copyright issues surrounding 3D models oriented towards implicit representation become increasingly pressing.In response to this challenge...As neural radiance fields continue to advance in 3D content representation,the copyright issues surrounding 3D models oriented towards implicit representation become increasingly pressing.In response to this challenge,this paper treats the embedding and extraction of neural radiance field watermarks as inverse problems of image transformations and proposes a scheme for protecting neural radiance field copyrights using invertible neural network watermarking.Leveraging 2D image watermarking technology for 3D scene protection,the scheme embeds watermarks within the training images of neural radiance fields through the forward process in invertible neural networks and extracts them from images rendered by neural radiance fields through the reverse process,thereby ensuring copyright protection for both the neural radiance fields and associated 3D scenes.However,challenges such as information loss during rendering processes and deliberate tampering necessitate the design of an image quality enhancement module to increase the scheme’s robustness.This module restores distorted images through neural network processing before watermark extraction.Additionally,embedding watermarks in each training image enables watermark information extraction from multiple viewpoints.Our proposed watermarking method achieves a PSNR(Peak Signal-to-Noise Ratio)value exceeding 37 dB for images containing watermarks and 22 dB for recovered watermarked images,as evaluated on the Lego,Hotdog,and Chair datasets,respectively.These results demonstrate the efficacy of our scheme in enhancing copyright protection.展开更多
In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications.Three-dimensional convolutional neural networks(CNNs)have demonstrated their advantages in 3D object ...In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications.Three-dimensional convolutional neural networks(CNNs)have demonstrated their advantages in 3D object recognition.In this paper,we propose to use the principal curvature directions of 3D objects(using a CAD model)to represent the geometric features as inputs for the 3D CNN.Our framework,namely CurveNet,learns perceptually relevant salient features and predicts object class labels.Curvature directions incorporate complex surface information of a 3D object,which helps our framework to produce more precise and discriminative features for object recognition.Multitask learning is inspired by sharing features between two related tasks,where we consider pose classification as an auxiliary task to enable our CurveNet to better generalize object label classification.Experimental results show that our proposed framework using curvature vectors performs better than voxels as an input for 3D object classification.We further improved the performance of CurveNet by combining two networks with both curvature direction and voxels of a 3D object as the inputs.A Cross-Stitch module was adopted to learn effective shared features across multiple representations.We evaluated our methods using three publicly available datasets and achieved competitive performance in the 3D object recognition task.展开更多
Background Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications,particularly in visual recognition tasks such as image and video analyses.There is a growing...Background Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications,particularly in visual recognition tasks such as image and video analyses.There is a growing interest in applying this technology to diverse applications in medical image analysis.Automated three dimensional Breast Ultrasound is a vital tool for detecting breast cancer,and computer-assisted diagnosis software,developed based on deep learning,can effectively assist radiologists in diagnosis.However,the network model is prone to overfitting during training,owing to challenges such as insufficient training data.This study attempts to solve the problem caused by small datasets and improve model detection performance.Methods We propose a breast cancer detection framework based on deep learning(a transfer learning method based on cross-organ cancer detection)and a contrastive learning method based on breast imaging reporting and data systems(BI-RADS).Results When using cross organ transfer learning and BIRADS based contrastive learning,the average sensitivity of the model increased by a maximum of 16.05%.Conclusion Our experiments have demonstrated that the parameters and experiences of cross-organ cancer detection can be mutually referenced,and contrastive learning method based on BI-RADS can improve the detection performance of the model.展开更多
Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the...Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the residual module is extended to three dimensions,which can extract features in the time and space domain at the same time.Second,by changing the size of the pooling layer window the integrity of the time domain features is preserved,at the same time,in order to overcome the difficulty of network training and over-fitting problems,the batch normalization(BN)layer and the dropout layer are added.After that,because the global average pooling layer(GAP)is affected by the size of the feature map,the network cannot be further deepened,so the convolution layer and maxpool layer are added to the R3D network.Finally,because LSTM has the ability to memorize information and can extract more abstract timing features,the LSTM network is introduced into the R3D network.Experimental results show that the R3D+LSTM network achieves 91%recognition rate on the UCF-101 dataset.展开更多
Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shap...Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS.展开更多
文摘Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Our review traces the evolution of CNN, emphasizing the adaptation and capabilities of the U-Net 3D model in automating seismic fault delineation with unprecedented accuracy. We find: 1) The transition from basic neural networks to sophisticated CNN has enabled remarkable advancements in image recognition, which are directly applicable to analyzing seismic data. The U-Net 3D model, with its innovative architecture, exemplifies this progress by providing a method for detailed and accurate fault detection with reduced manual interpretation bias. 2) The U-Net 3D model has demonstrated its superiority over traditional fault identification methods in several key areas: it has enhanced interpretation accuracy, increased operational efficiency, and reduced the subjectivity of manual methods. 3) Despite these achievements, challenges such as the need for effective data preprocessing, acquisition of high-quality annotated datasets, and achieving model generalization across different geological conditions remain. Future research should therefore focus on developing more complex network architectures and innovative training strategies to refine fault identification performance further. Our findings confirm the transformative potential of deep learning, particularly CNN like the U-Net 3D model, in geosciences, advocating for its broader integration to revolutionize geological exploration and seismic analysis.
基金Nanning Technology and Innovation Special Program(20204122)and Research Grant for 100 Talents of Guangxi Plan.
文摘Esophageal disease is a common disorder of the digestive system that can severely affect the quality of life andprognosis of patients. Esophageal stenting is an effective treatment that has been widely used in clinical practice.However, esophageal stents of different types and parameters have varying adaptability and effectiveness forpatients, and they need to be individually selected according to the patient’s specific situation. The purposeof this study was to provide a reference for clinical doctors to choose suitable esophageal stents. We used 3Dprinting technology to fabricate esophageal stents with different ratios of thermoplastic polyurethane (TPU)/(Poly-ε-caprolactone) PCL polymer, and established an artificial neural network model that could predict the radial forceof esophageal stents based on the content of TPU, PCL and print parameter. We selected three optimal ratios formechanical performance tests and evaluated the biomechanical effects of different ratios of stents on esophagealimplantation, swallowing, and stent migration processes through finite element numerical simulation and in vitrosimulation tests. The results showed that different ratios of polymer stents had different mechanical properties,affecting the effectiveness of stent expansion treatment and the possibility of postoperative complications of stentimplantation.
基金supported in part by the National Natural Science Foundation of China under Grant Nos.U20A20197,62306187the Foundation of Ministry of Industry and Information Technology TC220H05X-04.
文摘In recent years,semantic segmentation on 3D point cloud data has attracted much attention.Unlike 2D images where pixels distribute regularly in the image domain,3D point clouds in non-Euclidean space are irregular and inherently sparse.Therefore,it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space.Most current methods either focus on local feature aggregation or long-range context dependency,but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks.In this paper,we propose a Transformer-based stratified graph convolutional network(SGT-Net),which enlarges the effective receptive field and builds direct long-range dependency.Specifically,we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network(GCN).Secondly,we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field.In addition,to further improve the efficiency of the network,we propose a similarity measurement module to determine whether the neighborhood near the center point is effective.We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets.Through ablation experiments and segmentation visualization,we verify that the SGT model can improve the performance of the point cloud semantic segmentation.
基金supported by the AG600 project of AVIC General Huanan Aircraft Industry Co.,Ltd.
文摘When checking the ice shape calculation software,its accuracy is judged based on the proximity between the calculated ice shape and the typical test ice shape.Therefore,determining the typical test ice shape becomes the key task of the icing wind tunnel tests.In the icing wind tunnel test of the tail wing model of a large amphibious aircraft,in order to obtain accurate typical test ice shape,the Romer Absolute Scanner is used to obtain the 3D point cloud data of the ice shape on the tail wing model.Then,the batch-learning self-organizing map(BLSOM)neural network is used to obtain the 2D average ice shape along the model direction based on the 3D point cloud data of the ice shape,while its tolerance band is calculated using the probabilistic statistical method.The results show that the combination of 2D average ice shape and its tolerance band can represent the 3D characteristics of the test ice shape effectively,which can be used as the typical test ice shape for comparative analysis with the calculated ice shape.
基金the National Natural Science Foundation of China(No.41274129)Chuan Qing Drilling Engineering Company's Scientific Research Project:Seismic detection technology and application of complex carbonate reservoir in Sulige Majiagou Formation and the 2018 Central Supporting Local Co-construction Fund(No.80000-18Z0140504)the Construction and Development of Universities in 2019-Joint Support for Geophysics(Double First-Class center,80000-19Z0204)。
文摘In this paper, the complete process of constructing 3D digital core by fullconvolutional neural network is described carefully. A large number of sandstone computedtomography (CT) images are used as training input for a fully convolutional neural networkmodel. This model is used to reconstruct the three-dimensional (3D) digital core of Bereasandstone based on a small number of CT images. The Hamming distance together with theMinkowski functions for porosity, average volume specifi c surface area, average curvature,and connectivity of both the real core and the digital reconstruction are used to evaluate theaccuracy of the proposed method. The results show that the reconstruction achieved relativeerrors of 6.26%, 1.40%, 6.06%, and 4.91% for the four Minkowski functions and a Hammingdistance of 0.04479. This demonstrates that the proposed method can not only reconstructthe physical properties of real sandstone but can also restore the real characteristics of poredistribution in sandstone, is the ability to which is a new way to characterize the internalmicrostructure of rocks.
文摘Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.
基金the management of Sierra Rutile Company for providing the drillhole dataset used in this studythe Japanese Ministry of Education Science and Technology (MEXT) Scholarship for academic funding
文摘In this research, a method called ANNMG is presented to integrate Artificial Neural Networks and Geostatistics for optimum mineral reserve evaluation. The word ANNMG simply means Artificial Neural Network Model integrated with Geostatiscs, In this procedure, the Artificial Neural Network was trained, tested and validated using assay values obtained from exploratory drillholes. Next, the validated model was used to generalize mineral grades at known and unknown sampled locations inside the drilling region respectively. Finally, the reproduced and generalized assay values were combined and fed to geostatistics in order to develop a geological 3D block model. The regression analysis revealed that the predicted sample grades were in close proximity to the actual sample grades, The generalized grades from the ANNMG show that this process could be used to complement exploration activities thereby reducing drilling requirement. It could also be an effective mineral reserve evaluation method that could oroduce optimum block model for mine design.
文摘In robot-assisted surgery projects,researchers should be able to make fast 3D reconstruction. Usually 2D images acquired with common diagnostic equipments such as UT, CT and MRI are not enough and complete for an accurate 3D reconstruction. There are some interpolation methods for approximating non value voxels which consume large execution time. A novel algorithm is introduced based on generalized regression neural network (GRNN) which can interpolate unknown voxles fast and reliable. The GRNN interpolation is used to produce new 2D images between each two succeeding ultrasonic images. It is shown that the composition of GRNN with image distance transformation can produce higher quality 3D shapes. The results of this method are compared with other interpolation methods practically. It shows this method can decrease overall time consumption on online 3D reconstruction.
基金the support of the Natural Sciences and Engineer-ing Research Council of Canada(NSERC).A special note of appreciation for the help received in using PUMA by Dr Ernesto G.Birgin from the University of São Paulo.
文摘The aim of this study is to develop a reliable method to determine optical constants for 3D-nanonetwork Si thin films manufactured using a pulsed-laser ablation technique that can be applied to other materials synthesized by this tech-nique.An analytical method was introduced to calculate optical constants from reflectance and transmittance spectra.Optical band gaps for this novel material and other important insights on the physical properties were derived from the optical constants.The existing optimization methods described in the literature were found to be complex and prone to errors while determining optical constants of opaque materials where only reflectance data is available.A supervised Deep Learning Algorithm was developed to accurately predict optical constants from the reflectance spectrum alone.The hybrid method introduced in this study was proved to be effective with an accuracy of 95%.
文摘Construction 3D printing is changing construction industry, but for its immaturity, there are still many problems to be solved. One of the major problems is to study materials for construction 3D printing. Because printed buildings are very different from traditional buildings, there are special requirements for printing materials. Based on environmental and cost considerations, the recycled concrete as printing material is a perfect choice. In order to study and develop the construction 3D printing materials, it is necessary to predict the properties of them. As one of the most effective artificial intelligence algorithms, artificial neural network can deal with multi-parameter and nonlinear problems, and it can provide useful reference to predict the performance of recycled concrete for 3D printing. However, since there are many types and parameters for neural network, it is difficult to select the optimal neural network with excellent prediction performance. In this paper, by comparing different types of neural networks and statistically analyzing the distribution of the root-mean-square error (RMSE) and the coefficient of determination (R2) of these neural networks, we can determine the best performance among four neural networks and finally select the suitable one to predict the performance of 3D printing concrete.
文摘In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group.
基金Supported by the Eleventh Five-Year Pre-research Project of China.
文摘A new 3D surface contouring and ranging system based on digital fringe projection and phase shifting technique is presented. Using the phase-shift technique, points cloud with high spatial resolution and limited accuracy can be generated. Stereo-pair images obtained from two cameras can be used to compute 3D world coordinates of a point using traditional active triangulation approach, yet the camera calibration is crucial. Neural network is a well-known approach to approximate a nonlinear system without an explicit physical model, in this work it is used to train the stereo vision application system to calculating 3D world coordinates such that the camera calibration can be bypassed. The training set for neural network consists of a variety of stereo-pair images and the corresponding 3D world coordinates. The picture elements correspondence problem is solved by using projected color-coded fringes with different orientations. Color imbalance is completely eliminated by the new color-coded method. Once the high accuracy correspondence of 2D images with 3D points is acquired, high precision 3D points cloud can be recognized by the well trained net. The obvious advantage of this approach is that high spatial resolution can be obtained by the phase-shifting technique and high accuracy 3D object point coordinates are achieved by the well trained net which is independent of the camera model works for any type of camera. Some experiments verified the performance of the method.
文摘针对卷积神经网络在高光谱图像特征提取和分类的过程中,存在空谱特征提取不充分以及网络层数太多引起的参数量大、计算复杂的问题,提出快速三维卷积神经网络(3D-CNN)结合深度可分离卷积(DSC)的轻量型卷积模型。该方法首先利用增量主成分分析(IPCA)对输入的数据进行降维预处理;其次将输入模型的像素分割成小的重叠的三维小卷积块,在分割的小块上基于中心像素形成地面标签,利用三维核函数进行卷积处理,形成连续的三维特征图,保留空谱特征。用3D-CNN同时提取空谱特征,然后在三维卷积中加入深度可分离卷积对空间特征再次提取,丰富空谱特征的同时减少参数量,从而减少计算时间,分类精度也有所提高。所提模型在Indian Pines、Salinas Scene和University of Pavia公开数据集上验证,并且同其他经典的分类方法进行比较。实验结果表明,该方法不仅能大幅度节省可学习的参数,降低模型复杂度,而且表现出较好的分类性能,其中总体精度(OA)、平均分类精度(AA)和Kappa系数均可达99%以上。
基金Supported by National Science Foundation of China(No.81800878)Interdisciplinary Program of Shanghai Jiao Tong University(No.YG2017QN24)+1 种基金Key Technological Research Projects of Songjiang District(No.18sjkjgg24)Bethune Langmu Ophthalmological Research Fund for Young and Middle-aged People(No.BJ-LM2018002J)
文摘AIM: To explore a segmentation algorithm based on deep learning to achieve accurate diagnosis and treatment of patients with retinal fluid.METHODS: A two-dimensional(2D) fully convolutional network for retinal segmentation was employed. In order to solve the category imbalance in retinal optical coherence tomography(OCT) images, the network parameters and loss function based on the 2D fully convolutional network were modified. For this network, the correlations of corresponding positions among adjacent images in space are ignored. Thus, we proposed a three-dimensional(3D) fully convolutional network for segmentation in the retinal OCT images.RESULTS: The algorithm was evaluated according to segmentation accuracy, Kappa coefficient, and F1 score. For the 3D fully convolutional network proposed in this paper, the overall segmentation accuracy rate is 99.56%, Kappa coefficient is 98.47%, and F1 score of retinal fluid is 95.50%. CONCLUSION: The OCT image segmentation algorithm based on deep learning is primarily founded on the 2D convolutional network. The 3D network architecture proposed in this paper reduces the influence of category imbalance, realizes end-to-end segmentation of volume images, and achieves optimal segmentation results. The segmentation maps are practically the same as the manual annotations of doctors, and can provide doctors with more accurate diagnostic data.
基金Supported by Fundamental Research Funds for the Central Universities of China(Grant No.2023JBMC014).
文摘In the railway system,fasteners have the functions of damping,maintaining the track distance,and adjusting the track level.Therefore,routine maintenance and inspection of fasteners are important to ensure the safe operation of track lines.Currently,assessment methods for fastener tightness include manual observation,acoustic wave detection,and image detection.There are limitations such as low accuracy and efficiency,easy interference and misjudgment,and a lack of accurate,stable,and fast detection methods.Aiming at the small deformation characteristics and large elastic change of fasteners from full loosening to full tightening,this study proposes high-precision surface-structured light technology for fastener detection and fastener deformation feature extraction based on the center-line projection distance and a fastener tightness regression method based on neural networks.First,the method uses a 3D camera to obtain a fastener point cloud and then segments the elastic rod area based on the iterative closest point algorithm registration.Principal component analysis is used to calculate the normal vector of the segmented elastic rod surface and extract the point on the centerline of the elastic rod.The point is projected onto the upper surface of the bolt to calculate the projection distance.Subsequently,the mapping relationship between the projection distance sequence and fastener tightness is established,and the influence of each parameter on the fastener tightness prediction is analyzed.Finally,by setting up a fastener detection scene in the track experimental base,collecting data,and completing the algorithm verification,the results showed that the deviation between the fastener tightness regression value obtained after the algorithm processing and the actual measured value RMSE was 0.2196 mm,which significantly improved the effect compared with other tightness detection methods,and realized an effective fastener tightness regression.
基金supported by the National Natural Science Foundation of China,with Fund Numbers 62272478,62102451the National Defense Science and Technology Independent Research Project(Intelligent Information Hiding Technology and Its Applications in a Certain Field)and Science and Technology Innovation Team Innovative Research Project Research on Key Technologies for Intelligent Information Hiding”with Fund Number ZZKY20222102.
文摘As neural radiance fields continue to advance in 3D content representation,the copyright issues surrounding 3D models oriented towards implicit representation become increasingly pressing.In response to this challenge,this paper treats the embedding and extraction of neural radiance field watermarks as inverse problems of image transformations and proposes a scheme for protecting neural radiance field copyrights using invertible neural network watermarking.Leveraging 2D image watermarking technology for 3D scene protection,the scheme embeds watermarks within the training images of neural radiance fields through the forward process in invertible neural networks and extracts them from images rendered by neural radiance fields through the reverse process,thereby ensuring copyright protection for both the neural radiance fields and associated 3D scenes.However,challenges such as information loss during rendering processes and deliberate tampering necessitate the design of an image quality enhancement module to increase the scheme’s robustness.This module restores distorted images through neural network processing before watermark extraction.Additionally,embedding watermarks in each training image enables watermark information extraction from multiple viewpoints.Our proposed watermarking method achieves a PSNR(Peak Signal-to-Noise Ratio)value exceeding 37 dB for images containing watermarks and 22 dB for recovered watermarked images,as evaluated on the Lego,Hotdog,and Chair datasets,respectively.These results demonstrate the efficacy of our scheme in enhancing copyright protection.
基金This paper was partially supported by a project of the Shanghai Science and Technology Committee(18510760300)Anhui Natural Science Foundation(1908085MF178)Anhui Excellent Young Talents Support Program Project(gxyqZD2019069).
文摘In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications.Three-dimensional convolutional neural networks(CNNs)have demonstrated their advantages in 3D object recognition.In this paper,we propose to use the principal curvature directions of 3D objects(using a CAD model)to represent the geometric features as inputs for the 3D CNN.Our framework,namely CurveNet,learns perceptually relevant salient features and predicts object class labels.Curvature directions incorporate complex surface information of a 3D object,which helps our framework to produce more precise and discriminative features for object recognition.Multitask learning is inspired by sharing features between two related tasks,where we consider pose classification as an auxiliary task to enable our CurveNet to better generalize object label classification.Experimental results show that our proposed framework using curvature vectors performs better than voxels as an input for 3D object classification.We further improved the performance of CurveNet by combining two networks with both curvature direction and voxels of a 3D object as the inputs.A Cross-Stitch module was adopted to learn effective shared features across multiple representations.We evaluated our methods using three publicly available datasets and achieved competitive performance in the 3D object recognition task.
基金Macao Polytechnic University Grant(RP/FCSD-01/2022RP/FCA-05/2022)Science and Technology Development Fund of Macao(0105/2022/A).
文摘Background Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications,particularly in visual recognition tasks such as image and video analyses.There is a growing interest in applying this technology to diverse applications in medical image analysis.Automated three dimensional Breast Ultrasound is a vital tool for detecting breast cancer,and computer-assisted diagnosis software,developed based on deep learning,can effectively assist radiologists in diagnosis.However,the network model is prone to overfitting during training,owing to challenges such as insufficient training data.This study attempts to solve the problem caused by small datasets and improve model detection performance.Methods We propose a breast cancer detection framework based on deep learning(a transfer learning method based on cross-organ cancer detection)and a contrastive learning method based on breast imaging reporting and data systems(BI-RADS).Results When using cross organ transfer learning and BIRADS based contrastive learning,the average sensitivity of the model increased by a maximum of 16.05%.Conclusion Our experiments have demonstrated that the parameters and experiences of cross-organ cancer detection can be mutually referenced,and contrastive learning method based on BI-RADS can improve the detection performance of the model.
基金Supported by the Shaanxi Province Key Research and Development Project (No. 2021GY-280)Shaanxi Province Natural Science Basic Research Program (No. 2021JM-459)the National Natural Science Foundation of China (No. 61772417)
文摘Because behavior recognition is based on video frame sequences,this paper proposes a behavior recognition algorithm that combines 3D residual convolutional neural network(R3D)and long short-term memory(LSTM).First,the residual module is extended to three dimensions,which can extract features in the time and space domain at the same time.Second,by changing the size of the pooling layer window the integrity of the time domain features is preserved,at the same time,in order to overcome the difficulty of network training and over-fitting problems,the batch normalization(BN)layer and the dropout layer are added.After that,because the global average pooling layer(GAP)is affected by the size of the feature map,the network cannot be further deepened,so the convolution layer and maxpool layer are added to the R3D network.Finally,because LSTM has the ability to memorize information and can extract more abstract timing features,the LSTM network is introduced into the R3D network.Experimental results show that the R3D+LSTM network achieves 91%recognition rate on the UCF-101 dataset.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFE0206900the National Natural Science Foundation of China under Grant No.61871440 and CAAI‐Huawei Mind-Spore Open Fund.
文摘Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS.