Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively u...Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation.展开更多
Visual motion segmentation(VMS)is an important and key part of many intelligent crowd systems.It can be used to figure out the flow behavior through a crowd and to spot unusual life-threatening incidents like crowd st...Visual motion segmentation(VMS)is an important and key part of many intelligent crowd systems.It can be used to figure out the flow behavior through a crowd and to spot unusual life-threatening incidents like crowd stampedes and crashes,which pose a serious risk to public safety and have resulted in numerous fatalities over the past few decades.Trajectory clustering has become one of the most popular methods in VMS.However,complex data,such as a large number of samples and parameters,makes it difficult for trajectory clustering to work well with accurate motion segmentation results.This study introduces a spatial-angular stacked sparse autoencoder model(SA-SSAE)with l2-regularization and softmax,a powerful deep learning method for visual motion segmentation to cluster similar motion patterns that belong to the same cluster.The proposed model can extract meaningful high-level features using only spatial-angular features obtained from refined tracklets(a.k.a‘trajectories’).We adopt l2-regularization and sparsity regularization,which can learn sparse representations of features,to guarantee the sparsity of the autoencoders.We employ the softmax layer to map the data points into accurate cluster representations.One of the best advantages of the SA-SSAE framework is it can manage VMS even when individuals move around randomly.This framework helps cluster the motion patterns effectively with higher accuracy.We put forward a new dataset with itsmanual ground truth,including 21 crowd videos.Experiments conducted on two crowd benchmarks demonstrate that the proposed model can more accurately group trajectories than the traditional clustering approaches used in previous studies.The proposed SA-SSAE framework achieved a 0.11 improvement in accuracy and a 0.13 improvement in the F-measure compared with the best current method using the CUHK dataset.展开更多
For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual diffe...For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual differences,conventional automatic segmentation methods perform poorly.Since the success of deep learning in the segmentation of medical images has been shown in the past few years,it has been applied to this task in a number of ways.The multi-scale and multi-modal features of lumbar tissues,however,are rarely explored by methodologies of deep learning.Because of the inadequacies in medical images availability,it is crucial to effectively fuse various modes of data collection for model training to alleviate the problem of insufficient samples.In this paper,we propose a novel multi-modality hierarchical fusion network(MHFN)for improving lumbar spine segmentation by learning robust feature representations from multi-modality magnetic resonance images.An adaptive group fusion module(AGFM)is introduced in this paper to fuse features from various modes to extract cross-modality features that could be valuable.Furthermore,to combine features from low to high levels of cross-modality,we design a hierarchical fusion structure based on AGFM.Compared to the other feature fusion methods,AGFM is more effective based on experimental results on multi-modality MR images of the lumbar spine.To further enhance segmentation accuracy,we compare our network with baseline fusion structures.Compared to the baseline fusion structures(input-level:76.27%,layer-level:78.10%,decision-level:79.14%),our network was able to segment fractured vertebrae more accurately(85.05%).展开更多
In this paper a kind of ECG signal automatic segmentation algorithm based on ECG fractal dimension trajectory is put forward.First,the ECG signal will be analyzed,then constructing the fractal dimension trajectory of ...In this paper a kind of ECG signal automatic segmentation algorithm based on ECG fractal dimension trajectory is put forward.First,the ECG signal will be analyzed,then constructing the fractal dimension trajectory of ECG signal according to the fractal dimension trajectory constructing algorithm,finally,obtaining ECG signal feature points and ECG automatic segmentation will be realized by the feature of ECG signal fractal dimension trajectory and the feature of ECG frequency domain characteristics.Through Matlab simulation of the algorithm,the results showed that by constructing the ECG fractal dimension trajectory enables ECG location of each component displayed clearly and obtains high success rate of sub-ECG,providing a basis to identify the various components of ECG signal accurately.展开更多
Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time...Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time series.Firstly,an algorithm improvement was proposed based on the segmentation algorithm GRASP-UTS(Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation).On the basis of considering trajectory coverage,this algorithm designs an adaptive parameter adjustment to segment long-term trajectory data reasonably and the identification of an optimal starting point for segmentation.Then the compression efficiency of typical offline and online algorithms,such as the Douglas-Peucker algorithm,the Sliding Window algorithm and its enhancements,was compared before and after segmentation.The experimental findings highlight that the Adaptive Parameters GRASP-UTS segmentation approach leads to higher fitting precision in trajectory time series compression and improved algorithm efficiency post-segmentation.Additionally,the compression performance of the Improved Sliding Window algorithm post-segmentation showcases its suitability for trajectories of varying scales,providing reasonable compression accuracy.展开更多
In the vehicle trajectory application system, it is often necessary to detect whether the vehicle deviates from the specified route. Trajectory planning in the traditional route deviation detection is defined by the d...In the vehicle trajectory application system, it is often necessary to detect whether the vehicle deviates from the specified route. Trajectory planning in the traditional route deviation detection is defined by the driver through the mobile phone navigation software, which plays a more auxiliary driving role. This paper presents a method of vehicle trajectory deviation detection. Firstly, the manager customizes the trajectory planning and then uses big data technologies to match the deviation between the trajectory planning and the vehicle trajectory. Finally, it achieves the supervisory function of the manager on the vehicle track route in real-time. The results show that this method could detect the vehicle trajectory deviation quickly and accurately, and has practical application value.展开更多
The segmentation of head and neck(H&N)tumors in dual Positron Emission Tomography/Computed Tomogra-phy(PET/CT)imaging is a critical task in medical imaging,providing essential information for diagnosis,treatment p...The segmentation of head and neck(H&N)tumors in dual Positron Emission Tomography/Computed Tomogra-phy(PET/CT)imaging is a critical task in medical imaging,providing essential information for diagnosis,treatment planning,and outcome prediction.Motivated by the need for more accurate and robust segmentation methods,this study addresses key research gaps in the application of deep learning techniques to multimodal medical images.Specifically,it investigates the limitations of existing 2D and 3D models in capturing complex tumor structures and proposes an innovative 2.5D UNet Transformer model as a solution.The primary research questions guiding this study are:(1)How can the integration of convolutional neural networks(CNNs)and transformer networks enhance segmentation accuracy in dual PET/CT imaging?(2)What are the comparative advantages of 2D,2.5D,and 3D model configurations in this context?To answer these questions,we aimed to develop and evaluate advanced deep-learning models that leverage the strengths of both CNNs and transformers.Our proposed methodology involved a comprehensive preprocessing pipeline,including normalization,contrast enhancement,and resampling,followed by segmentation using 2D,2.5D,and 3D UNet Transformer models.The models were trained and tested on three diverse datasets:HeckTor2022,AutoPET2023,and SegRap2023.Performance was assessed using metrics such as Dice Similarity Coefficient,Jaccard Index,Average Surface Distance(ASD),and Relative Absolute Volume Difference(RAVD).The findings demonstrate that the 2.5D UNet Transformer model consistently outperformed the 2D and 3D models across most metrics,achieving the highest Dice and Jaccard values,indicating superior segmentation accuracy.For instance,on the HeckTor2022 dataset,the 2.5D model achieved a Dice score of 81.777 and a Jaccard index of 0.705,surpassing other model configurations.The 3D model showed strong boundary delineation performance but exhibited variability across datasets,while the 2D model,although effective,generally underperformed compared to its 2.5D and 3D counterparts.Compared to related literature,our study confirms the advantages of incorporating additional spatial context,as seen in the improved performance of the 2.5D model.This research fills a significant gap by providing a detailed comparative analysis of different model dimensions and their impact on H&N segmentation accuracy in dual PET/CT imaging.展开更多
基金National Key R&D Program of China(No.2022ZD0118401).
文摘Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation.
基金This research work is supported by the Deputyship of Research&Innovation,Ministry of Education in Saudi Arabia(Grant Number 758).
文摘Visual motion segmentation(VMS)is an important and key part of many intelligent crowd systems.It can be used to figure out the flow behavior through a crowd and to spot unusual life-threatening incidents like crowd stampedes and crashes,which pose a serious risk to public safety and have resulted in numerous fatalities over the past few decades.Trajectory clustering has become one of the most popular methods in VMS.However,complex data,such as a large number of samples and parameters,makes it difficult for trajectory clustering to work well with accurate motion segmentation results.This study introduces a spatial-angular stacked sparse autoencoder model(SA-SSAE)with l2-regularization and softmax,a powerful deep learning method for visual motion segmentation to cluster similar motion patterns that belong to the same cluster.The proposed model can extract meaningful high-level features using only spatial-angular features obtained from refined tracklets(a.k.a‘trajectories’).We adopt l2-regularization and sparsity regularization,which can learn sparse representations of features,to guarantee the sparsity of the autoencoders.We employ the softmax layer to map the data points into accurate cluster representations.One of the best advantages of the SA-SSAE framework is it can manage VMS even when individuals move around randomly.This framework helps cluster the motion patterns effectively with higher accuracy.We put forward a new dataset with itsmanual ground truth,including 21 crowd videos.Experiments conducted on two crowd benchmarks demonstrate that the proposed model can more accurately group trajectories than the traditional clustering approaches used in previous studies.The proposed SA-SSAE framework achieved a 0.11 improvement in accuracy and a 0.13 improvement in the F-measure compared with the best current method using the CUHK dataset.
基金supported in part by the Technology Innovation 2030 under Grant 2022ZD0211700.
文摘For the analysis of spinal and disc diseases,automated tissue segmentation of the lumbar spine is vital.Due to the continuous and concentrated location of the target,the abundance of edge features,and individual differences,conventional automatic segmentation methods perform poorly.Since the success of deep learning in the segmentation of medical images has been shown in the past few years,it has been applied to this task in a number of ways.The multi-scale and multi-modal features of lumbar tissues,however,are rarely explored by methodologies of deep learning.Because of the inadequacies in medical images availability,it is crucial to effectively fuse various modes of data collection for model training to alleviate the problem of insufficient samples.In this paper,we propose a novel multi-modality hierarchical fusion network(MHFN)for improving lumbar spine segmentation by learning robust feature representations from multi-modality magnetic resonance images.An adaptive group fusion module(AGFM)is introduced in this paper to fuse features from various modes to extract cross-modality features that could be valuable.Furthermore,to combine features from low to high levels of cross-modality,we design a hierarchical fusion structure based on AGFM.Compared to the other feature fusion methods,AGFM is more effective based on experimental results on multi-modality MR images of the lumbar spine.To further enhance segmentation accuracy,we compare our network with baseline fusion structures.Compared to the baseline fusion structures(input-level:76.27%,layer-level:78.10%,decision-level:79.14%),our network was able to segment fractured vertebrae more accurately(85.05%).
文摘In this paper a kind of ECG signal automatic segmentation algorithm based on ECG fractal dimension trajectory is put forward.First,the ECG signal will be analyzed,then constructing the fractal dimension trajectory of ECG signal according to the fractal dimension trajectory constructing algorithm,finally,obtaining ECG signal feature points and ECG automatic segmentation will be realized by the feature of ECG signal fractal dimension trajectory and the feature of ECG frequency domain characteristics.Through Matlab simulation of the algorithm,the results showed that by constructing the ECG fractal dimension trajectory enables ECG location of each component displayed clearly and obtains high success rate of sub-ECG,providing a basis to identify the various components of ECG signal accurately.
基金Supported by the Basic Research Projects of Liaoning Provincial Department of Education(LJKQZ20222459)。
文摘Aiming at the problem of ignoring the importance of starting point features of trajecory segmentation in existing trajectory compression algorithms,a study was conducted on the preprocessing process of trajectory time series.Firstly,an algorithm improvement was proposed based on the segmentation algorithm GRASP-UTS(Greedy Randomized Adaptive Search Procedure for Unsupervised Trajectory Segmentation).On the basis of considering trajectory coverage,this algorithm designs an adaptive parameter adjustment to segment long-term trajectory data reasonably and the identification of an optimal starting point for segmentation.Then the compression efficiency of typical offline and online algorithms,such as the Douglas-Peucker algorithm,the Sliding Window algorithm and its enhancements,was compared before and after segmentation.The experimental findings highlight that the Adaptive Parameters GRASP-UTS segmentation approach leads to higher fitting precision in trajectory time series compression and improved algorithm efficiency post-segmentation.Additionally,the compression performance of the Improved Sliding Window algorithm post-segmentation showcases its suitability for trajectories of varying scales,providing reasonable compression accuracy.
文摘In the vehicle trajectory application system, it is often necessary to detect whether the vehicle deviates from the specified route. Trajectory planning in the traditional route deviation detection is defined by the driver through the mobile phone navigation software, which plays a more auxiliary driving role. This paper presents a method of vehicle trajectory deviation detection. Firstly, the manager customizes the trajectory planning and then uses big data technologies to match the deviation between the trajectory planning and the vehicle trajectory. Finally, it achieves the supervisory function of the manager on the vehicle track route in real-time. The results show that this method could detect the vehicle trajectory deviation quickly and accurately, and has practical application value.
基金supported by Scientific Research Deanship at University of Ha’il,Saudi Arabia through project number RG-23137.
文摘The segmentation of head and neck(H&N)tumors in dual Positron Emission Tomography/Computed Tomogra-phy(PET/CT)imaging is a critical task in medical imaging,providing essential information for diagnosis,treatment planning,and outcome prediction.Motivated by the need for more accurate and robust segmentation methods,this study addresses key research gaps in the application of deep learning techniques to multimodal medical images.Specifically,it investigates the limitations of existing 2D and 3D models in capturing complex tumor structures and proposes an innovative 2.5D UNet Transformer model as a solution.The primary research questions guiding this study are:(1)How can the integration of convolutional neural networks(CNNs)and transformer networks enhance segmentation accuracy in dual PET/CT imaging?(2)What are the comparative advantages of 2D,2.5D,and 3D model configurations in this context?To answer these questions,we aimed to develop and evaluate advanced deep-learning models that leverage the strengths of both CNNs and transformers.Our proposed methodology involved a comprehensive preprocessing pipeline,including normalization,contrast enhancement,and resampling,followed by segmentation using 2D,2.5D,and 3D UNet Transformer models.The models were trained and tested on three diverse datasets:HeckTor2022,AutoPET2023,and SegRap2023.Performance was assessed using metrics such as Dice Similarity Coefficient,Jaccard Index,Average Surface Distance(ASD),and Relative Absolute Volume Difference(RAVD).The findings demonstrate that the 2.5D UNet Transformer model consistently outperformed the 2D and 3D models across most metrics,achieving the highest Dice and Jaccard values,indicating superior segmentation accuracy.For instance,on the HeckTor2022 dataset,the 2.5D model achieved a Dice score of 81.777 and a Jaccard index of 0.705,surpassing other model configurations.The 3D model showed strong boundary delineation performance but exhibited variability across datasets,while the 2D model,although effective,generally underperformed compared to its 2.5D and 3D counterparts.Compared to related literature,our study confirms the advantages of incorporating additional spatial context,as seen in the improved performance of the 2.5D model.This research fills a significant gap by providing a detailed comparative analysis of different model dimensions and their impact on H&N segmentation accuracy in dual PET/CT imaging.