This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The go...This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.展开更多
The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-r...The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-ronment is a challenging task.Current instance segmentation algorithms for strawberries suffer from issues such as poor real-time performance and low accuracy.To this end,the present study proposes an Efficient YOLACT(E-YOLACT)algorithm for strawberry detection and segmentation based on the YOLACT framework.The key enhancements of the E-YOLACT encompass the development of a lightweight attention mechanism,pyramid squeeze shuffle attention(PSSA),for efficient feature extraction.Additionally,an attention-guided context-feature pyramid network(AC-FPN)is employed instead of FPN to optimize the architecture’s performance.Furthermore,a feature-enhanced model(FEM)is introduced to enhance the prediction head’s capabilities,while efficient fast non-maximum suppression(EF-NMS)is devised to improve non-maximum suppression.The experimental results demonstrate that the E-YOLACT achieves a Box-mAP and Mask-mAP of 77.9 and 76.6,respectively,on the custom dataset.Moreover,it exhibits an impressive category accuracy of 93.5%.Notably,the E-YOLACT also demonstrates a remarkable real-time detection capability with a speed of 34.8 FPS.The method proposed in this article presents an efficient approach for the vision system of a strawberry-picking robot.展开更多
The precise detection and segmentation of tumor lesions are very important for lung cancer computer-aided diagnosis.However,in PET/CT(Positron Emission Tomography/Computed Tomography)lung images,the lesion shapes are ...The precise detection and segmentation of tumor lesions are very important for lung cancer computer-aided diagnosis.However,in PET/CT(Positron Emission Tomography/Computed Tomography)lung images,the lesion shapes are complex,the edges are blurred,and the sample numbers are unbalanced.To solve these problems,this paper proposes a Multi-branch Cross-scale Interactive Feature fusion Transformer model(MCIF-Transformer Mask RCNN)for PET/CT lung tumor instance segmentation,The main innovative works of this paper are as follows:Firstly,the ResNet-Transformer backbone network is used to extract global feature and local feature in lung images.The pixel dependence relationship is established in local and non-local fields to improve the model perception ability.Secondly,the Cross-scale Interactive Feature Enhancement auxiliary network is designed to provide the shallow features to the deep features,and the cross-scale interactive feature enhancement module(CIFEM)is used to enhance the attention ability of the fine-grained features.Thirdly,the Cross-scale Interactive Feature fusion FPN network(CIF-FPN)is constructed to realize bidirectional interactive fusion between deep features and shallow features,and the low-level features are enhanced in deep semantic features.Finally,4 ablation experiments,3 comparison experiments of detection,3 comparison experiments of segmentation and 6 comparison experiments with two-stage and single-stage instance segmentation networks are done on PET/CT lung medical image datasets.The results showed that APdet,APseg,ARdet and ARseg indexes are improved by 5.5%,5.15%,3.11%and 6.79%compared with Mask RCNN(resnet50).Based on the above research,the precise detection and segmentation of the lesion region are realized in this paper.This method has positive significance for the detection of lung tumors.展开更多
Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,...Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
Instance segmentation plays an important role in image processing.The Deep Snake algorithm based on contour iteration deforms an initial bounding box to an instance contour end-to-end,which can improve the performance...Instance segmentation plays an important role in image processing.The Deep Snake algorithm based on contour iteration deforms an initial bounding box to an instance contour end-to-end,which can improve the performance of instance segmentation,but has defects such as slow segmentation speed and sub-optimal initial contour.To solve these problems,a real-time instance segmentation algorithm based on contour learning was proposed.Firstly,ShuffleNet V2 was used as backbone network,and the receptive field of the model was expanded by using a 5×5 convolution kernel.Secondly,a lightweight up-sampling module,multi-stage aggregation(MSA),performs residual fusion of multi-layer features,which not only improves segmentation speed,but also extracts effective features more comprehensively.Thirdly,a contour initialization method for network learning was designed,and a global contour feature aggregation mechanism was used to return a coarse contour,which solves the problem of excessive error between manually initialized contour and real contour.Finally,the Snake deformation module was used to iteratively optimize the coarse contour to obtain the final instance contour.The experimental results showed that the proposed method improved the instance segmentation accuracy on semantic boundaries dataset(SBD),Cityscapes and Kins datasets,and the average precision reached 55.8 on the SBD;Compared with Deep Snake,the model parameters were reduced by 87.2%,calculation amount was reduced by 78.3%,and segmentation speed reached 39.8 frame·s−1 when instance segmentation was performed on an image with a size of 512×512 pixels on a 2080Ti GPU.The proposed method can reduce resource consumption,realize instance segmentation tasks quickly and accurately,and therefore is more suitable for embedded platforms with limited resources.展开更多
Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving syst...Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.展开更多
We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differ...We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differently from current discriminative tracking-by-detection solutions,our proposed hierarchical structural embedding learning can predict more highquality masks with accurate boundary details over spatio-temporal space via the normalizing flows.We formulate the instance inference procedure as a hierarchical spatio-temporal embedded learning across time and space.Given the video clip,our method first coarsely locates pixels belonging to a particular instance with Gaussian distribution and then builds a novel mixing distribution to promote the instance boundary by fusing hierarchical appearance embedding information in a coarse-to-fine manner.For the mixing distribution,we utilize a factorization condition normalized flow fashion to estimate the distribution parameters to improve the segmentation performance.Comprehensive qualitative,quantitative,and ablation experiments are performed on three representative video instance segmentation benchmarks(i.e.,YouTube-VIS19,YouTube-VIS21,and OVIS)and the effectiveness of the proposed method is demonstrated.More impressively,the superior performance of our model on an unsupervised video object segmentation dataset(i.e.,DAVIS19)proves its generalizability.Our algorithm implementations are publicly available at https://github.com/zyqin19/HEVis.展开更多
Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated ...Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.展开更多
Osteoporosis is a systemic disease characterized by low bone mass,impaired bone microstruc-ture,increased bone fragility,and a higher risk of fractures.It commonly affects postmenopausal women and the elderly.Orthopan...Osteoporosis is a systemic disease characterized by low bone mass,impaired bone microstruc-ture,increased bone fragility,and a higher risk of fractures.It commonly affects postmenopausal women and the elderly.Orthopantomography,also known as panoramic radiography,is a widely used imaging technique in dental examinations due to its low cost and easy accessibility.Previous studies have shown that the mandibular cortical index(MCI)derived from orthopantomography can serve as an important indicator of osteoporosis risk.To address this,this study proposes a parallel Transformer network based on multiple instance learning.By introducing parallel modules that alleviate optimization issues and integrating multiple-instance learning with the Transformer architecture,our model effectively extracts information from image patches.Our model achieves an accuracy of 86%and an AUC score of 0.963 on an osteoporosis dataset,which demonstrates its promising and competitive performance.展开更多
Mature soybean phenotyping is an important process in soybean breeding;however, the manual process is time-consuming and labor-intensive. Therefore, a novel approach that is rapid, accurate and highly precise is requi...Mature soybean phenotyping is an important process in soybean breeding;however, the manual process is time-consuming and labor-intensive. Therefore, a novel approach that is rapid, accurate and highly precise is required to obtain the phenotypic data of soybean stems, pods and seeds. In this research, we propose a mature soybean phenotype measurement algorithm called Soybean Phenotype Measure-instance Segmentation(SPM-IS). SPM-IS is based on a feature pyramid network, Principal Component Analysis(PCA) and instance segmentation. We also propose a new method that uses PCA to locate and measure the length and width of a target object via image instance segmentation. After 60,000 iterations, the maximum mean Average Precision(m AP) of the mask and box was able to reach 95.7%. The correlation coefficients R^(2) of the manual measurement and SPM-IS measurement of the pod length, pod width, stem length, complete main stem length, seed length and seed width were 0.9755, 0.9872, 0.9692, 0.9803,0.9656, and 0.9716, respectively. The correlation coefficients R^(2) of the manual counting and SPM-IS counting of pods, stems and seeds were 0.9733, 0.9872, and 0.9851, respectively. The above results show that SPM-IS is a robust measurement and counting algorithm that can reduce labor intensity, improve efficiency and speed up the soybean breeding process.展开更多
A hybrid feature selection and classification strategy was proposed based on the simulated annealing genetic algonthrn and multiple instance learning (MIL). The band selection method was proposed from subspace decom...A hybrid feature selection and classification strategy was proposed based on the simulated annealing genetic algonthrn and multiple instance learning (MIL). The band selection method was proposed from subspace decomposition, which combines the simulated annealing algorithm with the genetic algorithm in choosing different cross-over and mutation probabilities, as well as mutation individuals. Then MIL was combined with image segmentation, clustering and support vector machine algorithms to classify hyperspectral image. The experimental results show that this proposed method can get high classification accuracy of 93.13% at small training samples and the weaknesses of the conventional methods are overcome.展开更多
Recently, image representations derived by convolutional neural networks(CNN) have achieved promising performance for instance retrieval, and they outperformthe traditional hand-crafted image features. However, most o...Recently, image representations derived by convolutional neural networks(CNN) have achieved promising performance for instance retrieval, and they outperformthe traditional hand-crafted image features. However, most of existing CNN-based featuresare proposed to describe the entire images, and thus they are less robust to backgroundclutter. This paper proposes a region of interest (RoI)-based deep convolutionalrepresentation for instance retrieval. It first detects the region of interests (RoIs) from animage, and then extracts a set of RoI-based CNN features from the fully-connected layerof CNN. The proposed RoI-based CNN feature describes the patterns of the detected RoIs,so that the visual matching can be implemented at image region-level to effectively identifytarget objects from cluttered backgrounds. Moreover, we test the performance of theproposed RoI-based CNN feature, when it is extracted from different convolutional layersor fully-connected layers. Also, we compare the performance of RoI-based CNN featurewith those of the state-of-the-art CNN features on two instance retrieval benchmarks.Experimental results show that the proposed RoI-based CNN feature provides superiorperformance than the state-of-the-art CNN features for in-stance retrieval.展开更多
Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,s...Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,scalability,and cold start problems.This paper addresses sparsity,and scalability problems of model-based collaborative recommender system based on ensemble learning approach and enhanced clustering algorithm for movie recommendations.In this paper,an effective movie recommendation system is proposed by Classification and Regression Tree(CART)algorithm,enhanced Balanced Iterative Reducing and Clustering using Hierarchies(BIRCH)algorithm and truncation method.In this research paper,a new hyper parameters tuning is added in BIRCH algorithm to enhance the cluster formation process,where the proposed algorithm is named as enhanced BIRCH.The proposed model yields quality movie recommendation to the new user using Gradient boost classification with broad coverage.In this paper,the proposed model is tested on Movielens dataset,and the performance is evaluated by means of Mean Absolute Error(MAE),precision,recall and f-measure.The experimental results showed the superiority of proposed model in movie recommendation compared to the existing models.The proposed model obtained 0.52 and 0.57 MAE value on Movielens 100k and 1M datasets.Further,the proposed model obtained 0.83 of precision,0.86 of recall and 0.86 of f-measure on Movielens 100k dataset,which are effective compared to the existing models in movie recommendation.展开更多
Deep Web sources contain a large of high-quality and query-related structured date. One of the challenges in the Deep Web is extracting result schemas of Deep Web sources. To address this challenge, this paper describ...Deep Web sources contain a large of high-quality and query-related structured date. One of the challenges in the Deep Web is extracting result schemas of Deep Web sources. To address this challenge, this paper describes a novel approach that extracts both result data and the result schema of a Web database. The approach first models the query interface of a Deep Web source and fills in it with a specifically query instance. Then the result pages of the Deep Web sources are formatted in the tree structure to retrieve subtrees that contain elements of the query instance, Next, result schema of the Deep Web source is extracted by matching the subtree' nodes with the query instance, in which, a two-phase schema extraction method is adopted for obtaining more accurate result schema. Finally, experiments on real Deep Web sources show the utility of our approach, which provides a high precision and recall.展开更多
This paper presents a novel ontology mapping approach based on rough set theory and instance selection.In this appoach the construction approach of a rough set-based inference instance base in which the instance selec...This paper presents a novel ontology mapping approach based on rough set theory and instance selection.In this appoach the construction approach of a rough set-based inference instance base in which the instance selection(involving similarity distance,clustering set and redundancy degree)and discernibility matrix-based feature reduction are introduced respectively;and an ontology mapping approach based on multi-dimensional attribute value joint distribution is proposed.The core of this mapping aI overlapping of the inference instance space.Only valuable instances and important attributes can be selected into the ontology mapping based on the multi-dimensional attribute value joint distribution,so the sequently mapping efficiency is improved.The time complexity of the discernibility matrix-based method and the accuracy of the mapping approach are evaluated by an application example and a series of analyses and comparisons.展开更多
Search-based software engineering has mainly dealt with automated test data generation by metaheuristic search techniques. Similarly, we try to generate the test data (i.e., problem instances) which show the worst cas...Search-based software engineering has mainly dealt with automated test data generation by metaheuristic search techniques. Similarly, we try to generate the test data (i.e., problem instances) which show the worst case of algorithms by such a technique. In this paper, in terms of non-functional testing, we re-define the worst case of some algorithms, respectively. By using genetic algorithms (GAs), we illustrate the strategies corresponding to each type of instances. We here adopt three problems for examples;the sorting problem, the 0/1 knapsack problem (0/1KP), and the travelling salesperson problem (TSP). In some algorithms solving these problems, we could find the worst-case instances successfully;the successfulness of the result is based on a statistical approach and comparison to the results by using the random testing. Our tried examples introduce informative guidelines to the use of genetic algorithms in generating the worst-case instance, which is defined in the aspect of algorithm performance.展开更多
A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input d...A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input data in accordance with the groups of output data.Then,a set of prototypes are selected from the clustered input data.The inessential data can be ultimately discarded from the data set.The proposed method can reduce the effect from outliers because only the prototypes are used.This method is applied to reduce the data set in regression problems.Two standard synthetic data sets and three standard real-world data sets are used for evaluation.The root-mean-square errors are compared from support vector regression models trained with the original data sets and the corresponding instance-reduced data sets.From the experiments,the proposed method provides good results on the reduction and the reconstruction of the standard synthetic and real-world data sets.The numbers of instances of the synthetic data sets are decreased by 25%-69%.The reduction rates for the real-world data sets of the automobile miles per gallon and the 1990 census in CA are 46% and 57%,respectively.The reduction rate of 96% is very good for the electrocardiogram(ECG) data set because of the redundant and periodic nature of ECG signals.For all of the data sets,the regression results are similar to those from the corresponding original data sets.Therefore,the regression performance of the proposed method is good while only a fraction of the data is needed in the training process.展开更多
HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing net...HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing network channels,but encoding videos into multiple representations with selected encoding parameters is a significant challenge.However,video encoding is a computationally intensive and time-consuming operation that requires high-performance resources provided by on-premise infrastructures or public clouds.In turn,the public clouds,such as Amazon elastic compute cloud(EC2),provide hundreds of computing instances optimized for different purposes and clients’budgets.Thus,there is a need for algorithms and methods for optimized computing instance selection for specific tasks such as video encoding and transcoding operations.Additionally,the encoding speed directly depends on the selected encoding parameters and the complexity characteristics of video content.In this paper,we first benchmarked the video encoding performance of Amazon EC2 spot instances using multiple×264 codec encoding parameters and video sequences of varying complexity.Then,we proposed a novel fast approach to optimize Amazon EC2 spot instances and minimize video encoding costs.Furthermore,we evaluated how the optimized selection of EC2 spot instances can affect the encoding cost.The results show that our approach,on average,can reduce the encoding costs by at least 15.8%and up to 47.8%when compared to a random selection of EC2 spot instances.展开更多
基金The results and knowledge included herein have been obtained owing to support from the following institutional grant.Internal grant agency of the Faculty of Economics and Management,Czech University of Life Sciences Prague,Grant No.2023A0004-“Text Segmentation Methods of Historical Alphabets in OCR Development”.https://iga.pef.czu.cz/.Funds were granted to T.Novák,A.Hamplová,O.Svojše,and A.Veselýfrom the author team.
文摘This study presents a single-class and multi-class instance segmentation approach applied to ancient Palmyrene inscriptions,employing two state-of-the-art deep learning algorithms,namely YOLOv8 and Roboflow 3.0.The goal is to contribute to the preservation and understanding of historical texts,showcasing the potential of modern deep learning methods in archaeological research.Our research culminates in several key findings and scientific contributions.We comprehensively compare the performance of YOLOv8 and Roboflow 3.0 in the context of Palmyrene character segmentation—this comparative analysis mainly focuses on the strengths and weaknesses of each algorithm in this context.We also created and annotated an extensive dataset of Palmyrene inscriptions,a crucial resource for further research in the field.The dataset serves for training and evaluating the segmentation models.We employ comparative evaluation metrics to quantitatively assess the segmentation results,ensuring the reliability and reproducibility of our findings and we present custom visualization tools for predicted segmentation masks.Our study advances the state of the art in semi-automatic reading of Palmyrene inscriptions and establishes a benchmark for future research.The availability of the Palmyrene dataset and the insights into algorithm performance contribute to the broader understanding of historical text analysis.
基金funded by Anhui Provincial Natural Science Foundation(No.2208085ME128)the Anhui University-Level Special Project of Anhui University of Science and Technology(No.XCZX2021-01)+1 种基金the Research and the Development Fund of the Institute of Environmental Friendly Materials and Occupational Health,Anhui University of Science and Technology(No.ALW2022YF06)Anhui Province New Era Education Quality Project(Graduate Education)(No.2022xscx073).
文摘The real-time detection and instance segmentation of strawberries constitute fundamental components in the development of strawberry harvesting robots.Real-time identification of strawberries in an unstructured envi-ronment is a challenging task.Current instance segmentation algorithms for strawberries suffer from issues such as poor real-time performance and low accuracy.To this end,the present study proposes an Efficient YOLACT(E-YOLACT)algorithm for strawberry detection and segmentation based on the YOLACT framework.The key enhancements of the E-YOLACT encompass the development of a lightweight attention mechanism,pyramid squeeze shuffle attention(PSSA),for efficient feature extraction.Additionally,an attention-guided context-feature pyramid network(AC-FPN)is employed instead of FPN to optimize the architecture’s performance.Furthermore,a feature-enhanced model(FEM)is introduced to enhance the prediction head’s capabilities,while efficient fast non-maximum suppression(EF-NMS)is devised to improve non-maximum suppression.The experimental results demonstrate that the E-YOLACT achieves a Box-mAP and Mask-mAP of 77.9 and 76.6,respectively,on the custom dataset.Moreover,it exhibits an impressive category accuracy of 93.5%.Notably,the E-YOLACT also demonstrates a remarkable real-time detection capability with a speed of 34.8 FPS.The method proposed in this article presents an efficient approach for the vision system of a strawberry-picking robot.
基金funded by National Natural Science Foundation of China No.62062003Ningxia Natural Science Foundation Project No.2023AAC03293.
文摘The precise detection and segmentation of tumor lesions are very important for lung cancer computer-aided diagnosis.However,in PET/CT(Positron Emission Tomography/Computed Tomography)lung images,the lesion shapes are complex,the edges are blurred,and the sample numbers are unbalanced.To solve these problems,this paper proposes a Multi-branch Cross-scale Interactive Feature fusion Transformer model(MCIF-Transformer Mask RCNN)for PET/CT lung tumor instance segmentation,The main innovative works of this paper are as follows:Firstly,the ResNet-Transformer backbone network is used to extract global feature and local feature in lung images.The pixel dependence relationship is established in local and non-local fields to improve the model perception ability.Secondly,the Cross-scale Interactive Feature Enhancement auxiliary network is designed to provide the shallow features to the deep features,and the cross-scale interactive feature enhancement module(CIFEM)is used to enhance the attention ability of the fine-grained features.Thirdly,the Cross-scale Interactive Feature fusion FPN network(CIF-FPN)is constructed to realize bidirectional interactive fusion between deep features and shallow features,and the low-level features are enhanced in deep semantic features.Finally,4 ablation experiments,3 comparison experiments of detection,3 comparison experiments of segmentation and 6 comparison experiments with two-stage and single-stage instance segmentation networks are done on PET/CT lung medical image datasets.The results showed that APdet,APseg,ARdet and ARseg indexes are improved by 5.5%,5.15%,3.11%and 6.79%compared with Mask RCNN(resnet50).Based on the above research,the precise detection and segmentation of the lesion region are realized in this paper.This method has positive significance for the detection of lung tumors.
基金the National Natural Science Foundation of China(No.62063006)the Natural Science Foundation of Guangxi Province(No.2023GXNS-FAA026025)+3 种基金the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)the Research Project for Young andMiddle-Aged Teachers in Guangxi Universi-ties(ID:2020KY15013)the Special Research Project of Hechi University(ID:2021GCC028)financially supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘Dynamic Simultaneous Localization and Mapping(SLAM)in visual scenes is currently a major research area in fields such as robot navigation and autonomous driving.However,in the face of complex real-world envi-ronments,current dynamic SLAM systems struggle to achieve precise localization and map construction.With the advancement of deep learning,there has been increasing interest in the development of deep learning-based dynamic SLAM visual odometry in recent years,and more researchers are turning to deep learning techniques to address the challenges of dynamic SLAM.Compared to dynamic SLAM systems based on deep learning methods such as object detection and semantic segmentation,dynamic SLAM systems based on instance segmentation can not only detect dynamic objects in the scene but also distinguish different instances of the same type of object,thereby reducing the impact of dynamic objects on the SLAM system’s positioning.This article not only introduces traditional dynamic SLAM systems based on mathematical models but also provides a comprehensive analysis of existing instance segmentation algorithms and dynamic SLAM systems based on instance segmentation,comparing and summarizing their advantages and disadvantages.Through comparisons on datasets,it is found that instance segmentation-based methods have significant advantages in accuracy and robustness in dynamic environments.However,the real-time performance of instance segmentation algorithms hinders the widespread application of dynamic SLAM systems.In recent years,the rapid development of single-stage instance segmentationmethods has brought hope for the widespread application of dynamic SLAM systems based on instance segmentation.Finally,possible future research directions and improvementmeasures are discussed for reference by relevant professionals.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
基金supported by National Key Research and Development Program(No.2022YFE0112400)National Natural Science Foundation of China(No.21706096)Natural Science Foundation of Jiangsu Province(No.BK20160162).
文摘Instance segmentation plays an important role in image processing.The Deep Snake algorithm based on contour iteration deforms an initial bounding box to an instance contour end-to-end,which can improve the performance of instance segmentation,but has defects such as slow segmentation speed and sub-optimal initial contour.To solve these problems,a real-time instance segmentation algorithm based on contour learning was proposed.Firstly,ShuffleNet V2 was used as backbone network,and the receptive field of the model was expanded by using a 5×5 convolution kernel.Secondly,a lightweight up-sampling module,multi-stage aggregation(MSA),performs residual fusion of multi-layer features,which not only improves segmentation speed,but also extracts effective features more comprehensively.Thirdly,a contour initialization method for network learning was designed,and a global contour feature aggregation mechanism was used to return a coarse contour,which solves the problem of excessive error between manually initialized contour and real contour.Finally,the Snake deformation module was used to iteratively optimize the coarse contour to obtain the final instance contour.The experimental results showed that the proposed method improved the instance segmentation accuracy on semantic boundaries dataset(SBD),Cityscapes and Kins datasets,and the average precision reached 55.8 on the SBD;Compared with Deep Snake,the model parameters were reduced by 87.2%,calculation amount was reduced by 78.3%,and segmentation speed reached 39.8 frame·s−1 when instance segmentation was performed on an image with a size of 512×512 pixels on a 2080Ti GPU.The proposed method can reduce resource consumption,realize instance segmentation tasks quickly and accurately,and therefore is more suitable for embedded platforms with limited resources.
基金supported by the Natural Science Foundation of Guizhou Province(Grant Number:20161054)Joint Natural Science Foundation of Guizhou Province(Grant Number:LH20177226)+1 种基金2017 Special Project of New Academic Talent Training and Innovation Exploration of Guizhou University(Grant Number:20175788)The National Natural Science Foundation of China under Grant No.12205062.
文摘Autonomous driving technology has made a lot of outstanding achievements with deep learning,and the vehicle detection and classification algorithm has become one of the critical technologies of autonomous driving systems.The vehicle instance segmentation can perform instance-level semantic parsing of vehicle information,which is more accurate and reliable than object detection.However,the existing instance segmentation algorithms still have the problems of poor mask prediction accuracy and low detection speed.Therefore,this paper proposes an advanced real-time instance segmentation model named FIR-YOLACT,which fuses the ICIoU(Improved Complete Intersection over Union)and Res2Net for the YOLACT algorithm.Specifically,the ICIoU function can effectively solve the degradation problem of the original CIoU loss function,and improve the training convergence speed and detection accuracy.The Res2Net module fused with the ECA(Efficient Channel Attention)Net is added to the model’s backbone network,which improves the multi-scale detection capability and mask prediction accuracy.Furthermore,the Cluster NMS(Non-Maximum Suppression)algorithm is introduced in the model’s bounding box regression to enhance the performance of detecting similarly occluded objects.The experimental results demonstrate the superiority of FIR-YOLACT to the based methods and the effectiveness of all components.The processing speed reaches 28 FPS,which meets the demands of real-time vehicle instance segmentation.
基金supported in part by the National Natural Science Foundation of China(62176139,62106128,62176141)the Major Basic Research Project of Shandong Natural Science Foundation(ZR2021ZD15)+4 种基金the Natural Science Foundation of Shandong Province(ZR2021QF001)the Young Elite Scientists Sponsorship Program by CAST(2021QNRC001)the Open Project of Key Laboratory of Artificial Intelligence,Ministry of Educationthe Shandong Provincial Natural Science Foundation for Distinguished Young Scholars(ZR2021JQ26)the Taishan Scholar Project of Shandong Province(tsqn202103088)。
文摘We introduce a novel method using a new generative model that automatically learns effective representations of the target and background appearance to detect,segment and track each instance in a video sequence.Differently from current discriminative tracking-by-detection solutions,our proposed hierarchical structural embedding learning can predict more highquality masks with accurate boundary details over spatio-temporal space via the normalizing flows.We formulate the instance inference procedure as a hierarchical spatio-temporal embedded learning across time and space.Given the video clip,our method first coarsely locates pixels belonging to a particular instance with Gaussian distribution and then builds a novel mixing distribution to promote the instance boundary by fusing hierarchical appearance embedding information in a coarse-to-fine manner.For the mixing distribution,we utilize a factorization condition normalized flow fashion to estimate the distribution parameters to improve the segmentation performance.Comprehensive qualitative,quantitative,and ablation experiments are performed on three representative video instance segmentation benchmarks(i.e.,YouTube-VIS19,YouTube-VIS21,and OVIS)and the effectiveness of the proposed method is demonstrated.More impressively,the superior performance of our model on an unsupervised video object segmentation dataset(i.e.,DAVIS19)proves its generalizability.Our algorithm implementations are publicly available at https://github.com/zyqin19/HEVis.
基金This research is supported by Fundamental Research Grant Scheme(FRGS),Ministry of Education Malaysia(MOE)under the project code,FRGS/1/2018/ICT02/USM/02/9 titled,Automated Big Data Annotation for Training Semi-Supervised Deep Learning Model in Sentiment Classification.
文摘Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.
文摘Osteoporosis is a systemic disease characterized by low bone mass,impaired bone microstruc-ture,increased bone fragility,and a higher risk of fractures.It commonly affects postmenopausal women and the elderly.Orthopantomography,also known as panoramic radiography,is a widely used imaging technique in dental examinations due to its low cost and easy accessibility.Previous studies have shown that the mandibular cortical index(MCI)derived from orthopantomography can serve as an important indicator of osteoporosis risk.To address this,this study proposes a parallel Transformer network based on multiple instance learning.By introducing parallel modules that alleviate optimization issues and integrating multiple-instance learning with the Transformer architecture,our model effectively extracts information from image patches.Our model achieves an accuracy of 86%and an AUC score of 0.963 on an osteoporosis dataset,which demonstrates its promising and competitive performance.
基金supported by the National Natural Science Foundation of China (31400074, 31471516, 31271747, and 30971809)the Natural Science Foundation of Heilongjiang Province of China(ZD201213)the Heilongjiang Postdoctoral Science Foundation(LBH-Q18025)。
文摘Mature soybean phenotyping is an important process in soybean breeding;however, the manual process is time-consuming and labor-intensive. Therefore, a novel approach that is rapid, accurate and highly precise is required to obtain the phenotypic data of soybean stems, pods and seeds. In this research, we propose a mature soybean phenotype measurement algorithm called Soybean Phenotype Measure-instance Segmentation(SPM-IS). SPM-IS is based on a feature pyramid network, Principal Component Analysis(PCA) and instance segmentation. We also propose a new method that uses PCA to locate and measure the length and width of a target object via image instance segmentation. After 60,000 iterations, the maximum mean Average Precision(m AP) of the mask and box was able to reach 95.7%. The correlation coefficients R^(2) of the manual measurement and SPM-IS measurement of the pod length, pod width, stem length, complete main stem length, seed length and seed width were 0.9755, 0.9872, 0.9692, 0.9803,0.9656, and 0.9716, respectively. The correlation coefficients R^(2) of the manual counting and SPM-IS counting of pods, stems and seeds were 0.9733, 0.9872, and 0.9851, respectively. The above results show that SPM-IS is a robust measurement and counting algorithm that can reduce labor intensity, improve efficiency and speed up the soybean breeding process.
文摘A hybrid feature selection and classification strategy was proposed based on the simulated annealing genetic algonthrn and multiple instance learning (MIL). The band selection method was proposed from subspace decomposition, which combines the simulated annealing algorithm with the genetic algorithm in choosing different cross-over and mutation probabilities, as well as mutation individuals. Then MIL was combined with image segmentation, clustering and support vector machine algorithms to classify hyperspectral image. The experimental results show that this proposed method can get high classification accuracy of 93.13% at small training samples and the weaknesses of the conventional methods are overcome.
基金supported by the National Natural Science Foundation ofChina under Grant 61602253, U1836208, U1536206, U1836110, 61672294, in part by theNational Key R&D Program of China under Grant 2018YFB1003205, in part by the PriorityAcademic Program Development of Jiangsu Higher Education Institutions (PAPD) fund, inpart by the Collaborative Innovation Center of Atmospheric Environment and EquipmentTechnology (CICAEET) fund, China, and in part by MOST under contracts 108-2634-F-259-001- through Pervasive Artificial Intelligence Research (PAIR) Labs, Taiwan.
文摘Recently, image representations derived by convolutional neural networks(CNN) have achieved promising performance for instance retrieval, and they outperformthe traditional hand-crafted image features. However, most of existing CNN-based featuresare proposed to describe the entire images, and thus they are less robust to backgroundclutter. This paper proposes a region of interest (RoI)-based deep convolutionalrepresentation for instance retrieval. It first detects the region of interests (RoIs) from animage, and then extracts a set of RoI-based CNN features from the fully-connected layerof CNN. The proposed RoI-based CNN feature describes the patterns of the detected RoIs,so that the visual matching can be implemented at image region-level to effectively identifytarget objects from cluttered backgrounds. Moreover, we test the performance of theproposed RoI-based CNN feature, when it is extracted from different convolutional layersor fully-connected layers. Also, we compare the performance of RoI-based CNN featurewith those of the state-of-the-art CNN features on two instance retrieval benchmarks.Experimental results show that the proposed RoI-based CNN feature provides superiorperformance than the state-of-the-art CNN features for in-stance retrieval.
文摘Recommender system is a tool to suggest items to the users from the extensive history of the user’s feedback.Though,it is an emerging research area concerning academics and industries,where it suffers from sparsity,scalability,and cold start problems.This paper addresses sparsity,and scalability problems of model-based collaborative recommender system based on ensemble learning approach and enhanced clustering algorithm for movie recommendations.In this paper,an effective movie recommendation system is proposed by Classification and Regression Tree(CART)algorithm,enhanced Balanced Iterative Reducing and Clustering using Hierarchies(BIRCH)algorithm and truncation method.In this research paper,a new hyper parameters tuning is added in BIRCH algorithm to enhance the cluster formation process,where the proposed algorithm is named as enhanced BIRCH.The proposed model yields quality movie recommendation to the new user using Gradient boost classification with broad coverage.In this paper,the proposed model is tested on Movielens dataset,and the performance is evaluated by means of Mean Absolute Error(MAE),precision,recall and f-measure.The experimental results showed the superiority of proposed model in movie recommendation compared to the existing models.The proposed model obtained 0.52 and 0.57 MAE value on Movielens 100k and 1M datasets.Further,the proposed model obtained 0.83 of precision,0.86 of recall and 0.86 of f-measure on Movielens 100k dataset,which are effective compared to the existing models in movie recommendation.
基金Supported by the National Natural Science Foundation of China (60673139, 60473073, 60573090)
文摘Deep Web sources contain a large of high-quality and query-related structured date. One of the challenges in the Deep Web is extracting result schemas of Deep Web sources. To address this challenge, this paper describes a novel approach that extracts both result data and the result schema of a Web database. The approach first models the query interface of a Deep Web source and fills in it with a specifically query instance. Then the result pages of the Deep Web sources are formatted in the tree structure to retrieve subtrees that contain elements of the query instance, Next, result schema of the Deep Web source is extracted by matching the subtree' nodes with the query instance, in which, a two-phase schema extraction method is adopted for obtaining more accurate result schema. Finally, experiments on real Deep Web sources show the utility of our approach, which provides a high precision and recall.
基金the National High Technology Research and Development Program of China(No.2002AA411420)the National Key Basic Research and Development Program of China(No.2003CB316905)the National Natural Science Foundation of China(No.60374071)
文摘This paper presents a novel ontology mapping approach based on rough set theory and instance selection.In this appoach the construction approach of a rough set-based inference instance base in which the instance selection(involving similarity distance,clustering set and redundancy degree)and discernibility matrix-based feature reduction are introduced respectively;and an ontology mapping approach based on multi-dimensional attribute value joint distribution is proposed.The core of this mapping aI overlapping of the inference instance space.Only valuable instances and important attributes can be selected into the ontology mapping based on the multi-dimensional attribute value joint distribution,so the sequently mapping efficiency is improved.The time complexity of the discernibility matrix-based method and the accuracy of the mapping approach are evaluated by an application example and a series of analyses and comparisons.
文摘Search-based software engineering has mainly dealt with automated test data generation by metaheuristic search techniques. Similarly, we try to generate the test data (i.e., problem instances) which show the worst case of algorithms by such a technique. In this paper, in terms of non-functional testing, we re-define the worst case of some algorithms, respectively. By using genetic algorithms (GAs), we illustrate the strategies corresponding to each type of instances. We here adopt three problems for examples;the sorting problem, the 0/1 knapsack problem (0/1KP), and the travelling salesperson problem (TSP). In some algorithms solving these problems, we could find the worst-case instances successfully;the successfulness of the result is based on a statistical approach and comparison to the results by using the random testing. Our tried examples introduce informative guidelines to the use of genetic algorithms in generating the worst-case instance, which is defined in the aspect of algorithm performance.
基金supported by Chiang Mai University Research Fund under the contract number T-M5744
文摘A method that applies clustering technique to reduce the number of samples of large data sets using input-output clustering is proposed.The proposed method clusters the output data into groups and clusters the input data in accordance with the groups of output data.Then,a set of prototypes are selected from the clustered input data.The inessential data can be ultimately discarded from the data set.The proposed method can reduce the effect from outliers because only the prototypes are used.This method is applied to reduce the data set in regression problems.Two standard synthetic data sets and three standard real-world data sets are used for evaluation.The root-mean-square errors are compared from support vector regression models trained with the original data sets and the corresponding instance-reduced data sets.From the experiments,the proposed method provides good results on the reduction and the reconstruction of the standard synthetic and real-world data sets.The numbers of instances of the synthetic data sets are decreased by 25%-69%.The reduction rates for the real-world data sets of the automobile miles per gallon and the 1990 census in CA are 46% and 57%,respectively.The reduction rate of 96% is very good for the electrocardiogram(ECG) data set because of the redundant and periodic nature of ECG signals.For all of the data sets,the regression results are similar to those from the corresponding original data sets.Therefore,the regression performance of the proposed method is good while only a fraction of the data is needed in the training process.
基金This work has been supported in part by the Austrian Research Promotion Agency(FFG)under the APOLLO and Karnten Fog project.
文摘HTTP Adaptive Streaming(HAS)of video content is becoming an undivided part of the Internet and accounts for most of today’s network traffic.Video compression technology plays a vital role in efficiently utilizing network channels,but encoding videos into multiple representations with selected encoding parameters is a significant challenge.However,video encoding is a computationally intensive and time-consuming operation that requires high-performance resources provided by on-premise infrastructures or public clouds.In turn,the public clouds,such as Amazon elastic compute cloud(EC2),provide hundreds of computing instances optimized for different purposes and clients’budgets.Thus,there is a need for algorithms and methods for optimized computing instance selection for specific tasks such as video encoding and transcoding operations.Additionally,the encoding speed directly depends on the selected encoding parameters and the complexity characteristics of video content.In this paper,we first benchmarked the video encoding performance of Amazon EC2 spot instances using multiple×264 codec encoding parameters and video sequences of varying complexity.Then,we proposed a novel fast approach to optimize Amazon EC2 spot instances and minimize video encoding costs.Furthermore,we evaluated how the optimized selection of EC2 spot instances can affect the encoding cost.The results show that our approach,on average,can reduce the encoding costs by at least 15.8%and up to 47.8%when compared to a random selection of EC2 spot instances.