It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limit...It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.展开更多
Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft mea...Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft measurement technology,the instrumental method seems obsolete and involves high cost.This paper proposes a novel method for predicting the types of weather based on the PV power data and partial meteorological data.By this method,the weather types are deduced by data analysis,instead of weather instrument A better fault detection is obtained by using the support vector machines(SVM) and comparing the predicted and the actual weather.The model of the weather prediction is established by a direct SVM for training multiclass predictors.Although SVM is suitable for classification,the classified results depend on the type of the kernel,the parameters of the kernel,and the soft margin coefficient,which are difficult to choose.In this paper,these parameters are optimized by particle swarm optimization(PSO) algorithm in anticipation of good prediction results can be achieved.Prediction results show that this method is feasible and effective.展开更多
In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of c...In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of carbaryl, atrazine, propazine, chlorothalonil, dimethametryn and terbutryn in environmental water samples. Parameters affecting the extraction efficiency such as type and volume of extraction solvent, sample volume, salt type and amount, centrifugation speed and time, and sample pH were optimized. Under the optimum extraction conditions the method was linear over the range of 10 - 100 μg/L (carbaryl), 8 - 100 μg/L (atarzine), 7 - 100 μg/L (propazine) and 9 - 100 μg/L (chlorothalonil, terbutryn and dimethametryn) with correlation coefficients (R2) between 0.99 and 0.999. Limits of detection and quantification ranged from 2.0 to 2.8 μg/L and 6.7 to 9.5 μg/L, respectively. The extraction recoveries obtained for ground, lake and river waters were in a range of 75.5% to 106.6%, with the intra-day and inter-day relative standard deviation lower than 3.4% for all the target analytes. All of the target analytes were not detected in these samples. Therefore, the proposed SALLE-HPLC-DAD method is simple, rapid, cheap and environmentally friendly for the determination of the aforementioned herbicides, insecticide and fungicide residues in environmental water samples.展开更多
Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructe...Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructed by combining SVM^light algorithm with directed acyclic graph SVM (DAGSVM) method, named DAGSVM^light A new method is proposed to select the working set which is identical to the working set selected by SVM^light approach. Experimental results indicate DAGSVM^light is competitive with DAGSMO. It is more suitable for practice use. It may be an especially useful tool for large-scale multiclass classification problems and lead to more widespread use of SVMs in the engineering community due to its good performance.展开更多
In this study, a miniaturized analytical technique based on high density solvent based dispersive liquid-liquid microextraction (HD-DLLME) was developed for extraction of trace residues of multiclass pesticides includ...In this study, a miniaturized analytical technique based on high density solvent based dispersive liquid-liquid microextraction (HD-DLLME) was developed for extraction of trace residues of multiclass pesticides including three striazine herbicides, two organophosphate insecticides and two organochlorine fungicides from environmental water and sugarcane juice samples. The analytical method was validated and found to offer good linearity: R2 ≥ 0.991;repeatability varied from 0.73% - 5.28%;reproducibility varied from 1.14% - 8.74% and limit of detection ranged from 0.005 to 0.02 μg/L. Moreover, accuracy of the optimized method was evaluated and the recovery was varied from 80.39% - 114.05%. Analytical applications of this method to environmental waters and sugarcane juice samples indicate the presence of trace residues of ametryn in the lake water and sugarcane juice samples. Atrazine and ametryn were also detected in irrigation water.展开更多
It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable ...It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.展开更多
Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem ...Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital display instruments.We developed a multiclass digital display instrument recognition algorithm by combining the character recognition network constructed using a convolutional neural network and bidirectional variable-length long short-term memory(LSTM).First,the digital display region detection CTPNpro network framework was designed based on the CTPN network architecture by introducing feature fusion and residual structure.Next,the digital display instrument identification network was constructed based on a convolutional neural network using twoway LSTM and Connectionist temporal classification(CTC)of indefinite length.Finally,an automatic calibration system for digital display instruments was built,and a multiclass digital display instrument dataset was constructed by sampling in the system.We compared the performance of the CTPNpro algorithm with other methods using this dataset to validate the effectiveness and robustness of the proposed algorithm.展开更多
tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has a...tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has attracted a lot of research interests in the past decade. Unfortunately, most research efforts only concentrate on 2-class problems. In this paper, we study a new method of formulating a multiclass Support Vector Machine (SVM) problem for imbalanced biomedical data to improve the classification performance. The proposed method applies cost-sensitive approach and ramp loss function to the Crammer and Singer multiclass SVM formulation. Experimental results on multiple biomedical datasets show that the proposed solution can effectively cure the problem when the datasets are noisy and highly imbalanced.展开更多
Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family...Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion;thus,differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved.In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization(LSTM.PQDO)method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction;thus, dynamic optimization of the resampling proportion is realized.The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets;moreover, it can function as a reference for sample resampling tasks in similar scenarios.展开更多
Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challen...Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.展开更多
Precisely understanding the business relationships between autonomous systems(ASes)is essential for studying the Internet structure.To date,many inference algorithms,which mainly focus on peer-to-peer(P2P)and provider...Precisely understanding the business relationships between autonomous systems(ASes)is essential for studying the Internet structure.To date,many inference algorithms,which mainly focus on peer-to-peer(P2P)and provider-to-customer(P2C)binary classification,have been proposed to classify the AS relationships and have achieved excellent results.However,business-based sibling relationships and structure-based exchange relationships have become an increasingly nonnegligible part of the Internet market in recent years.Existing algorithms are often difficult to infer due to the high similarity of these relationships to P2P or P2C relationships.In this study,we focus on multiclassification of AS relationship for the first time.We first summarize the differences between AS relationships under the structural and attribute features,and the reasons why multiclass relationships are difficult to be inferred.We then introduce new features and propose a graph convolutional network(GCN)framework,AS-GCN,to solve this multiclassification problem under complex scenes.The proposed framework considers the global network structure and local link features concurrently.Experiments on real Internet topological data validate the effectiveness of our method,that is,AS-GCN.The proposed method achieves comparable results on the binary classification task and outperforms a series of baselines on the more difficult multiclassification task,with an overall metrics above 95%.展开更多
Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafte...Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.展开更多
Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational powe...Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational power of quantum systems hold the potential to outpace classical counterparts in solving complex optimization problems,which are pervasive in machine learning.Quantum Support Vector Machine(QSVM)is a quantum machine learning algorithm inspired by classical Support Vector Machine(SVM)that exploits quantum parallelism to efficiently classify data points in high-dimensional feature spaces.We provide a comprehensive overview of the underlying principles of QSVM,elucidating how different quantum feature maps and quantum kernels enable the manipulation of quantum states to perform classification tasks.Through a comparative analysis,we reveal the quantum advantage achieved by these algorithms in terms of speedup and solution quality.As a case study,we explored the potential of quantum paradigms in the context of a real-world problem:classifying pancreatic cancer biomarker data.The Support Vector Classifier(SVC)algorithm was employed for the classical approach while the QSVM algorithm was executed on a quantum simulator provided by the Qiskit quantum computing framework.The classical approach as well as the quantum-based techniques reported similar accuracy.This uniformity suggests that these methods effectively captured similar underlying patterns in the dataset.Remarkably,quantum implementations exhibited substantially reduced execution times demonstrating the potential of quantum approaches in enhancing classification efficiency.This affirms the growing significance of quantum computing as a transformative tool for augmenting machine learning paradigms and also underscores the potency of quantum execution for computational acceleration.展开更多
骨盆CT影像精确分割是骨盆骨疾病的临床诊断和手术规划中非常重要的环节。针对目前2D骨盆分割方法对三维医学影像进行切片处理时损失空间信息的问题,提出了改进3D U-Net网络实现对骨盆CT影像3D自动分割。实验数据为公开数据集CTPelvic1K...骨盆CT影像精确分割是骨盆骨疾病的临床诊断和手术规划中非常重要的环节。针对目前2D骨盆分割方法对三维医学影像进行切片处理时损失空间信息的问题,提出了改进3D U-Net网络实现对骨盆CT影像3D自动分割。实验数据为公开数据集CTPelvic1K共1184名患者骨盆CT影像,其中包含骶骨、左髋骨、右髋骨和腰椎四个部位标签。以3D U-Net骨干网络为基础,结合自注意力机制提出3D多类分割模型3D Trans U-Net,并使用迁移学习训练3D U-Net、V-Net、Attention U-Net作为对照实验。实验结果表明:3D Trans U-Net在测试集上整个骨盆区域、骶骨、左髋骨、右髋骨、腰椎Dice系数分别达到97.99%,96.70%,97.96%,97.95%,96.89%;Dice系数、豪斯多夫距离等评价指标均优于现有经典网络3D U-Net、V-Net、Attention U-Net。因此,改进的3D Trans U-Net对骨盆不同部位具有较好的分割效果,为精准医治骨盆骨疾病提供了一条有效的技术途径。展开更多
文摘It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.
基金supported by the National Natural Science Foundation of China(61433004,61473069)IAPI Fundamental Research Funds(2013ZCX14)+1 种基金supported by the Development Project of Key Laboratory of Liaoning Provincethe Enterprise Postdoctoral Fund Projects of Liaoning Province
文摘Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft measurement technology,the instrumental method seems obsolete and involves high cost.This paper proposes a novel method for predicting the types of weather based on the PV power data and partial meteorological data.By this method,the weather types are deduced by data analysis,instead of weather instrument A better fault detection is obtained by using the support vector machines(SVM) and comparing the predicted and the actual weather.The model of the weather prediction is established by a direct SVM for training multiclass predictors.Although SVM is suitable for classification,the classified results depend on the type of the kernel,the parameters of the kernel,and the soft margin coefficient,which are difficult to choose.In this paper,these parameters are optimized by particle swarm optimization(PSO) algorithm in anticipation of good prediction results can be achieved.Prediction results show that this method is feasible and effective.
文摘In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of carbaryl, atrazine, propazine, chlorothalonil, dimethametryn and terbutryn in environmental water samples. Parameters affecting the extraction efficiency such as type and volume of extraction solvent, sample volume, salt type and amount, centrifugation speed and time, and sample pH were optimized. Under the optimum extraction conditions the method was linear over the range of 10 - 100 μg/L (carbaryl), 8 - 100 μg/L (atarzine), 7 - 100 μg/L (propazine) and 9 - 100 μg/L (chlorothalonil, terbutryn and dimethametryn) with correlation coefficients (R2) between 0.99 and 0.999. Limits of detection and quantification ranged from 2.0 to 2.8 μg/L and 6.7 to 9.5 μg/L, respectively. The extraction recoveries obtained for ground, lake and river waters were in a range of 75.5% to 106.6%, with the intra-day and inter-day relative standard deviation lower than 3.4% for all the target analytes. All of the target analytes were not detected in these samples. Therefore, the proposed SALLE-HPLC-DAD method is simple, rapid, cheap and environmentally friendly for the determination of the aforementioned herbicides, insecticide and fungicide residues in environmental water samples.
文摘Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructed by combining SVM^light algorithm with directed acyclic graph SVM (DAGSVM) method, named DAGSVM^light A new method is proposed to select the working set which is identical to the working set selected by SVM^light approach. Experimental results indicate DAGSVM^light is competitive with DAGSMO. It is more suitable for practice use. It may be an especially useful tool for large-scale multiclass classification problems and lead to more widespread use of SVMs in the engineering community due to its good performance.
文摘In this study, a miniaturized analytical technique based on high density solvent based dispersive liquid-liquid microextraction (HD-DLLME) was developed for extraction of trace residues of multiclass pesticides including three striazine herbicides, two organophosphate insecticides and two organochlorine fungicides from environmental water and sugarcane juice samples. The analytical method was validated and found to offer good linearity: R2 ≥ 0.991;repeatability varied from 0.73% - 5.28%;reproducibility varied from 1.14% - 8.74% and limit of detection ranged from 0.005 to 0.02 μg/L. Moreover, accuracy of the optimized method was evaluated and the recovery was varied from 80.39% - 114.05%. Analytical applications of this method to environmental waters and sugarcane juice samples indicate the presence of trace residues of ametryn in the lake water and sugarcane juice samples. Atrazine and ametryn were also detected in irrigation water.
文摘It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.
基金supported by the National Key R&D Program of China(2022YFB4701502)the“Leading Goose”R&D Program of Zhejiang(2023C01177)+1 种基金the Key Research Project of Zhejiang Lab(2021NB0AL03)the Key R&D Project on Agriculture and Social Development in Hangzhou City(Asian Games)(20230701 A05).
文摘Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital display instruments.We developed a multiclass digital display instrument recognition algorithm by combining the character recognition network constructed using a convolutional neural network and bidirectional variable-length long short-term memory(LSTM).First,the digital display region detection CTPNpro network framework was designed based on the CTPN network architecture by introducing feature fusion and residual structure.Next,the digital display instrument identification network was constructed based on a convolutional neural network using twoway LSTM and Connectionist temporal classification(CTC)of indefinite length.Finally,an automatic calibration system for digital display instruments was built,and a multiclass digital display instrument dataset was constructed by sampling in the system.We compared the performance of the CTPNpro algorithm with other methods using this dataset to validate the effectiveness and robustness of the proposed algorithm.
基金Supported by GSU Molecular Basis of Disease Graduate Fellow, 2011-2012
文摘tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has attracted a lot of research interests in the past decade. Unfortunately, most research efforts only concentrate on 2-class problems. In this paper, we study a new method of formulating a multiclass Support Vector Machine (SVM) problem for imbalanced biomedical data to improve the classification performance. The proposed method applies cost-sensitive approach and ramp loss function to the Crammer and Singer multiclass SVM formulation. Experimental results on multiple biomedical datasets show that the proposed solution can effectively cure the problem when the datasets are noisy and highly imbalanced.
基金partially funded by the National Natural Science Foundation of China (No. 61272447)the National Entrepreneurship&Innovation Demonstration Base of China (No. C700011)the Key Research&Development Project of Sichuan Province of China (No.2018G20100)。
文摘Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion;thus,differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved.In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization(LSTM.PQDO)method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction;thus, dynamic optimization of the resampling proportion is realized.The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets;moreover, it can function as a reference for sample resampling tasks in similar scenarios.
基金Project supported by the National Natural Science Foundation of China(Nos.61802085 and 61563012)the Guangxi Provincial Natural Science Foundation,China(Nos.2021GXNSFAA220074and 2020GXNSFAA159038)+1 种基金the Guangxi Key Laboratory of Embedded Technology and Intelligent System Foundation,China(No.2018A-04)the Guangxi Key Laboratory of Trusted Software Foundation,China(No.kx202011)。
文摘Since traditional machine learning methods are sensitive to skewed distribution and do not consider the characteristics in multiclass imbalance problems,the skewed distribution of multiclass data poses a major challenge to machine learning algorithms.To tackle such issues,we propose a new splitting criterion of the decision tree based on the one-against-all-based Hellinger distance(OAHD).Two crucial elements are included in OAHD.First,the one-against-all scheme is integrated into the process of computing the Hellinger distance in OAHD,thereby extending the Hellinger distance decision tree to cope with the multiclass imbalance problem.Second,for the multiclass imbalance problem,the distribution and the number of distinct classes are taken into account,and a modified Gini index is designed.Moreover,we give theoretical proofs for the properties of OAHD,including skew insensitivity and the ability to seek a purer node in the decision tree.Finally,we collect 20 public real-world imbalanced data sets from the Knowledge Extraction based on Evolutionary Learning(KEEL)repository and the University of California,Irvine(UCI)repository.Experimental and statistical results show that OAHD significantly improves the performance compared with the five other well-known decision trees in terms of Precision,F-measure,and multiclass area under the receiver operating characteristic curve(MAUC).Moreover,through statistical analysis,the Friedman and Nemenyi tests are used to prove the advantage of OAHD over the five other decision trees.
基金This workwas partially supported by the Key R&D Program of Zhejiang(Grant No.2022C01018)the National Natural Science Foundation of China(Grant Nos.U21B2001 and 61973273)+1 种基金the Zhejiang Provincial Natural Science Foundationof China(Grant Nos.LY21F030017 andLR19F030001)the Major Key Project of PCL(Grant Nos.PCL2022A03,PCL2021A02,and PCL2021A09).
文摘Precisely understanding the business relationships between autonomous systems(ASes)is essential for studying the Internet structure.To date,many inference algorithms,which mainly focus on peer-to-peer(P2P)and provider-to-customer(P2C)binary classification,have been proposed to classify the AS relationships and have achieved excellent results.However,business-based sibling relationships and structure-based exchange relationships have become an increasingly nonnegligible part of the Internet market in recent years.Existing algorithms are often difficult to infer due to the high similarity of these relationships to P2P or P2C relationships.In this study,we focus on multiclassification of AS relationship for the first time.We first summarize the differences between AS relationships under the structural and attribute features,and the reasons why multiclass relationships are difficult to be inferred.We then introduce new features and propose a graph convolutional network(GCN)framework,AS-GCN,to solve this multiclassification problem under complex scenes.The proposed framework considers the global network structure and local link features concurrently.Experiments on real Internet topological data validate the effectiveness of our method,that is,AS-GCN.The proposed method achieves comparable results on the binary classification task and outperforms a series of baselines on the more difficult multiclassification task,with an overall metrics above 95%.
文摘Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.
文摘Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational power of quantum systems hold the potential to outpace classical counterparts in solving complex optimization problems,which are pervasive in machine learning.Quantum Support Vector Machine(QSVM)is a quantum machine learning algorithm inspired by classical Support Vector Machine(SVM)that exploits quantum parallelism to efficiently classify data points in high-dimensional feature spaces.We provide a comprehensive overview of the underlying principles of QSVM,elucidating how different quantum feature maps and quantum kernels enable the manipulation of quantum states to perform classification tasks.Through a comparative analysis,we reveal the quantum advantage achieved by these algorithms in terms of speedup and solution quality.As a case study,we explored the potential of quantum paradigms in the context of a real-world problem:classifying pancreatic cancer biomarker data.The Support Vector Classifier(SVC)algorithm was employed for the classical approach while the QSVM algorithm was executed on a quantum simulator provided by the Qiskit quantum computing framework.The classical approach as well as the quantum-based techniques reported similar accuracy.This uniformity suggests that these methods effectively captured similar underlying patterns in the dataset.Remarkably,quantum implementations exhibited substantially reduced execution times demonstrating the potential of quantum approaches in enhancing classification efficiency.This affirms the growing significance of quantum computing as a transformative tool for augmenting machine learning paradigms and also underscores the potency of quantum execution for computational acceleration.
文摘骨盆CT影像精确分割是骨盆骨疾病的临床诊断和手术规划中非常重要的环节。针对目前2D骨盆分割方法对三维医学影像进行切片处理时损失空间信息的问题,提出了改进3D U-Net网络实现对骨盆CT影像3D自动分割。实验数据为公开数据集CTPelvic1K共1184名患者骨盆CT影像,其中包含骶骨、左髋骨、右髋骨和腰椎四个部位标签。以3D U-Net骨干网络为基础,结合自注意力机制提出3D多类分割模型3D Trans U-Net,并使用迁移学习训练3D U-Net、V-Net、Attention U-Net作为对照实验。实验结果表明:3D Trans U-Net在测试集上整个骨盆区域、骶骨、左髋骨、右髋骨、腰椎Dice系数分别达到97.99%,96.70%,97.96%,97.95%,96.89%;Dice系数、豪斯多夫距离等评价指标均优于现有经典网络3D U-Net、V-Net、Attention U-Net。因此,改进的3D Trans U-Net对骨盆不同部位具有较好的分割效果,为精准医治骨盆骨疾病提供了一条有效的技术途径。