Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves elim...Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves eliminating irrelevant,redundant,and noisy features to streamline the classification process.Various methods,from single feature selection techniques to ensemble filter-wrapper methods,have been used in the literature.Metaheuristic algorithms have become popular due to their ability to handle optimization complexity and the continuous influx of text documents.Feature selection is inherently multi-objective,balancing the enhancement of feature relevance,accuracy,and the reduction of redundant features.This research presents a two-fold objective for feature selection.The first objective is to identify the top-ranked features using an ensemble of three multi-univariate filter methods:Information Gain(Infogain),Chi-Square(Chi^(2)),and Analysis of Variance(ANOVA).This aims to maximize feature relevance while minimizing redundancy.The second objective involves reducing the number of selected features and increasing accuracy through a hybrid approach combining Artificial Bee Colony(ABC)and Genetic Algorithms(GA).This hybrid method operates in a wrapper framework to identify the most informative subset of text features.Support Vector Machine(SVM)was employed as the performance evaluator for the proposed model,tested on two high-dimensional multiclass datasets.The experimental results demonstrated that the ensemble filter combined with the ABC+GA hybrid approach is a promising solution for text feature selection,offering superior performance compared to other existing feature selection algorithms.展开更多
An improved preprocessed Yaroslavsky filter(IPYF)is proposed to avoid the nick effects and obtain a better denoising result when the noise variance is unknown.Different from its predecessors,the similarity between t...An improved preprocessed Yaroslavsky filter(IPYF)is proposed to avoid the nick effects and obtain a better denoising result when the noise variance is unknown.Different from its predecessors,the similarity between two pixels is calculated by shearlet features.The feature vector consists of initial denoised results by the non-subsampled shearlet transform hard thresholding(NSST-HT)and NSST coefficients,which can help allocate the averaging weights more reasonably.With the correct estimated noise variance,the NSST-HT can provide good denoised results as the initial estimation and high-frequency coefficients contribute large weights to preserve textures.In case of the incorrect estimated noise variance,the low-frequency coefficients will mitigate the nick effect in cartoon regions greatly,making the IPYF more robust than the original PYF.Detailed experimental results show that the IPYF is a very competitive method based on a comprehensive consideration involving peak signal to noise ratio(PSNR),computing time,visual quality and method noise.展开更多
Diagnosis and treatment of breast cancer have been improved during the last decade; however, breast cancer is still a leading cause of death among women in the whole world. Early detection and accurate diagnosis of th...Diagnosis and treatment of breast cancer have been improved during the last decade; however, breast cancer is still a leading cause of death among women in the whole world. Early detection and accurate diagnosis of this disease has been demonstrated an approach to long survival of the patients. As an attempt to develop a reliable diagnosing method for breast cancer, we integrated support vector machine (SVM), k-nearest neighbor and probabilistic neural network into a complex machine learning approach to detect malignant breast tumour through a set of indicators consisting of age and ten cellular features of fine-needle aspiration of breast which were ranked according to signal-to-noise ratio to identify determinants distinguishing benign breast tumours from malignant ones. The method turned out to significantly improve the diagnosis, with a sensitivity of 94.04%, a specificity of 97.37%, and an overall accuracy up to 96.24% when SVM was adopted with the sigmoid kernel function under 5-fold cross validation. The results suggest that SVM is a promising methodology to be further developed into a practical adjunct implement to help discerning benign and malignant breast tumours and thus reduce the incidence of misdiagnosis.展开更多
This paper presents a variant of Haar-Iike feature used in Viola and Jones detection framework, called scattered rectangle feature, based on the common-component analysis of local region feature. Three common componen...This paper presents a variant of Haar-Iike feature used in Viola and Jones detection framework, called scattered rectangle feature, based on the common-component analysis of local region feature. Three common components, feature filter, feature structure and feature form, are extracted without concerning the details of the studied region features, which cast a new light on region feature design for specific applications and requirements: modifying some component(s) of a feature for an improved one or combining different components of existing features for a new favorable one. Scattered rectangle feature follows the former way, extending the feature structure component of Haar-like feature out of the restriction of the geometry adjacency rule, which results in a richer representation that explores much more orientations other than horizontal, vertical and diagonal, as well as misaligned, detached and non-rectangle shape information that is unreachable to Haar-Iike feature. The training result of the two face detectors in the experiments illustrates the benefits of scattered rectangle feature empirically; the comparison of the ROC curves under a rigid and objective detection criterion on MIT+CMU upright face test set shows that the cascade based on scattered rectangle features outperforms that based on Haar-Iike features.展开更多
A simple configuration for the generation of a switchable dual-wavelength fiber ring laser is presented.The proposed configuration employs a short twin-core photonic crystal fiber acting as a Mach–Zehnder interferome...A simple configuration for the generation of a switchable dual-wavelength fiber ring laser is presented.The proposed configuration employs a short twin-core photonic crystal fiber acting as a Mach–Zehnder interferometer at room temperature.A polarization controller is further utilized to enable switchable dualwavelength operation.展开更多
基金supported by Universiti Sains Malaysia(USM)and School of Computer Sciences,USM。
文摘Feature selection is a crucial technique in text classification for improving the efficiency and effectiveness of classifiers or machine learning techniques by reducing the dataset’s dimensionality.This involves eliminating irrelevant,redundant,and noisy features to streamline the classification process.Various methods,from single feature selection techniques to ensemble filter-wrapper methods,have been used in the literature.Metaheuristic algorithms have become popular due to their ability to handle optimization complexity and the continuous influx of text documents.Feature selection is inherently multi-objective,balancing the enhancement of feature relevance,accuracy,and the reduction of redundant features.This research presents a two-fold objective for feature selection.The first objective is to identify the top-ranked features using an ensemble of three multi-univariate filter methods:Information Gain(Infogain),Chi-Square(Chi^(2)),and Analysis of Variance(ANOVA).This aims to maximize feature relevance while minimizing redundancy.The second objective involves reducing the number of selected features and increasing accuracy through a hybrid approach combining Artificial Bee Colony(ABC)and Genetic Algorithms(GA).This hybrid method operates in a wrapper framework to identify the most informative subset of text features.Support Vector Machine(SVM)was employed as the performance evaluator for the proposed model,tested on two high-dimensional multiclass datasets.The experimental results demonstrated that the ensemble filter combined with the ABC+GA hybrid approach is a promising solution for text feature selection,offering superior performance compared to other existing feature selection algorithms.
基金Supported by Open Fund of State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation(Southwest Petroleum University)(PL N1303)Open Fund of State Key Laboratory of Marine Geology(Tongji University)(MGK1412)+1 种基金Fundation of Graduate Innovation Center in NUAA(kfjj201430)the Fundamental Research Funds for the Central Universities
文摘An improved preprocessed Yaroslavsky filter(IPYF)is proposed to avoid the nick effects and obtain a better denoising result when the noise variance is unknown.Different from its predecessors,the similarity between two pixels is calculated by shearlet features.The feature vector consists of initial denoised results by the non-subsampled shearlet transform hard thresholding(NSST-HT)and NSST coefficients,which can help allocate the averaging weights more reasonably.With the correct estimated noise variance,the NSST-HT can provide good denoised results as the initial estimation and high-frequency coefficients contribute large weights to preserve textures.In case of the incorrect estimated noise variance,the low-frequency coefficients will mitigate the nick effect in cartoon regions greatly,making the IPYF more robust than the original PYF.Detailed experimental results show that the IPYF is a very competitive method based on a comprehensive consideration involving peak signal to noise ratio(PSNR),computing time,visual quality and method noise.
基金Joint Research Project Between Chongqing University and National University of Singapore (No. ARF-151-000-014-112)the Basic Research & Applied Basic Research Program of Chongqing University (No.71341103)Natural Science Foundation of Chongqing S & T Committee(No. CSTC,2006BB5240)
文摘Diagnosis and treatment of breast cancer have been improved during the last decade; however, breast cancer is still a leading cause of death among women in the whole world. Early detection and accurate diagnosis of this disease has been demonstrated an approach to long survival of the patients. As an attempt to develop a reliable diagnosing method for breast cancer, we integrated support vector machine (SVM), k-nearest neighbor and probabilistic neural network into a complex machine learning approach to detect malignant breast tumour through a set of indicators consisting of age and ten cellular features of fine-needle aspiration of breast which were ranked according to signal-to-noise ratio to identify determinants distinguishing benign breast tumours from malignant ones. The method turned out to significantly improve the diagnosis, with a sensitivity of 94.04%, a specificity of 97.37%, and an overall accuracy up to 96.24% when SVM was adopted with the sigmoid kernel function under 5-fold cross validation. The results suggest that SVM is a promising methodology to be further developed into a practical adjunct implement to help discerning benign and malignant breast tumours and thus reduce the incidence of misdiagnosis.
基金Supported by the National Basic Research Program of China (Grant No.2006CB303106)the Doctoral Subject Special Scientific Research Fund of the Ministry of Education of China (Grant No.20070335074)
文摘This paper presents a variant of Haar-Iike feature used in Viola and Jones detection framework, called scattered rectangle feature, based on the common-component analysis of local region feature. Three common components, feature filter, feature structure and feature form, are extracted without concerning the details of the studied region features, which cast a new light on region feature design for specific applications and requirements: modifying some component(s) of a feature for an improved one or combining different components of existing features for a new favorable one. Scattered rectangle feature follows the former way, extending the feature structure component of Haar-like feature out of the restriction of the geometry adjacency rule, which results in a richer representation that explores much more orientations other than horizontal, vertical and diagonal, as well as misaligned, detached and non-rectangle shape information that is unreachable to Haar-Iike feature. The training result of the two face detectors in the experiments illustrates the benefits of scattered rectangle feature empirically; the comparison of the ROC curves under a rigid and objective detection criterion on MIT+CMU upright face test set shows that the cascade based on scattered rectangle features outperforms that based on Haar-Iike features.
基金Financial support for this work was provided by the Deanship of Scientific Research(DSR)of King Fahd University of Petroleum and Minerals under Grant No.FT121004
文摘A simple configuration for the generation of a switchable dual-wavelength fiber ring laser is presented.The proposed configuration employs a short twin-core photonic crystal fiber acting as a Mach–Zehnder interferometer at room temperature.A polarization controller is further utilized to enable switchable dualwavelength operation.