BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the ...BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the diagnosis of CRC.AIM To explore the risk factors for SLM in CRC and construct a visual prediction model based on gray-level co-occurrence matrix(GLCM)features collected from magnetic resonance imaging(MRI).METHODS Our study retrospectively enrolled 392 patients with CRC from Yichang Central People’s Hospital from January 2015 to May 2023.Patients were randomly divided into a training and validation group(3:7).The clinical parameters and GLCM features extracted from MRI were included as candidate variables.The prediction model was constructed using a generalized linear regression model,random forest model(RFM),and artificial neural network model.Receiver operating characteristic curves and decision curves were used to evaluate the prediction model.RESULTS Among the 392 patients,48 had SLM(12.24%).We obtained fourteen GLCM imaging data for variable screening of SLM prediction models.Inverse difference,mean sum,sum entropy,sum variance,sum of squares,energy,and difference variance were listed as candidate variables,and the prediction efficiency(area under the curve)of the subsequent RFM in the training set and internal validation set was 0.917[95%confidence interval(95%CI):0.866-0.968]and 0.09(95%CI:0.858-0.960),respectively.CONCLUSION A predictive model combining GLCM image features with machine learning can predict SLM in CRC.This model can assist clinicians in making timely and personalized clinical decisions.展开更多
Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a sin...Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a single pole and scale and thus cannot fully exploit and utilise sentiment feature information,making their performance less than ideal.To resolve the problem,the authors propose a new method,GP‐FMLNet,that integrates both glyph and phonetic information and design a novel feature matrix learning process for phonetic features with which to model words that have the same pinyin information but different glyph information.Our method solves the problem of misspelling words influencing sentiment polarity prediction results.Specifically,the authors iteratively mine character,glyph,and pinyin features from the input comments sentences.Then,the authors use soft attention and matrix compound modules to model the phonetic features,which empowers their model to keep on zeroing in on the dynamic‐setting words in various positions and to dispense with the impacts of the deceptive‐setting ones.Ex-periments on six public datasets prove that the proposed model fully utilises the glyph and phonetic information and improves on the performance of existing Chinese senti-ment analysis algorithms.展开更多
In recent years, automatic identification of butterfly species arouses more and more attention in different areas. Because most of their larvae are pests, this research is not only meaningful for the popularization of...In recent years, automatic identification of butterfly species arouses more and more attention in different areas. Because most of their larvae are pests, this research is not only meaningful for the popularization of science but also important to the agricultural production and the environment. Texture as a notable feature is widely used in digital image recognition technology; for describing the texture, an extremely effective method, graylevel co-occurrence matrix(GLCM), has been proposed and used in automatic identification systems. However,according to most of the existing works, GLCM is computed by the whole image, which likely misses some important features in local areas. To solve this problem, this paper presents a new method based on the GLCM features extruded from three image blocks, and a weight-based k-nearest neighbor(KNN) search algorithm used for classifier design. With this method, a butterfly classification system works on ten butterfly species which are hard to identify by shape features. The final identification accuracy is 98%.展开更多
A leukocyte recognition system, as part of a differential blood counter system, is very important in hematology field. In this paper, the propose system aims to automatically classify the white blood cells (leukocytes...A leukocyte recognition system, as part of a differential blood counter system, is very important in hematology field. In this paper, the propose system aims to automatically classify the white blood cells (leukocytes) on a given microscopic image. The classifications of leukocytes are performed based on the combination of color and texture features of the blood cell images. The developed system classifies the leukocytes in one of the five categories (neutrophils, eosinophils, basophils, lymphocytes, and monocytes). In the preprocessing stage, the system starts with converting the microscopic images from Red Green Blue (RGB) color space to Hue Saturation Value (HSV) color space. Next, the system splits the Hue and Saturation features from the Value feature. For both Hue and Saturation features, the system processes their color information using the Feature Selection method and the Window Cropping method;while the Value feature is processed by its texture information using the Co-occurrence matrix method. The final recognition stage is performed using the Euclidean distance method. The combination of the Feature Selection and Co-occurrence Matrix methods gives the best overall recognition accuracies for classifying leukocyte images.展开更多
Offshore carbon dioxide(CO_(2)) geological storage(OCGS) represents a significant strategy for addressing climate change by curtailing greenhouse gas emissions. Nonetheless, the risk of CO_(2) leakage poses a substant...Offshore carbon dioxide(CO_(2)) geological storage(OCGS) represents a significant strategy for addressing climate change by curtailing greenhouse gas emissions. Nonetheless, the risk of CO_(2) leakage poses a substantial concern associated with this technology. This study introduces an innovative approach for establishing OCGS leakage scenarios, involving four pivotal stages, namely, interactive matrix establishment, risk matrix evaluation, cause–effect analysis, and scenario development, which has been implemented in the Pearl River Estuary Basin in China. The initial phase encompassed the establishment of an interaction matrix for OCGS systems based on features, events, and processes. Subsequent risk matrix evaluation and cause–effect analysis identified key system components, specifically CO_(2) injection and faults/features. Building upon this analysis, two leakage risk scenarios were successfully developed, accompanied by the corresponding mitigation measures. In addition, this study introduces the application of scenario development to risk assessment, including scenario numerical simulation and quantitative assessment. Overall, this research positively contributes to the sustainable development and safe operation of OCGS projects and holds potential for further refinement and broader application to diverse geographical environments and project requirements. This comprehensive study provides valuable insights into the establishment of OCGS leakage scenarios and demonstrates their practical application to risk assessment, laying the foundation for promoting the sustainable development and safe operation of ocean CO_(2) geological storage projects while proposing possibilities for future improvements and broader applications to different contexts.展开更多
Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smar...Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.展开更多
AIM: To develop an automatic tool on screening diabetic retinopathy(DR) from diabetic patients.METHODS: We extracted textures from eye fundus images of each diabetes subject using grey level co-occurrence matrix metho...AIM: To develop an automatic tool on screening diabetic retinopathy(DR) from diabetic patients.METHODS: We extracted textures from eye fundus images of each diabetes subject using grey level co-occurrence matrix method and trained a Bayesian model based on these textures. The receiver operating characteristic(ROC) curve was used to estimate the sensitivity and specificity of the Bayesian model.RESULTS: A total of 1000 eyes fundus images from diabetic patients in which 298 eyes were diagnosed as DR by two ophthalmologists. The Bayesian model was trained using four extracted textures including contrast, entropy, angular second moment and correlation using a training dataset. The Bayesian model achieved a sensitivity of 0.949 and a specificity of 0.928 in the validation dataset. The area under the ROC curve was 0.938, and the 10-fold cross validation method showed that the average accuracy rate is 93.5%.CONCLUSION: Textures extracted by grey level cooccurrence can be useful information for DR diagnosis, and a trained Bayesian model based on these textures can be an effective tool for DR screening among diabetic patients.展开更多
BACKGROUND The most important consideration in determining treatment strategies for undifferentiated early gastric cancer(UEGC)is the risk of lymph node metastasis(LNM).Therefore,identifying a potential biomarker that...BACKGROUND The most important consideration in determining treatment strategies for undifferentiated early gastric cancer(UEGC)is the risk of lymph node metastasis(LNM).Therefore,identifying a potential biomarker that predicts LNM is quite useful in determining treatment.AIM To develop a machine learning(ML)-based integral procedure to construct the LNM gray-level co-occurrence matrix(GLCM)prediction model.METHODS We retrospectively selected 526 cases of UEGC confirmed through pathological examination after radical gastrectomy without endoscopic treatment in four tertiary hospitals between January 2015 to December 2021.We extracted GLCM-based features from grayscale images and applied ML to the classification of candidate predictive variables.The robustness and clinical utility of each model were evaluated based on the following factors:Receiver operating characteristic curve(ROC),decision curve analysis,and clinical impact curve.RESULTS GLCM-based feature extraction significantly correlated with LNM.The top 7 GLCM-based factors included inertia value 0°(IV_0),inertia value 45°(IV_45),inverse gap 0°(IG_0),inverse gap 45°(IG_45),inverse gap full angle(IG_all),Haralick 30°(Haralick_30),Haralick full angle(Haralick_all),and Entropy.The areas under the ROC curve(AUCs)of the random forest classifier(RFC)model,support vector machine,eXtreme gradient boosting,artificial neural network,and decision tree ranged from 0.805[95%confidence interval(CI):0.258-1.352]to 0.925(95%CI:0.378-1.472)in the training set and from 0.794(95%CI:0.237-1.351)to 0.912(95%CI:0.355-1.469)in the testing set,respectively.The RFC(training set:AUC:0.925,95%CI:0.378-1.472;testing set:AUC:0.912,95%CI:0.355-1.469)model that incorporates Entropy,Haralick_all,Haralick_30,IG_all,IG_45,IG_0,and IV_45 had the highest predictive accuracy.CONCLUSION The evaluation results indicate that the method of selecting radiological and textural features becomes more effective in the LNM discrimination against UEGC patients.Additionally,the MLbased prediction model developed using the RFC can be used to derive treatment options and identify LNM,which can hence improve clinical outcomes.展开更多
In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic s...In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic schemes,they always find out the flippable pixels to minimize the embedding distortions.For this reason,the stego images generated by the previous schemes maintain visual quality and it is hard for steganalyzer to capture the embedding trace in spacial domain.However,the distortion maps can be calculated for cover and stego images and the difference between them is significant.In this paper,a novel binary image steganalytic scheme is proposed,which is based on distortion level co-occurrence matrix.The proposed scheme first generates the corresponding distortion maps for cover and stego images.Then the co-occurrence matrix is constructed on the distortion level maps to represent the features of cover and stego images.Finally,support vector machine,based on the gaussian kernel,is used to classify the features.Compared with the prior steganalytic methods,experimental results demonstrate that the proposed scheme can effectively detect stego images.展开更多
Melanoma is of the lethal and rare types of skin cancer.It is curable at an initial stage and the patient can survive easily.It is very difficult to screen all skin lesion patients due to costly treatment.Clinicians ar...Melanoma is of the lethal and rare types of skin cancer.It is curable at an initial stage and the patient can survive easily.It is very difficult to screen all skin lesion patients due to costly treatment.Clinicians are requiring a correct method for the right treatment for dermoscopic clinical features such as lesion borders,pigment networks,and the color of melanoma.These challenges are required an automated system to classify the clinical features of melanoma and non-melanoma disease.The trained clinicians can overcome the issues such as low contrast,lesions varying in size,color,and the existence of several objects like hair,reflections,air bubbles,and oils on almost all images.Active contour is one of the suitable methods with some drawbacks for the segmentation of irre-gular shapes.An entropy and morphology-based automated mask selection is pro-posed for the active contour method.The proposed method can improve the overall segmentation along with the boundary of melanoma images.In this study,features have been extracted to perform the classification on different texture scales like Gray level co-occurrence matrix(GLCM)and Local binary pattern(LBP).When four different moments pull out in six different color spaces like HSV,Lin RGB,YIQ,YCbCr,XYZ,and CIE L*a*b then global information from different colors channels have been combined.Therefore,hybrid fused texture features;such as local,color feature as global,shape features,and Artificial neural network(ANN)as classifiers have been proposed for the categorization of the malignant and non-malignant.Experimentations had been carried out on datasets Dermis,DermQuest,and PH2.The results of our advanced method showed super-iority and contrast with the existing state-of-the-art techniques.展开更多
Since the efficiency of treatment of thyroid disorder depends on the risk of malignancy, indeterminate follicular neoplasm (FN) images should be classified. The diagnosis process has been done by visual interpretation...Since the efficiency of treatment of thyroid disorder depends on the risk of malignancy, indeterminate follicular neoplasm (FN) images should be classified. The diagnosis process has been done by visual interpretation of experienced pathologists. However, it is difficult to separate the favor benign from borderline types. Thus, this paper presents a classification approach based on 3D nuclei model to classify favor benign and borderline types of follicular thyroid adenoma (FTA) in cytological specimens. The proposed method utilized 3D gray level co-occurrence matrix (GLCM) and random forest classifier. It was applied to 22 data sets of FN images. Furthermore, the use of 3D GLCM was compared with 2D GLCM to evaluate the classification results. From experimental results, the proposed system achieved 95.45% of the classification. The use of 3D GLCM was better than 2D GLCM according to the accuracy of classification. Consequently, the proposed method probably helps a pathologist as a prescreening tool.展开更多
By employing the elastic and elastic plastic finite element method(FEM), the effects of matrix feature on the stress transfer mechanisms of short fiber composites are studied. In the calculation, the variations in ma...By employing the elastic and elastic plastic finite element method(FEM), the effects of matrix feature on the stress transfer mechanisms of short fiber composites are studied. In the calculation, the variations in matrix modulus, yield strength and hardening modulus are considered. It is concluded that large deformation of matrix is harmful to the improvement of the mechanical performances of the composites.展开更多
Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence...Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence matrix,twenty-two texture features were extracted from the images of coal and rock.Data dimension of the feature space reduced to four by feature selection,which was according to a separability criterion based on inter-class mean difference and within-class scatter.The experimental results show that the optimized features were effective in improving the separability of the samples and reducing the time complexity of the algorithm.In the optimized low-dimensional feature space,the coal–rock classifer was set up using the fsher discriminant method.Using the 10-fold cross-validation technique,the performance of the classifer was evaluated,and an average recognition rate of 94.12%was obtained.The results of comparative experiments show that the identifcation performance of the proposed method was superior to the texture description method based on gray histogram and gradient histogram.展开更多
With the development of satellite technology,the satellite imagery of the earth’s surface and the whole surface makes it possible to survey surface resources and master the dynamic changes of the earth with high effi...With the development of satellite technology,the satellite imagery of the earth’s surface and the whole surface makes it possible to survey surface resources and master the dynamic changes of the earth with high efficiency and low consumption.As an important tool for satellite remote sensing image processing,remote sensing image classification has become a hot topic.According to the natural texture characteristics of remote sensing images,this paper combines different texture features with the Extreme Learning Machine,and proposes a new remote sensing image classification algorithm.The experimental tests are carried out through the standard test dataset SAT-4 and SAT-6.Our results show that the proposed method is a simpler and more efficient remote sensing image classification algorithm.It also achieves 99.434%recognition accuracy on SAT-4,which is 1.5%higher than the 97.95%accuracy achieved by DeepSat.At the same time,the recognition accuracy of SAT-6 reaches 99.5728%,which is 5.6%higher than DeepSat’s 93.9%.展开更多
A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the develo...A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.展开更多
The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints...The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints. After initial correspondences are built via the epipolar constraint, many point-to-point image mappings called homographies are set up to predict the matching position for feature points. To refine the predictions and reject false correspondences, four schemes are proposed. Extensive experiments on simulated data as well as on real images of scenes of variant depths show that the proposed method is effective and robust.展开更多
To accelerate the selection process of feature subsets in the rough set theory (RST), an ensemble elitist roles based quantum game (EERQG) algorithm is proposed for feature selec- tion. Firstly, the multilevel eli...To accelerate the selection process of feature subsets in the rough set theory (RST), an ensemble elitist roles based quantum game (EERQG) algorithm is proposed for feature selec- tion. Firstly, the multilevel elitist roles based dynamics equilibrium strategy is established, and both immigration and emigration of elitists are able to be self-adaptive to balance between exploration and exploitation for feature selection. Secondly, the utility matrix of trust margins is introduced to the model of multilevel elitist roles to enhance various elitist roles' performance of searching the optimal feature subsets, and the win-win utility solutions for feature selec- tion can be attained. Meanwhile, a novel ensemble quantum game strategy is designed as an intriguing exhibiting structure to perfect the dynamics equilibrium of multilevel elitist roles. Finally, the en- semble manner of multilevel elitist roles is employed to achieve the global minimal feature subset, which will greatly improve the fea- sibility and effectiveness. Experiment results show the proposed EERQG algorithm has superiority compared to the existing feature selection algorithms.展开更多
A fast feature ranking algorithm for classification in the presence of high dimensionahty and small sample size is proposed. The basic idea is that the important features force the data points of the same class to mai...A fast feature ranking algorithm for classification in the presence of high dimensionahty and small sample size is proposed. The basic idea is that the important features force the data points of the same class to maintain their intrinsic neighbor relations, whereas neighboring points of different classes are no longer to stick to one an- other. Applying this assumption, an optimization problem weighting each feature is derived. The algorithm does not involve the dense matrix eigen-decomposition which can be computationally expensive in time. Extensive exper- iments are conducted to validate the significance of selected features using the Yale, Extended YaleB and PIE data- sets. The thorough evaluation shows that, using one-nearest neighbor classifier, the recognition rates using 100-- 500 leading features selected by the algorithm distinctively outperform those with features selected by the baseline feature selection algorithms, while using support vector machine features selected by the algorithm show less prominent improvement. Moreover, the experiments demonstrate that the proposed algorithm is particularly effi- cient for multi-class face recognition problem.展开更多
Mean shift is a widely used clustering algorithm in image segmentation. However, the segmenting results are not so good as expected when dealing with the texture surface due to the influence of the textures. Therefore...Mean shift is a widely used clustering algorithm in image segmentation. However, the segmenting results are not so good as expected when dealing with the texture surface due to the influence of the textures. Therefore, an approach based on wavelet transform (WT), co-occurrence matrix (COM) and mean shift is proposed in this paper. First, WT and COM are employed to extract the optimal resolution approximation of the original image as feature image. Then, mean shift is successfully used to obtain better detection results. Finally, experiments are done to show this approach is effective.展开更多
Collaborative f iltering, as one of the most popular techniques, plays an important role in recommendation systems. However,when the user-item rating matrix is sparse,its performance will be degenerate. Recently,domai...Collaborative f iltering, as one of the most popular techniques, plays an important role in recommendation systems. However,when the user-item rating matrix is sparse,its performance will be degenerate. Recently,domain-specific recommendation approaches have been developed to address this problem.The basic idea is to partition the users and items into overlapping domains, and then perform recommendation in each domain independently. Here, a domain means a group of users having similar preference to a group of products. However, these domain-specific methods consisting of two sequential steps ignore the mutual benefi t of domain segmentation and recommendation. Hence, a unified framework is presented to simultaneously realize recommendation and make use of the domain information underlying the rating matrix in this paper. Based on matrix factorization,the proposed model learns both user preferences of multiple domains and preference selection vectors to select relevant features for each group of products. Besides, local context information is utilized from the user-item rating matrix to enhance the new framework.Experimental results on two widely used datasets, e.g., Ciao and Epinions, demonstrate the effectiveness of our proposed model.展开更多
文摘BACKGROUND Synchronous liver metastasis(SLM)is a significant contributor to morbidity in colorectal cancer(CRC).There are no effective predictive device integration algorithms to predict adverse SLM events during the diagnosis of CRC.AIM To explore the risk factors for SLM in CRC and construct a visual prediction model based on gray-level co-occurrence matrix(GLCM)features collected from magnetic resonance imaging(MRI).METHODS Our study retrospectively enrolled 392 patients with CRC from Yichang Central People’s Hospital from January 2015 to May 2023.Patients were randomly divided into a training and validation group(3:7).The clinical parameters and GLCM features extracted from MRI were included as candidate variables.The prediction model was constructed using a generalized linear regression model,random forest model(RFM),and artificial neural network model.Receiver operating characteristic curves and decision curves were used to evaluate the prediction model.RESULTS Among the 392 patients,48 had SLM(12.24%).We obtained fourteen GLCM imaging data for variable screening of SLM prediction models.Inverse difference,mean sum,sum entropy,sum variance,sum of squares,energy,and difference variance were listed as candidate variables,and the prediction efficiency(area under the curve)of the subsequent RFM in the training set and internal validation set was 0.917[95%confidence interval(95%CI):0.866-0.968]and 0.09(95%CI:0.858-0.960),respectively.CONCLUSION A predictive model combining GLCM image features with machine learning can predict SLM in CRC.This model can assist clinicians in making timely and personalized clinical decisions.
基金Science and Technology Innovation 2030‐“New Generation Artificial Intelligence”major project,Grant/Award Number:2020AAA0108703。
文摘Sentiment analysis is a fine‐grained analysis task that aims to identify the sentiment polarity of a specified sentence.Existing methods in Chinese sentiment analysis tasks only consider sentiment features from a single pole and scale and thus cannot fully exploit and utilise sentiment feature information,making their performance less than ideal.To resolve the problem,the authors propose a new method,GP‐FMLNet,that integrates both glyph and phonetic information and design a novel feature matrix learning process for phonetic features with which to model words that have the same pinyin information but different glyph information.Our method solves the problem of misspelling words influencing sentiment polarity prediction results.Specifically,the authors iteratively mine character,glyph,and pinyin features from the input comments sentences.Then,the authors use soft attention and matrix compound modules to model the phonetic features,which empowers their model to keep on zeroing in on the dynamic‐setting words in various positions and to dispense with the impacts of the deceptive‐setting ones.Ex-periments on six public datasets prove that the proposed model fully utilises the glyph and phonetic information and improves on the performance of existing Chinese senti-ment analysis algorithms.
基金the Yunnan Applied Basic Research Projects(No.2016FD039)the Talent Cultivation Project in Yunnan Province(No.KKSY201503063)
文摘In recent years, automatic identification of butterfly species arouses more and more attention in different areas. Because most of their larvae are pests, this research is not only meaningful for the popularization of science but also important to the agricultural production and the environment. Texture as a notable feature is widely used in digital image recognition technology; for describing the texture, an extremely effective method, graylevel co-occurrence matrix(GLCM), has been proposed and used in automatic identification systems. However,according to most of the existing works, GLCM is computed by the whole image, which likely misses some important features in local areas. To solve this problem, this paper presents a new method based on the GLCM features extruded from three image blocks, and a weight-based k-nearest neighbor(KNN) search algorithm used for classifier design. With this method, a butterfly classification system works on ten butterfly species which are hard to identify by shape features. The final identification accuracy is 98%.
文摘A leukocyte recognition system, as part of a differential blood counter system, is very important in hematology field. In this paper, the propose system aims to automatically classify the white blood cells (leukocytes) on a given microscopic image. The classifications of leukocytes are performed based on the combination of color and texture features of the blood cell images. The developed system classifies the leukocytes in one of the five categories (neutrophils, eosinophils, basophils, lymphocytes, and monocytes). In the preprocessing stage, the system starts with converting the microscopic images from Red Green Blue (RGB) color space to Hue Saturation Value (HSV) color space. Next, the system splits the Hue and Saturation features from the Value feature. For both Hue and Saturation features, the system processes their color information using the Feature Selection method and the Window Cropping method;while the Value feature is processed by its texture information using the Co-occurrence matrix method. The final recognition stage is performed using the Euclidean distance method. The combination of the Feature Selection and Co-occurrence Matrix methods gives the best overall recognition accuracies for classifying leukocyte images.
文摘Offshore carbon dioxide(CO_(2)) geological storage(OCGS) represents a significant strategy for addressing climate change by curtailing greenhouse gas emissions. Nonetheless, the risk of CO_(2) leakage poses a substantial concern associated with this technology. This study introduces an innovative approach for establishing OCGS leakage scenarios, involving four pivotal stages, namely, interactive matrix establishment, risk matrix evaluation, cause–effect analysis, and scenario development, which has been implemented in the Pearl River Estuary Basin in China. The initial phase encompassed the establishment of an interaction matrix for OCGS systems based on features, events, and processes. Subsequent risk matrix evaluation and cause–effect analysis identified key system components, specifically CO_(2) injection and faults/features. Building upon this analysis, two leakage risk scenarios were successfully developed, accompanied by the corresponding mitigation measures. In addition, this study introduces the application of scenario development to risk assessment, including scenario numerical simulation and quantitative assessment. Overall, this research positively contributes to the sustainable development and safe operation of OCGS projects and holds potential for further refinement and broader application to diverse geographical environments and project requirements. This comprehensive study provides valuable insights into the establishment of OCGS leakage scenarios and demonstrates their practical application to risk assessment, laying the foundation for promoting the sustainable development and safe operation of ocean CO_(2) geological storage projects while proposing possibilities for future improvements and broader applications to different contexts.
基金Supported by Shaanxi Provincial Overall Innovation Project of Science and Technology,China(Grant No.2013KTCQ01-06)
文摘Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.
基金Supported by the Priming Scientific Research Foundation for the Junior Researcher in Beijing Tongren Hospital,Capital Medical University
文摘AIM: To develop an automatic tool on screening diabetic retinopathy(DR) from diabetic patients.METHODS: We extracted textures from eye fundus images of each diabetes subject using grey level co-occurrence matrix method and trained a Bayesian model based on these textures. The receiver operating characteristic(ROC) curve was used to estimate the sensitivity and specificity of the Bayesian model.RESULTS: A total of 1000 eyes fundus images from diabetic patients in which 298 eyes were diagnosed as DR by two ophthalmologists. The Bayesian model was trained using four extracted textures including contrast, entropy, angular second moment and correlation using a training dataset. The Bayesian model achieved a sensitivity of 0.949 and a specificity of 0.928 in the validation dataset. The area under the ROC curve was 0.938, and the 10-fold cross validation method showed that the average accuracy rate is 93.5%.CONCLUSION: Textures extracted by grey level cooccurrence can be useful information for DR diagnosis, and a trained Bayesian model based on these textures can be an effective tool for DR screening among diabetic patients.
基金Supported by the General Project-Social Development Field of Shaanxi Province Science and Technology Department,No. 2021SF-313Innovation Capability Support Plan of Shaanxi Science and Technology Department-Science and Technology Innovation Team,No. 2020TD-048
文摘BACKGROUND The most important consideration in determining treatment strategies for undifferentiated early gastric cancer(UEGC)is the risk of lymph node metastasis(LNM).Therefore,identifying a potential biomarker that predicts LNM is quite useful in determining treatment.AIM To develop a machine learning(ML)-based integral procedure to construct the LNM gray-level co-occurrence matrix(GLCM)prediction model.METHODS We retrospectively selected 526 cases of UEGC confirmed through pathological examination after radical gastrectomy without endoscopic treatment in four tertiary hospitals between January 2015 to December 2021.We extracted GLCM-based features from grayscale images and applied ML to the classification of candidate predictive variables.The robustness and clinical utility of each model were evaluated based on the following factors:Receiver operating characteristic curve(ROC),decision curve analysis,and clinical impact curve.RESULTS GLCM-based feature extraction significantly correlated with LNM.The top 7 GLCM-based factors included inertia value 0°(IV_0),inertia value 45°(IV_45),inverse gap 0°(IG_0),inverse gap 45°(IG_45),inverse gap full angle(IG_all),Haralick 30°(Haralick_30),Haralick full angle(Haralick_all),and Entropy.The areas under the ROC curve(AUCs)of the random forest classifier(RFC)model,support vector machine,eXtreme gradient boosting,artificial neural network,and decision tree ranged from 0.805[95%confidence interval(CI):0.258-1.352]to 0.925(95%CI:0.378-1.472)in the training set and from 0.794(95%CI:0.237-1.351)to 0.912(95%CI:0.355-1.469)in the testing set,respectively.The RFC(training set:AUC:0.925,95%CI:0.378-1.472;testing set:AUC:0.912,95%CI:0.355-1.469)model that incorporates Entropy,Haralick_all,Haralick_30,IG_all,IG_45,IG_0,and IV_45 had the highest predictive accuracy.CONCLUSION The evaluation results indicate that the method of selecting radiological and textural features becomes more effective in the LNM discrimination against UEGC patients.Additionally,the MLbased prediction model developed using the RFC can be used to derive treatment options and identify LNM,which can hence improve clinical outcomes.
基金This work is supported by the National Natural Science Foundation of China(No.U1736118)the Natural Science Foundation of Guangdong(No.2016A030313350)+3 种基金the Special Funds for Science and Technology Development of Guangdong(No.2016KZ010103)the Key Project of Scientific Research Plan of Guangzhou(No.201804020068)the Fundamental Research Funds for the Central Universities(No.16lgjc83 and No.17lgjc45)the Science and Technology Planning Project of Guangdong Province(Grant No.2017A040405051).
文摘In recent years,binary image steganography has developed so rapidly that the research of binary image steganalysis becomes more important for information security.In most state-of-the-art binary image steganographic schemes,they always find out the flippable pixels to minimize the embedding distortions.For this reason,the stego images generated by the previous schemes maintain visual quality and it is hard for steganalyzer to capture the embedding trace in spacial domain.However,the distortion maps can be calculated for cover and stego images and the difference between them is significant.In this paper,a novel binary image steganalytic scheme is proposed,which is based on distortion level co-occurrence matrix.The proposed scheme first generates the corresponding distortion maps for cover and stego images.Then the co-occurrence matrix is constructed on the distortion level maps to represent the features of cover and stego images.Finally,support vector machine,based on the gaussian kernel,is used to classify the features.Compared with the prior steganalytic methods,experimental results demonstrate that the proposed scheme can effectively detect stego images.
文摘Melanoma is of the lethal and rare types of skin cancer.It is curable at an initial stage and the patient can survive easily.It is very difficult to screen all skin lesion patients due to costly treatment.Clinicians are requiring a correct method for the right treatment for dermoscopic clinical features such as lesion borders,pigment networks,and the color of melanoma.These challenges are required an automated system to classify the clinical features of melanoma and non-melanoma disease.The trained clinicians can overcome the issues such as low contrast,lesions varying in size,color,and the existence of several objects like hair,reflections,air bubbles,and oils on almost all images.Active contour is one of the suitable methods with some drawbacks for the segmentation of irre-gular shapes.An entropy and morphology-based automated mask selection is pro-posed for the active contour method.The proposed method can improve the overall segmentation along with the boundary of melanoma images.In this study,features have been extracted to perform the classification on different texture scales like Gray level co-occurrence matrix(GLCM)and Local binary pattern(LBP).When four different moments pull out in six different color spaces like HSV,Lin RGB,YIQ,YCbCr,XYZ,and CIE L*a*b then global information from different colors channels have been combined.Therefore,hybrid fused texture features;such as local,color feature as global,shape features,and Artificial neural network(ANN)as classifiers have been proposed for the categorization of the malignant and non-malignant.Experimentations had been carried out on datasets Dermis,DermQuest,and PH2.The results of our advanced method showed super-iority and contrast with the existing state-of-the-art techniques.
文摘Since the efficiency of treatment of thyroid disorder depends on the risk of malignancy, indeterminate follicular neoplasm (FN) images should be classified. The diagnosis process has been done by visual interpretation of experienced pathologists. However, it is difficult to separate the favor benign from borderline types. Thus, this paper presents a classification approach based on 3D nuclei model to classify favor benign and borderline types of follicular thyroid adenoma (FTA) in cytological specimens. The proposed method utilized 3D gray level co-occurrence matrix (GLCM) and random forest classifier. It was applied to 22 data sets of FN images. Furthermore, the use of 3D GLCM was compared with 2D GLCM to evaluate the classification results. From experimental results, the proposed system achieved 95.45% of the classification. The use of 3D GLCM was better than 2D GLCM according to the accuracy of classification. Consequently, the proposed method probably helps a pathologist as a prescreening tool.
文摘By employing the elastic and elastic plastic finite element method(FEM), the effects of matrix feature on the stress transfer mechanisms of short fiber composites are studied. In the calculation, the variations in matrix modulus, yield strength and hardening modulus are considered. It is concluded that large deformation of matrix is harmful to the improvement of the mechanical performances of the composites.
基金the National Natural Science Foundation of China(No.51134024/E0422)for the financial support
文摘Based on the stability and inequality of texture features between coal and rock,this study used the digital image analysis technique to propose a coal–rock interface detection method.By using gray level co-occurrence matrix,twenty-two texture features were extracted from the images of coal and rock.Data dimension of the feature space reduced to four by feature selection,which was according to a separability criterion based on inter-class mean difference and within-class scatter.The experimental results show that the optimized features were effective in improving the separability of the samples and reducing the time complexity of the algorithm.In the optimized low-dimensional feature space,the coal–rock classifer was set up using the fsher discriminant method.Using the 10-fold cross-validation technique,the performance of the classifer was evaluated,and an average recognition rate of 94.12%was obtained.The results of comparative experiments show that the identifcation performance of the proposed method was superior to the texture description method based on gray histogram and gradient histogram.
基金This work was supported in part by national science foundation project of P.R.China under Grant No.61701554State Language Commission Key Project(ZDl135-39)+1 种基金First class courses(Digital Image Processing:KC2066)MUC 111 Project,Ministry of Education Collaborative Education Project(201901056009,201901160059,201901238038).
文摘With the development of satellite technology,the satellite imagery of the earth’s surface and the whole surface makes it possible to survey surface resources and master the dynamic changes of the earth with high efficiency and low consumption.As an important tool for satellite remote sensing image processing,remote sensing image classification has become a hot topic.According to the natural texture characteristics of remote sensing images,this paper combines different texture features with the Extreme Learning Machine,and proposes a new remote sensing image classification algorithm.The experimental tests are carried out through the standard test dataset SAT-4 and SAT-6.Our results show that the proposed method is a simpler and more efficient remote sensing image classification algorithm.It also achieves 99.434%recognition accuracy on SAT-4,which is 1.5%higher than the 97.95%accuracy achieved by DeepSat.At the same time,the recognition accuracy of SAT-6 reaches 99.5728%,which is 5.6%higher than DeepSat’s 93.9%.
基金support from the Ministry of Education(MOE) Singapore Tier 1 (RG8/20)。
文摘A large database is desired for machine learning(ML) technology to make accurate predictions of materials physicochemical properties based on their molecular structure.When a large database is not available,the development of proper featurization method based on physicochemical nature of target proprieties can improve the predictive power of ML models with a smaller database.In this work,we show that two new featurization methods,volume occupation spatial matrix and heat contribution spatial matrix,can improve the accuracy in predicting energetic materials' crystal density(ρ_(crystal)) and solid phase enthalpy of formation(H_(f,solid)) using a database containing 451 energetic molecules.Their mean absolute errors are reduced from 0.048 g/cm~3 and 24.67 kcal/mol to 0.035 g/cm~3 and 9.66 kcal/mol,respectively.By leave-one-out-cross-validation,the newly developed ML models can be used to determine the performance of most kinds of energetic materials except cubanes.Our ML models are applied to predict ρ_(crystal) and H_(f,solid) of CHON-based molecules of the 150 million sized PubChem database,and screened out 56 candidates with competitive detonation performance and reasonable chemical structures.With further improvement in future,spatial matrices have the potential of becoming multifunctional ML simulation tools that could provide even better predictions in wider fields of materials science.
基金the Ph. D. Programs Foundation of Ministry of Education of China (20040248046).
文摘The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints. After initial correspondences are built via the epipolar constraint, many point-to-point image mappings called homographies are set up to predict the matching position for feature points. To refine the predictions and reject false correspondences, four schemes are proposed. Extensive experiments on simulated data as well as on real images of scenes of variant depths show that the proposed method is effective and robust.
基金supported by the National Natural Science Foundation of China(6113900261171132+4 种基金61300167)the Natural Science Foundation of Jiangsu Education Department(12KJB520013)the Open Project Program of Jiangsu Provincial Key Laboratory of Computer Information Processing Technologythe Qing Lan Project of Jiangsu Provincethe Starting Foundation for Doctoral Scientific Research,Nantong University(14B20)
文摘To accelerate the selection process of feature subsets in the rough set theory (RST), an ensemble elitist roles based quantum game (EERQG) algorithm is proposed for feature selec- tion. Firstly, the multilevel elitist roles based dynamics equilibrium strategy is established, and both immigration and emigration of elitists are able to be self-adaptive to balance between exploration and exploitation for feature selection. Secondly, the utility matrix of trust margins is introduced to the model of multilevel elitist roles to enhance various elitist roles' performance of searching the optimal feature subsets, and the win-win utility solutions for feature selec- tion can be attained. Meanwhile, a novel ensemble quantum game strategy is designed as an intriguing exhibiting structure to perfect the dynamics equilibrium of multilevel elitist roles. Finally, the en- semble manner of multilevel elitist roles is employed to achieve the global minimal feature subset, which will greatly improve the fea- sibility and effectiveness. Experiment results show the proposed EERQG algorithm has superiority compared to the existing feature selection algorithms.
基金Supported by the National Natural Science Foundation of China(71001072)the Natural Science Foundation of Guangdong Province(9451806001002294)
文摘A fast feature ranking algorithm for classification in the presence of high dimensionahty and small sample size is proposed. The basic idea is that the important features force the data points of the same class to maintain their intrinsic neighbor relations, whereas neighboring points of different classes are no longer to stick to one an- other. Applying this assumption, an optimization problem weighting each feature is derived. The algorithm does not involve the dense matrix eigen-decomposition which can be computationally expensive in time. Extensive exper- iments are conducted to validate the significance of selected features using the Yale, Extended YaleB and PIE data- sets. The thorough evaluation shows that, using one-nearest neighbor classifier, the recognition rates using 100-- 500 leading features selected by the algorithm distinctively outperform those with features selected by the baseline feature selection algorithms, while using support vector machine features selected by the algorithm show less prominent improvement. Moreover, the experiments demonstrate that the proposed algorithm is particularly effi- cient for multi-class face recognition problem.
基金Project (No. 035115039) supported by the Scientific Committee of Shanghai, China
文摘Mean shift is a widely used clustering algorithm in image segmentation. However, the segmenting results are not so good as expected when dealing with the texture surface due to the influence of the textures. Therefore, an approach based on wavelet transform (WT), co-occurrence matrix (COM) and mean shift is proposed in this paper. First, WT and COM are employed to extract the optimal resolution approximation of the original image as feature image. Then, mean shift is successfully used to obtain better detection results. Finally, experiments are done to show this approach is effective.
基金supported in part by the Humanity&Social Science general project of Ministry of Education under Grants No.14YJAZH046National Science Foundation of China under Grants No.61402304the Beijing Educational Committee Science and Technology Development Planned under Grants No.KM201610028015
文摘Collaborative f iltering, as one of the most popular techniques, plays an important role in recommendation systems. However,when the user-item rating matrix is sparse,its performance will be degenerate. Recently,domain-specific recommendation approaches have been developed to address this problem.The basic idea is to partition the users and items into overlapping domains, and then perform recommendation in each domain independently. Here, a domain means a group of users having similar preference to a group of products. However, these domain-specific methods consisting of two sequential steps ignore the mutual benefi t of domain segmentation and recommendation. Hence, a unified framework is presented to simultaneously realize recommendation and make use of the domain information underlying the rating matrix in this paper. Based on matrix factorization,the proposed model learns both user preferences of multiple domains and preference selection vectors to select relevant features for each group of products. Besides, local context information is utilized from the user-item rating matrix to enhance the new framework.Experimental results on two widely used datasets, e.g., Ciao and Epinions, demonstrate the effectiveness of our proposed model.