Big data is becoming increasingly important because of the enormous information generation and storage in recent years.It has become a challenge to the data mining technique and management.Based on the characteristics...Big data is becoming increasingly important because of the enormous information generation and storage in recent years.It has become a challenge to the data mining technique and management.Based on the characteristics of geometric explosion of information in the era of big data,this paper studies the possible approaches to balance the maximum value and privacy of information,and disposes the Nine-Cells information matrix,hierarchical classification.Furthermore,the paper uses the rough sets theory to proceed from the two dimensions of value and privacy,establishes information classification method,puts forward the countermeasures for information security.Taking spam messages for example,the massive spam messages can be classified,and then targeted hierarchical management strategy was put forward.This paper proposes personal Information index system,Information management platform and possible solutions to protect information security and utilize information value in the age of big data.展开更多
We explore the techniques of utilizing N gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation process...We explore the techniques of utilizing N gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classifier is implemented. Experimental results show that hierarchically classifying Chinese text documents based N grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.展开更多
With the deterioration of the environment,it is imperative to protect coastal wetlands.Using multi-source remote sensing data and object-based hierarchical classification to classify coastal wetlands is an effective m...With the deterioration of the environment,it is imperative to protect coastal wetlands.Using multi-source remote sensing data and object-based hierarchical classification to classify coastal wetlands is an effective method.The object-based hierarchical classification using remote sensing indices(OBH-RSI)for coastal wetland is proposed to achieve fine classification of coastal wetland.First,the original categories are divided into four groups according to the category characteristics.Second,the training and test maps of each group are extracted according to the remote sensing indices.Third,four groups are passed through the classifier in order.Finally,the results of the four groups are combined to get the final classification result map.The experimental results demonstrate that the overall accuracy,average accuracy and kappa coefficient of the proposed strategy are over 94%using the Yellow River Delta dataset.展开更多
This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea di...This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea disease recognition methods.This method enhances the feature extraction ability by conducting hierarchical classification learning based on the EfficientNet model,effectively alleviating the impact of high similarity between tea diseases on the model’s classification performance.To better solve the problem of few and unevenly distributed tea disease samples,this study introduced a weighted sampling scheme to optimize data processing,which not only alleviates the overfitting effect caused by too few sample data but also balances the probability of extracting imbalanced classification data.The experimental results show that the proposed method was significant in identifying both healthy tea leaves and four common leaf diseases of tea(tea algal spot disease,tea white spot disease,tea anthracnose disease,and tea leaf blight disease).After applying the“weighted sampling hierarchical classification learning method”to train 7 different efficient backbone networks,most of their accuracies have improved.The EfficientNet-B1 model proposed in this study achieved an accuracy rate of 99.21%after adopting this learning method,which is higher than EfficientNet-b2(98.82%)and MobileNet-V3(98.43%).In addition,to better apply the results of identifying tea diseases,this study developed a mini-program that operates on WeChat.Users can quickly obtain accurate identification results and corresponding disease descriptions and prevention methods through simple operations.This intelligent tool for identifying tea diseases can serve as an auxiliary tool for farmers,consumers,and related scientific researchers and has certain practical value.展开更多
Hierarchical Text Classification(HTC)aims to match text to hierarchical labels.Existing methods overlook two critical issues:first,some texts cannot be fully matched to leaf node labels and need to be classified to th...Hierarchical Text Classification(HTC)aims to match text to hierarchical labels.Existing methods overlook two critical issues:first,some texts cannot be fully matched to leaf node labels and need to be classified to the correct parent node instead of treating leaf nodes as the final classification target.Second,error propagation occurs when a misclassification at a parent node propagates down the hierarchy,ultimately leading to inaccurate predictions at the leaf nodes.To address these limitations,we propose an uncertainty-guided HTC depth-aware model called DepthMatch.Specifically,we design an early stopping strategy with uncertainty to identify incomplete matching between text and labels,classifying them into the corresponding parent node labels.This approach allows us to dynamically determine the classification depth by leveraging evidence to quantify and accumulate uncertainty.Experimental results show that the proposed DepthMatch outperforms recent strong baselines on four commonly used public datasets:WOS(Web of Science),RCV1-V2(Reuters Corpus Volume I),AAPD(Arxiv Academic Paper Dataset),and BGC.Notably,on the BGC dataset,it improvesMicro-F1 andMacro-F1 scores by at least 1.09%and 1.74%,respectively.展开更多
Landform elements with varying morphologies and spatial arrangements are recognized as feature indicator of landform classification and play a critical role in geomorphological studies.Differential geometry method has...Landform elements with varying morphologies and spatial arrangements are recognized as feature indicator of landform classification and play a critical role in geomorphological studies.Differential geometry method has been extensively applied in prior landform element research,while its efficacy in differentiating similar morphological characteristics remains inadequate to date.To reduce reliance on geomorphometric variables and increase awareness of landform patterns,geomorphons method was generated in previous study corresponding to specific landform reclassification map based on lookup table.Besides,to address the problem of feature similarity,hierarchical classification was proposed and effectively utilized for terrain recognition through the analytical strategy of fuzzy gradient features.Thus,combining the advantages of these two aspects,a hierarchical framework was proposed in this study for landform element pattern recognition considering the morphology and hierarchy factors.First,the local triplet patterns derived from geomorphons were enhanced by setting the flatness threshold,and subsequently adopted for the primary landform element recognition.Then,as geomorphic units with the same morphology possess different spatial analytical scales,the unidentified landform elements under the principle of scale adaptation were determined by calculating the spatial correlation and entropy information.To ensure the effectiveness of this proposed method,the sampling points were randomly selected from NASADEM data and then validated against a real 3D terrain model.Quantitative results of landform element pattern recognition demonstrate that our approach can reach above 77%average accuracy.Additionally,it delineates local details more effectively than geomorphons in visual assessment,resulting in a 7%accuracy improvement in overall scale.展开更多
This study assesses the projected changes in the climate zoning of Côte d’Ivoire using the hierarchical classification of principal components (HCPC) method applied to the daily precipitation data of an ensemble...This study assesses the projected changes in the climate zoning of Côte d’Ivoire using the hierarchical classification of principal components (HCPC) method applied to the daily precipitation data of an ensemble of 14 CORDEX-AFRICA simulations under RCP4.5 and RCP8.5 scenarios. The results indicate the existence of three climate zones in Côte d’Ivoire (the coastal, the centre and the north) over the historical period (1981-2005). Moreover, CORDEX simulations project an extension of the surface area of drier climatic zones while a reduction of wetter zones, associated with the appearance of an intermediate climate zone with surface area varying from 77,560 km<sup>2</sup> to 134,960 km<sup>2</sup> depending on the period and the scenario. These results highlight the potential impacts of climate change on the delimitation of the climate zones of Côte d’Ivoire under the greenhouse gas emission scenarios. Thus, there is a reduction in the surface areas suitable for the production of cash crops such as cocoa and coffee. This could hinder the country’s economy and development, mainly based on these cash crops.展开更多
A vast amount of information has been produced in recent years,which brings a huge challenge to information management.The better usage of big data is of important theoretical and practical significance for effectivel...A vast amount of information has been produced in recent years,which brings a huge challenge to information management.The better usage of big data is of important theoretical and practical significance for effectively addressing and managing messages.In this paper,we propose a nine-rectangle-grid information model according to the information value and privacy,and then present information use policies based on the rough set theory.Recurrent neural networks were employed to classify OTT messages.The content of user interest is effectively incorporated into the classification process during the annotation of OTT messages,ending with a reliable trained classification model.Experimental results showed that the proposed method yielded an accurate classification performance and hence can be used for effective distribution and control of OTT messages.展开更多
This paper proposes a hierarchical word domain assignment algorithm to automatically build domain dictionaries from Machine-Readable Dictionary(MRD).The process for word domain assignment can be divided into three ste...This paper proposes a hierarchical word domain assignment algorithm to automatically build domain dictionaries from Machine-Readable Dictionary(MRD).The process for word domain assignment can be divided into three steps:1) Hierarchical structure constructing;2) Classifier training;3) Word domain assigning.Compared with the traditional methods,the hierarchical word domain assignment algorithm enhances the accuracy of word domain assignment while reducing human efforts on collecting corpus.Experiments on WordNet 2.0 show that 62.53% of the first domain labels are matched with the WordNet Domains 3.0 by using gloss-based word domain assignment,and the performance can be further improved by utilizing the hierarchical relationships among the domain sets.展开更多
Hierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously.Existing methods tend to overlook that different image region...Hierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously.Existing methods tend to overlook that different image regions contribute differently to label prediction at different granularities,and also insufficiently consider relationships between the hierarchical multi-granularity labels.We introduce a sequence-to-sequence mechanism to overcome these two problems and propose a multi-granularity sequence generation(MGSG)approach for the hierarchical multi-granularity image classification task.Specifically,we introduce a transformer architecture to encode the image into visual representation sequences.Next,we traverse the taxonomic tree and organize the multi-granularity labels into sequences,and vectorize them and add positional information.The proposed multi-granularity sequence generation method builds a decoder that takes visual representation sequences and semantic label embedding as inputs,and outputs the predicted multi-granularity label sequence.The decoder models dependencies and correlations between multi-granularity labels through a masked multi-head self-attention mechanism,and relates visual information to the semantic label information through a crossmodality attention mechanism.In this way,the proposed method preserves the relationships between labels at different granularity levels and takes into account the influence of different image regions on labels with different granularities.Evaluations on six public benchmarks qualitatively and quantitatively demonstrate the advantages of the proposed method.Our project is available at https://github.com/liuxindazz/mgs.展开更多
A multilevel secure relation hierarchical data model for multilevel secure database is extended from the relation hierarchical data model in single level environment in this paper. Based on the model, an upper lowe...A multilevel secure relation hierarchical data model for multilevel secure database is extended from the relation hierarchical data model in single level environment in this paper. Based on the model, an upper lower layer relationalintegrity is presented after we analyze and eliminate the covert channels caused by the database integrity.Two SQL statements are extended to process polyinstantiation in the multilevel secure environment.The system based on the multilevel secure relation hierarchical data model is capable of integratively storing and manipulating complicated objects ( e.g. , multilevel spatial data) and conventional data ( e.g. , integer, real number and character string) in multilevel secure database.展开更多
Molecular subtyping of gastric cancer(GC)aims to comprehend its genetic landscape.However,the efficacy of current subtyping methods is hampered by their mixed use of molecular features,a lack of strategy optimization,...Molecular subtyping of gastric cancer(GC)aims to comprehend its genetic landscape.However,the efficacy of current subtyping methods is hampered by their mixed use of molecular features,a lack of strategy optimization,and the limited availability of public GC datasets.There is a pressing need for a precise and easily adoptable subtyping approach for early DNA-based screening and treatment.Based on TCGA subtypes,we developed a novel DNA-based hierarchical classifier for gastric cancer molecular subtyping(HCG),which employs gene mutations,copy number aberrations,and methylation patterns as predictors.By incorporating the closely related esophageal adenocarcinomas dataset,we expanded the TCGA GC dataset for the training and testing of HCG(n=453).The optimization of HCG was achieved through three hierarchical strategies using Lasso-Logistic regression,evaluated by their overall the area under receiver operating characteristic curve(auROC),accuracy,F1 score,the area under precision-recall curve(auPRC)and their capability for clinical stratification using multivariate survival analysis.Subtype-specific DNA alteration biomarkers were discerned through difference tests based on HCG defined subtypes.Our HCG classifier demonstrated superior performance in terms of overall auROC(0.95),accuracy(0.88),F1 score(0.87)and auPRC(0.86),significantly improving the clinical stratification of patients(overall p-value=0.032).Difference tests identified 25 subtype-specific DNA alterations,including a high mutation rate in the SYNE1,ITGB4,and COL22A1 genes for the MSI subtype,and hypermethylation of ALS2CL,KIAA0406,and RPRD1B genes for the EBV subtype.HCG is an accurate and robust classifier for DNA-based GC molecular subtyping with highly predictive clinical stratification performance.The training and test datasets,along with the analysis programs of HCG,are accessible on the GitHub website(github.com/LabxSCUT).展开更多
In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific...In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.展开更多
Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded conv...Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded convolutional neural network(CasCNN)-based hierarchical digital modulation classification scheme,where M-ary phase shift keying(PSK)and M-ary quadrature amplitude modulation(QAM)modulation formats are considered to be classified.In CasCNN,two-block convolutional neural networks are cascaded.The first block network is utilized to classify the different classes of modulation formats,namely PSK and QAM.The second block is designed to identify the indexes of the modulations in the same PSK or QAM class.Moreover,it is noted that the gird constellation diagram extracted from the received signal is utilized as the inputs to the CasCNN.Extensive simulations demonstrate that CasCNN yields performance gain and performs stronger robustness to frequency offset compared with other recent methods.Specifically,CasCNN achieves 90%classification accuracy at 4 dB signal-to-noise ratio when the symbol length is set as 256.展开更多
Abstract Objective To develop a new technique for assessing the risk of birth defects, which are a major cause of infant mortality and disability in many parts of the world. Methods The region of interest in this stud...Abstract Objective To develop a new technique for assessing the risk of birth defects, which are a major cause of infant mortality and disability in many parts of the world. Methods The region of interest in this study was Heshun County, the county in China with the highest rate of neural tube defects (NTDs). A hybrid particle swarm optimization/ant colony optimization (PSO/ACO) algorithm was used to quantify the probability of NTDs occurring at villages with no births. The hybrid PSO/ACO algorithm is a form of artificial intelligence adapted for hierarchical classification. It is a powerful technique for modeling complex problems involving impacts of causes. Results The algorithm was easy to apply, with the accuracy of the results being 69.5%+7.02% at the 95% confidence level. Conclusion The proposed method is simple to apply, has acceptable fault tolerance, and greatly enhances the accuracy of calculations.展开更多
This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relatio...This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relation hierarchical data model. Based on the multilevel relation hierarchical data model, the concept of upper lower layer relational integrity is presented after we analyze and eliminate the covert channels caused by the database integrity. Two SQL statements are extended to process polyinstantiation in the multilevel secure environment. The system is based on the multilevel relation hierarchical data model and is capable of integratively storing and manipulating multilevel complicated objects ( e.g., multilevel spatial data) and multilevel conventional data ( e.g., integer, real number and character string).展开更多
Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can...Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can facilitate the development of this baseline knowledge across broad extents, but they first must be classified into forest community types. Here, we compared three alternative classifications across the United States using data from over 117,000 U.S. Department of Agriculture Forest Service Forest Inventory and Analysis (FIA) plots. Methods: Each plot had three forest community type labels: (1) "FIA" types were assigned by the FIA program using a supervised method; (2) "USNVC" types were assigned via a key based on the U.S. National Vegetation Classification; (3) "empirical" types resulted from unsupervised clustering of tree species information. We assessed the degree to which analog classes occurred among classifications, compared indicator species values, and used random forest models to determine how well the classifications could be predicted using environmental variables. Results: The classifications generated groups of classes that had broadly similar distributions, but often there was no one-to-one analog across the classifications. The Iongleaf pine forest community type stood out as the exception: it was the only class with strong analogs across all classifications. Analogs were most lacking for forest community types with species that occurred across a range of geographic and environmental conditions, such as Ioblolly pine types, indicator species metrics were generally high for the USNVC, suggesting that LJSNVC classes are floristically well-defined. The empirical classification was best predicted by environmental variables. The most important predictors differed slightly but were broadly similar across all classifications, and included slope, amount of forest in the surrounding landscape, average minimum temperature, and other climate variables. Conclusions: The classifications have similarities and differences that reflect their differing approaches and Dbjectives. They are most consistent for forest community types that occur in a relatively narrow range of Invironmental conditions, and differ most for types with wide-ranging tree species. Environmental variables at variety of scales were important for predicting all classifications, though strongest for the empirical and FIA, guggesting that each is useful for studying how forest communities respond to of multi-scale environmental processes, including global change drivers.展开更多
The paper carried on the classified and rating evaluation primarily on natural landscape resources in Lushan Mountain. According to the evaluation, exploiting and utilizing the situation of scenic spot natural landsca...The paper carried on the classified and rating evaluation primarily on natural landscape resources in Lushan Mountain. According to the evaluation, exploiting and utilizing the situation of scenic spot natural landscape resources, some reasonable advices were given on further exploiting Lushan Mountain natural scenic spot, expecting that it could supply some theoretical references for the natural landscape resources sustainable development in Lushan Mountain in the future.展开更多
In order to study social inequalities, indices can be used to summarize the multiple dimensions of the socioeconomic status. As a part of the Equit’Area Project, a public health program focused on social and environm...In order to study social inequalities, indices can be used to summarize the multiple dimensions of the socioeconomic status. As a part of the Equit’Area Project, a public health program focused on social and environmental health inequalities;a statistical procedure to create (neighborhood) socioeconomic indices was developed. This procedure uses successive principal components analyses to select variables and create the index. In order to simplify the application of the procedure for non-specialists, the R package SesIndexCreatoR was created. It allows the creation of the index with all the possible options of the procedure, the classification of the resulting index in categories using several classical methods, the visualization of the results, and the generation of automatic reports.展开更多
To partition the scintigraphic image, several methods are used, among which is Kohonen’s self-organizing map algorithm. The objective of this study was to perform an ascending hierarchical classification (HAC) on the...To partition the scintigraphic image, several methods are used, among which is Kohonen’s self-organizing map algorithm. The objective of this study was to perform an ascending hierarchical classification (HAC) on the results of the Kohonen self-organizing map. This makes it possible to carry out the second phase necessary for the elaboration of the classifier by grouping the neurons as well as possible into 3 classes then by reconstituting the scintigraphic image from the 3 classes. This partition proceeds by successive groups, thus merging at each iteration two subsets of neurons using a measure of similarity which is Ward’s method. In this method, the algorithm aggregates the nearest neurons into classes. This allows us to obtain a dendrogram that looks like a tree. And this one needs to be cut. And to have an adequate cut-off level, we have established the variation of the Davies Bouldin index as a function of the number of classes. The minimum value of this index gave the optimal number of classes which corresponded to 3 in the study. These three groups A, B, C have a variable intensity. This intensity can be high, it can be medium or low. The high, medium and low intensities corresponded respectively to metastases for class A, to degenerative or inflammatory phenomena for class B and to normal radiopharmaceutical uptake for class C. To confirm this strong suspicion, we performed reconstructions using a filter. And after this reconstruction, we had images like at the entrance. And for the interpretation of these images, we used a visual metric. This enabled us to note that for the interval [0 - 50[, the image is not contrasted and no lesion could be detected. Over the interval [50 - 200[, we observed the distribution of the radiopharmaceutical over the entire skeletal whole body. On this reconstruction interval, the visual metric shows hypofixation in the bladder and areas suspected of metastases. Over the interval [200 - 250[, we detected hyperfixations linked to degenerative, inflammatory or metastatic lesions. And finally, in the last interval, [250 - 252], we found regions that showed strong uptake (bladder, sternum, etc.). This capture is physiological. Apart from physiological hyperfixation, the other types of hyperfixation were considered metastatic according to the two nuclear scientists who interpreted these images. In total, the HAC allowed us to sub-classify the data into 3 groups which were subsequently reconstructed. And this reconstruction technique highlighted the periarticular metastases belonging to the class [250 - 252]. This allowed us to highlight the oligo-metastases and to carry out in most of these patients a radical prostatectomy.展开更多
文摘Big data is becoming increasingly important because of the enormous information generation and storage in recent years.It has become a challenge to the data mining technique and management.Based on the characteristics of geometric explosion of information in the era of big data,this paper studies the possible approaches to balance the maximum value and privacy of information,and disposes the Nine-Cells information matrix,hierarchical classification.Furthermore,the paper uses the rough sets theory to proceed from the two dimensions of value and privacy,establishes information classification method,puts forward the countermeasures for information security.Taking spam messages for example,the massive spam messages can be classified,and then targeted hierarchical management strategy was put forward.This paper proposes personal Information index system,Information management platform and possible solutions to protect information security and utilize information value in the age of big data.
基金Supported by the China Postdoctoral Science Foundation
文摘We explore the techniques of utilizing N gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classifier is implemented. Experimental results show that hierarchically classifying Chinese text documents based N grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.
基金supported by the Beijing Natural Science Foundation(No.JQ20021)the National Natural Science Foundation of China(Nos.61922013,61421001 and U1833203)the Remote Sensing Monitoring Project of Geographical Elements in Shandong Yellow River Delta National Nature Reserve。
文摘With the deterioration of the environment,it is imperative to protect coastal wetlands.Using multi-source remote sensing data and object-based hierarchical classification to classify coastal wetlands is an effective method.The object-based hierarchical classification using remote sensing indices(OBH-RSI)for coastal wetland is proposed to achieve fine classification of coastal wetland.First,the original categories are divided into four groups according to the category characteristics.Second,the training and test maps of each group are extracted according to the remote sensing indices.Third,four groups are passed through the classifier in order.Finally,the results of the four groups are combined to get the final classification result map.The experimental results demonstrate that the overall accuracy,average accuracy and kappa coefficient of the proposed strategy are over 94%using the Yellow River Delta dataset.
基金financial support provided by the Major Project of Yunnan Science and Technology,under Project No.202302AE09002003,entitled“Research on the Integration of Key Technologies in Smart Agriculture.”。
文摘This study proposed a weighted sampling hierarchical classification learning method based on an efficient backbone network model to address the problems of high costs,low accuracy,and time-consuming traditional tea disease recognition methods.This method enhances the feature extraction ability by conducting hierarchical classification learning based on the EfficientNet model,effectively alleviating the impact of high similarity between tea diseases on the model’s classification performance.To better solve the problem of few and unevenly distributed tea disease samples,this study introduced a weighted sampling scheme to optimize data processing,which not only alleviates the overfitting effect caused by too few sample data but also balances the probability of extracting imbalanced classification data.The experimental results show that the proposed method was significant in identifying both healthy tea leaves and four common leaf diseases of tea(tea algal spot disease,tea white spot disease,tea anthracnose disease,and tea leaf blight disease).After applying the“weighted sampling hierarchical classification learning method”to train 7 different efficient backbone networks,most of their accuracies have improved.The EfficientNet-B1 model proposed in this study achieved an accuracy rate of 99.21%after adopting this learning method,which is higher than EfficientNet-b2(98.82%)and MobileNet-V3(98.43%).In addition,to better apply the results of identifying tea diseases,this study developed a mini-program that operates on WeChat.Users can quickly obtain accurate identification results and corresponding disease descriptions and prevention methods through simple operations.This intelligent tool for identifying tea diseases can serve as an auxiliary tool for farmers,consumers,and related scientific researchers and has certain practical value.
基金sponsored by the National Key Research and Development Program of China(No.2021YFF0704100)the National Natural Science Foundation of China(No.62136002)+1 种基金the Chongqing Natural Science Foundation(No.cstc2022ycjh-bgzxm0004)the Science and Technology Commission of Chongqing Municipality(CSTB2023NSCQ-LZX0006),respectively.
文摘Hierarchical Text Classification(HTC)aims to match text to hierarchical labels.Existing methods overlook two critical issues:first,some texts cannot be fully matched to leaf node labels and need to be classified to the correct parent node instead of treating leaf nodes as the final classification target.Second,error propagation occurs when a misclassification at a parent node propagates down the hierarchy,ultimately leading to inaccurate predictions at the leaf nodes.To address these limitations,we propose an uncertainty-guided HTC depth-aware model called DepthMatch.Specifically,we design an early stopping strategy with uncertainty to identify incomplete matching between text and labels,classifying them into the corresponding parent node labels.This approach allows us to dynamically determine the classification depth by leveraging evidence to quantify and accumulate uncertainty.Experimental results show that the proposed DepthMatch outperforms recent strong baselines on four commonly used public datasets:WOS(Web of Science),RCV1-V2(Reuters Corpus Volume I),AAPD(Arxiv Academic Paper Dataset),and BGC.Notably,on the BGC dataset,it improvesMicro-F1 andMacro-F1 scores by at least 1.09%and 1.74%,respectively.
基金supported by the National Natural Science Foundation of China(Grant Nos.41930102,41971339 and 41771423)Shandong University of Science and Technology Research Fund(No.2019TDJH103)。
文摘Landform elements with varying morphologies and spatial arrangements are recognized as feature indicator of landform classification and play a critical role in geomorphological studies.Differential geometry method has been extensively applied in prior landform element research,while its efficacy in differentiating similar morphological characteristics remains inadequate to date.To reduce reliance on geomorphometric variables and increase awareness of landform patterns,geomorphons method was generated in previous study corresponding to specific landform reclassification map based on lookup table.Besides,to address the problem of feature similarity,hierarchical classification was proposed and effectively utilized for terrain recognition through the analytical strategy of fuzzy gradient features.Thus,combining the advantages of these two aspects,a hierarchical framework was proposed in this study for landform element pattern recognition considering the morphology and hierarchy factors.First,the local triplet patterns derived from geomorphons were enhanced by setting the flatness threshold,and subsequently adopted for the primary landform element recognition.Then,as geomorphic units with the same morphology possess different spatial analytical scales,the unidentified landform elements under the principle of scale adaptation were determined by calculating the spatial correlation and entropy information.To ensure the effectiveness of this proposed method,the sampling points were randomly selected from NASADEM data and then validated against a real 3D terrain model.Quantitative results of landform element pattern recognition demonstrate that our approach can reach above 77%average accuracy.Additionally,it delineates local details more effectively than geomorphons in visual assessment,resulting in a 7%accuracy improvement in overall scale.
文摘This study assesses the projected changes in the climate zoning of Côte d’Ivoire using the hierarchical classification of principal components (HCPC) method applied to the daily precipitation data of an ensemble of 14 CORDEX-AFRICA simulations under RCP4.5 and RCP8.5 scenarios. The results indicate the existence of three climate zones in Côte d’Ivoire (the coastal, the centre and the north) over the historical period (1981-2005). Moreover, CORDEX simulations project an extension of the surface area of drier climatic zones while a reduction of wetter zones, associated with the appearance of an intermediate climate zone with surface area varying from 77,560 km<sup>2</sup> to 134,960 km<sup>2</sup> depending on the period and the scenario. These results highlight the potential impacts of climate change on the delimitation of the climate zones of Côte d’Ivoire under the greenhouse gas emission scenarios. Thus, there is a reduction in the surface areas suitable for the production of cash crops such as cocoa and coffee. This could hinder the country’s economy and development, mainly based on these cash crops.
基金This work is supported by the Research on Big Data in Application for Education of BUPT(No.2018Y0403)Fundamental Research Funds of BUPT(No.2018XKJC07,2018RC27)the National Natural Science Foundation of China(No.61571059).
文摘A vast amount of information has been produced in recent years,which brings a huge challenge to information management.The better usage of big data is of important theoretical and practical significance for effectively addressing and managing messages.In this paper,we propose a nine-rectangle-grid information model according to the information value and privacy,and then present information use policies based on the rough set theory.Recurrent neural networks were employed to classify OTT messages.The content of user interest is effectively incorporated into the classification process during the annotation of OTT messages,ending with a reliable trained classification model.Experimental results showed that the proposed method yielded an accurate classification performance and hence can be used for effective distribution and control of OTT messages.
基金supported by the BIT Technology Innovation Program "cloud computing-oriented intelligent processing theory and method of massive language information"underGrant No.3070012231102the BIT Fundamental Research Projects under Grant No.3070012210917
文摘This paper proposes a hierarchical word domain assignment algorithm to automatically build domain dictionaries from Machine-Readable Dictionary(MRD).The process for word domain assignment can be divided into three steps:1) Hierarchical structure constructing;2) Classifier training;3) Word domain assigning.Compared with the traditional methods,the hierarchical word domain assignment algorithm enhances the accuracy of word domain assignment while reducing human efforts on collecting corpus.Experiments on WordNet 2.0 show that 62.53% of the first domain labels are matched with the WordNet Domains 3.0 by using gloss-based word domain assignment,and the performance can be further improved by utilizing the hierarchical relationships among the domain sets.
基金supported by National Key R&D Program of China(2019YFC1521102)the National Natural Science Foundation of China(61932003)Beijing Science and Technology Plan(Z221100007722004).
文摘Hierarchical multi-granularity image classification is a challenging task that aims to tag each given image with multiple granularity labels simultaneously.Existing methods tend to overlook that different image regions contribute differently to label prediction at different granularities,and also insufficiently consider relationships between the hierarchical multi-granularity labels.We introduce a sequence-to-sequence mechanism to overcome these two problems and propose a multi-granularity sequence generation(MGSG)approach for the hierarchical multi-granularity image classification task.Specifically,we introduce a transformer architecture to encode the image into visual representation sequences.Next,we traverse the taxonomic tree and organize the multi-granularity labels into sequences,and vectorize them and add positional information.The proposed multi-granularity sequence generation method builds a decoder that takes visual representation sequences and semantic label embedding as inputs,and outputs the predicted multi-granularity label sequence.The decoder models dependencies and correlations between multi-granularity labels through a masked multi-head self-attention mechanism,and relates visual information to the semantic label information through a crossmodality attention mechanism.In this way,the proposed method preserves the relationships between labels at different granularity levels and takes into account the influence of different image regions on labels with different granularities.Evaluations on six public benchmarks qualitatively and quantitatively demonstrate the advantages of the proposed method.Our project is available at https://github.com/liuxindazz/mgs.
文摘A multilevel secure relation hierarchical data model for multilevel secure database is extended from the relation hierarchical data model in single level environment in this paper. Based on the model, an upper lower layer relationalintegrity is presented after we analyze and eliminate the covert channels caused by the database integrity.Two SQL statements are extended to process polyinstantiation in the multilevel secure environment.The system based on the multilevel secure relation hierarchical data model is capable of integratively storing and manipulating complicated objects ( e.g. , multilevel spatial data) and conventional data ( e.g. , integer, real number and character string) in multilevel secure database.
基金Guangdong Basic and Applied Basic Research Foundation,Grant/Award Number:2022A1515-011426National Natural Science Foundation of China,Grant/Award Numbers:61873027,32000466Shenzhen Science and Technology Program,Grant/Award Number:RCBS20200714114909234。
文摘Molecular subtyping of gastric cancer(GC)aims to comprehend its genetic landscape.However,the efficacy of current subtyping methods is hampered by their mixed use of molecular features,a lack of strategy optimization,and the limited availability of public GC datasets.There is a pressing need for a precise and easily adoptable subtyping approach for early DNA-based screening and treatment.Based on TCGA subtypes,we developed a novel DNA-based hierarchical classifier for gastric cancer molecular subtyping(HCG),which employs gene mutations,copy number aberrations,and methylation patterns as predictors.By incorporating the closely related esophageal adenocarcinomas dataset,we expanded the TCGA GC dataset for the training and testing of HCG(n=453).The optimization of HCG was achieved through three hierarchical strategies using Lasso-Logistic regression,evaluated by their overall the area under receiver operating characteristic curve(auROC),accuracy,F1 score,the area under precision-recall curve(auPRC)and their capability for clinical stratification using multivariate survival analysis.Subtype-specific DNA alteration biomarkers were discerned through difference tests based on HCG defined subtypes.Our HCG classifier demonstrated superior performance in terms of overall auROC(0.95),accuracy(0.88),F1 score(0.87)and auPRC(0.86),significantly improving the clinical stratification of patients(overall p-value=0.032).Difference tests identified 25 subtype-specific DNA alterations,including a high mutation rate in the SYNE1,ITGB4,and COL22A1 genes for the MSI subtype,and hypermethylation of ALS2CL,KIAA0406,and RPRD1B genes for the EBV subtype.HCG is an accurate and robust classifier for DNA-based GC molecular subtyping with highly predictive clinical stratification performance.The training and test datasets,along with the analysis programs of HCG,are accessible on the GitHub website(github.com/LabxSCUT).
基金Project supported by the National Natural Science Foundation of China(No.61379074)the Zhejiang Provincial Natural Science Foundation of China(Nos.LZ12F02003 and LY15F020035)
文摘In some image classification tasks, similarities among different categories are different and the samples are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features are required so that the classifier can improve the classification performance. In this paper, we propose a novel two-level hierarchical feature learning framework based on the deep convolutional neural network(CNN), which is simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature extracted from all the categories and the specific feature extracted from highly similar categories are fused into a feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count(CPC) datasets demonstrate that the expression ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method effectively increases the classification accuracy in comparison with flat multiple classification methods.
基金National Key Research and Development Program of China under(2019YFB1804404)Beijing Natural Science Foundation(4202046)+1 种基金National Natural Science Foundation of China(61801052)Guangdong Key Field R&D Program(2018B010124001)。
文摘Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded convolutional neural network(CasCNN)-based hierarchical digital modulation classification scheme,where M-ary phase shift keying(PSK)and M-ary quadrature amplitude modulation(QAM)modulation formats are considered to be classified.In CasCNN,two-block convolutional neural networks are cascaded.The first block network is utilized to classify the different classes of modulation formats,namely PSK and QAM.The second block is designed to identify the indexes of the modulations in the same PSK or QAM class.Moreover,it is noted that the gird constellation diagram extracted from the received signal is utilized as the inputs to the CasCNN.Extensive simulations demonstrate that CasCNN yields performance gain and performs stronger robustness to frequency offset compared with other recent methods.Specifically,CasCNN achieves 90%classification accuracy at 4 dB signal-to-noise ratio when the symbol length is set as 256.
基金supported by National Natural Science Foundation of China(No.41101431)the fourth installment special funding of China Postdoctoral Science Foundation(No.201104003)+1 种基金China Postdoctoral Science Foundation(No.20100470004)the State Key Funds of Social Science Project(Research on Disability Prevention Measurement in China,No.09&ZD072)
文摘Abstract Objective To develop a new technique for assessing the risk of birth defects, which are a major cause of infant mortality and disability in many parts of the world. Methods The region of interest in this study was Heshun County, the county in China with the highest rate of neural tube defects (NTDs). A hybrid particle swarm optimization/ant colony optimization (PSO/ACO) algorithm was used to quantify the probability of NTDs occurring at villages with no births. The hybrid PSO/ACO algorithm is a form of artificial intelligence adapted for hierarchical classification. It is a powerful technique for modeling complex problems involving impacts of causes. Results The algorithm was easy to apply, with the accuracy of the results being 69.5%+7.02% at the 95% confidence level. Conclusion The proposed method is simple to apply, has acceptable fault tolerance, and greatly enhances the accuracy of calculations.
文摘This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relation hierarchical data model. Based on the multilevel relation hierarchical data model, the concept of upper lower layer relational integrity is presented after we analyze and eliminate the covert channels caused by the database integrity. Two SQL statements are extended to process polyinstantiation in the multilevel secure environment. The system is based on the multilevel relation hierarchical data model and is capable of integratively storing and manipulating multilevel complicated objects ( e.g., multilevel spatial data) and multilevel conventional data ( e.g., integer, real number and character string).
基金Funding for this work came from the USDA Forest Service Resources Planning Act Assessment,via an agreement with North Carolina State University
文摘Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can facilitate the development of this baseline knowledge across broad extents, but they first must be classified into forest community types. Here, we compared three alternative classifications across the United States using data from over 117,000 U.S. Department of Agriculture Forest Service Forest Inventory and Analysis (FIA) plots. Methods: Each plot had three forest community type labels: (1) "FIA" types were assigned by the FIA program using a supervised method; (2) "USNVC" types were assigned via a key based on the U.S. National Vegetation Classification; (3) "empirical" types resulted from unsupervised clustering of tree species information. We assessed the degree to which analog classes occurred among classifications, compared indicator species values, and used random forest models to determine how well the classifications could be predicted using environmental variables. Results: The classifications generated groups of classes that had broadly similar distributions, but often there was no one-to-one analog across the classifications. The Iongleaf pine forest community type stood out as the exception: it was the only class with strong analogs across all classifications. Analogs were most lacking for forest community types with species that occurred across a range of geographic and environmental conditions, such as Ioblolly pine types, indicator species metrics were generally high for the USNVC, suggesting that LJSNVC classes are floristically well-defined. The empirical classification was best predicted by environmental variables. The most important predictors differed slightly but were broadly similar across all classifications, and included slope, amount of forest in the surrounding landscape, average minimum temperature, and other climate variables. Conclusions: The classifications have similarities and differences that reflect their differing approaches and Dbjectives. They are most consistent for forest community types that occur in a relatively narrow range of Invironmental conditions, and differ most for types with wide-ranging tree species. Environmental variables at variety of scales were important for predicting all classifications, though strongest for the empirical and FIA, guggesting that each is useful for studying how forest communities respond to of multi-scale environmental processes, including global change drivers.
文摘The paper carried on the classified and rating evaluation primarily on natural landscape resources in Lushan Mountain. According to the evaluation, exploiting and utilizing the situation of scenic spot natural landscape resources, some reasonable advices were given on further exploiting Lushan Mountain natural scenic spot, expecting that it could supply some theoretical references for the natural landscape resources sustainable development in Lushan Mountain in the future.
文摘In order to study social inequalities, indices can be used to summarize the multiple dimensions of the socioeconomic status. As a part of the Equit’Area Project, a public health program focused on social and environmental health inequalities;a statistical procedure to create (neighborhood) socioeconomic indices was developed. This procedure uses successive principal components analyses to select variables and create the index. In order to simplify the application of the procedure for non-specialists, the R package SesIndexCreatoR was created. It allows the creation of the index with all the possible options of the procedure, the classification of the resulting index in categories using several classical methods, the visualization of the results, and the generation of automatic reports.
文摘To partition the scintigraphic image, several methods are used, among which is Kohonen’s self-organizing map algorithm. The objective of this study was to perform an ascending hierarchical classification (HAC) on the results of the Kohonen self-organizing map. This makes it possible to carry out the second phase necessary for the elaboration of the classifier by grouping the neurons as well as possible into 3 classes then by reconstituting the scintigraphic image from the 3 classes. This partition proceeds by successive groups, thus merging at each iteration two subsets of neurons using a measure of similarity which is Ward’s method. In this method, the algorithm aggregates the nearest neurons into classes. This allows us to obtain a dendrogram that looks like a tree. And this one needs to be cut. And to have an adequate cut-off level, we have established the variation of the Davies Bouldin index as a function of the number of classes. The minimum value of this index gave the optimal number of classes which corresponded to 3 in the study. These three groups A, B, C have a variable intensity. This intensity can be high, it can be medium or low. The high, medium and low intensities corresponded respectively to metastases for class A, to degenerative or inflammatory phenomena for class B and to normal radiopharmaceutical uptake for class C. To confirm this strong suspicion, we performed reconstructions using a filter. And after this reconstruction, we had images like at the entrance. And for the interpretation of these images, we used a visual metric. This enabled us to note that for the interval [0 - 50[, the image is not contrasted and no lesion could be detected. Over the interval [50 - 200[, we observed the distribution of the radiopharmaceutical over the entire skeletal whole body. On this reconstruction interval, the visual metric shows hypofixation in the bladder and areas suspected of metastases. Over the interval [200 - 250[, we detected hyperfixations linked to degenerative, inflammatory or metastatic lesions. And finally, in the last interval, [250 - 252], we found regions that showed strong uptake (bladder, sternum, etc.). This capture is physiological. Apart from physiological hyperfixation, the other types of hyperfixation were considered metastatic according to the two nuclear scientists who interpreted these images. In total, the HAC allowed us to sub-classify the data into 3 groups which were subsequently reconstructed. And this reconstruction technique highlighted the periarticular metastases belonging to the class [250 - 252]. This allowed us to highlight the oligo-metastases and to carry out in most of these patients a radical prostatectomy.