This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for d...This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for deriving fuzzy boundaries from fuzzy categorical maps,which can be based on the maximum fuzzy membership values,confusion index,or measure of entropy.Results from an empirical test preformed in an Edinburgh suburb show that fuzzy boundaries of land cover can be derived from aerial photographs and satellite images by using the three criteria with small differences,and that slicing based on the maximum fuzzy membership values is the easiest and most straightforward solution.This,in turn,implies the suitability of maintaining both a crisp classification and its underlying certainty map for deriving fuzzy boundaries at different thresholds,which is a flexible and compact management of categorical map data and their uncertainty.展开更多
This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The fra...This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The framework contents of categoricaldatabase generalization transformationare defined. This paper presents an in-tegrated spatial supporting data struc-ture, a semantic supporting model andsimilarity model for the categorical da-tabase generalization. The concept oftransformation unit is proposed in generalization.展开更多
Simple linear regression analysis has been used to map QTL for quantitative traits. Many traits of biological interest and/or economical importance in various species show binary phenotypic distributions (e.g., presen...Simple linear regression analysis has been used to map QTL for quantitative traits. Many traits of biological interest and/or economical importance in various species show binary phenotypic distributions (e.g., presence or absence). It has been shown that such a binary trait also can be analyzed with the simple linear regression, subject to virtually no loss in power compared to the generalized linear model analysis. Binary trait is a special case of a multiple categorical trait (e.g., low, medium or high). We propose a mechanism to decompose a multiple categorical trait into an array of correlated binary variables. The categorical trait turned multiple binary traits are analyzed with a multivariate linear regression method. Turning the problem of categorical trait mapping into that of multivariate mapping allows the exploration of pleiotropic effects of QTL for different categories. Efficiency of the method is verified through a series of simulation experiments.展开更多
In this paper a novel coupled attribute similarity learning method is proposed with the basis on the multi-label categorical data(CASonMLCD).The CASonMLCD method not only computes the correlations between different ...In this paper a novel coupled attribute similarity learning method is proposed with the basis on the multi-label categorical data(CASonMLCD).The CASonMLCD method not only computes the correlations between different attributes and multi-label sets using information gain,which can be regarded as the important degree of each attribute in the attribute learning method,but also further analyzes the intra-coupled and inter-coupled interactions between an attribute value pair for different attributes and multiple labels.The paper compared the CASonMLCD method with the OF distance and Jaccard similarity,which is based on the MLKNN algorithm according to 5common evaluation criteria.The experiment results demonstrated that the CASonMLCD method can mine the similarity relationship more accurately and comprehensively,it can obtain better performance than compared methods.展开更多
In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify pat...In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.展开更多
The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple m...The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple method to find such distinctive features by comparing pooled within-cluster mean relative difference and then partition the data upon such features and give subspace of the subgroups. The applications on zoo data and soybean data illustrate the performance of the proposed method.展开更多
On the basis of extension architectonics,this paper researches the process of extension categorical data mining for extension interior design. In accordance with the theory of extension data mining,the extension categ...On the basis of extension architectonics,this paper researches the process of extension categorical data mining for extension interior design. In accordance with the theory of extension data mining,the extension categorical data mining for the extension interior design can be divided into data preparation,the operation of mining and knowledge application. The paper expatiates the main content and cohesive relations of each link,and emphatically discusses extension acquisition,analysis extension,categorical mining extension,knowledge application extension and other several core nodes that are related with data. Through the knowledge fusion of extension architectonics and data mining,the paper discusses the process of knowledge requirements with multiple classification under different mining targets. The purpose of this paper is to explore a whole categorical data mining process of interior design from extension design data to the design of knowledge discovery and extension application.展开更多
Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from th...Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms.展开更多
BACKGROUND Premenstrual syndrome(PMS)is the constellation of physical and psychological symptoms before menstruation.Premenstrual dysphoric disorder(PMDD)is a severe form of PMS with more depressive and anxiety sympto...BACKGROUND Premenstrual syndrome(PMS)is the constellation of physical and psychological symptoms before menstruation.Premenstrual dysphoric disorder(PMDD)is a severe form of PMS with more depressive and anxiety symptoms.The Mini international neuropsychiatric interview,module U(MINI-U),assesses the diagnostic criteria for probable PMDD.The Premenstrual Symptoms screening tool(PSST)measures the severity of these symptoms.AIM To compare the PSST ordinal scores with the corresponding dichotomous MINI-U answers.METHODS Arab women(n=194)residing in Doha,Qatar,received the MINI-U and PSST.Receiver Operating Characteristics(ROC)analyses provided the cut-off scores on the PSST using MINI-U as a gold standard.RESULTS All PSST ratings were higher in participants with positive responses on MINI-U.In addition,ROC analyses showed that all areas under the curves were significant with the cutoff scores on PSST.CONCLUSION This study confirms that the severity measures from PSST can recognize patients with moderate/severe PMS and PMDD who would benefit from immediate treatment.展开更多
Among the huge diversity of ideas that show up while studying graph theory,one that has obtained a lot of popularity is the concept of labelings of graphs.Graph labelings give valuable mathematical models for a wide s...Among the huge diversity of ideas that show up while studying graph theory,one that has obtained a lot of popularity is the concept of labelings of graphs.Graph labelings give valuable mathematical models for a wide scope of applications in high technologies(cryptography,astronomy,data security,various coding theory problems,communication networks,etc.).A labeling or a valuation of a graph is any mapping that sends a certain set of graph elements to a certain set of numbers subject to certain conditions.Graph labeling is a mapping of elements of the graph,i.e.,vertex and for edges to a set of numbers(usually positive integers),called labels.If the domain is the vertex-set or the edge-set,the labelings are called vertex labelings or edge labelings respectively.Similarly,if the domain is V(G)[E(G)],then the labeling is called total labeling.A reflexive edge irregular k-labeling of graph introduced by Tanna et al.:A total labeling of graph such that for any two different edges ab and a'b'of the graph their weights has wt_(x)(ab)=x(a)+x(ab)+x(b) and wt_(x)(a'b')=x(a')+x(a'b')+x(b') are distinct.The smallest value of k for which such labeling exist is called the reflexive edge strength of the graph and is denoted by res(G).In this paper we have found the exact value of the reflexive edge irregularity strength of the categorical product of two paths (P_(a)×P_(b))for any choice of a≥3 and b≥3.展开更多
To classify DNA sequences, k-mer frequency is widely used since it can convert variable-length sequences into fixed-length and numerical feature vectors. However, in case of fixed-length DNA sequence classification, s...To classify DNA sequences, k-mer frequency is widely used since it can convert variable-length sequences into fixed-length and numerical feature vectors. However, in case of fixed-length DNA sequence classification, subsequences starting at a specific position of the given sequence can also be used as categorical features. Through the performance evaluation on six datasets of fixed-length DNA sequences, our algorithm based on the above idea achieved comparable or better performance than other state-of-the art algorithms.展开更多
This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability d...This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability distributions. One algorithm classifies objects based on the distribution of the attribute frequencies, and the other classifies objects based on the distribution of the pairwise attribute frequencies described using a matrix of pairwise frequencies. Both algorithms are based on the method of invariants, which offers the simplest dependencies for estimating the probabilities of objects in each class by an average frequency of their attributes. The estimated object class corresponds to the maximum probability. This method reflects the sensory process models of animals and is aimed at recognizing an object class by searching for a prototype in information accumulated in the brain. Because these matrices may be sparse, the solution cannot be determined for some objects. For these objects, an analog of the k-nearest neighbors method is provided in which for each attribute value, the class to which the majority of the k-nearest objects in the training sample belong is determined, and the most likely class value is calculated. The efficiencies of these two algorithms were confirmed on five databases.展开更多
Statistics is a powerful tool for data measurement. Statistical techniques properly planned and executed give meaning to meaningless data. The difficulty some practitioners encounter hinges on the fact that though the...Statistics is a powerful tool for data measurement. Statistical techniques properly planned and executed give meaning to meaningless data. The difficulty some practitioners encounter hinges on the fact that though there are numerous statistical methods available for use in analysis, the extent of their understanding and ease of using these tools for analysis is limited. This study has twofold purpose: firstly, literature on categorical data commonly used in research w</span><span style="font-family:Verdana;">as</span><span style="font-family:Verdana;"> reviewed</span><span style="font-family:Verdana;">;</span><span style="font-family:""><span style="font-family:Verdana;"> next, we reported the results of a survey we designed and executed. Categorical data was collected via questionnaire and analyzed to serve as a backbone of the robustness of categorical data. Several conjec</span><span style="font-family:Verdana;">tures about the independence of the socio-economic variables and e-commence</span><span style="font-family:Verdana;"> were tested. Some of the factors influencing patronage of e-commerce were </span><span style="font-family:Verdana;">identified. It is clear from the literature that as one’s academic qualification</span><span style="font-family:Verdana;"> improves</span></span><span style="font-family:Verdana;">, </span><span style="font-family:""><span style="font-family:Verdana;">there is an associated improvement in their preference for e-commerce, but the results revealed otherwise. Size of family was found to influence e-commerce. Both income and social status positively affected pa</span><span style="font-family:Verdana;">tronage in e-commerce. Gender also appeared to affect patronage in e-commerce</span><span style="font-family:Verdana;">. 62.3% of staff had patronized e-commerce</span></span><span style="font-family:Verdana;">.</span><span style="font-family:Verdana;"> This shows that e-commerce patronage was gradually increasing. It is therefore our considered view that policy documents regulating and monitoring the use of e-commerce be developed to increase e-commerce participation across the globe</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is also recommended that the bottlenecks which obstruct patronage in e-commence be addressed so that a lot more staff will develop a positive attitude towards e-commerce.展开更多
In quantum computing, the computation is achieved by linear operators in or between Hilbert spaces. In this work, we explore a new computation scheme, in which the linear operators in quantum computing are replaced by...In quantum computing, the computation is achieved by linear operators in or between Hilbert spaces. In this work, we explore a new computation scheme, in which the linear operators in quantum computing are replaced by (higher) functors between two (higher) categories. If from Turing computing to quantum computing is the first quantization of computation, then this new scheme can be viewed as the second quantization of computation. The fundamental problem in realizing this idea is how to realize a (higher) functor physically. We provide a theoretical idea of realizing (higher) functors physically based on the physics of topological orders.展开更多
By introducing the partial actions of primitive inverse semigroups on a set and their globalizations, a structure theorem for E^*-unitary categorical inverse semigroups is obtained.
This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed...This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed herein utilizes the fusion of diversified feature formats,specifically,metadata,textual,and pattern features.The goal is to enhance the system’s ability to discern and generalize transformation rules fromsource to destination formats in varied contexts.Firstly,the article delves into the methodology for extracting these distinct features from raw data and the pre-processing steps undertaken to prepare the data for the model.Subsequent sections expound on the mechanism of feature optimization using Recursive Feature Elimination(RFE)with linear regression,aiming to retain the most contributive features and eliminate redundant or less significant ones.The core of the research revolves around the deployment of the XGBoostmodel for training,using the prepared and optimized feature sets.The article presents a detailed overview of the mathematical model and algorithmic steps behind this procedure.Finally,the process of rule discovery(prediction phase)by the trained XGBoost model is explained,underscoring its role in real-time,automated data transformations.By employingmachine learning and particularly,the XGBoost model in the context of Business Rule Engine(BRE)data transformation,the article underscores a paradigm shift towardsmore scalable,efficient,and less human-dependent data transformation systems.This research opens doors for further exploration into automated rule discovery systems and their applications in various sectors.展开更多
Identifying the factors that influence the heavy metal contents of soil could reveal the sources of soil heavy metal pollution.In this study,a categorical regression was used to identify the factors that influence soi...Identifying the factors that influence the heavy metal contents of soil could reveal the sources of soil heavy metal pollution.In this study,a categorical regression was used to identify the factors that influence soil heavy metals.First,environmental factors were associated with soil heavy metal data,and then,the degree of influence of different factors on the soil heavy metal contents in Beijing was analyzed using a categorical regression.The results showed that the soil parent material,soil type,land use type,and industrial activity were the main influencing factors,which suggested that these four factors were important sources of soil heavy metals in Beijing.In addition,population density had a certain influence on the soil Pb and Zn contents.The distribution of soil As,Cd,Pb,and Zn was markedly influenced by interactions,such as traffic activity and land use type,industrial activity and population density.The spatial distribution of soil heavy metal hotspots corresponded well with the influencing factors,such as industrial activity,population density,and soil parent material.In this study,the main factors affecting soil heavy metals were identified,and the degree of their influence was ranked.A categorical regression represents a suitable method for identifying the factors that influence soil heavy metal contents and could be used to study the genetic process of regional soil heavy metal pollution.展开更多
Appropriate color mapping for categorical data visualization can significantly facilitate the discovery of underlying data patterns and effectively bring out visual aesthetics.Some systems suggest pre-defined palettes...Appropriate color mapping for categorical data visualization can significantly facilitate the discovery of underlying data patterns and effectively bring out visual aesthetics.Some systems suggest pre-defined palettes for this task.However,a predefined color mapping is not always optimal,failing to consider users’needs for customization.Given an input cate-gorical data visualization and a reference image,we present an effective method to automatically generate a coloring that resembles the reference while allowing classes to be easily distinguished.We extract a color palette with high perceptual distance between the colors by sampling dominant and discriminable colors from the image’s color space.These colors are assigned to given classes by solving an integer quadratic program to optimize point distinctness of the given chart while preserving the color spatial relations in the source image.We show results on various coloring tasks,with a diverse set of new coloring appearances for the input data.We also compare our approach to state-of-the-art palettes in a controlled user study,which shows that our method achieves comparable performance in class discrimination,while being more similar to the source image.User feedback after using our system verifies its efficiency in automatically generating desirable colorings that meet the user’s expectations when choosing a reference.展开更多
We clarify the relation between the subcategory D_(hf)~b(A) of homological finite objects in D^b(A)and the subcategory K^b(P) of perfect complexes in D^b(A), by giving two classes of abelian categories A with enough p...We clarify the relation between the subcategory D_(hf)~b(A) of homological finite objects in D^b(A)and the subcategory K^b(P) of perfect complexes in D^b(A), by giving two classes of abelian categories A with enough projective objects such that D_(hf)~b(A) = K^b(P), and finding an example such that D_(hf)~b(A)≠K^b(P). We realize the bounded derived category D^b(A) as a Verdier quotient of the relative derived category D_C^b(A), where C is an arbitrary resolving contravariantly finite subcategory of A. Using this relative derived categories, we get categorical resolutions of a class of bounded derived categories of module categories of infinite global dimension.We prove that if an Artin algebra A of infinite global dimension has a module T with inj.dimT <∞ such that ~⊥T is finite, then D^b(modA) admits a categorical resolution; and that for a CM(Cohen-Macaulay)-finite Gorenstein algebra, such a categorical resolution is weakly crepant.展开更多
As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of d...As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.展开更多
文摘This paper provides a brief introduction to the methods for generating fuzzy categorical maps from remotely sensed images (in graphical and digital forms).This is followed by a description of the slicing process for deriving fuzzy boundaries from fuzzy categorical maps,which can be based on the maximum fuzzy membership values,confusion index,or measure of entropy.Results from an empirical test preformed in an Edinburgh suburb show that fuzzy boundaries of land cover can be derived from aerial photographs and satellite images by using the three criteria with small differences,and that slicing based on the maximum fuzzy membership values is the easiest and most straightforward solution.This,in turn,implies the suitability of maintaining both a crisp classification and its underlying certainty map for deriving fuzzy boundaries at different thresholds,which is a flexible and compact management of categorical map data and their uncertainty.
基金the National Natural Science Foundation (No. 40271088) the Research Fund of International Institute of Geo-information Science and Earth Observation.
文摘This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The framework contents of categoricaldatabase generalization transformationare defined. This paper presents an in-tegrated spatial supporting data struc-ture, a semantic supporting model andsimilarity model for the categorical da-tabase generalization. The concept oftransformation unit is proposed in generalization.
基金Item supported by national natural sciencefoundation( No.30471236)
文摘Simple linear regression analysis has been used to map QTL for quantitative traits. Many traits of biological interest and/or economical importance in various species show binary phenotypic distributions (e.g., presence or absence). It has been shown that such a binary trait also can be analyzed with the simple linear regression, subject to virtually no loss in power compared to the generalized linear model analysis. Binary trait is a special case of a multiple categorical trait (e.g., low, medium or high). We propose a mechanism to decompose a multiple categorical trait into an array of correlated binary variables. The categorical trait turned multiple binary traits are analyzed with a multivariate linear regression method. Turning the problem of categorical trait mapping into that of multivariate mapping allows the exploration of pleiotropic effects of QTL for different categories. Efficiency of the method is verified through a series of simulation experiments.
基金Supported by Australian Research Council Discovery(DP130102691)the National Science Foundation of China(61302157)+1 种基金China National 863 Project(2012AA12A308)China Pre-research Project of Nuclear Industry(FZ1402-08)
文摘In this paper a novel coupled attribute similarity learning method is proposed with the basis on the multi-label categorical data(CASonMLCD).The CASonMLCD method not only computes the correlations between different attributes and multi-label sets using information gain,which can be regarded as the important degree of each attribute in the attribute learning method,but also further analyzes the intra-coupled and inter-coupled interactions between an attribute value pair for different attributes and multiple labels.The paper compared the CASonMLCD method with the OF distance and Jaccard similarity,which is based on the MLKNN algorithm according to 5common evaluation criteria.The experiment results demonstrated that the CASonMLCD method can mine the similarity relationship more accurately and comprehensively,it can obtain better performance than compared methods.
基金Supported by the National Natural Science Foundation of China (601133010)
文摘In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.
文摘The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple method to find such distinctive features by comparing pooled within-cluster mean relative difference and then partition the data upon such features and give subspace of the subgroups. The applications on zoo data and soybean data illustrate the performance of the proposed method.
基金Sponsored by the National Natural Science Foundation of China(Grant No.51178132)"Thirteenth Five-year" Social Science Research Project of the Education Department in Jilin Province(Grant No.Ji UNESCO co word[2016]No.382th)
文摘On the basis of extension architectonics,this paper researches the process of extension categorical data mining for extension interior design. In accordance with the theory of extension data mining,the extension categorical data mining for the extension interior design can be divided into data preparation,the operation of mining and knowledge application. The paper expatiates the main content and cohesive relations of each link,and emphatically discusses extension acquisition,analysis extension,categorical mining extension,knowledge application extension and other several core nodes that are related with data. Through the knowledge fusion of extension architectonics and data mining,the paper discusses the process of knowledge requirements with multiple classification under different mining targets. The purpose of this paper is to explore a whole categorical data mining process of interior design from extension design data to the design of knowledge discovery and extension application.
文摘Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms.
基金Supported by the Qatar National Research Fund,No. UREP 10-022-3-005
文摘BACKGROUND Premenstrual syndrome(PMS)is the constellation of physical and psychological symptoms before menstruation.Premenstrual dysphoric disorder(PMDD)is a severe form of PMS with more depressive and anxiety symptoms.The Mini international neuropsychiatric interview,module U(MINI-U),assesses the diagnostic criteria for probable PMDD.The Premenstrual Symptoms screening tool(PSST)measures the severity of these symptoms.AIM To compare the PSST ordinal scores with the corresponding dichotomous MINI-U answers.METHODS Arab women(n=194)residing in Doha,Qatar,received the MINI-U and PSST.Receiver Operating Characteristics(ROC)analyses provided the cut-off scores on the PSST using MINI-U as a gold standard.RESULTS All PSST ratings were higher in participants with positive responses on MINI-U.In addition,ROC analyses showed that all areas under the curves were significant with the cutoff scores on PSST.CONCLUSION This study confirms that the severity measures from PSST can recognize patients with moderate/severe PMS and PMDD who would benefit from immediate treatment.
文摘Among the huge diversity of ideas that show up while studying graph theory,one that has obtained a lot of popularity is the concept of labelings of graphs.Graph labelings give valuable mathematical models for a wide scope of applications in high technologies(cryptography,astronomy,data security,various coding theory problems,communication networks,etc.).A labeling or a valuation of a graph is any mapping that sends a certain set of graph elements to a certain set of numbers subject to certain conditions.Graph labeling is a mapping of elements of the graph,i.e.,vertex and for edges to a set of numbers(usually positive integers),called labels.If the domain is the vertex-set or the edge-set,the labelings are called vertex labelings or edge labelings respectively.Similarly,if the domain is V(G)[E(G)],then the labeling is called total labeling.A reflexive edge irregular k-labeling of graph introduced by Tanna et al.:A total labeling of graph such that for any two different edges ab and a'b'of the graph their weights has wt_(x)(ab)=x(a)+x(ab)+x(b) and wt_(x)(a'b')=x(a')+x(a'b')+x(b') are distinct.The smallest value of k for which such labeling exist is called the reflexive edge strength of the graph and is denoted by res(G).In this paper we have found the exact value of the reflexive edge irregularity strength of the categorical product of two paths (P_(a)×P_(b))for any choice of a≥3 and b≥3.
文摘To classify DNA sequences, k-mer frequency is widely used since it can convert variable-length sequences into fixed-length and numerical feature vectors. However, in case of fixed-length DNA sequence classification, subsequences starting at a specific position of the given sequence can also be used as categorical features. Through the performance evaluation on six datasets of fixed-length DNA sequences, our algorithm based on the above idea achieved comparable or better performance than other state-of-the art algorithms.
文摘This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability distributions. One algorithm classifies objects based on the distribution of the attribute frequencies, and the other classifies objects based on the distribution of the pairwise attribute frequencies described using a matrix of pairwise frequencies. Both algorithms are based on the method of invariants, which offers the simplest dependencies for estimating the probabilities of objects in each class by an average frequency of their attributes. The estimated object class corresponds to the maximum probability. This method reflects the sensory process models of animals and is aimed at recognizing an object class by searching for a prototype in information accumulated in the brain. Because these matrices may be sparse, the solution cannot be determined for some objects. For these objects, an analog of the k-nearest neighbors method is provided in which for each attribute value, the class to which the majority of the k-nearest objects in the training sample belong is determined, and the most likely class value is calculated. The efficiencies of these two algorithms were confirmed on five databases.
文摘Statistics is a powerful tool for data measurement. Statistical techniques properly planned and executed give meaning to meaningless data. The difficulty some practitioners encounter hinges on the fact that though there are numerous statistical methods available for use in analysis, the extent of their understanding and ease of using these tools for analysis is limited. This study has twofold purpose: firstly, literature on categorical data commonly used in research w</span><span style="font-family:Verdana;">as</span><span style="font-family:Verdana;"> reviewed</span><span style="font-family:Verdana;">;</span><span style="font-family:""><span style="font-family:Verdana;"> next, we reported the results of a survey we designed and executed. Categorical data was collected via questionnaire and analyzed to serve as a backbone of the robustness of categorical data. Several conjec</span><span style="font-family:Verdana;">tures about the independence of the socio-economic variables and e-commence</span><span style="font-family:Verdana;"> were tested. Some of the factors influencing patronage of e-commerce were </span><span style="font-family:Verdana;">identified. It is clear from the literature that as one’s academic qualification</span><span style="font-family:Verdana;"> improves</span></span><span style="font-family:Verdana;">, </span><span style="font-family:""><span style="font-family:Verdana;">there is an associated improvement in their preference for e-commerce, but the results revealed otherwise. Size of family was found to influence e-commerce. Both income and social status positively affected pa</span><span style="font-family:Verdana;">tronage in e-commerce. Gender also appeared to affect patronage in e-commerce</span><span style="font-family:Verdana;">. 62.3% of staff had patronized e-commerce</span></span><span style="font-family:Verdana;">.</span><span style="font-family:Verdana;"> This shows that e-commerce patronage was gradually increasing. It is therefore our considered view that policy documents regulating and monitoring the use of e-commerce be developed to increase e-commerce participation across the globe</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is also recommended that the bottlenecks which obstruct patronage in e-commence be addressed so that a lot more staff will develop a positive attitude towards e-commerce.
基金We are supported by Guangdong Provincial Key Laboratory(Grant No.2019B121203002)L.K.is also supported by the National Natural Science Foundation of China under Grant No.11971219+1 种基金Guangdong Basic and Applied Basic Research Foundation under Grant No.2020B1515120100H.Z.is also supported by the National Natural Science Foundation of China under Grant No.11871078.
文摘In quantum computing, the computation is achieved by linear operators in or between Hilbert spaces. In this work, we explore a new computation scheme, in which the linear operators in quantum computing are replaced by (higher) functors between two (higher) categories. If from Turing computing to quantum computing is the first quantization of computation, then this new scheme can be viewed as the second quantization of computation. The fundamental problem in realizing this idea is how to realize a (higher) functor physically. We provide a theoretical idea of realizing (higher) functors physically based on the physics of topological orders.
文摘By introducing the partial actions of primitive inverse semigroups on a set and their globalizations, a structure theorem for E^*-unitary categorical inverse semigroups is obtained.
文摘This article presents an innovative approach to automatic rule discovery for data transformation tasks leveraging XGBoost,a machine learning algorithm renowned for its efficiency and performance.The framework proposed herein utilizes the fusion of diversified feature formats,specifically,metadata,textual,and pattern features.The goal is to enhance the system’s ability to discern and generalize transformation rules fromsource to destination formats in varied contexts.Firstly,the article delves into the methodology for extracting these distinct features from raw data and the pre-processing steps undertaken to prepare the data for the model.Subsequent sections expound on the mechanism of feature optimization using Recursive Feature Elimination(RFE)with linear regression,aiming to retain the most contributive features and eliminate redundant or less significant ones.The core of the research revolves around the deployment of the XGBoostmodel for training,using the prepared and optimized feature sets.The article presents a detailed overview of the mathematical model and algorithmic steps behind this procedure.Finally,the process of rule discovery(prediction phase)by the trained XGBoost model is explained,underscoring its role in real-time,automated data transformations.By employingmachine learning and particularly,the XGBoost model in the context of Business Rule Engine(BRE)data transformation,the article underscores a paradigm shift towardsmore scalable,efficient,and less human-dependent data transformation systems.This research opens doors for further exploration into automated rule discovery systems and their applications in various sectors.
基金This research was supported by the National Natural Science Foundation of China(Grant Nos.41771510 and 41271478)the Science and Technology Service Network Initiative(STS)from the Chinese Academy of Sciences(No.KFJ-STS-ZDTP-007)In addition,the authors would like to thank professor Yucheng Chen of South-west University for helping us to analyze the data.
文摘Identifying the factors that influence the heavy metal contents of soil could reveal the sources of soil heavy metal pollution.In this study,a categorical regression was used to identify the factors that influence soil heavy metals.First,environmental factors were associated with soil heavy metal data,and then,the degree of influence of different factors on the soil heavy metal contents in Beijing was analyzed using a categorical regression.The results showed that the soil parent material,soil type,land use type,and industrial activity were the main influencing factors,which suggested that these four factors were important sources of soil heavy metals in Beijing.In addition,population density had a certain influence on the soil Pb and Zn contents.The distribution of soil As,Cd,Pb,and Zn was markedly influenced by interactions,such as traffic activity and land use type,industrial activity and population density.The spatial distribution of soil heavy metal hotspots corresponded well with the influencing factors,such as industrial activity,population density,and soil parent material.In this study,the main factors affecting soil heavy metals were identified,and the degree of their influence was ranked.A categorical regression represents a suitable method for identifying the factors that influence soil heavy metal contents and could be used to study the genetic process of regional soil heavy metal pollution.
基金supported in parts by National Natural Science Foundation of China(U2001206,61872250)GD Talent Program(2019JC05X328)+2 种基金GD Natural Science Foundation(2020A0505100064,2021B1515020085)DEGP Key Project(2018KZDXM058)Shenzhen Science and Technology Key Program(RCJC20200714114435012,JCYJ20210324120213036).
文摘Appropriate color mapping for categorical data visualization can significantly facilitate the discovery of underlying data patterns and effectively bring out visual aesthetics.Some systems suggest pre-defined palettes for this task.However,a predefined color mapping is not always optimal,failing to consider users’needs for customization.Given an input cate-gorical data visualization and a reference image,we present an effective method to automatically generate a coloring that resembles the reference while allowing classes to be easily distinguished.We extract a color palette with high perceptual distance between the colors by sampling dominant and discriminable colors from the image’s color space.These colors are assigned to given classes by solving an integer quadratic program to optimize point distinctness of the given chart while preserving the color spatial relations in the source image.We show results on various coloring tasks,with a diverse set of new coloring appearances for the input data.We also compare our approach to state-of-the-art palettes in a controlled user study,which shows that our method achieves comparable performance in class discrimination,while being more similar to the source image.User feedback after using our system verifies its efficiency in automatically generating desirable colorings that meet the user’s expectations when choosing a reference.
基金supported by National Natural Science Foundation of China(Grant Nos.11271251 and 11431010)
文摘We clarify the relation between the subcategory D_(hf)~b(A) of homological finite objects in D^b(A)and the subcategory K^b(P) of perfect complexes in D^b(A), by giving two classes of abelian categories A with enough projective objects such that D_(hf)~b(A) = K^b(P), and finding an example such that D_(hf)~b(A)≠K^b(P). We realize the bounded derived category D^b(A) as a Verdier quotient of the relative derived category D_C^b(A), where C is an arbitrary resolving contravariantly finite subcategory of A. Using this relative derived categories, we get categorical resolutions of a class of bounded derived categories of module categories of infinite global dimension.We prove that if an Artin algebra A of infinite global dimension has a module T with inj.dimT <∞ such that ~⊥T is finite, then D^b(modA) admits a categorical resolution; and that for a CM(Cohen-Macaulay)-finite Gorenstein algebra, such a categorical resolution is weakly crepant.
文摘As digital technologies have advanced more rapidly,the number of paper documents recently converted into a digital format has exponentially increased.To respond to the urgent need to categorize the growing number of digitized documents,the classification of digitized documents in real time has been identified as the primary goal of our study.A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement.Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits,which were not conceivable ten years ago.Deep learning aids in comprehending input patterns so that object classes may be predicted.The segmentation process divides the input image into separate segments for a more thorough image study.This study proposes a deep learning-enabled framework for automated document classification,which can be implemented in higher education.To further this goal,a dataset was developed that includes seven categories:Diplomas,Personal documents,Journal of Accounting of higher education diplomas,Service letters,Orders,Production orders,and Student orders.Subsequently,a deep learning model based on Conv2D layers is proposed for the document classification process.In the final part of this research,the proposed model is evaluated and compared with other machine-learning techniques.The results demonstrate that the proposed deep learning model shows high results in document categorization overtaking the other machine learning models by reaching 94.84%,94.79%,94.62%,94.43%,94.07%in accuracy,precision,recall,F-score,and AUC-ROC,respectively.The achieved results prove that the proposed deep model is acceptable to use in practice as an assistant to an office worker.