期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Coupled Attribute Similarity Learning on Categorical Data for Multi-Label Classification
1
作者 Zhenwu Wang Longbing Cao 《Journal of Beijing Institute of Technology》 EI CAS 2017年第3期404-410,共7页
In this paper a novel coupled attribute similarity learning method is proposed with the basis on the multi-label categorical data(CASonMLCD).The CASonMLCD method not only computes the correlations between different ... In this paper a novel coupled attribute similarity learning method is proposed with the basis on the multi-label categorical data(CASonMLCD).The CASonMLCD method not only computes the correlations between different attributes and multi-label sets using information gain,which can be regarded as the important degree of each attribute in the attribute learning method,but also further analyzes the intra-coupled and inter-coupled interactions between an attribute value pair for different attributes and multiple labels.The paper compared the CASonMLCD method with the OF distance and Jaccard similarity,which is based on the MLKNN algorithm according to 5common evaluation criteria.The experiment results demonstrated that the CASonMLCD method can mine the similarity relationship more accurately and comprehensively,it can obtain better performance than compared methods. 展开更多
关键词 COUPLED SIMILARITY MULTI-LABEL categorical data CORRELATIONS
下载PDF
A Graph Drawing Algorithm for Visualizing Multivariate Categorical Data
2
作者 HUANG Jingwei HUANG Jie 《Wuhan University Journal of Natural Sciences》 CAS 2007年第2期239-242,共4页
In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify pat... In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast. 展开更多
关键词 multivariate categorical data GRAPH graph drawing ALGORITHMS
下载PDF
Analysis of Extension Categorical Data Mining Process for the Extension Interior Designing
3
作者 Hui Ma Guangtian Zou 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2016年第6期26-31,共6页
On the basis of extension architectonics,this paper researches the process of extension categorical data mining for extension interior design. In accordance with the theory of extension data mining,the extension categ... On the basis of extension architectonics,this paper researches the process of extension categorical data mining for extension interior design. In accordance with the theory of extension data mining,the extension categorical data mining for the extension interior design can be divided into data preparation,the operation of mining and knowledge application. The paper expatiates the main content and cohesive relations of each link,and emphatically discusses extension acquisition,analysis extension,categorical mining extension,knowledge application extension and other several core nodes that are related with data. Through the knowledge fusion of extension architectonics and data mining,the paper discusses the process of knowledge requirements with multiple classification under different mining targets. The purpose of this paper is to explore a whole categorical data mining process of interior design from extension design data to the design of knowledge discovery and extension application. 展开更多
关键词 extension categorical data mining extension sets extension interior design
下载PDF
Clustering Categorical Data:A Cluster Ensemble Approach
4
作者 何增友 Xu +2 位作者 Xiaofei Deng Shengchun 《High Technology Letters》 EI CAS 2003年第4期8-12,共5页
Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from th... Clustering categorical data, an integral part of data mining,has attracted much attention recently. In this paper, the authors formally define the categorical data clustering problem as an optimization problem from the viewpoint of cluster ensemble, and apply cluster ensemble approach for clustering categorical data. Experimental results on real datasets show that better clustering accuracy can be obtained by comparing with existing categorical data clustering algorithms. 展开更多
关键词 CLUSTERING categorical data cluster ensemble data mining
下载PDF
Validating Intrinsic Factors Informing E-Commerce: Categorical Data Analysis Demo
5
作者 Anthony Joe Turkson John Awuah Addor Douglas Yenwon Kharib 《Open Journal of Statistics》 2021年第5期737-758,共22页
Statistics is a powerful tool for data measurement. Statistical techniques properly planned and executed give meaning to meaningless data. The difficulty some practitioners encounter hinges on the fact that though the... Statistics is a powerful tool for data measurement. Statistical techniques properly planned and executed give meaning to meaningless data. The difficulty some practitioners encounter hinges on the fact that though there are numerous statistical methods available for use in analysis, the extent of their understanding and ease of using these tools for analysis is limited. This study has twofold purpose: firstly, literature on categorical data commonly used in research w</span><span style="font-family:Verdana;">as</span><span style="font-family:Verdana;"> reviewed</span><span style="font-family:Verdana;">;</span><span style="font-family:""><span style="font-family:Verdana;"> next, we reported the results of a survey we designed and executed. Categorical data was collected via questionnaire and analyzed to serve as a backbone of the robustness of categorical data. Several conjec</span><span style="font-family:Verdana;">tures about the independence of the socio-economic variables and e-commence</span><span style="font-family:Verdana;"> were tested. Some of the factors influencing patronage of e-commerce were </span><span style="font-family:Verdana;">identified. It is clear from the literature that as one’s academic qualification</span><span style="font-family:Verdana;"> improves</span></span><span style="font-family:Verdana;">, </span><span style="font-family:""><span style="font-family:Verdana;">there is an associated improvement in their preference for e-commerce, but the results revealed otherwise. Size of family was found to influence e-commerce. Both income and social status positively affected pa</span><span style="font-family:Verdana;">tronage in e-commerce. Gender also appeared to affect patronage in e-commerce</span><span style="font-family:Verdana;">. 62.3% of staff had patronized e-commerce</span></span><span style="font-family:Verdana;">.</span><span style="font-family:Verdana;"> This shows that e-commerce patronage was gradually increasing. It is therefore our considered view that policy documents regulating and monitoring the use of e-commerce be developed to increase e-commerce participation across the globe</span><span style="font-family:Verdana;">. </span><span style="font-family:Verdana;">It is also recommended that the bottlenecks which obstruct patronage in e-commence be addressed so that a lot more staff will develop a positive attitude towards e-commerce. 展开更多
关键词 categorical data CHI-SQUARE E-COMMERCE Ordinal data Nominal data
下载PDF
Image-guided color mapping for categorical data visualization 被引量:1
6
作者 Qian Zheng Min Lu +3 位作者 Sicong Wu Ruizhen Hu Joel Lanir Hui Huang 《Computational Visual Media》 SCIE EI CSCD 2022年第4期613-629,共17页
Appropriate color mapping for categorical data visualization can significantly facilitate the discovery of underlying data patterns and effectively bring out visual aesthetics.Some systems suggest pre-defined palettes... Appropriate color mapping for categorical data visualization can significantly facilitate the discovery of underlying data patterns and effectively bring out visual aesthetics.Some systems suggest pre-defined palettes for this task.However,a predefined color mapping is not always optimal,failing to consider users’needs for customization.Given an input cate-gorical data visualization and a reference image,we present an effective method to automatically generate a coloring that resembles the reference while allowing classes to be easily distinguished.We extract a color palette with high perceptual distance between the colors by sampling dominant and discriminable colors from the image’s color space.These colors are assigned to given classes by solving an integer quadratic program to optimize point distinctness of the given chart while preserving the color spatial relations in the source image.We show results on various coloring tasks,with a diverse set of new coloring appearances for the input data.We also compare our approach to state-of-the-art palettes in a controlled user study,which shows that our method achieves comparable performance in class discrimination,while being more similar to the source image.User feedback after using our system verifies its efficiency in automatically generating desirable colorings that meet the user’s expectations when choosing a reference. 展开更多
关键词 color palette DISCRIMINABILITY IMAGE-GUIDED categorical data visualization
原文传递
A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
7
作者 Ohn Mar San Van-Nam Huynh Yoshiteru Nakamori 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2003年第4期562-571,共10页
Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications fr... Most of the earlier work on clustering mainly focused on numeric data whoseinherent geometric properties can be exploited to naturally define distance functions between datapoints. However, data mining applications frequently involve many datasets that also consists ofmixed numeric and categorical attributes. In this paper we present a clustering algorithm which isbased on the k-means algorithm. The algorithm clusters objects with numeric and categoricalattributes in a way similar to k-means. The object similarity measure is derived from both numericand categorical attributes. When applied to numeric data, the algorithm is identical to the k-means.The main result of this paper is to provide a method to update the 'cluster centers' of clusteringobjects described by mixed numeric and categorical attributes in the clustering process to minimizethe clustering cost function. The clustering performance of the algorithm is demonstrated with thetwo well known data sets, namely credit approval and abalone databases. 展开更多
关键词 cluster analysis numeric data categorical data k-means algorithm
原文传递
A new clustering algorithm for large datasets 被引量:1
8
作者 李清峰 彭文峰 《Journal of Central South University》 SCIE EI CAS 2011年第3期823-829,共7页
The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between c... The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between clustering aggregation and the problem of correlation clustering.The best deterministic approximation algorithm was provided for the variation of the correlation of clustering problem,and showed how sampling can be used to scale the algorithms for large datasets.An extensive empirical evaluation was given for the usefulness of the problem and the solutions.The results show that this method achieves more than 50% reduction in the running time without sacrificing the quality of the clustering. 展开更多
关键词 data mining Circle algorithm clustering categorical data clustering aggregation
下载PDF
Risk Factors Categorizations of Ischemic Heart Disease in South-Western Bangladesh
9
作者 M.Raihan Sami Azam +5 位作者 Laboni Akter Md.Mehedi Hassan Ryana Quadir Asif Karim Saikat Mondal Arun More 《Data Intelligence》 EI 2024年第3期834-868,共35页
Ischemic heart disease(IHD)is one of the leading causes of death worldwide.However,different geographic regions show different variations of the risk factors of this disease based on the different lifestyles of people... Ischemic heart disease(IHD)is one of the leading causes of death worldwide.However,different geographic regions show different variations of the risk factors of this disease based on the different lifestyles of people.This study examines the current IHD condition in southern Bangladesh,a Southeast Asian middle-income country.The main approach to this research is an Al-based proposal of a reduced set of the greatest impact clinical traits that may cause IHD.This approach attempts to reduce IHD morbidity and mortality by early detection of risk factors using the reduced set of clinical data.Demographic,diagnostic,and symptomatic features were considered for analysing this clinical data.Data pre-processing utilizes several machine learning techniques to select significant features and make meaningful interpretations.A proposed voting mechanism ranked the selected 138 features by their impact factor.In this regard,diverse patterns in correlations with variables,including age,sex,career,family history,obesity,etc.,were calculated and explained in terms of voting scores.Among the 138 risk factors,three labels were categorized:high-risk,medium-risk,and low-risk features;19 features were regarded as high,25 were medium,and 94 were considered low impactful features.This research's technological methodology and practical goals provide an innovative and resilient framework for addressing IHD,especially in less developed cities and townships of Bangladesh,where the general population's socioeconomic conditions are often unexpected.The data collection,pre-processing,and use of this study's complete and comprehensive IHD patient dataset is another innovative addition.We believe that other relevant research initiatives will benefit from this work. 展开更多
关键词 Ischemic heart disease Machine learning CVD data categorization Medical data
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部