期刊文献+
共找到266篇文章
< 1 2 14 >
每页显示 20 50 100
A novel method for clustering cellular data to improve classification
1
作者 Diek W.Wheeler Giorgio A.Ascoli 《Neural Regeneration Research》 SCIE CAS 2025年第9期2697-2705,共9页
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse... Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons. 展开更多
关键词 cellular data clustering dendrogram data classification Levene's one-tailed statistical test unsupervised hierarchical clustering
下载PDF
Two-level Hierarchical Clustering Analysis and Application
2
作者 HU Hui-rong, WANG Zhou-jing (Department of Automation, Xiamen University, Xiamen 361005, China) 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2002年第S1期283-284,共2页
Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri n... Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri ng while ignoring R clustering in practice, so it has some limitation especially when the number of sample and index is very large. Furthermore, because of igno ring the association between the different indexes, the clustering result is not good & true. In this paper, we present the model and the algorithm of two-level hierarchi cal clustering which integrates Q clustering with R clustering. Moreover, becaus e two-level hierarchical clustering is based on the respective clustering resul t of each class, the classification of the indexes directly effects on the a ccuracy of the final clustering result, how to appropriately classify the inde xes is the chief and difficult problem we must handle in advance. Although some literatures also have referred to the issue of the classificati on of the indexes, but the articles classify the indexes only according to their superficial signification, which is unscientific. The reasons are as follow s: First, the superficial signification of some indexes usually takes on different meanings and it is easy to be misapprehended by different person. Furthermore, t his classification method seldom make use of history data, the classification re sult is not so objective. Second, for some indexes, its superficial signification didn’t show any mean ings, so simply from the superficial signification, we can’t classify them to c ertain classes. Third, this classification method need the users have higher level knowledge of this field, otherwise it is difficult for the users to understand the signifi cation of some indexes, which sometimes is not available. So in this paper, to this question, we first use R clustering method to cluste ring indexes, dividing p dimension indexes into q classes, then adopt two-level clustering method to get the final result. Obviously, the classification result is more objective and accurate. Moreover, after the first step, we can get the relation of the different indexes and their interaction. We can also know under a certain class indexes, which samples can be clustering to a class. (These semi finished results sometimes are very useful.) The experiments also indicates the effective and accurate of the algorithms. And, the result of R clustering ca n be easily used for the later practice. 展开更多
关键词 data mining clustering hierarchical clustering R clustering Q clustering
下载PDF
ADC-DL:Communication-Efficient Distributed Learning with Hierarchical Clustering and Adaptive Dataset Condensation
3
作者 Zhipeng Gao Yan Yang +1 位作者 Chen Zhao Zijia Mo 《China Communications》 SCIE CSCD 2022年第12期73-85,共13页
The rapid growth of modern mobile devices leads to a large number of distributed data,which is extremely valuable for learning models.Unfortunately,model training by collecting all these original data to a centralized... The rapid growth of modern mobile devices leads to a large number of distributed data,which is extremely valuable for learning models.Unfortunately,model training by collecting all these original data to a centralized cloud server is not applicable due to data privacy and communication costs concerns,hindering artificial intelligence from empowering mobile devices.Moreover,these data are not identically and independently distributed(Non-IID)caused by their different context,which will deteriorate the performance of the model.To address these issues,we propose a novel Distributed Learning algorithm based on hierarchical clustering and Adaptive Dataset Condensation,named ADC-DL,which learns a shared model by collecting the synthetic samples generated on each device.To tackle the heterogeneity of data distribution,we propose an entropy topsis comprehensive tiering model for hierarchical clustering,which distinguishes clients in terms of their data characteristics.Subsequently,synthetic dummy samples are generated based on the hierarchical structure utilizing adaptive dataset condensation.The procedure of dataset condensation can be adjusted adaptively according to the tier of the client.Extensive experiments demonstrate that the performance of our ADC-DL is more outstanding in prediction accuracy and communication costs compared with existing algorithms. 展开更多
关键词 distributed learning Non-IID data partition hierarchical clustering adaptive dataset condensation
下载PDF
Privacy Preserving Two-Party Hierarchical Clustering Over Vertically Partitioned Dataset
4
作者 Animesh Tripathy Ipsa De 《Journal of Software Engineering and Applications》 2013年第5期26-31,共6页
Data mining has been a popular research area for more than a decade. There are several problems associated with data mining. Among them clustering is one of the most interesting problems. However, this problem becomes... Data mining has been a popular research area for more than a decade. There are several problems associated with data mining. Among them clustering is one of the most interesting problems. However, this problem becomes more challenging when dataset is distributed between different parties and they do not want to share their data. So, in this paper we propose a privacy preserving two party hierarchical clustering algorithm vertically partitioned data set. Each site only learns the final cluster centers, but nothing about the individual’s data. 展开更多
关键词 data MINING PRIVACY hierarchical clustering
下载PDF
AVLINK: Robust Clustering Algorithm based on Average Link Applied to Protein Sequence Analysis 被引量:1
5
作者 Mohamed A. Mahfouz 《Journal of Mathematics and System Science》 2016年第5期205-214,共10页
Robust Clustering methods are aimed at avoiding unsatisfactory results resulting from the presence of certain amount of outlying observations in the input data of many practical applications such as biological sequenc... Robust Clustering methods are aimed at avoiding unsatisfactory results resulting from the presence of certain amount of outlying observations in the input data of many practical applications such as biological sequences analysis or gene expressions analysis. This paper presents a fuzzy clustering algorithm based on average link and possibilistic clustering paradigm termed as AVLINK. It minimizes the average dissimilarity between pairs of patterns within the same cluster and at the same time the size of a cluster is maximized by computing the zeros of the derivative of proposed objective function. AVLINK along with the proposed initialization procedure show a high outliers rejection capability as it makes their membership very low furthermore it does not requires the number of clusters to be known in advance and it can discover clusters of non convex shape. The effectiveness and robustness of the proposed algorithms have been demonstrated on different types of protein data sets. 展开更多
关键词 data Mining Fuzzy clustering Relational clustering hierarchical clustering Bioinformatics.
下载PDF
Performances of Clustering Methods Considering Data Transformation and Sample Size: An Evaluation with Fisheries Survey Data
6
作者 WO Jia ZHANG Chongliang +2 位作者 XU Binduo XUE Ying REN Yiping 《Journal of Ocean University of China》 SCIE CAS CSCD 2020年第3期659-668,共10页
Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable ... Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods. 展开更多
关键词 hierarchical cluster K-means cluster expectation-maximization cluster optimal number of clusters stability data transformation
下载PDF
A Topological Clustering of Variables
7
作者 Rafik Abdesselam 《Journal of Mathematics and System Science》 2021年第2期1-17,共17页
The clustering of objects(individuals or variables)is one of the most used approaches to exploring multivariate data.The two most common unsupervised clustering strategies are hierarchical ascending clustering(HAC)and... The clustering of objects(individuals or variables)is one of the most used approaches to exploring multivariate data.The two most common unsupervised clustering strategies are hierarchical ascending clustering(HAC)and k-means partitioning used to identify groups of similar objects in a dataset to divide it into homogeneous groups.The proposed topological clustering of variables,called TCV,studies an homogeneous set of variables defined on the same set of individuals,based on the notion of neighborhood graphs,some of these variables are more-or-less correlated or linked according to the type quantitative or qualitative of the variables.This topological data analysis approach can then be useful for dimension reduction and variable selection.It’s a topological hierarchical clustering analysis of a set of variables which can be quantitative,qualitative or a mixture of both.It arranges variables into homogeneous groups according to their correlations or associations studied in a topological context of principal component analysis(PCA)or multiple correspondence analysis(MCA).The proposed TCV is adapted to the type of data considered,its principle is presented and illustrated using simple real datasets with quantitative,qualitative and mixed variables.The results of these illustrative examples are compared to those of other variables clustering approaches. 展开更多
关键词 hierarchical clustering proximity measure neighborhood graph adjacency matrix multivariate quantitative qualitative and mixed data analysis dimension reduction
下载PDF
Hierarchical Clustering of Complex Symbolic Data and Application for Emitter Identification 被引量:1
8
作者 Xin Xu Jiaheng Lu Wei Wang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2018年第4期807-822,共16页
It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the major... It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the majority of existing work in symbolic data analysis still focuses on interval values. Although some pioneering work in stochastic pattern based symbolic data and mixture of symbolic variables has been explored, it still lacks flexibility and computation efficiency to make full use of the distinctive individual symbolic variables. Therefore, we bring forward a novel hierarchical clustering method with weighted general Jaccard distance and effective global pruning strategy for complex symbolic data and apply it to emitter identification. Extensive experiments indicate that our method has outperformed its peers in both computational efficiency and emitter identification accuracy. 展开更多
关键词 symbolic data analysis stochastic pattern fuzzy interval hierarchical clustering emitter identification
原文传递
A Novel Clustering Technique for Efficient Clustering of Big Data in Hadoop Ecosystem 被引量:5
9
作者 Sunil Kumar Maninder Singh 《Big Data Mining and Analytics》 2019年第4期240-247,共8页
Big data analytics and data mining are techniques used to analyze data and to extract hidden information.Traditional approaches to analysis and extraction do not work well for big data because this data is complex and... Big data analytics and data mining are techniques used to analyze data and to extract hidden information.Traditional approaches to analysis and extraction do not work well for big data because this data is complex and of very high volume. A major data mining technique known as data clustering groups the data into clusters and makes it easy to extract information from these clusters. However, existing clustering algorithms, such as k-means and hierarchical, are not efficient as the quality of the clusters they produce is compromised. Therefore, there is a need to design an efficient and highly scalable clustering algorithm. In this paper, we put forward a new clustering algorithm called hybrid clustering in order to overcome the disadvantages of existing clustering algorithms. We compare the new hybrid algorithm with existing algorithms on the bases of precision, recall, F-measure, execution time, and accuracy of results. From the experimental results, it is clear that the proposed hybrid clustering algorithm is more accurate, and has better precision, recall, and F-measure values. 展开更多
关键词 clustering HADOOP BIG data K-MEANS hierarchical
原文传递
A Multilevel Secure Relation-Hierarchical Data Model for a Secure DBMS
10
作者 朱虹 冯玉才 《Journal of Modern Transportation》 2001年第1期8-16,共9页
A multilevel secure relation hierarchical data model for multilevel secure database is extended from the relation hierarchical data model in single level environment in this paper. Based on the model, an upper lowe... A multilevel secure relation hierarchical data model for multilevel secure database is extended from the relation hierarchical data model in single level environment in this paper. Based on the model, an upper lower layer relationalintegrity is presented after we analyze and eliminate the covert channels caused by the database integrity.Two SQL statements are extended to process polyinstantiation in the multilevel secure environment.The system based on the multilevel secure relation hierarchical data model is capable of integratively storing and manipulating complicated objects ( e.g. , multilevel spatial data) and conventional data ( e.g. , integer, real number and character string) in multilevel secure database. 展开更多
关键词 dataBASES data structure data models secure DBMS covert channels mandatory access control POLYINSTANTIATION hierarchical classification non hierarchical category security level integrity cluster index
下载PDF
A Direct Data-Cluster Analysis Method Based on Neutrosophic Set Implication 被引量:1
11
作者 Sudan Jha Gyanendra Prasad Joshi +2 位作者 Lewis Nkenyereya Dae Wan Kim Florentin Smarandache 《Computers, Materials & Continua》 SCIE EI 2020年第11期1203-1220,共18页
Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets... Raw data are classified using clustering techniques in a reasonable manner to create disjoint clusters.A lot of clustering algorithms based on specific parameters have been proposed to access a high volume of datasets.This paper focuses on cluster analysis based on neutrosophic set implication,i.e.,a k-means algorithm with a threshold-based clustering technique.This algorithm addresses the shortcomings of the k-means clustering algorithm by overcoming the limitations of the threshold-based clustering algorithm.To evaluate the validity of the proposed method,several validity measures and validity indices are applied to the Iris dataset(from the University of California,Irvine,Machine Learning Repository)along with k-means and threshold-based clustering algorithms.The proposed method results in more segregated datasets with compacted clusters,thus achieving higher validity indices.The method also eliminates the limitations of threshold-based clustering algorithm and validates measures and respective indices along with k-means and threshold-based clustering algorithms. 展开更多
关键词 data clustering data mining neutrosophic set K-MEANS validity measures cluster-based classification hierarchical clustering
下载PDF
基于Panel-Data模型的江苏城市居民文化消费的实证研究 被引量:5
12
作者 刘洁 陈海波 肖明珍 《江苏商论》 2012年第4期36-39,共4页
本文运用面板数据和聚类分析对江苏省13个地区城市居民文化消费问题进行了实证分析,研究发现前期文化消费比当期居民可支配收入对城市居民文化消费的正向影响更大,而且各地区文化消费倾向和前期文化消费的影响程度不同,因此,为了提高江... 本文运用面板数据和聚类分析对江苏省13个地区城市居民文化消费问题进行了实证分析,研究发现前期文化消费比当期居民可支配收入对城市居民文化消费的正向影响更大,而且各地区文化消费倾向和前期文化消费的影响程度不同,因此,为了提高江苏省各个地区的文化消费水平,政府应因地制宜制定差异化政策。 展开更多
关键词 文化消费 面板数据 聚类分析
下载PDF
COVID19 Outbreak:A Hierarchical Framework for User Sentiment Analysis
13
作者 Ahmed F.Ibrahim M.Hassaballah +2 位作者 Abdelmgeid A.Ali Yunyoung Nam Ibrahim A.Ibrahim 《Computers, Materials & Continua》 SCIE EI 2022年第2期2507-2524,共18页
Social networking sites in the most modernized world are flooded with large data volumes.Extracting the sentiment polarity of important aspects is necessary;as it helps to determine people’s opinions through what the... Social networking sites in the most modernized world are flooded with large data volumes.Extracting the sentiment polarity of important aspects is necessary;as it helps to determine people’s opinions through what they write.The Coronavirus pandemic has invaded the world and been given a mention in the social media on a large scale.In a very short period of time,tweets indicate unpredicted increase of coronavirus.They reflect people’s opinions and thoughts with regard to coronavirus and its impact on society.The research community has been interested in discovering the hidden relationships from short texts such as Twitter and Weiboa;due to their shortness and sparsity.In this paper,a hierarchical twitter sentiment model(HTSM)is proposed to show people’s opinions in short texts.The proposed HTSM has two main features as follows:constructing a hierarchical tree of important aspects from short texts without a predefined hierarchy depth and width,as well as analyzing the extracted opinions to discover the sentiment polarity on those important aspects by applying a valence aware dictionary for sentiment reasoner(VADER)sentiment analysis.The tweets for each extracted important aspect can be categorized as follows:strongly positive,positive,neutral,strongly negative,or negative.The quality of the proposed model is validated by applying it to a popular product and a widespread topic.The results show that the proposed model outperforms the state-of-the-art methods used in analyzing people’s opinions in short text effectively. 展开更多
关键词 COVID19 COVID data sentiment analysis hierarchical clustering sentiment tree
下载PDF
Impact of COVID-19 on G20 countries:analysis of economic recession using data mining approaches
14
作者 Osman Taylan Abdulaziz S.Alkabaa Mustafa Tahsin Yılmaz 《Financial Innovation》 2022年第1期2208-2237,共30页
The G20 countries are the locomotives of economic growth,representing 64%of the global population and including 4.7 billion inhabitants.As a monetary and market value index,real gross domestic product(GDP)is affected ... The G20 countries are the locomotives of economic growth,representing 64%of the global population and including 4.7 billion inhabitants.As a monetary and market value index,real gross domestic product(GDP)is affected by several factors and reflects the economic development of countries.This study aimed to reveal the hidden economic patterns of G20 countries,study the complexity of related economic factors,and analyze the economic reactions taken by policymakers during the coronavirus disease of 2019(COVID-19)pandemic recession(2019–2020).In this respect,this study employed data-mining techniques of nonparametric classification tree and hierarchical clustering approaches to consider factors such as GDP/capita,industrial production,government spending,COVID-19 cases/population,patient recovery,COVID-19 death cases,number of hospital beds/1000 people,and percentage of the vaccinated population to identify clusters for G20 countries.The clustering approach can help policymakers measure economic indices in terms of the factors considered to identify the specific focus of influences on economic development.The results exhibited significant findings for the economic effects of the COVID-19 pandemic on G20 countries,splitting them into three clusters by sharing different measurements and patterns(harmonies and variances across G20 countries).A comprehensive statistical analysis was performed to analyze endogenous and exogenous factors.Similarly,the classification and regression tree method was applied to predict the associations between the response and independent factors to split the G-20 countries into different groups and analyze the economic recession.Variables such as GDP per capita and patient recovery of COVID-19 cases with values of$12,012 and 82.8%,respectively,were the most significant factors for clustering the G20 countries,with a correlation coefficient(R2)of 91.8%.The results and findings offer some crucial recommendations to handle pandemics in terms of the suggested economic systems by identifying the challenges that the G20 countries have experienced. 展开更多
关键词 hierarchical clustering CART Economic recession data mining COVID-19 G20 countries
下载PDF
Hybrid Data Mining Models for Predicting Customer Churn 被引量:1
15
作者 Amjad Hudaib Reham Dannoun +2 位作者 Osama Harfoushi Ruba Obiedat Hossam Faris 《International Journal of Communications, Network and System Sciences》 2015年第5期91-96,共6页
The term “customer churn” is used in the industry of information and communication technology (ICT) to indicate those customers who are about to leave for a new competitor, or end their subscription. Predicting this... The term “customer churn” is used in the industry of information and communication technology (ICT) to indicate those customers who are about to leave for a new competitor, or end their subscription. Predicting this behavior is very important for real life market and competition, and it is essential to manage it. In this paper, three hybrid models are investigated to develop an accurate and efficient churn prediction model. The three models are based on two phases;the clustering phase and the prediction phase. In the first phase, customer data is filtered. The second phase predicts the customer behavior. The first model investigates the k-means algorithm for data filtering, and Multilayer Perceptron Artificial Neural Networks (MLP-ANN) for prediction. The second model uses hierarchical clustering with MLP-ANN. The third one uses self organizing maps (SOM) with MLP-ANN. The three models are developed based on real data then the accuracy and churn rate values are calculated and compared. The comparison with the other models shows that the three hybrid models outperformed single common models. 展开更多
关键词 data Mining K-MEANS hierarchical Cluster Self ORGANIZING MAPS MULTILAYER PERCEPTRON Artificial Neural Networks CHURN Prediction
下载PDF
Study on Mandatory Access Control in a Secure Database Management System
16
作者 ZHU Hong, FENG Yu cai School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China 《Journal of Shanghai University(English Edition)》 CAS 2001年第4期299-307,共9页
This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relatio... This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relation hierarchical data model. Based on the multilevel relation hierarchical data model, the concept of upper lower layer relational integrity is presented after we analyze and eliminate the covert channels caused by the database integrity. Two SQL statements are extended to process polyinstantiation in the multilevel secure environment. The system is based on the multilevel relation hierarchical data model and is capable of integratively storing and manipulating multilevel complicated objects ( e.g., multilevel spatial data) and multilevel conventional data ( e.g., integer, real number and character string). 展开更多
关键词 multilevel relation hierarchical data model covert channels mandatory access control POLYINSTANTIATION hierarchical classification non hierarchical category security level multilevel relation hierarchical instance INTEGRITY cluster
下载PDF
REMUDA: A Practical Topology Control and Data Forwarding Mechanism for Wireless Sensor Networks
17
作者 SUN Li-Min YAN Ting-Xin BI Yan-Zhong 《自动化学报》 EI CSCD 北大核心 2006年第6期867-874,共8页
In wireless sensor networks, topology control plays an important role for data forwarding efficiency in the data gathering applications. In this paper, we present a novel topology control and data forwarding mechanism... In wireless sensor networks, topology control plays an important role for data forwarding efficiency in the data gathering applications. In this paper, we present a novel topology control and data forwarding mechanism called REMUDA, which is designed for a practical indoor parking lot management system. REMUDA forms a tree-based hierarchical network topology which brings as many nodes as possible to be leaf nodes and constructs a virtual cluster structure. Meanwhile, it takes the reliability, stability and path length into account in the tree construction process. Through an experiment in a network of 30 real sensor nodes, we evaluate the performance of REMUDA and compare it with LEPS which is also a practical routing protocol in TinyOS. Experiment results show that REMUDA can achieve better performance than LEPS. 展开更多
关键词 data forwarding mechanism tree-based hierarchical topology virtual cluster
下载PDF
基于用户层次聚类的联邦学习优化方法
18
作者 谭玉玲 欧国成 +1 位作者 曹灿明 柴争议 《南京理工大学学报》 CAS CSCD 北大核心 2024年第4期469-478,488,共11页
联邦学习通过分布式机器学习训练出一种全局模型,该模型能够泛化所有的本地用户数据,以达到保护用户数据隐私的目的。由于用户间的行为、环境等不同,造成了数据异构问题,进而使得用户局部模型的性能往往远高于全局模型。针对上述问题,... 联邦学习通过分布式机器学习训练出一种全局模型,该模型能够泛化所有的本地用户数据,以达到保护用户数据隐私的目的。由于用户间的行为、环境等不同,造成了数据异构问题,进而使得用户局部模型的性能往往远高于全局模型。针对上述问题,该文提出了一种基于用户层次聚类的联邦学习方法。设计了一种联邦学习收敛评估的方法,用于判断全局模型收敛程度;当全局模型收敛时进行聚类用户操作,能够更加准确地找出相似程度较高的用户;通过余弦相似性的层次聚类方法,将具有相似性的用户进行聚类操作,从而减少因数据异构带来的影响。此外该文还采用较大深度的模型WideResNet提高用户本地训练精度。该文采用数据集EMNIST、CIFAR10,调整用户数据之间的角度,分别进行了两类用户和三类用户的聚类联邦学习实验。实验结果显示,与相关经典联邦学习算法FedAvg相比,采用聚类策略后,其训练准确度提高约10%。 展开更多
关键词 联邦学习 数据异构 层次聚类 余弦相似性 WideResNet
下载PDF
基于专利数据库对脉痹处方规律及核心中药的数据挖掘研究 被引量:1
19
作者 杨振宇 李纪新 +3 位作者 李丙泉 顾晨浩 李晨 郭丹丹 《中国医药导报》 CAS 2024年第6期21-25,33,共6页
目的 基于国家专利复方,探寻中医药干预脉痹的遣方用药规律。方法 计算机检索国家专利数据建库至2023年3月有关中医药干预脉痹的专利复方,运用Excel进行数据清洗和频次、性味、归经的统计,SPSS Modeler18.0和R.4.3.1进行中药关联性及层... 目的 基于国家专利复方,探寻中医药干预脉痹的遣方用药规律。方法 计算机检索国家专利数据建库至2023年3月有关中医药干预脉痹的专利复方,运用Excel进行数据清洗和频次、性味、归经的统计,SPSS Modeler18.0和R.4.3.1进行中药关联性及层次聚类分析,Cytoscape 3.9.1对共现网络进行可视化升级。结果 本研究纳入专利复方124项,涉及中药324味;高频中药包括丹参、黄芪等;药性以寒、温为主,药味以苦、甘、辛居多,归经主要归肝经;常见药对包括“黄芪-丹参”等;常用角药组合包含“山楂-丹参-川芎”等;层次聚类得到4组聚类组合。结论 专利复方治疗脉痹以活血祛瘀,化浊降脂,补益气血为主,本研究梳理了治疗脉痹的核心中药,以期为临床遣方用药和未来新药研发提供思路。 展开更多
关键词 脉痹 动脉粥样硬化 复方 专利 数据挖掘 层次聚类 中介中心性 Phi相关系数
下载PDF
中医药治疗肺癌的用药规律 被引量:1
20
作者 辛静 蒋士卿 +3 位作者 张云慧 周月玲 孙旭杭 王留芳 《世界中医药》 CAS 北大核心 2024年第9期1316-1323,共8页
目的:探究中医药治疗肺癌的用药规律。方法:检索《肿瘤良方大全》《肿瘤方剂大辞典》《卫生部药品标准中药成方制剂》以及《国家药品监督管理局总局国家药品标准(修订)颁布件》中中医治疗肺癌的处方,将数据录入Excel进行数据处理,使用La... 目的:探究中医药治疗肺癌的用药规律。方法:检索《肿瘤良方大全》《肿瘤方剂大辞典》《卫生部药品标准中药成方制剂》以及《国家药品监督管理局总局国家药品标准(修订)颁布件》中中医治疗肺癌的处方,将数据录入Excel进行数据处理,使用Lantern 5.0、SPSS Modeler 18.0及SPSS Statistics 25.0对肺癌方剂频次统计、系统聚类分析及关联规则分析。结果:共纳入650个处方,572种药,6 120次。药性以寒、温、平为主,药味以苦、甘、辛为主,归经使用最多为胃、脾、肾、肝、肺。使用频次最多的药物为黄芪、白花蛇舌草、甘草、茯苓、沙参、麦冬等。高频药物以清热解毒药和益气养阴药及活血化瘀药主;关联规则分析得到药对关联16条,三联24条,因子内得到13个公因子,系统聚类分析得出7个关联紧密的药组。隐结构得到14个隐变量,每个隐变量有2个隐类,共28个隐类。结论:补气养阴药和清热抗癌解毒、活血化瘀药、健脾利湿药、清热化痰治则,为中医药治疗肺癌提供借鉴,有待进一步结合临床及实验给予验证。 展开更多
关键词 中医药 肺癌 方剂 药组 用药规律 数据挖掘 系统聚类分析 关联规则分析
下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部