期刊文献+
共找到104,822篇文章
< 1 2 250 >
每页显示 20 50 100
CSFW-SC: Cuckoo Search Fuzzy-Weighting Algorithm for Subspace Clustering Applying to High-Dimensional Clustering 被引量:1
1
作者 WANG Jindong HE Jiajing +1 位作者 ZHANG Hengwei YU Zhiyong 《China Communications》 SCIE CSCD 2015年第S2期55-63,共9页
Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subsp... Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subspace clustering algorithm. In the proposed algorithm, a novel objective function is firstly designed by considering the fuzzy weighting within-cluster compactness and the between-cluster separation, and loosening the constraints of dimension weight matrix. Then gradual membership and improved Cuckoo search, a global search strategy, are introduced to optimize the objective function and search subspace clusters, giving novel learning rules for clustering. At last, the performance of the proposed algorithm on the clustering analysis of various low and high dimensional datasets is experimentally compared with that of several competitive subspace clustering algorithms. Experimental studies demonstrate that the proposed algorithm can obtain better performance than most of the existing soft subspace clustering algorithms. 展开更多
关键词 high-dimensional data clustering soft SUBSPACE CUCKOO SEARCH FUZZY clustering
下载PDF
Subspace Clustering in High-Dimensional Data Streams:A Systematic Literature Review
2
作者 Nur Laila Ab Ghani Izzatdin Abdul Aziz Said Jadid AbdulKadir 《Computers, Materials & Continua》 SCIE EI 2023年第5期4649-4668,共20页
Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approac... Clustering high dimensional data is challenging as data dimensionality increases the distance between data points,resulting in sparse regions that degrade clustering performance.Subspace clustering is a common approach for processing high-dimensional data by finding relevant features for each cluster in the data space.Subspace clustering methods extend traditional clustering to account for the constraints imposed by data streams.Data streams are not only high-dimensional,but also unbounded and evolving.This necessitates the development of subspace clustering algorithms that can handle high dimensionality and adapt to the unique characteristics of data streams.Although many articles have contributed to the literature review on data stream clustering,there is currently no specific review on subspace clustering algorithms in high-dimensional data streams.Therefore,this article aims to systematically review the existing literature on subspace clustering of data streams in high-dimensional streaming environments.The review follows a systematic methodological approach and includes 18 articles for the final analysis.The analysis focused on two research questions related to the general clustering process and dealing with the unbounded and evolving characteristics of data streams.The main findings relate to six elements:clustering process,cluster search,subspace search,synopsis structure,cluster maintenance,and evaluation measures.Most algorithms use a two-phase clustering approach consisting of an initialization stage,a refinement stage,a cluster maintenance stage,and a final clustering stage.The density-based top-down subspace clustering approach is more widely used than the others because it is able to distinguish true clusters and outliers using projected microclusters.Most algorithms implicitly adapt to the evolving nature of the data stream by using a time fading function that is sensitive to outliers.Future work can focus on the clustering framework,parameter optimization,subspace search techniques,memory-efficient synopsis structures,explicit cluster change detection,and intrinsic performance metrics.This article can serve as a guide for researchers interested in high-dimensional subspace clustering methods for data streams. 展开更多
关键词 clustering subspace clustering projected clustering data stream stream clustering high dimensionality evolving data stream concept drift
下载PDF
K-Hyperparameter Tuning in High-Dimensional Space Clustering:Solving Smooth Elbow Challenges Using an Ensemble Based Technique of a Self-Adapting Autoencoder and Internal Validation Indexes
3
作者 Rufus Gikera Jonathan Mwaura +1 位作者 Elizaphan Muuro Shadrack Mambo 《Journal on Artificial Intelligence》 2023年第1期75-112,共38页
k-means is a popular clustering algorithm because of its simplicity and scalability to handle large datasets.However,one of its setbacks is the challenge of identifying the correct k-hyperparameter value.Tuning this v... k-means is a popular clustering algorithm because of its simplicity and scalability to handle large datasets.However,one of its setbacks is the challenge of identifying the correct k-hyperparameter value.Tuning this value correctly is critical for building effective k-means models.The use of the traditional elbow method to help identify this value has a long-standing literature.However,when using this method with certain datasets,smooth curves may appear,making it challenging to identify the k-value due to its unclear nature.On the other hand,various internal validation indexes,which are proposed as a solution to this issue,may be inconsistent.Although various techniques for solving smooth elbow challenges exist,k-hyperparameter tuning in high-dimensional spaces still remains intractable and an open research issue.In this paper,we have first reviewed the existing techniques for solving smooth elbow challenges.The identified research gaps are then utilized in the development of the new technique.The new technique,referred to as the ensemble-based technique of a self-adapting autoencoder and internal validation indexes,is then validated in high-dimensional space clustering.The optimal k-value,tuned by this technique using a voting scheme,is a trade-off between the number of clusters visualized in the autoencoder’s latent space,k-value from the ensemble internal validation index score and one that generates a value of 0 or close to 0 on the derivative f″′(k)(1+f′(k)^(2))−3 f″(k)^(2)f″((k)2f′(k),at the elbow.Experimental results based on the Cochran’s Q test,ANOVA,and McNemar’s score indicate a relatively good performance of the newly developed technique in k-hyperparameter tuning. 展开更多
关键词 k-hyperparameter tuning high-dimensional smooth elbow
下载PDF
基于Blending-Clustering集成学习的大坝变形预测模型
4
作者 冯子强 李登华 丁勇 《水利水电技术(中英文)》 北大核心 2024年第4期59-70,共12页
【目的】变形是反映大坝结构性态最直观的效应量,构建科学合理的变形预测模型是保障大坝安全健康运行的重要手段。针对传统大坝变形预测模型预测精度低、误报率高等问题导致的错误报警现象,【方法】选取不同预测模型和聚类算法集成,构... 【目的】变形是反映大坝结构性态最直观的效应量,构建科学合理的变形预测模型是保障大坝安全健康运行的重要手段。针对传统大坝变形预测模型预测精度低、误报率高等问题导致的错误报警现象,【方法】选取不同预测模型和聚类算法集成,构建了一种Blending-Clustering集成学习的大坝变形预测模型,该模型以Blending对单一预测模型集成提升预测精度为核心,并通过Clustering聚类优选预测值改善模型稳定性。以新疆某面板堆石坝变形监测数据为实例分析,通过多模型预测性能比较,对所提出模型的预测精度和稳定性进行全面评估。【结果】结果显示:Blending-Clustering模型将预测模型和聚类算法集成,均方根误差(RMSE)和归一化平均百分比误差(nMAPE)明显降低,模型的预测精度得到显著提高;回归相关系数(R~2)得到提升,模型具备更强的拟合能力;在面板堆石坝上22个测点变形数据集上的预测评价指标波动范围更小,模型的泛化性和稳定性得到有效增强。【结论】结果表明:Blending-Clustering集成预测模型对于预测精度、泛化性和稳定性均有明显提升,在实际工程具有一定的应用价值。 展开更多
关键词 大坝 变形 预测模型 Blending集成 clustering集成 模型融合
下载PDF
Deep Learning and Tensor-Based Multiple Clustering Approaches for Cyber-Physical-Social Applications 被引量:1
5
作者 Hongjun Zhang Hao Zhang +3 位作者 Yu Lei Hao Ye Peng Li Desheng Shi 《Computers, Materials & Continua》 SCIE EI 2024年第3期4109-4128,共20页
The study delves into the expanding role of network platforms in our daily lives, encompassing various mediums like blogs, forums, online chats, and prominent social media platforms such as Facebook, Twitter, and Inst... The study delves into the expanding role of network platforms in our daily lives, encompassing various mediums like blogs, forums, online chats, and prominent social media platforms such as Facebook, Twitter, and Instagram. While these platforms offer avenues for self-expression and community support, they concurrently harbor negative impacts, fostering antisocial behaviors like phishing, impersonation, hate speech, cyberbullying, cyberstalking, cyberterrorism, fake news propagation, spamming, and fraud. Notably, individuals also leverage these platforms to connect with authorities and seek aid during disasters. The overarching objective of this research is to address the dual nature of network platforms by proposing innovative methodologies aimed at enhancing their positive aspects and mitigating their negative repercussions. To achieve this, the study introduces a weight learning method grounded in multi-linear attribute ranking. This approach serves to evaluate the significance of attribute combinations across all feature spaces. Additionally, a novel clustering method based on tensors is proposed to elevate the quality of clustering while effectively distinguishing selected features. The methodology incorporates a weighted average similarity matrix and optionally integrates weighted Euclidean distance, contributing to a more nuanced understanding of attribute importance. The analysis of the proposed methods yields significant findings. The weight learning method proves instrumental in discerning the importance of attribute combinations, shedding light on key aspects within feature spaces. Simultaneously, the clustering method based on tensors exhibits improved efficacy in enhancing clustering quality and feature distinction. This not only advances our understanding of attribute importance but also paves the way for more nuanced data analysis methodologies. In conclusion, this research underscores the pivotal role of network platforms in contemporary society, emphasizing their potential for both positive contributions and adverse consequences. The proposed methodologies offer novel approaches to address these dualities, providing a foundation for future research and practical applications. Ultimately, this study contributes to the ongoing discourse on optimizing the utility of network platforms while minimizing their negative impacts. 展开更多
关键词 Network platform tensor-based clustering weight learning multi-linear euclidean
下载PDF
Target Controllability of Multi-Layer Networks With High-Dimensional Nodes
6
作者 Lifu Wang Zhaofei Li +1 位作者 Ge Guo Zhi Kong 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2024年第9期1999-2010,共12页
This paper studies the target controllability of multilayer complex networked systems,in which the nodes are highdimensional linear time invariant(LTI)dynamical systems,and the network topology is directed and weighte... This paper studies the target controllability of multilayer complex networked systems,in which the nodes are highdimensional linear time invariant(LTI)dynamical systems,and the network topology is directed and weighted.The influence of inter-layer couplings on the target controllability of multi-layer networks is discussed.It is found that even if there exists a layer which is not target controllable,the entire multi-layer network can still be target controllable due to the inter-layer couplings.For the multi-layer networks with general structure,a necessary and sufficient condition for target controllability is given by establishing the relationship between uncontrollable subspace and output matrix.By the derived condition,it can be found that the system may be target controllable even if it is not state controllable.On this basis,two corollaries are derived,which clarify the relationship between target controllability,state controllability and output controllability.For the multi-layer networks where the inter-layer couplings are directed chains and directed stars,sufficient conditions for target controllability of networked systems are given,respectively.These conditions are easier to verify than the classic criterion. 展开更多
关键词 high-dimensional nodes inter-layer couplings multi-layer networks target controllability
下载PDF
Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition
7
作者 Liya Yue Pei Hu +1 位作者 Shu-Chuan Chu Jeng-Shyang Pan 《Computers, Materials & Continua》 SCIE EI 2024年第2期1957-1975,共19页
Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is ext... Speech emotion recognition(SER)uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions.The number of features acquired with acoustic analysis is extremely high,so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system.The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy.First,we use the information gain and Fisher Score to sort the features extracted from signals.Then,we employ a multi-objective ranking method to evaluate these features and assign different importance to them.Features with high rankings have a large probability of being selected.Finally,we propose a repair strategy to address the problem of duplicate solutions in multi-objective feature selection,which can improve the diversity of solutions and avoid falling into local traps.Using random forest and K-nearest neighbor classifiers,four English speech emotion datasets are employed to test the proposed algorithm(MBEO)as well as other multi-objective emotion identification techniques.The results illustrate that it performs well in inverted generational distance,hypervolume,Pareto solutions,and execution time,and MBEO is appropriate for high-dimensional English SER. 展开更多
关键词 Speech emotion recognition filter-wrapper high-dimensional feature selection equilibrium optimizer MULTI-OBJECTIVE
下载PDF
A Shared Natural Neighbors Based-Hierarchical Clustering Algorithm for Discovering Arbitrary-Shaped Clusters
8
作者 Zhongshang Chen Ji Feng +1 位作者 Fapeng Cai Degang Yang 《Computers, Materials & Continua》 SCIE EI 2024年第8期2031-2048,共18页
In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared... In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes. 展开更多
关键词 cluster analysis shared natural neighbor hierarchical clustering
下载PDF
Multiscale and Auto-Tuned Semi-Supervised Deep Subspace Clustering and Its Application in Brain Tumor Clustering
9
作者 Zhenyu Qian Yizhang Jiang +4 位作者 Zhou Hong Lijun Huang Fengda Li Khin Wee Lai Kaijian Xia 《Computers, Materials & Continua》 SCIE EI 2024年第6期4741-4762,共22页
In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da... In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework. 展开更多
关键词 Deep subspace clustering multiscale network structure automatic hyperparameter tuning SEMI-SUPERVISED medical image clustering
下载PDF
Knowledge-Driven Possibilistic Clustering with Automatic Cluster Elimination
10
作者 Xianghui Hu Yiming Tang +2 位作者 Witold Pedrycz Jiuchuan Jiang Yichuan Jiang 《Computers, Materials & Continua》 SCIE EI 2024年第9期4917-4945,共29页
Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have ... Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed. 展开更多
关键词 Fuzzy C-Means(FCM) possibilistic clustering optimal number of clusters knowledge-driven machine learning fuzzy logic
下载PDF
A novel method for clustering cellular data to improve classification
11
作者 Diek W.Wheeler Giorgio A.Ascoli 《Neural Regeneration Research》 SCIE CAS 2025年第9期2697-2705,共9页
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse... Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons. 展开更多
关键词 cellular data clustering dendrogram data classification Levene's one-tailed statistical test unsupervised hierarchical clustering
下载PDF
Path-Based Clustering Algorithm with High Scalability Using the Combined Behavior of Evolutionary Algorithms
12
作者 Leila Safari-Monjeghtapeh Mansour Esmaeilpour 《Computer Systems Science & Engineering》 2024年第3期705-721,共17页
Path-based clustering algorithms typically generate clusters by optimizing a benchmark function.Most optimiza-tion methods in clustering algorithms often offer solutions close to the general optimal value.This study a... Path-based clustering algorithms typically generate clusters by optimizing a benchmark function.Most optimiza-tion methods in clustering algorithms often offer solutions close to the general optimal value.This study achieves the global optimum value for the criterion function in a shorter time using the minimax distance,Maximum Spanning Tree“MST”,and meta-heuristic algorithms,including Genetic Algorithm“GA”and Particle Swarm Optimization“PSO”.The Fast Path-based Clustering“FPC”algorithm proposed in this paper can find cluster centers correctly in most datasets and quickly perform clustering operations.The FPC does this operation using MST,the minimax distance,and a new hybrid meta-heuristic algorithm in a few rounds of algorithm iterations.This algorithm can achieve the global optimal value,and the main clustering process of the algorithm has a computational complexity of O�k2×n�.However,due to the complexity of the minimum distance algorithm,the total computational complexity is O�n2�.Experimental results of FPC on synthetic datasets with arbitrary shapes demonstrate that the algorithm is resistant to noise and outliers and can correctly identify clusters of varying sizes and numbers.In addition,the FPC requires the number of clusters as the only parameter to perform the clustering process.A comparative analysis of FPC and other clustering algorithms in this domain indicates that FPC exhibits superior speed,stability,and performance. 展开更多
关键词 clustering global optimization the minimax matrix MST path-based clustering FPC
下载PDF
An Efficient Reliability-Based Optimization Method Utilizing High-Dimensional Model Representation and Weight-Point Estimation Method
13
作者 Xiaoyi Wang Xinyue Chang +2 位作者 Wenxuan Wang Zijie Qiao Feng Zhang 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第5期1775-1796,共22页
The objective of reliability-based design optimization(RBDO)is to minimize the optimization objective while satisfying the corresponding reliability requirements.However,the nested loop characteristic reduces the effi... The objective of reliability-based design optimization(RBDO)is to minimize the optimization objective while satisfying the corresponding reliability requirements.However,the nested loop characteristic reduces the efficiency of RBDO algorithm,which hinders their application to high-dimensional engineering problems.To address these issues,this paper proposes an efficient decoupled RBDO method combining high dimensional model representation(HDMR)and the weight-point estimation method(WPEM).First,we decouple the RBDO model using HDMR and WPEM.Second,Lagrange interpolation is used to approximate a univariate function.Finally,based on the results of the first two steps,the original nested loop reliability optimization model is completely transformed into a deterministic design optimization model that can be solved by a series of mature constrained optimization methods without any additional calculations.Two numerical examples of a planar 10-bar structure and an aviation hydraulic piping system with 28 design variables are analyzed to illustrate the performance and practicability of the proposed method. 展开更多
关键词 Reliability-based design optimization high-dimensional model decomposition point estimation method Lagrange interpolation aviation hydraulic piping system
下载PDF
Censored Composite Conditional Quantile Screening for High-Dimensional Survival Data
14
作者 LIU Wei LI Yingqiu 《应用概率统计》 CSCD 北大核心 2024年第5期783-799,共17页
In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all usef... In this paper,we introduce the censored composite conditional quantile coefficient(cC-CQC)to rank the relative importance of each predictor in high-dimensional censored regression.The cCCQC takes advantage of all useful information across quantiles and can detect nonlinear effects including interactions and heterogeneity,effectively.Furthermore,the proposed screening method based on cCCQC is robust to the existence of outliers and enjoys the sure screening property.Simulation results demonstrate that the proposed method performs competitively on survival datasets of high-dimensional predictors,particularly when the variables are highly correlated. 展开更多
关键词 high-dimensional survival data censored composite conditional quantile coefficient sure screening property rank consistency property
下载PDF
Efficient Clustering Network Based on Matrix Factorization
15
作者 Jieren Cheng Jimei Li +2 位作者 Faqiang Zeng Zhicong Tao and Yue Yang 《Computers, Materials & Continua》 SCIE EI 2024年第7期281-298,共18页
Contrastive learning is a significant research direction in the field of deep learning.However,existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of ... Contrastive learning is a significant research direction in the field of deep learning.However,existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of model pre-training limits further improvement in the performance of existing methods.To address these challenges,we propose the Efficient Clustering Network based on Matrix Factorization(ECN-MF).Specifically,we design a batched low-rank Singular Value Decomposition(SVD)algorithm for data augmentation to eliminate redundant information and uncover major patterns of variation and key information in the data.Additionally,we design a Mutual Information-Enhanced Clustering Module(MI-ECM)to accelerate the training process by leveraging a simple architecture to bring samples from the same cluster closer while pushing samples from other clusters apart.Extensive experiments on six datasets demonstrate that ECN-MF exhibits more effective performance compared to state-of-the-art algorithms. 展开更多
关键词 Contrastive learning clustering matrix factorization
下载PDF
Improved Data Stream Clustering Method: Incorporating KD-Tree for Typicality and Eccentricity-Based Approach
16
作者 Dayu Xu Jiaming Lu +1 位作者 Xuyao Zhang Hongtao Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第2期2557-2573,共17页
Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims... Data stream clustering is integral to contemporary big data applications.However,addressing the ongoing influx of data streams efficiently and accurately remains a primary challenge in current research.This paper aims to elevate the efficiency and precision of data stream clustering,leveraging the TEDA(Typicality and Eccentricity Data Analysis)algorithm as a foundation,we introduce improvements by integrating a nearest neighbor search algorithm to enhance both the efficiency and accuracy of the algorithm.The original TEDA algorithm,grounded in the concept of“Typicality and Eccentricity Data Analytics”,represents an evolving and recursive method that requires no prior knowledge.While the algorithm autonomously creates and merges clusters as new data arrives,its efficiency is significantly hindered by the need to traverse all existing clusters upon the arrival of further data.This work presents the NS-TEDA(Neighbor Search Based Typicality and Eccentricity Data Analysis)algorithm by incorporating a KD-Tree(K-Dimensional Tree)algorithm integrated with the Scapegoat Tree.Upon arrival,this ensures that new data points interact solely with clusters in very close proximity.This significantly enhances algorithm efficiency while preventing a single data point from joining too many clusters and mitigating the merging of clusters with high overlap to some extent.We apply the NS-TEDA algorithm to several well-known datasets,comparing its performance with other data stream clustering algorithms and the original TEDA algorithm.The results demonstrate that the proposed algorithm achieves higher accuracy,and its runtime exhibits almost linear dependence on the volume of data,making it more suitable for large-scale data stream analysis research. 展开更多
关键词 Data stream clustering TEDA KD-TREE scapegoat tree
下载PDF
Hyperspectral Image Based Interpretable Feature Clustering Algorithm
17
作者 Yaming Kang PeishunYe +1 位作者 Yuxiu Bai Shi Qiu 《Computers, Materials & Continua》 SCIE EI 2024年第5期2151-2168,共18页
Hyperspectral imagery encompasses spectral and spatial dimensions,reflecting the material properties of objects.Its application proves crucial in search and rescue,concealed target identification,and crop growth analy... Hyperspectral imagery encompasses spectral and spatial dimensions,reflecting the material properties of objects.Its application proves crucial in search and rescue,concealed target identification,and crop growth analysis.Clustering is an important method of hyperspectral analysis.The vast data volume of hyperspectral imagery,coupled with redundant information,poses significant challenges in swiftly and accurately extracting features for subsequent analysis.The current hyperspectral feature clustering methods,which are mostly studied from space or spectrum,do not have strong interpretability,resulting in poor comprehensibility of the algorithm.So,this research introduces a feature clustering algorithm for hyperspectral imagery from an interpretability perspective.It commences with a simulated perception process,proposing an interpretable band selection algorithm to reduce data dimensions.Following this,amulti-dimensional clustering algorithm,rooted in fuzzy and kernel clustering,is developed to highlight intra-class similarities and inter-class differences.An optimized P systemis then introduced to enhance computational efficiency.This system coordinates all cells within a mapping space to compute optimal cluster centers,facilitating parallel computation.This approach diminishes sensitivity to initial cluster centers and augments global search capabilities,thus preventing entrapment in local minima and enhancing clustering performance.Experiments conducted on 300 datasets,comprising both real and simulated data.The results show that the average accuracy(ACC)of the proposed algorithm is 0.86 and the combination measure(CM)is 0.81. 展开更多
关键词 HYPERSPECTRAL fuzzy clustering tissue P system band selection interpretable
下载PDF
Research on Tensor Multi-Clustering Distributed Incremental Updating Method for Big Data
18
作者 Hongjun Zhang Zeyu Zhang +3 位作者 Yilong Ruan Hao Ye Peng Li Desheng Shi 《Computers, Materials & Continua》 SCIE EI 2024年第10期1409-1432,共24页
The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces ... The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology. 展开更多
关键词 TENSOR incremental update DISTRIBUTED clustering processing big data
下载PDF
Examining the Use of Scott’s Formula and Link Expiration Time Metric for Vehicular Clustering
19
作者 Fady Samann Shavan Askar 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第3期2421-2444,共24页
Implementing machine learning algorithms in the non-conducive environment of the vehicular network requires some adaptations due to the high computational complexity of these algorithms.K-clustering algorithms are sim... Implementing machine learning algorithms in the non-conducive environment of the vehicular network requires some adaptations due to the high computational complexity of these algorithms.K-clustering algorithms are simplistic,with fast performance and relative accuracy.However,their implementation depends on the initial selection of clusters number(K),the initial clusters’centers,and the clustering metric.This paper investigated using Scott’s histogram formula to estimate the K number and the Link Expiration Time(LET)as a clustering metric.Realistic traffic flows were considered for three maps,namely Highway,Traffic Light junction,and Roundabout junction,to study the effect of road layout on estimating the K number.A fast version of the PAM algorithm was used for clustering with a modification to reduce time complexity.The Affinity propagation algorithm sets the baseline for the estimated K number,and the Medoid Silhouette method is used to quantify the clustering.OMNET++,Veins,and SUMO were used to simulate the traffic,while the related algorithms were implemented in Python.The Scott’s formula estimation of the K number only matched the baseline when the road layout was simple.Moreover,the clustering algorithm required one iteration on average to converge when used with LET. 展开更多
关键词 clustering vehicular network Scott’s formula FastPAM
下载PDF
Design and construction of charged-particle telescope array for study of exotic nuclear clustering structure
20
作者 Zheng‑Li Liao Xi‑Guang Cao +2 位作者 Yu‑Xuan Yang Chang‑Bo Fu Xian‑Gai Deng 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2024年第8期114-123,共10页
The exploration of exotic shapes and properties of atomic nuclei,e.g.,αcluster and toroidal shape,is a fascinating field in nuclear physics.To study the decay of these nuclei,a novel detector aimed at detecting multi... The exploration of exotic shapes and properties of atomic nuclei,e.g.,αcluster and toroidal shape,is a fascinating field in nuclear physics.To study the decay of these nuclei,a novel detector aimed at detecting multipleα-particle events was designed and constructed.The detector comprises two layers of double-sided silicon strip detectors(DSSD)and a cesium iodide scintillator array coupled with silicon photomultipliers array as light sensors,which has the advantages of their small size,fast response,and large dynamic range.DSSDs coupled with cesium iodide crystal arrays are used to distinguish multipleαhits.The detector array has a compact and integrated design that can be adapted to different experimental conditions.The detector array was simulated using Geant4,and the excitation energy spectra of someα-clustering nuclei were reconstructed to demonstrate the performance.The simulation results show that the detector array has excellent angular and energy resolutions,enabling effective reconstruction of the nuclear excited state by multipleαparticle events.This detector offers a new and powerful tool for nuclear physics experiments and has the potential to discover interesting physical phenomena related to exotic nuclear structures and their decay mechanisms. 展开更多
关键词 cluster decay Toroidal structure Telescope array SIPM Energy resolution
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部