Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
Customer segmentation according to load-shape profiles using smart meter data is an increasingly important application to vital the planning and operation of energy systems and to enable citizens’participation in the...Customer segmentation according to load-shape profiles using smart meter data is an increasingly important application to vital the planning and operation of energy systems and to enable citizens’participation in the energy transition.This study proposes an innovative multi-step clustering procedure to segment customers based on load-shape patterns at the daily and intra-daily time horizons.Smart meter data is split between daily and hourly normalized time series to assess monthly,weekly,daily,and hourly seasonality patterns separately.The dimensionality reduction implicit in the splitting allows a direct approach to clustering raw daily energy time series data.The intraday clustering procedure sequentially identifies representative hourly day-unit profiles for each customer and the entire population.For the first time,a step function approach is applied to reduce time series dimensionality.Customer attributes embedded in surveys are employed to build external clustering validation metrics using Cramer’s V correlation factors and to identify statistically significant determinants of load-shape in energy usage.In addition,a time series features engineering approach is used to extract 16 relevant demand flexibility indicators that characterize customers and corresponding clusters along four different axes:available Energy(E),Temporal patterns(T),Consistency(C),and Variability(V).The methodology is implemented on a real-world electricity consumption dataset of 325 Small and Medium-sized Enterprise(SME)customers,identifying 4 daily and 6 hourly easy-to-interpret,well-defined clusters.The application of the methodology includes selecting key parameters via grid search and a thorough comparison of clustering distances and methods to ensure the robustness of the results.Further research can test the scalability of the methodology to larger datasets from various customer segments(households and large commercial)and locations with different weather and socioeconomic conditions.展开更多
Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will...Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.展开更多
In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extract...In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extraction ability,and they cannot mine the discriminating features of the protocol data thoroughly.To address the issue,we propose an unknown application layer protocol recognition method based on deep clustering.Deep clustering which consists of the deep neural network and the clustering algorithm can automatically extract the features of the input and cluster the data based on the extracted features.Compared with the traditional clustering methods,deep clustering boasts of higher clustering accuracy.The proposed method utilizes network-in-network(NIN),channel attention,spatial attention and Bidirectional Long Short-term memory(BLSTM)to construct an autoencoder to extract the spatial-temporal features of the protocol data,and utilizes the unsupervised clustering algorithm to recognize the unknown protocols based on the features.The method firstly extracts the application layer protocol data from the network traffic and transforms the data into one-dimensional matrix.Secondly,the autoencoder is pretrained,and the protocol data is compressed into low dimensional latent space by the autoencoder and the initial clustering is performed with K-Means.Finally,the clustering loss is calculated and the classification model is optimized according to the clustering loss.The classification results can be obtained when the classification model is optimal.Compared with the existing unknown protocol recognition methods,the proposed method utilizes deep clustering to cluster the unknown protocols,and it can mine the key features of the protocol data and recognize the unknown protocols accurately.Experimental results show that the proposed method can effectively recognize the unknown protocols,and its performance is better than other methods.展开更多
To study the formation and transformation mechanism of long-period stacked ordered(LPSO)structures,a systematic atomic scale analysis was conducted for the structural evolution of long-period stacked ordered(LPSO)stru...To study the formation and transformation mechanism of long-period stacked ordered(LPSO)structures,a systematic atomic scale analysis was conducted for the structural evolution of long-period stacked ordered(LPSO)structures in the Mg-Gd-Y-Zn-Zr alloy annealed at 300℃~500℃.Various types of metastable LPSO building block clusters were found to exist in alloy structures at different temperatures,which precipitate during the solidification and homogenization process.The stability of Zn/Y clusters is explained by the first principles of density functional theory.The LPSO structure is distinguished by the arrangement of its different Zn/Y enriched LPSO structural units,which comprises local fcc stacking sequences upon a tightly packed plane.The presence of solute atoms causes local lattice distortion,thereby enabling the rearrangement of Mg atoms in the different configurations in the local lattice,and local HCP-FCC transitions occur between Mg and Zn atoms occupying the nearest neighbor positions.This finding indicates that LPSO structures can generate necessary Schockley partial dislocations on specific slip surfaces,providing direct evidence of the transition from 18R to 14H.Growth of the LPSO,devoid of any defects and non-coherent interfaces,was observed separately from other precipitated phases.As a result,the precipitation sequence of LPSO in the solidification stage was as follows:Zn/Ycluster+Mg layers→various metastable LPSO building block clusters→18R/24R LPSO;whereas the precipitation sequence of LPSO during homogenization treatment was observed to be as follows:18R LPSO→various metastable LPSO building block clusters→14H LPSO.Of these,14H LPSO was found to be the most thermodynamically stable structure.展开更多
The study delves into the expanding role of network platforms in our daily lives, encompassing various mediums like blogs, forums, online chats, and prominent social media platforms such as Facebook, Twitter, and Inst...The study delves into the expanding role of network platforms in our daily lives, encompassing various mediums like blogs, forums, online chats, and prominent social media platforms such as Facebook, Twitter, and Instagram. While these platforms offer avenues for self-expression and community support, they concurrently harbor negative impacts, fostering antisocial behaviors like phishing, impersonation, hate speech, cyberbullying, cyberstalking, cyberterrorism, fake news propagation, spamming, and fraud. Notably, individuals also leverage these platforms to connect with authorities and seek aid during disasters. The overarching objective of this research is to address the dual nature of network platforms by proposing innovative methodologies aimed at enhancing their positive aspects and mitigating their negative repercussions. To achieve this, the study introduces a weight learning method grounded in multi-linear attribute ranking. This approach serves to evaluate the significance of attribute combinations across all feature spaces. Additionally, a novel clustering method based on tensors is proposed to elevate the quality of clustering while effectively distinguishing selected features. The methodology incorporates a weighted average similarity matrix and optionally integrates weighted Euclidean distance, contributing to a more nuanced understanding of attribute importance. The analysis of the proposed methods yields significant findings. The weight learning method proves instrumental in discerning the importance of attribute combinations, shedding light on key aspects within feature spaces. Simultaneously, the clustering method based on tensors exhibits improved efficacy in enhancing clustering quality and feature distinction. This not only advances our understanding of attribute importance but also paves the way for more nuanced data analysis methodologies. In conclusion, this research underscores the pivotal role of network platforms in contemporary society, emphasizing their potential for both positive contributions and adverse consequences. The proposed methodologies offer novel approaches to address these dualities, providing a foundation for future research and practical applications. Ultimately, this study contributes to the ongoing discourse on optimizing the utility of network platforms while minimizing their negative impacts.展开更多
Rapid development in Information Technology(IT)has allowed several novel application regions like large outdoor vehicular networks for Vehicle-to-Vehicle(V2V)transmission.Vehicular networks give a safe and more effect...Rapid development in Information Technology(IT)has allowed several novel application regions like large outdoor vehicular networks for Vehicle-to-Vehicle(V2V)transmission.Vehicular networks give a safe and more effective driving experience by presenting time-sensitive and location-aware data.The communication occurs directly between V2V and Base Station(BS)units such as the Road Side Unit(RSU),named as a Vehicle to Infrastructure(V2I).However,the frequent topology alterations in VANETs generate several problems with data transmission as the vehicle velocity differs with time.Therefore,the scheme of an effectual routing protocol for reliable and stable communications is significant.Current research demonstrates that clustering is an intelligent method for effectual routing in a mobile environment.Therefore,this article presents a Falcon Optimization Algorithm-based Energy Efficient Communication Protocol for Cluster-based Routing(FOA-EECPCR)technique in VANETS.The FOA-EECPCR technique intends to group the vehicles and determine the shortest route in the VANET.To accomplish this,the FOA-EECPCR technique initially clusters the vehicles using FOA with fitness functions comprising energy,distance,and trust level.For the routing process,the Sparrow Search Algorithm(SSA)is derived with a fitness function that encompasses two variables,namely,energy and distance.A series of experiments have been conducted to exhibit the enhanced performance of the FOA-EECPCR method.The experimental outcomes demonstrate the enhanced performance of the FOA-EECPCR approach over other current methods.展开更多
The valence states and coordination structures of doped heterometal atoms in two-dimensional(2D)nanomaterials lack predictable regulation strategies.Hence,a robust method is proposed to form unsaturated heteroatom clu...The valence states and coordination structures of doped heterometal atoms in two-dimensional(2D)nanomaterials lack predictable regulation strategies.Hence,a robust method is proposed to form unsaturated heteroatom clusters via the metal-vacancy restraint mechanism,which can precisely regulate the bonding and valence state of heterometal atoms doped in 2D molybdenum disulfide.The unsaturated valence state of heterometal Pt and Ru cluster atoms form a spatial coordination structure with Pt–S and Ru–O–S as catalytically active sites.Among them,the strong binding energy of negatively charged suspended S and O sites for H+,as well as the weak adsorption of positively charged unsaturated heterometal atoms for H*,reduces the energy barrier of the hydrogen evolution reaction proved by theoretical calculation.Whereupon,the electrocatalytic hydrogen evolution performance is markedly improved by the ensemble effect of unsaturated heterometal atoms and highlighted with an overpotential of 84 mV and Tafel slope of 68.5 mV dec^(−1).In brief,this metal vacancy-induced valence state regulation of heterometal can manipulate the coordination structure and catalytic activity of heterometal atoms doped in the 2D atomic lattice but not limited to 2D nanomaterials.展开更多
Tri-axial fracturing studies were carried out to understand the impact of lateral mechanical parameters on fracture propagation from multiple in-plane perforations in horizontal wells. Additionally, the discussion cov...Tri-axial fracturing studies were carried out to understand the impact of lateral mechanical parameters on fracture propagation from multiple in-plane perforations in horizontal wells. Additionally, the discussion covered the effects of geology, treatment, and perforation characteristics on the non-planar propagation behavior. According to experimental findings, two parallel transverse fractures can be successfully initiated from in-plane perforation clusters in the horizontal well because of the in-plane perforation, the guide nonuniform fishbone structure fracture propagation still can be exhibited. The emergence of transverse fractures and axial fractures combined as complex fractures under low horizontal principal stress difference and large pump rate conditions. The injection pressure was also investigated, and the largest breakdown pressure can be also found for samples under these conditions.The increase in perforation number or decrease in the cluster spacing could provide more chances to increase the complexity of the target stimulated zone, thus affecting the pressure fluctuation. In a contrast, the increase in fracturing fluid viscosity can reduce the multiple fracture complexity. The fracture propagation is significantly affected by the change in the rock mechanical properties. The fracture geometry in the high brittle zone seems to be complicated and tends to induce fracture reorientation from the weak-brittle zone. The stress shadow effect can be used to explain the fracture attraction, branch, connection, and repulsion in the multiple perforation clusters for the horizontal well.The increase in the rock heterogeneity can enhance the stress shadow effect, resulting in more complex fracture geometry. In addition, the variable density perforation and temporary plugging fracturing were also conducted, demonstrating higher likelihood for non-uniform multiple fracture propagation. Thus, to increase the perforation efficiency along the horizontal well, it is necessary to consider the lateral fracability of the horizontal well on target formation.展开更多
The photocatalytic conversion of CO_(2)into solar‐powered fuels is viewed as a forward‐looking strategy to address energy scarcity and global warming.This work demonstrated the selective photoreduction of CO_(2)to C...The photocatalytic conversion of CO_(2)into solar‐powered fuels is viewed as a forward‐looking strategy to address energy scarcity and global warming.This work demonstrated the selective photoreduction of CO_(2)to CO using ultrathin Bi_(12)O_(17)Cl_(2)nanosheets decorated with hydrothermally synthesized bismuth clusters and oxygen vacancies(OVs).The characterizations revealed that the coexistences of OVs and Bi clusters generated in situ contributed to the high efficiency of CO_(2)–CO conversion(64.3μmol g^(−1)h^(−1))and perfect selectivity.The OVs on the facet(001)of the ultrathin Bi_(12)O_(17)Cl_(2)nanosheets serve as sites for CO_(2)adsorption and activation sites,capturing photoexcited electrons and prolonging light absorption due to defect states.In addition,the Bi‐cluster generated in situ offers the ability to trap holes and the surface plasmonic resonance effect.This study offers great potential for the construction of semiconductor hybrids as multiphotocatalysts,capable of being used for the elimination and conversion of CO_(2)in terms of energy and environment.展开更多
The evolution of dislocation loops in austenitic steels irradiated with Fe^(+)is investigated using cluster dynamics(CD)simulations by developing a CD model.The CD predictions are compared with experimental results in...The evolution of dislocation loops in austenitic steels irradiated with Fe^(+)is investigated using cluster dynamics(CD)simulations by developing a CD model.The CD predictions are compared with experimental results in the literature.The number density and average diameter of the dislocation loops obtained from the CD simulations are in good agreement with the experimental data obtained from transmission electron microscopy(TEM)observations of Fe~+-irradiated Solution Annealed 304,Cold Worked 316,and HR3 austenitic steels in the literature.The CD simulation results demonstrate that the diffusion of in-cascade interstitial clusters plays a major role in the dislocation loop density and dislocation loop growth;in particular,for the HR3 austenitic steel,the CD model has verified the effect of temperature on the density and size of the dislocation loops.展开更多
We study the structural and dynamical properties of A209 based on Chandra and XMM-Newton observations.We obtain detailed temperature,pressure,and entropy maps with the contour binning method,and find a hot region in t...We study the structural and dynamical properties of A209 based on Chandra and XMM-Newton observations.We obtain detailed temperature,pressure,and entropy maps with the contour binning method,and find a hot region in the NW direction.The X-ray brightness residual map and corresponding temperature profiles reveal a possible shock front in the NW direction and a cold front feature in the SE direction.Combined with the galaxy luminosity density map we propose a weak merger scenario.A young sub-cluster passing from the SE to NW direction could explain the optical subpeak,the intracluster medium temperature map,the X-ray surface brightness excess,and the X-ray peak offset together.展开更多
In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared...In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.展开更多
In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da...In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.展开更多
Developing highly active alloy catalysts that surpass the performance of platinum group metals in the oxygen reduction reaction(ORR)is critical in electrocatalysis.Gold-based single-atom alloy(AuSAA)clusters are gaini...Developing highly active alloy catalysts that surpass the performance of platinum group metals in the oxygen reduction reaction(ORR)is critical in electrocatalysis.Gold-based single-atom alloy(AuSAA)clusters are gaining recognition as promising alternatives due to their potential for high activity.However,enhancing its activity of AuSAA clusters remains challenging due to limited insights into its actual active site in alkaline environments.Herein,we studied a variety of Au_(54)M_(1) SAA cluster catalysts and revealed the operando formed MO_(x)(OH)_(y) complex acts as the crucial active site for catalyzing the ORR under the basic solution condition.The observed volcano plot indicates that Au_(54)Co_(1),Au_(54)M_(1),and Au_(54)Ru_(1) clusters can be the optimal Au_(54)M_(1) SAA cluster catalysts for the ORR.Our findings offer new insights into the actual active sites of AuSAA cluster catalysts,which will inform rational catalyst design in experimental settings.展开更多
Open clusters(OCs)serve as invaluable tracers for investigating the properties and evolution of stars and galaxies.Despite recent advancements in machine learning clustering algorithms,accurately discerning such clust...Open clusters(OCs)serve as invaluable tracers for investigating the properties and evolution of stars and galaxies.Despite recent advancements in machine learning clustering algorithms,accurately discerning such clusters remains challenging.We re-visited the 3013 samples generated with a hybrid clustering algorithm of FoF and pyUPMASK.A multi-view clustering(MvC)ensemble method was applied,which analyzes each member star of the OC from three perspectives—proper motion,spatial position,and composite views—before integrating the clustering outcomes to deduce more reliable cluster memberships.Based on the MvC results,we further excluded cluster candidates with fewer than ten member stars and obtained 1256 OC candidates.After isochrone fitting and visual inspection,we identified 506 candidate OCs in the Milky Way.In addition to the 493 previously reported candidates,we finally discovered 13 high-confidence new candidate clusters.展开更多
Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have ...Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.展开更多
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
基金supported by the Spanish Ministry of Science and Innovation under Projects PID2022-137680OB-C32 and PID2022-139187OB-I00.
文摘Customer segmentation according to load-shape profiles using smart meter data is an increasingly important application to vital the planning and operation of energy systems and to enable citizens’participation in the energy transition.This study proposes an innovative multi-step clustering procedure to segment customers based on load-shape patterns at the daily and intra-daily time horizons.Smart meter data is split between daily and hourly normalized time series to assess monthly,weekly,daily,and hourly seasonality patterns separately.The dimensionality reduction implicit in the splitting allows a direct approach to clustering raw daily energy time series data.The intraday clustering procedure sequentially identifies representative hourly day-unit profiles for each customer and the entire population.For the first time,a step function approach is applied to reduce time series dimensionality.Customer attributes embedded in surveys are employed to build external clustering validation metrics using Cramer’s V correlation factors and to identify statistically significant determinants of load-shape in energy usage.In addition,a time series features engineering approach is used to extract 16 relevant demand flexibility indicators that characterize customers and corresponding clusters along four different axes:available Energy(E),Temporal patterns(T),Consistency(C),and Variability(V).The methodology is implemented on a real-world electricity consumption dataset of 325 Small and Medium-sized Enterprise(SME)customers,identifying 4 daily and 6 hourly easy-to-interpret,well-defined clusters.The application of the methodology includes selecting key parameters via grid search and a thorough comparison of clustering distances and methods to ensure the robustness of the results.Further research can test the scalability of the methodology to larger datasets from various customer segments(households and large commercial)and locations with different weather and socioeconomic conditions.
基金Supported by the National Natural Science Foundation of China(No.61502312)the Fundamental Research Funds for the Central Universities(No.2017BQ024)+1 种基金the Natural Science Foundation of Guangdong Province(No.2017A030310428)the Science and Technology Programm of Guangzhou(No.201806020075,20180210025)
文摘Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.
基金This work is supported by the National Key R&D Program of China(2017YFB0802900).
文摘In recent years,many unknown protocols are constantly emerging,and they bring severe challenges to network security and network management.Existing unknown protocol recognition methods suffer from weak feature extraction ability,and they cannot mine the discriminating features of the protocol data thoroughly.To address the issue,we propose an unknown application layer protocol recognition method based on deep clustering.Deep clustering which consists of the deep neural network and the clustering algorithm can automatically extract the features of the input and cluster the data based on the extracted features.Compared with the traditional clustering methods,deep clustering boasts of higher clustering accuracy.The proposed method utilizes network-in-network(NIN),channel attention,spatial attention and Bidirectional Long Short-term memory(BLSTM)to construct an autoencoder to extract the spatial-temporal features of the protocol data,and utilizes the unsupervised clustering algorithm to recognize the unknown protocols based on the features.The method firstly extracts the application layer protocol data from the network traffic and transforms the data into one-dimensional matrix.Secondly,the autoencoder is pretrained,and the protocol data is compressed into low dimensional latent space by the autoencoder and the initial clustering is performed with K-Means.Finally,the clustering loss is calculated and the classification model is optimized according to the clustering loss.The classification results can be obtained when the classification model is optimal.Compared with the existing unknown protocol recognition methods,the proposed method utilizes deep clustering to cluster the unknown protocols,and it can mine the key features of the protocol data and recognize the unknown protocols accurately.Experimental results show that the proposed method can effectively recognize the unknown protocols,and its performance is better than other methods.
基金financially funded by Natural Science Basic Research Program of Shaanxi(grant number 2022JM-239)Key Research and Development Project of Shaanxi Provincial(grant number 2021LLRH-05–08)。
文摘To study the formation and transformation mechanism of long-period stacked ordered(LPSO)structures,a systematic atomic scale analysis was conducted for the structural evolution of long-period stacked ordered(LPSO)structures in the Mg-Gd-Y-Zn-Zr alloy annealed at 300℃~500℃.Various types of metastable LPSO building block clusters were found to exist in alloy structures at different temperatures,which precipitate during the solidification and homogenization process.The stability of Zn/Y clusters is explained by the first principles of density functional theory.The LPSO structure is distinguished by the arrangement of its different Zn/Y enriched LPSO structural units,which comprises local fcc stacking sequences upon a tightly packed plane.The presence of solute atoms causes local lattice distortion,thereby enabling the rearrangement of Mg atoms in the different configurations in the local lattice,and local HCP-FCC transitions occur between Mg and Zn atoms occupying the nearest neighbor positions.This finding indicates that LPSO structures can generate necessary Schockley partial dislocations on specific slip surfaces,providing direct evidence of the transition from 18R to 14H.Growth of the LPSO,devoid of any defects and non-coherent interfaces,was observed separately from other precipitated phases.As a result,the precipitation sequence of LPSO in the solidification stage was as follows:Zn/Ycluster+Mg layers→various metastable LPSO building block clusters→18R/24R LPSO;whereas the precipitation sequence of LPSO during homogenization treatment was observed to be as follows:18R LPSO→various metastable LPSO building block clusters→14H LPSO.Of these,14H LPSO was found to be the most thermodynamically stable structure.
基金sponsored by the National Natural Science Foundation of P.R.China(Nos.62102194 and 62102196)Six Talent Peaks Project of Jiangsu Province(No.RJFW-111)Postgraduate Research and Practice Innovation Program of Jiangsu Province(Nos.KYCX23_1087 and KYCX22_1027).
文摘The study delves into the expanding role of network platforms in our daily lives, encompassing various mediums like blogs, forums, online chats, and prominent social media platforms such as Facebook, Twitter, and Instagram. While these platforms offer avenues for self-expression and community support, they concurrently harbor negative impacts, fostering antisocial behaviors like phishing, impersonation, hate speech, cyberbullying, cyberstalking, cyberterrorism, fake news propagation, spamming, and fraud. Notably, individuals also leverage these platforms to connect with authorities and seek aid during disasters. The overarching objective of this research is to address the dual nature of network platforms by proposing innovative methodologies aimed at enhancing their positive aspects and mitigating their negative repercussions. To achieve this, the study introduces a weight learning method grounded in multi-linear attribute ranking. This approach serves to evaluate the significance of attribute combinations across all feature spaces. Additionally, a novel clustering method based on tensors is proposed to elevate the quality of clustering while effectively distinguishing selected features. The methodology incorporates a weighted average similarity matrix and optionally integrates weighted Euclidean distance, contributing to a more nuanced understanding of attribute importance. The analysis of the proposed methods yields significant findings. The weight learning method proves instrumental in discerning the importance of attribute combinations, shedding light on key aspects within feature spaces. Simultaneously, the clustering method based on tensors exhibits improved efficacy in enhancing clustering quality and feature distinction. This not only advances our understanding of attribute importance but also paves the way for more nuanced data analysis methodologies. In conclusion, this research underscores the pivotal role of network platforms in contemporary society, emphasizing their potential for both positive contributions and adverse consequences. The proposed methodologies offer novel approaches to address these dualities, providing a foundation for future research and practical applications. Ultimately, this study contributes to the ongoing discourse on optimizing the utility of network platforms while minimizing their negative impacts.
文摘Rapid development in Information Technology(IT)has allowed several novel application regions like large outdoor vehicular networks for Vehicle-to-Vehicle(V2V)transmission.Vehicular networks give a safe and more effective driving experience by presenting time-sensitive and location-aware data.The communication occurs directly between V2V and Base Station(BS)units such as the Road Side Unit(RSU),named as a Vehicle to Infrastructure(V2I).However,the frequent topology alterations in VANETs generate several problems with data transmission as the vehicle velocity differs with time.Therefore,the scheme of an effectual routing protocol for reliable and stable communications is significant.Current research demonstrates that clustering is an intelligent method for effectual routing in a mobile environment.Therefore,this article presents a Falcon Optimization Algorithm-based Energy Efficient Communication Protocol for Cluster-based Routing(FOA-EECPCR)technique in VANETS.The FOA-EECPCR technique intends to group the vehicles and determine the shortest route in the VANET.To accomplish this,the FOA-EECPCR technique initially clusters the vehicles using FOA with fitness functions comprising energy,distance,and trust level.For the routing process,the Sparrow Search Algorithm(SSA)is derived with a fitness function that encompasses two variables,namely,energy and distance.A series of experiments have been conducted to exhibit the enhanced performance of the FOA-EECPCR method.The experimental outcomes demonstrate the enhanced performance of the FOA-EECPCR approach over other current methods.
基金supported by the National Natural Science Foundation of China(22205209,52202373 and U21A200972)China Postdoctoral Science Foundation(2022M722867)Key Research Project of Higher Education Institutions in Henan Province(23A530001)。
文摘The valence states and coordination structures of doped heterometal atoms in two-dimensional(2D)nanomaterials lack predictable regulation strategies.Hence,a robust method is proposed to form unsaturated heteroatom clusters via the metal-vacancy restraint mechanism,which can precisely regulate the bonding and valence state of heterometal atoms doped in 2D molybdenum disulfide.The unsaturated valence state of heterometal Pt and Ru cluster atoms form a spatial coordination structure with Pt–S and Ru–O–S as catalytically active sites.Among them,the strong binding energy of negatively charged suspended S and O sites for H+,as well as the weak adsorption of positively charged unsaturated heterometal atoms for H*,reduces the energy barrier of the hydrogen evolution reaction proved by theoretical calculation.Whereupon,the electrocatalytic hydrogen evolution performance is markedly improved by the ensemble effect of unsaturated heterometal atoms and highlighted with an overpotential of 84 mV and Tafel slope of 68.5 mV dec^(−1).In brief,this metal vacancy-induced valence state regulation of heterometal can manipulate the coordination structure and catalytic activity of heterometal atoms doped in the 2D atomic lattice but not limited to 2D nanomaterials.
基金financially supported by the National Natural Science Foundation of China (51704324, 52374027)Natural Science Foundation of Shandong Province (ZR2023ME158, ZR2022ME025)Open Fund of Key Laboratory of Tectonics and Petroleum Resources (TPR-2020-14)。
文摘Tri-axial fracturing studies were carried out to understand the impact of lateral mechanical parameters on fracture propagation from multiple in-plane perforations in horizontal wells. Additionally, the discussion covered the effects of geology, treatment, and perforation characteristics on the non-planar propagation behavior. According to experimental findings, two parallel transverse fractures can be successfully initiated from in-plane perforation clusters in the horizontal well because of the in-plane perforation, the guide nonuniform fishbone structure fracture propagation still can be exhibited. The emergence of transverse fractures and axial fractures combined as complex fractures under low horizontal principal stress difference and large pump rate conditions. The injection pressure was also investigated, and the largest breakdown pressure can be also found for samples under these conditions.The increase in perforation number or decrease in the cluster spacing could provide more chances to increase the complexity of the target stimulated zone, thus affecting the pressure fluctuation. In a contrast, the increase in fracturing fluid viscosity can reduce the multiple fracture complexity. The fracture propagation is significantly affected by the change in the rock mechanical properties. The fracture geometry in the high brittle zone seems to be complicated and tends to induce fracture reorientation from the weak-brittle zone. The stress shadow effect can be used to explain the fracture attraction, branch, connection, and repulsion in the multiple perforation clusters for the horizontal well.The increase in the rock heterogeneity can enhance the stress shadow effect, resulting in more complex fracture geometry. In addition, the variable density perforation and temporary plugging fracturing were also conducted, demonstrating higher likelihood for non-uniform multiple fracture propagation. Thus, to increase the perforation efficiency along the horizontal well, it is necessary to consider the lateral fracability of the horizontal well on target formation.
基金Natural Science Foundation of Shandong Province,Grant/Award Number:ZR2022MB106national training program of innovation and entrepreneurship for undergraduates,Grant/Award Number:202210424099National Natural Science Foundation of China,Grant/Award Numbers:21601067,21701057,21905147。
文摘The photocatalytic conversion of CO_(2)into solar‐powered fuels is viewed as a forward‐looking strategy to address energy scarcity and global warming.This work demonstrated the selective photoreduction of CO_(2)to CO using ultrathin Bi_(12)O_(17)Cl_(2)nanosheets decorated with hydrothermally synthesized bismuth clusters and oxygen vacancies(OVs).The characterizations revealed that the coexistences of OVs and Bi clusters generated in situ contributed to the high efficiency of CO_(2)–CO conversion(64.3μmol g^(−1)h^(−1))and perfect selectivity.The OVs on the facet(001)of the ultrathin Bi_(12)O_(17)Cl_(2)nanosheets serve as sites for CO_(2)adsorption and activation sites,capturing photoexcited electrons and prolonging light absorption due to defect states.In addition,the Bi‐cluster generated in situ offers the ability to trap holes and the surface plasmonic resonance effect.This study offers great potential for the construction of semiconductor hybrids as multiphotocatalysts,capable of being used for the elimination and conversion of CO_(2)in terms of energy and environment.
基金supported by the National Natural Science Foundation of China(No.U1967212)the Fundamental Research Funds for the Central Universities(No.2021MS032)the Nuclear Materials Innovation Foundation(No.WDZC-2023-AW-0305)。
文摘The evolution of dislocation loops in austenitic steels irradiated with Fe^(+)is investigated using cluster dynamics(CD)simulations by developing a CD model.The CD predictions are compared with experimental results in the literature.The number density and average diameter of the dislocation loops obtained from the CD simulations are in good agreement with the experimental data obtained from transmission electron microscopy(TEM)observations of Fe~+-irradiated Solution Annealed 304,Cold Worked 316,and HR3 austenitic steels in the literature.The CD simulation results demonstrate that the diffusion of in-cascade interstitial clusters plays a major role in the dislocation loop density and dislocation loop growth;in particular,for the HR3 austenitic steel,the CD model has verified the effect of temperature on the density and size of the dislocation loops.
基金supported by the National Natural Science Foundation of China(grant Nos.U2038104 and 11703014)the Bureau of International Cooperation,Chinese Academy of Sciences(GJHZ1864)。
文摘We study the structural and dynamical properties of A209 based on Chandra and XMM-Newton observations.We obtain detailed temperature,pressure,and entropy maps with the contour binning method,and find a hot region in the NW direction.The X-ray brightness residual map and corresponding temperature profiles reveal a possible shock front in the NW direction and a cold front feature in the SE direction.Combined with the galaxy luminosity density map we propose a weak merger scenario.A young sub-cluster passing from the SE to NW direction could explain the optical subpeak,the intracluster medium temperature map,the X-ray surface brightness excess,and the X-ray peak offset together.
基金This work was supported by Science and Technology Research Program of Chongqing Municipal Education Commission(KJZD-M202300502,KJQN201800539).
文摘In clustering algorithms,the selection of neighbors significantly affects the quality of the final clustering results.While various neighbor relationships exist,such as K-nearest neighbors,natural neighbors,and shared neighbors,most neighbor relationships can only handle single structural relationships,and the identification accuracy is low for datasets with multiple structures.In life,people’s first instinct for complex things is to divide them into multiple parts to complete.Partitioning the dataset into more sub-graphs is a good idea approach to identifying complex structures.Taking inspiration from this,we propose a novel neighbor method:Shared Natural Neighbors(SNaN).To demonstrate the superiority of this neighbor method,we propose a shared natural neighbors-based hierarchical clustering algorithm for discovering arbitrary-shaped clusters(HC-SNaN).Our algorithm excels in identifying both spherical clusters and manifold clusters.Tested on synthetic datasets and real-world datasets,HC-SNaN demonstrates significant advantages over existing clustering algorithms,particularly when dealing with datasets containing arbitrary shapes.
基金supported in part by the National Natural Science Foundation of China under Grant 62171203in part by the Jiangsu Province“333 Project”High-Level Talent Cultivation Subsidized Project+2 种基金in part by the SuzhouKey Supporting Subjects for Health Informatics under Grant SZFCXK202147in part by the Changshu Science and Technology Program under Grants CS202015 and CS202246in part by Changshu Key Laboratory of Medical Artificial Intelligence and Big Data under Grants CYZ202301 and CS202314.
文摘In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.
文摘Developing highly active alloy catalysts that surpass the performance of platinum group metals in the oxygen reduction reaction(ORR)is critical in electrocatalysis.Gold-based single-atom alloy(AuSAA)clusters are gaining recognition as promising alternatives due to their potential for high activity.However,enhancing its activity of AuSAA clusters remains challenging due to limited insights into its actual active site in alkaline environments.Herein,we studied a variety of Au_(54)M_(1) SAA cluster catalysts and revealed the operando formed MO_(x)(OH)_(y) complex acts as the crucial active site for catalyzing the ORR under the basic solution condition.The observed volcano plot indicates that Au_(54)Co_(1),Au_(54)M_(1),and Au_(54)Ru_(1) clusters can be the optimal Au_(54)M_(1) SAA cluster catalysts for the ORR.Our findings offer new insights into the actual active sites of AuSAA cluster catalysts,which will inform rational catalyst design in experimental settings.
基金supported by the National Key Research And Development Program of China(No.2022YFF0711500)the National Natural Science Foundation of China(NSFC,Grant No.12373097)+1 种基金the Basic and Applied Basic Research Foundation Project of Guangdong Province(No.2024A1515011503)the Guangzhou Science and Technology Funds(2023A03J0016)。
文摘Open clusters(OCs)serve as invaluable tracers for investigating the properties and evolution of stars and galaxies.Despite recent advancements in machine learning clustering algorithms,accurately discerning such clusters remains challenging.We re-visited the 3013 samples generated with a hybrid clustering algorithm of FoF and pyUPMASK.A multi-view clustering(MvC)ensemble method was applied,which analyzes each member star of the OC from three perspectives—proper motion,spatial position,and composite views—before integrating the clustering outcomes to deduce more reliable cluster memberships.Based on the MvC results,we further excluded cluster candidates with fewer than ten member stars and obtained 1256 OC candidates.After isochrone fitting and visual inspection,we identified 506 candidate OCs in the Milky Way.In addition to the 493 previously reported candidates,we finally discovered 13 high-confidence new candidate clusters.
基金supported by the National Key Research and Development Program of China(No.2022YFB3304400)the National Natural Science Foundation of China(Nos.6230311,62303111,62076060,61932007,and 62176083)the Key Research and Development Program of Jiangsu Province of China(No.BE2022157).
文摘Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.