Contrastive learning is a significant research direction in the field of deep learning.However,existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of ...Contrastive learning is a significant research direction in the field of deep learning.However,existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of model pre-training limits further improvement in the performance of existing methods.To address these challenges,we propose the Efficient Clustering Network based on Matrix Factorization(ECN-MF).Specifically,we design a batched low-rank Singular Value Decomposition(SVD)algorithm for data augmentation to eliminate redundant information and uncover major patterns of variation and key information in the data.Additionally,we design a Mutual Information-Enhanced Clustering Module(MI-ECM)to accelerate the training process by leveraging a simple architecture to bring samples from the same cluster closer while pushing samples from other clusters apart.Extensive experiments on six datasets demonstrate that ECN-MF exhibits more effective performance compared to state-of-the-art algorithms.展开更多
The proliferation of internet communication channels has increased telecom fraud,causing billions of euros in losses for customers and the industry each year.Fraudsters constantly find new ways to engage in illegal ac...The proliferation of internet communication channels has increased telecom fraud,causing billions of euros in losses for customers and the industry each year.Fraudsters constantly find new ways to engage in illegal activity on the network.To reduce these losses,a new fraud detection approach is required.Telecom fraud detection involves identifying a small number of fraudulent calls from a vast amount of call traffic.Developing an effective strategy to combat fraud has become challenging.Although much effort has been made to detect fraud,most existing methods are designed for batch processing,not real-time detection.To solve this problem,we propose an online fraud detection model using a Neural Factorization Autoencoder(NFA),which analyzes customer calling patterns to detect fraudulent calls.The model employs Neural Factorization Machines(NFM)and an Autoencoder(AE)to model calling patterns and a memory module to adapt to changing customer behaviour.We evaluate our approach on a large dataset of real-world call detail records and compare it with several state-of-the-art methods.Our results show that our approach outperforms the baselines,with an AUC of 91.06%,a TPR of 91.89%,an FPR of 14.76%,and an F1-score of 95.45%.These results demonstrate the effectiveness of our approach in detecting fraud in real-time and suggest that it can be a valuable tool for preventing fraud in telecommunications networks.展开更多
This study aimed to investigate the pollution characteristics, source apportionment, and health risks associated with trace metal(loid)s(TMs) in the major agricultural producing areas in Chongqing, China. We analyzed ...This study aimed to investigate the pollution characteristics, source apportionment, and health risks associated with trace metal(loid)s(TMs) in the major agricultural producing areas in Chongqing, China. We analyzed the source apportionment and assessed the health risk of TMs in agricultural soils by using positive matrix factorization(PMF) model and health risk assessment(HRA) model based on Monte Carlo simulation. Meanwhile, we combined PMF and HRA models to explore the health risks of TMs in agricultural soils by different pollution sources to determine the priority control factors. Results showed that the average contents of cadmium(Cd), arsenic (As), lead(Pb), chromium(Cr), copper(Cu), nickel(Ni), and zinc(Zn) in the soil were found to be 0.26, 5.93, 27.14, 61.32, 23.81, 32.45, and 78.65 mg/kg, respectively. Spatial analysis and source apportionment analysis revealed that urban and industrial sources, agricultural sources, and natural sources accounted for 33.0%, 27.7%, and 39.3% of TM accumulation in the soil, respectively. In the HRA model based on Monte Carlo simulation, noncarcinogenic risks were deemed negligible(hazard index <1), the carcinogenic risks were at acceptable level(10^(-6)<total carcinogenic risk ≤ 10^(-4)), with higher risks observed for children compared to adults. The relationship between TMs, their sources, and health risks indicated that urban and industrial sources were primarily associated with As, contributing to 75.1% of carcinogenic risks and 55.7% of non-carcinogenic risks, making them the primary control factors. Meanwhile, agricultural sources were primarily linked to Cd and Pb, contributing to 13.1% of carcinogenic risks and 21.8% of non-carcinogenic risks, designating them as secondary control factors.展开更多
A multi-qubit pure quantum state is called separable when it can be factored as the tensor product of 1-qubit pure quantum states.Factorizing a general multi-qubit pure quantum state into the tensor product of its fac...A multi-qubit pure quantum state is called separable when it can be factored as the tensor product of 1-qubit pure quantum states.Factorizing a general multi-qubit pure quantum state into the tensor product of its factors(pure states containing a smaller number of qubits)can be a challenging task,especially for highly entangled states.A new criterion based on the proportionality of the rows of certain associated matrices for the existence of certain factorization and a factorization algorithm that follows from this criterion for systematically extracting all the factors is developed in this paper.3-qubit pure states play a crucial role in quantum computing and quantum information processing.For various applications,the well-known 3-qubit GHZ state which contains two nonzero terms,and the 3-qubit W state which contains three nonzero terms,have been studied extensively.Using the new factorization algorithm developed here we perform a complete analysis vis-à-vis entanglement of 3-qubit states that contain exactly two nonzero terms and exactly three nonzero terms.展开更多
Finding crucial vertices is a key problem for improving the reliability and ensuring the effective operation of networks,solved by approaches based on multiple attribute decision that suffer from ignoring the correlat...Finding crucial vertices is a key problem for improving the reliability and ensuring the effective operation of networks,solved by approaches based on multiple attribute decision that suffer from ignoring the correlation among each attribute or the heterogeneity between attribute and structure. To overcome these problems, a novel vertex centrality approach, called VCJG, is proposed based on joint nonnegative matrix factorization and graph embedding. The potential attributes with linearly independent and the structure information are captured automatically in light of nonnegative matrix factorization for factorizing the weighted adjacent matrix and the structure matrix, which is generated by graph embedding. And the smoothness strategy is applied to eliminate the heterogeneity between attributes and structure by joint nonnegative matrix factorization. Then VCJG integrates the above steps to formulate an overall objective function, and obtain the ultimately potential attributes fused the structure information of network through optimizing the objective function. Finally, the attributes are combined with neighborhood rules to evaluate vertex's importance. Through comparative analyses with experiments on nine real-world networks, we demonstrate that the proposed approach outperforms nine state-of-the-art algorithms for identification of vital vertices with respect to correlation, monotonicity and accuracy of top-10 vertices ranking.展开更多
Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glyc...Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glycolysis-related genes is novel and has great potential.Methods:First,gene expression and clinical data of PCa patients were downloaded from The Cancer Genome Atlas(TCGA)and Gene Expression Omnibus(GEO),and glycolysis-related genes were obtained from the Molecular Signatures Database(MSigDB).Gene enrichment analysis was performed to verify that glycolysis functions were enriched in the genes we obtained,which were used in nonnegative matrix factorization(NMF)to identify clusters.The correlation between clusters and clinical features was discussed,and the differentially expressed genes(DEGs)between the two clusters were investigated.Based on the DEGs,we investigated the biological differences between clusters,including immune cell infiltration,mutation,tumor immune dysfunction and exclusion,immune function,and checkpoint genes.To establish the prognostic model,the genes were filtered based on univariable Cox regression,LASSO,and multivariable Cox regression.Kaplan–Meier analysis and receiver operating characteristic analysis validated the prognostic value of the model.A nomogram of the risk score calculated by the prognostic model and clinical characteristics was constructed to quantitatively estimate the survival probability for PCa patients in the clinical setting.Result:The genes obtained from MSigDB were enriched in glycolysis functions.Two clusters were identified by NMF analysis based on 272 glycolysis-related genes,and a prognostic model based on DEGs between the two clusters was finally established.The prognostic model consisted of LAMPS,SPRN,ATOH1,TANC1,ETV1,TDRD1,KLK14,MESP2,POSTN,CRIP2,NAT1,AKR7A3,PODXL,CARTPT,and PCDHGB2.All sample,training,and test cohorts from The Cancer Genome Atlas(TCGA)and the external validation cohort from GEO showed significant differences between the high-risk and low-risk groups.The area under the ROC curve showed great performance of this prognostic model.Conclusion:A prognostic model based on glycolysis-related genes was established,with great performance and potential significance to the clinical application.展开更多
Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,par...Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,particularly network Denial-of-Service(DoS)attacks.While many research efforts have been devoted to identifying new features for DoS attack detection,detection methods are less accurate in detecting DoS attacks against client hosts due to the high stealth of such attacks.To solve this problem,a new method of DoS attack detection based on Deep Factorization Machine(DeepFM)is proposed in SDN.Firstly,we select the Growth Rate of Max Matched Packets(GRMMP)in SDN as detection feature.Then,the DeepFM algorithm is used to extract features from flow rules and classify them into dense and discrete features to detect DoS attacks.After training,the model can be used to infer whether SDN is under DoS attacks,and a DeepFM-based detection method for DoS attacks against client host is implemented.Simulation results show that our method can effectively detect DoS attacks in SDN.Compared with the K-Nearest Neighbor(K-NN),Artificial Neural Network(ANN)models,Support Vector Machine(SVM)and Random Forest models,our proposed method outperforms in accuracy,precision and F1 values.展开更多
Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency o...Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency of multi-view data,while neglecting the diversity among different views as well as the high-order relationships of data,resulting in the loss of valuable complementary information.In this paper,we design a hypergraph regularized diverse deep matrix factorization(HDDMF)model for multi-view data representation,to jointly utilize multi-view diversity and a high-order manifold in a multilayer factorization framework.A novel diversity enhancement term is designed to exploit the structural complementarity between different views of data.Hypergraph regularization is utilized to preserve the high-order geometry structure of data in each view.An efficient iterative optimization algorithm is developed to solve the proposed model with theoretical convergence analysis.Experimental results on five real-world data sets demonstrate that the proposed method significantly outperforms stateof-the-art multi-view learning approaches.展开更多
Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can...Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.展开更多
Let G be a graph, k(1), ... , k(m) be positive integers. If the edges of graph G can be decomposed into some edge disjoint [0, k(1)]-factor F-1, ..., [0, k(m)]-factor F-m, then we can say (F) over bar = {F-1, ..., F-m...Let G be a graph, k(1), ... , k(m) be positive integers. If the edges of graph G can be decomposed into some edge disjoint [0, k(1)]-factor F-1, ..., [0, k(m)]-factor F-m, then we can say (F) over bar = {F-1, ..., F-m}, is a [0, k(i)](1)(m) -factorization of G. If H is a subgraph with m edges in graph G and / E (H) boolean AND E(F-i) / = 1 for all 1 less than or equal to i less than or equal to m, then we can call that (F) over bar is orthogonal to H. It is proved that if G is a [0, k(1) + ... + k(m) - m + 1]-graph, H is a subgraph with m edges in G, then graph G has a [0, k(i)](1)(m)-factorization orthogonal to H.展开更多
We present a numerical method for solving the indefinite least squares problem. We first normalize the coefficient matrix. Then we compute the hyperbolic QR factorization of the normalized matrix. Finally we compute t...We present a numerical method for solving the indefinite least squares problem. We first normalize the coefficient matrix. Then we compute the hyperbolic QR factorization of the normalized matrix. Finally we compute the solution by solving several triangular systems. We give the first order error analysis to show that the method is backward stable. The method is more efficient than the backward stable method proposed by Chandrasekaran, Gu and Sayed.展开更多
In actual engineering, processing of big data sometimes requires building of mass physical models, while processing of physical model requires relevant math model, thus producing mass multivariate polynomials, the eff...In actual engineering, processing of big data sometimes requires building of mass physical models, while processing of physical model requires relevant math model, thus producing mass multivariate polynomials, the effective reduction of which is a difficult problem at present. A novel algorithm is proposed to achieve the approximation factorization of complex coefficient multivariate polynomial in light of characteristics of multivariate polynomials. At first, the multivariate polynomial is reduced to be the binary polynomial, then the approximation factorization of binary polynomial can produce irreducible duality factor, at last, the irreducible duality factor is restored to the irreducible multiple factor. As a unit root is cyclic, selecting the unit root as the reduced factor can ensure the coefficient does not expand in a reduction process. Chinese remainder theorem is adopted in the corresponding reduction process, which brought down the calculation complexity. The algorithm is based on approximation factorization of binary polynomial and calculation of approximation Greatest Common Divisor, GCD. The algorithm can solve the reduction of multivariate polynomials in massive math models, which can obtain effectively null point of multivariate polynomials, providing a new approach for further analysis and explanation of physical models. The experiment result shows that the irreducible factors from this method get close to the real factors with high efficiency.展开更多
Let G be a graph and f an integer-valued function defined on V(G). It is proved that every (0,mf - m+1)-graph G has a (0,f)-factorization orthogonal to any given subgraph with m edges.
In this paper, it is shown that a sufficient condition for the existence of a K 1,p k factorization of K m,n , whenever p is a prime number and k is a positive integer, is (1) m≤p kn,(2...In this paper, it is shown that a sufficient condition for the existence of a K 1,p k factorization of K m,n , whenever p is a prime number and k is a positive integer, is (1) m≤p kn,(2) n≤p km,(3)p kn-m≡p km-n ≡0(mod( p 2k -1 )) and (4) (p kn-m)(p km-n) ≡0(mod( p k -1)p k×(p 2k -1)(m+n)) .展开更多
Let G be a graph and g, f be two nonnegative integer-valued functions defined on the vertices set V(G) of G and g less than or equal to f. A (g, f)-factor of a graph G is a spanning subgraph F of G such that g(x)less ...Let G be a graph and g, f be two nonnegative integer-valued functions defined on the vertices set V(G) of G and g less than or equal to f. A (g, f)-factor of a graph G is a spanning subgraph F of G such that g(x)less than or equal to d(F)(x)less than or equal to f(x) for all x is an element of V(G). If G itself is a (g, f)-factor, then it is said that G is a (g, f)-graph. If the edges of G can be decomposed into some edge disjoint (g, f)-factors, then it is called that G is (g, f)-factorable. In this paper, one sufficient condition for a graph to be (g, f)-factorable is given.展开更多
A K1,k-factorization of λKm,n is a set of edge-disjoint K1,k-factors of λKm,n, which partition the set of edges of λKm,n. In this paper, it is proved that a sufficient condition for the existence of K1,k-factorizat...A K1,k-factorization of λKm,n is a set of edge-disjoint K1,k-factors of λKm,n, which partition the set of edges of λKm,n. In this paper, it is proved that a sufficient condition for the existence of K1,k-factorization of λKm,n, whenever k is any positive integer, is that (1) m ≤ kn, (2) n ≤ km, (3) km-n = kn-m ≡ 0 (mod (k^2- 1)) and (4) λ(km-n)(kn-m) ≡ 0 (mod k(k- 1)(k^2 - 1)(m + n)).展开更多
Let G be an (mg, mf)-graph, where g and f are integer-valued functions defined on V(G) and such that 0≤g(x)≤f(x) for each x ∈ V(G). It is proved that(1) If Z ≠ , both g and f may be not even, G has a (g, f)-factor...Let G be an (mg, mf)-graph, where g and f are integer-valued functions defined on V(G) and such that 0≤g(x)≤f(x) for each x ∈ V(G). It is proved that(1) If Z ≠ , both g and f may be not even, G has a (g, f)-factorization, where Z = {x ∈ V(G):mf(x)-dG(x)≤t(x) or dG(x)-mg(x)≤ t(x), t(x)=f(x)-g(x)>0}.(2) Let G be an m-regular graph with 2n vertices, m ≥ n. If (P1, P2,..., Pr) is a partition of m, P1 ≡m (mod 2), Pi≡0 (mod 2), i=2,..., r, then the edge set E(G) of G can be parted into r parts E1,E2,..., Er of E(G) such that G[Ei] is a Pi-factor of G.展开更多
Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smar...Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.展开更多
This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information ...This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.展开更多
基金supported by the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+3 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211)Innovative Research Project for Graduate Students in Hainan Province(Grant Nos.Qhys2023-96,Qhys2023-95).
文摘Contrastive learning is a significant research direction in the field of deep learning.However,existing data augmentation methods often lead to issues such as semantic drift in generated views while the complexity of model pre-training limits further improvement in the performance of existing methods.To address these challenges,we propose the Efficient Clustering Network based on Matrix Factorization(ECN-MF).Specifically,we design a batched low-rank Singular Value Decomposition(SVD)algorithm for data augmentation to eliminate redundant information and uncover major patterns of variation and key information in the data.Additionally,we design a Mutual Information-Enhanced Clustering Module(MI-ECM)to accelerate the training process by leveraging a simple architecture to bring samples from the same cluster closer while pushing samples from other clusters apart.Extensive experiments on six datasets demonstrate that ECN-MF exhibits more effective performance compared to state-of-the-art algorithms.
基金This research work has been conducted in cooperation with members of DETSI project supported by BPI France and Pays de Loire and Auvergne Rhone Alpes.
文摘The proliferation of internet communication channels has increased telecom fraud,causing billions of euros in losses for customers and the industry each year.Fraudsters constantly find new ways to engage in illegal activity on the network.To reduce these losses,a new fraud detection approach is required.Telecom fraud detection involves identifying a small number of fraudulent calls from a vast amount of call traffic.Developing an effective strategy to combat fraud has become challenging.Although much effort has been made to detect fraud,most existing methods are designed for batch processing,not real-time detection.To solve this problem,we propose an online fraud detection model using a Neural Factorization Autoencoder(NFA),which analyzes customer calling patterns to detect fraudulent calls.The model employs Neural Factorization Machines(NFM)and an Autoencoder(AE)to model calling patterns and a memory module to adapt to changing customer behaviour.We evaluate our approach on a large dataset of real-world call detail records and compare it with several state-of-the-art methods.Our results show that our approach outperforms the baselines,with an AUC of 91.06%,a TPR of 91.89%,an FPR of 14.76%,and an F1-score of 95.45%.These results demonstrate the effectiveness of our approach in detecting fraud in real-time and suggest that it can be a valuable tool for preventing fraud in telecommunications networks.
基金supported by Project of Chongqing Science and Technology Bureau (cstc2022jxjl0005)。
文摘This study aimed to investigate the pollution characteristics, source apportionment, and health risks associated with trace metal(loid)s(TMs) in the major agricultural producing areas in Chongqing, China. We analyzed the source apportionment and assessed the health risk of TMs in agricultural soils by using positive matrix factorization(PMF) model and health risk assessment(HRA) model based on Monte Carlo simulation. Meanwhile, we combined PMF and HRA models to explore the health risks of TMs in agricultural soils by different pollution sources to determine the priority control factors. Results showed that the average contents of cadmium(Cd), arsenic (As), lead(Pb), chromium(Cr), copper(Cu), nickel(Ni), and zinc(Zn) in the soil were found to be 0.26, 5.93, 27.14, 61.32, 23.81, 32.45, and 78.65 mg/kg, respectively. Spatial analysis and source apportionment analysis revealed that urban and industrial sources, agricultural sources, and natural sources accounted for 33.0%, 27.7%, and 39.3% of TM accumulation in the soil, respectively. In the HRA model based on Monte Carlo simulation, noncarcinogenic risks were deemed negligible(hazard index <1), the carcinogenic risks were at acceptable level(10^(-6)<total carcinogenic risk ≤ 10^(-4)), with higher risks observed for children compared to adults. The relationship between TMs, their sources, and health risks indicated that urban and industrial sources were primarily associated with As, contributing to 75.1% of carcinogenic risks and 55.7% of non-carcinogenic risks, making them the primary control factors. Meanwhile, agricultural sources were primarily linked to Cd and Pb, contributing to 13.1% of carcinogenic risks and 21.8% of non-carcinogenic risks, designating them as secondary control factors.
文摘A multi-qubit pure quantum state is called separable when it can be factored as the tensor product of 1-qubit pure quantum states.Factorizing a general multi-qubit pure quantum state into the tensor product of its factors(pure states containing a smaller number of qubits)can be a challenging task,especially for highly entangled states.A new criterion based on the proportionality of the rows of certain associated matrices for the existence of certain factorization and a factorization algorithm that follows from this criterion for systematically extracting all the factors is developed in this paper.3-qubit pure states play a crucial role in quantum computing and quantum information processing.For various applications,the well-known 3-qubit GHZ state which contains two nonzero terms,and the 3-qubit W state which contains three nonzero terms,have been studied extensively.Using the new factorization algorithm developed here we perform a complete analysis vis-à-vis entanglement of 3-qubit states that contain exactly two nonzero terms and exactly three nonzero terms.
基金Project supported by the National Natural Science Foundation of China (Grant Nos.62162040 and 11861045)。
文摘Finding crucial vertices is a key problem for improving the reliability and ensuring the effective operation of networks,solved by approaches based on multiple attribute decision that suffer from ignoring the correlation among each attribute or the heterogeneity between attribute and structure. To overcome these problems, a novel vertex centrality approach, called VCJG, is proposed based on joint nonnegative matrix factorization and graph embedding. The potential attributes with linearly independent and the structure information are captured automatically in light of nonnegative matrix factorization for factorizing the weighted adjacent matrix and the structure matrix, which is generated by graph embedding. And the smoothness strategy is applied to eliminate the heterogeneity between attributes and structure by joint nonnegative matrix factorization. Then VCJG integrates the above steps to formulate an overall objective function, and obtain the ultimately potential attributes fused the structure information of network through optimizing the objective function. Finally, the attributes are combined with neighborhood rules to evaluate vertex's importance. Through comparative analyses with experiments on nine real-world networks, we demonstrate that the proposed approach outperforms nine state-of-the-art algorithms for identification of vital vertices with respect to correlation, monotonicity and accuracy of top-10 vertices ranking.
基金supported by the Public Health Research Project in Futian District,Shenzhen(Grant Nos.FTWS2020026,FTWS2021073).
文摘Background:Establishing an appropriate prognostic model for PCa is essential for its effective treatment.Glycolysis is a vital energy-harvesting mechanism for tumors.Developing a prognostic model for PCa based on glycolysis-related genes is novel and has great potential.Methods:First,gene expression and clinical data of PCa patients were downloaded from The Cancer Genome Atlas(TCGA)and Gene Expression Omnibus(GEO),and glycolysis-related genes were obtained from the Molecular Signatures Database(MSigDB).Gene enrichment analysis was performed to verify that glycolysis functions were enriched in the genes we obtained,which were used in nonnegative matrix factorization(NMF)to identify clusters.The correlation between clusters and clinical features was discussed,and the differentially expressed genes(DEGs)between the two clusters were investigated.Based on the DEGs,we investigated the biological differences between clusters,including immune cell infiltration,mutation,tumor immune dysfunction and exclusion,immune function,and checkpoint genes.To establish the prognostic model,the genes were filtered based on univariable Cox regression,LASSO,and multivariable Cox regression.Kaplan–Meier analysis and receiver operating characteristic analysis validated the prognostic value of the model.A nomogram of the risk score calculated by the prognostic model and clinical characteristics was constructed to quantitatively estimate the survival probability for PCa patients in the clinical setting.Result:The genes obtained from MSigDB were enriched in glycolysis functions.Two clusters were identified by NMF analysis based on 272 glycolysis-related genes,and a prognostic model based on DEGs between the two clusters was finally established.The prognostic model consisted of LAMPS,SPRN,ATOH1,TANC1,ETV1,TDRD1,KLK14,MESP2,POSTN,CRIP2,NAT1,AKR7A3,PODXL,CARTPT,and PCDHGB2.All sample,training,and test cohorts from The Cancer Genome Atlas(TCGA)and the external validation cohort from GEO showed significant differences between the high-risk and low-risk groups.The area under the ROC curve showed great performance of this prognostic model.Conclusion:A prognostic model based on glycolysis-related genes was established,with great performance and potential significance to the clinical application.
基金This work was funded by the Researchers Supporting Project No.(RSP-2021/102)King Saud University,Riyadh,Saudi ArabiaThis work was supported by the Research Project on Teaching Reform of General Colleges and Universities in Hunan Province(Grant No.HNJG-2020-0261),China.
文摘Software-Defined Network(SDN)decouples the control plane of network devices from the data plane.While alleviating the problems presented in traditional network architectures,it also brings potential security risks,particularly network Denial-of-Service(DoS)attacks.While many research efforts have been devoted to identifying new features for DoS attack detection,detection methods are less accurate in detecting DoS attacks against client hosts due to the high stealth of such attacks.To solve this problem,a new method of DoS attack detection based on Deep Factorization Machine(DeepFM)is proposed in SDN.Firstly,we select the Growth Rate of Max Matched Packets(GRMMP)in SDN as detection feature.Then,the DeepFM algorithm is used to extract features from flow rules and classify them into dense and discrete features to detect DoS attacks.After training,the model can be used to infer whether SDN is under DoS attacks,and a DeepFM-based detection method for DoS attacks against client host is implemented.Simulation results show that our method can effectively detect DoS attacks in SDN.Compared with the K-Nearest Neighbor(K-NN),Artificial Neural Network(ANN)models,Support Vector Machine(SVM)and Random Forest models,our proposed method outperforms in accuracy,precision and F1 values.
基金This work was supported by the National Natural Science Foundation of China(62073087,62071132,61973090).
文摘Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency of multi-view data,while neglecting the diversity among different views as well as the high-order relationships of data,resulting in the loss of valuable complementary information.In this paper,we design a hypergraph regularized diverse deep matrix factorization(HDDMF)model for multi-view data representation,to jointly utilize multi-view diversity and a high-order manifold in a multilayer factorization framework.A novel diversity enhancement term is designed to exploit the structural complementarity between different views of data.Hypergraph regularization is utilized to preserve the high-order geometry structure of data in each view.An efficient iterative optimization algorithm is developed to solve the proposed model with theoretical convergence analysis.Experimental results on five real-world data sets demonstrate that the proposed method significantly outperforms stateof-the-art multi-view learning approaches.
文摘Data is humongous today because of the extensive use of World WideWeb, Social Media and Intelligent Systems. This data can be very important anduseful if it is harnessed carefully and correctly. Useful information can beextracted from this massive data using the Data Mining process. The informationextracted can be used to make vital decisions in various industries. Clustering is avery popular Data Mining method which divides the data points into differentgroups such that all similar data points form a part of the same group. Clusteringmethods are of various types. Many parameters and indexes exist for the evaluationand comparison of these methods. In this paper, we have compared partitioningbased methods K-Means, Fuzzy C-Means (FCM), Partitioning AroundMedoids (PAM) and Clustering Large Application (CLARA) on secure perturbeddata. Comparison and identification has been done for the method which performsbetter for analyzing the data perturbed using Extended NMF on the basis of thevalues of various indexes like Dunn Index, Silhouette Index, Xie-Beni Indexand Davies-Bouldin Index.
文摘Let G be a graph, k(1), ... , k(m) be positive integers. If the edges of graph G can be decomposed into some edge disjoint [0, k(1)]-factor F-1, ..., [0, k(m)]-factor F-m, then we can say (F) over bar = {F-1, ..., F-m}, is a [0, k(i)](1)(m) -factorization of G. If H is a subgraph with m edges in graph G and / E (H) boolean AND E(F-i) / = 1 for all 1 less than or equal to i less than or equal to m, then we can call that (F) over bar is orthogonal to H. It is proved that if G is a [0, k(1) + ... + k(m) - m + 1]-graph, H is a subgraph with m edges in G, then graph G has a [0, k(i)](1)(m)-factorization orthogonal to H.
文摘We present a numerical method for solving the indefinite least squares problem. We first normalize the coefficient matrix. Then we compute the hyperbolic QR factorization of the normalized matrix. Finally we compute the solution by solving several triangular systems. We give the first order error analysis to show that the method is backward stable. The method is more efficient than the backward stable method proposed by Chandrasekaran, Gu and Sayed.
文摘In actual engineering, processing of big data sometimes requires building of mass physical models, while processing of physical model requires relevant math model, thus producing mass multivariate polynomials, the effective reduction of which is a difficult problem at present. A novel algorithm is proposed to achieve the approximation factorization of complex coefficient multivariate polynomial in light of characteristics of multivariate polynomials. At first, the multivariate polynomial is reduced to be the binary polynomial, then the approximation factorization of binary polynomial can produce irreducible duality factor, at last, the irreducible duality factor is restored to the irreducible multiple factor. As a unit root is cyclic, selecting the unit root as the reduced factor can ensure the coefficient does not expand in a reduction process. Chinese remainder theorem is adopted in the corresponding reduction process, which brought down the calculation complexity. The algorithm is based on approximation factorization of binary polynomial and calculation of approximation Greatest Common Divisor, GCD. The algorithm can solve the reduction of multivariate polynomials in massive math models, which can obtain effectively null point of multivariate polynomials, providing a new approach for further analysis and explanation of physical models. The experiment result shows that the irreducible factors from this method get close to the real factors with high efficiency.
文摘Let G be a graph and f an integer-valued function defined on V(G). It is proved that every (0,mf - m+1)-graph G has a (0,f)-factorization orthogonal to any given subgraph with m edges.
文摘In this paper, it is shown that a sufficient condition for the existence of a K 1,p k factorization of K m,n , whenever p is a prime number and k is a positive integer, is (1) m≤p kn,(2) n≤p km,(3)p kn-m≡p km-n ≡0(mod( p 2k -1 )) and (4) (p kn-m)(p km-n) ≡0(mod( p k -1)p k×(p 2k -1)(m+n)) .
文摘Let G be a graph and g, f be two nonnegative integer-valued functions defined on the vertices set V(G) of G and g less than or equal to f. A (g, f)-factor of a graph G is a spanning subgraph F of G such that g(x)less than or equal to d(F)(x)less than or equal to f(x) for all x is an element of V(G). If G itself is a (g, f)-factor, then it is said that G is a (g, f)-graph. If the edges of G can be decomposed into some edge disjoint (g, f)-factors, then it is called that G is (g, f)-factorable. In this paper, one sufficient condition for a graph to be (g, f)-factorable is given.
基金the National Natural Science Foundation of China (10571133)
文摘A K1,k-factorization of λKm,n is a set of edge-disjoint K1,k-factors of λKm,n, which partition the set of edges of λKm,n. In this paper, it is proved that a sufficient condition for the existence of K1,k-factorization of λKm,n, whenever k is any positive integer, is that (1) m ≤ kn, (2) n ≤ km, (3) km-n = kn-m ≡ 0 (mod (k^2- 1)) and (4) λ(km-n)(kn-m) ≡ 0 (mod k(k- 1)(k^2 - 1)(m + n)).
基金Foundation item:Hunan Provincial Educational Department (03C496)
文摘Let G be an (mg, mf)-graph, where g and f are integer-valued functions defined on V(G) and such that 0≤g(x)≤f(x) for each x ∈ V(G). It is proved that(1) If Z ≠ , both g and f may be not even, G has a (g, f)-factorization, where Z = {x ∈ V(G):mf(x)-dG(x)≤t(x) or dG(x)-mg(x)≤ t(x), t(x)=f(x)-g(x)>0}.(2) Let G be an m-regular graph with 2n vertices, m ≥ n. If (P1, P2,..., Pr) is a partition of m, P1 ≡m (mod 2), Pi≡0 (mod 2), i=2,..., r, then the edge set E(G) of G can be parted into r parts E1,E2,..., Er of E(G) such that G[Ei] is a Pi-factor of G.
基金Supported by Shaanxi Provincial Overall Innovation Project of Science and Technology,China(Grant No.2013KTCQ01-06)
文摘Due to the non-stationary characteristics of vibration signals acquired from rolling element bearing fault, thc time-frequency analysis is often applied to describe the local information of these unstable signals smartly. However, it is difficult to classitythe high dimensional feature matrix directly because of too large dimensions for many classifiers. This paper combines the concepts of time-frequency distribution(TFD) with non-negative matrix factorization(NMF), and proposes a novel TFD matrix factorization method to enhance representation and identification of bearing fault. Throughout this method, the TFD of a vibration signal is firstly accomplished to describe the localized faults with short-time Fourier transform(STFT). Then, the supervised NMF mapping is adopted to extract the fault features from TFD. Meanwhile, the fault samples can be clustered and recognized automatically by using the clustering property of NMF. The proposed method takes advantages of the NMF in the parts-based representation and the adaptive clustering. The localized fault features of interest can be extracted as well. To evaluate the performance of the proposed method, the 9 kinds of the bearing fault on a test bench is performed. The proposed method can effectively identify the fault severity and different fault types. Moreover, in comparison with the artificial neural network(ANN), NMF yields 99.3% mean accuracy which is much superior to ANN. This research presents a simple and practical resolution for the fault diagnosis problem of rolling element bearing in high dimensional feature space.
基金supported by the National Natural Science Foundation of China(61702251,61363049,11571011)the State Scholarship Fund of China Scholarship Council(CSC)(201708360040)+3 种基金the Natural Science Foundation of Jiangxi Province(20161BAB212033)the Natural Science Basic Research Plan in Shaanxi Province of China(2018JM6030)the Doctor Scientific Research Starting Foundation of Northwest University(338050050)Youth Academic Talent Support Program of Northwest University
文摘This paper proposes a Graph regularized Lpsmooth non-negative matrix factorization(GSNMF) method by incorporating graph regularization and L_p smoothing constraint, which considers the intrinsic geometric information of a data set and produces smooth and stable solutions. The main contributions are as follows: first, graph regularization is added into NMF to discover the hidden semantics and simultaneously respect the intrinsic geometric structure information of a data set. Second,the Lpsmoothing constraint is incorporated into NMF to combine the merits of isotropic(L_2-norm) and anisotropic(L_1-norm)diffusion smoothing, and produces a smooth and more accurate solution to the optimization problem. Finally, the update rules and proof of convergence of GSNMF are given. Experiments on several data sets show that the proposed method outperforms related state-of-the-art methods.