A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and...A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and scalability when handling large-scale industrial problems.Aiming at addressing this issue,this study proposes a momentum-incorporated parallel stochastic gradient descent(MPSGD)algorithm,whose main idea is two-fold:a)implementing parallelization via a novel datasplitting strategy,and b)accelerating convergence rate by integrating momentum effects into its training process.With it,an MPSGD-based latent factor(MLF)model is achieved,which is capable of performing efficient and high-quality recommendations.Experimental results on four high-dimensional and sparse matrices generated by industrial RS indicate that owing to an MPSGD algorithm,an MLF model outperforms the existing state-of-the-art ones in both computational efficiency and scalability.展开更多
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
The distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of n local cost functions by using local information exchange is considered.This problem is an important component of...The distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of n local cost functions by using local information exchange is considered.This problem is an important component of many machine learning techniques with data parallelism,such as deep learning and federated learning.We propose a distributed primal-dual stochastic gradient descent(SGD)algorithm,suitable for arbitrarily connected communication networks and any smooth(possibly nonconvex)cost functions.We show that the proposed algorithm achieves the linear speedup convergence rate O(1/(√nT))for general nonconvex cost functions and the linear speedup convergence rate O(1/(nT)) when the global cost function satisfies the Polyak-Lojasiewicz(P-L)condition,where T is the total number of iterations.We also show that the output of the proposed algorithm with constant parameters linearly converges to a neighborhood of a global optimum.We demonstrate through numerical experiments the efficiency of our algorithm in comparison with the baseline centralized SGD and recently proposed distributed SGD algorithms.展开更多
针对多径环境下异步长短码直扩码分多址信号(long and short code direct sequence code division multiple access,LSC-DS-CDMA)伪码估计难的问题,提出一种基于张量分解和联合估计的伪码估计方法,采用重叠窗对接收信号进行分段并构建...针对多径环境下异步长短码直扩码分多址信号(long and short code direct sequence code division multiple access,LSC-DS-CDMA)伪码估计难的问题,提出一种基于张量分解和联合估计的伪码估计方法,采用重叠窗对接收信号进行分段并构建张量模型。为改善传统线性步长搜索算法结合梯度下降的方法分解因子矩阵收敛较慢的问题,提出改进的线性步长搜索算法,结合使用动量梯度下降法对各子张量进行Tucker分解得到各因子矩阵,所需的迭代次数大大减少;利用接收增益矩阵和移位相乘解决复合码的排序模糊和幅度模糊问题;利用最大似然准则联合估计复合码和多径信道后,使用梅西算法和相关运算估计每个用户的长码和短码。仿真结果表明,该方法能够有效估计多径异步LSC-DS-CDMA信号的伪码。展开更多
基金supported in part by the National Natural Science Foundation of China(61772493)the Deanship of Scientific Research(DSR)at King Abdulaziz University(RG-48-135-40)+1 种基金Guangdong Province Universities and College Pearl River Scholar Funded Scheme(2019)the Natural Science Foundation of Chongqing(cstc2019jcyjjqX0013)。
文摘A recommender system(RS)relying on latent factor analysis usually adopts stochastic gradient descent(SGD)as its learning algorithm.However,owing to its serial mechanism,an SGD algorithm suffers from low efficiency and scalability when handling large-scale industrial problems.Aiming at addressing this issue,this study proposes a momentum-incorporated parallel stochastic gradient descent(MPSGD)algorithm,whose main idea is two-fold:a)implementing parallelization via a novel datasplitting strategy,and b)accelerating convergence rate by integrating momentum effects into its training process.With it,an MPSGD-based latent factor(MLF)model is achieved,which is capable of performing efficient and high-quality recommendations.Experimental results on four high-dimensional and sparse matrices generated by industrial RS indicate that owing to an MPSGD algorithm,an MLF model outperforms the existing state-of-the-art ones in both computational efficiency and scalability.
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
基金supported by the Knut and Alice Wallenberg Foundationthe Swedish Foundation for Strategic Research+1 种基金the Swedish Research Councilthe National Natural Science Foundation of China(62133003,61991403,61991404,61991400)。
文摘The distributed nonconvex optimization problem of minimizing a global cost function formed by a sum of n local cost functions by using local information exchange is considered.This problem is an important component of many machine learning techniques with data parallelism,such as deep learning and federated learning.We propose a distributed primal-dual stochastic gradient descent(SGD)algorithm,suitable for arbitrarily connected communication networks and any smooth(possibly nonconvex)cost functions.We show that the proposed algorithm achieves the linear speedup convergence rate O(1/(√nT))for general nonconvex cost functions and the linear speedup convergence rate O(1/(nT)) when the global cost function satisfies the Polyak-Lojasiewicz(P-L)condition,where T is the total number of iterations.We also show that the output of the proposed algorithm with constant parameters linearly converges to a neighborhood of a global optimum.We demonstrate through numerical experiments the efficiency of our algorithm in comparison with the baseline centralized SGD and recently proposed distributed SGD algorithms.
文摘针对多径环境下异步长短码直扩码分多址信号(long and short code direct sequence code division multiple access,LSC-DS-CDMA)伪码估计难的问题,提出一种基于张量分解和联合估计的伪码估计方法,采用重叠窗对接收信号进行分段并构建张量模型。为改善传统线性步长搜索算法结合梯度下降的方法分解因子矩阵收敛较慢的问题,提出改进的线性步长搜索算法,结合使用动量梯度下降法对各子张量进行Tucker分解得到各因子矩阵,所需的迭代次数大大减少;利用接收增益矩阵和移位相乘解决复合码的排序模糊和幅度模糊问题;利用最大似然准则联合估计复合码和多径信道后,使用梅西算法和相关运算估计每个用户的长码和短码。仿真结果表明,该方法能够有效估计多径异步LSC-DS-CDMA信号的伪码。