Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts itera...Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.展开更多
We forecast realized volatilities by developing a time-varying heterogeneous autoregressive(HAR)latent factor model with dynamic model average(DMA)and dynamic model selection(DMS)approaches.The number of latent factor...We forecast realized volatilities by developing a time-varying heterogeneous autoregressive(HAR)latent factor model with dynamic model average(DMA)and dynamic model selection(DMS)approaches.The number of latent factors is determined using Chan and Grant's(2016)deviation information criteria.The predictors in our model include lagged daily,weekly,and monthly volatility variables,the corresponding volatility factors,and a speculation variable.In addition,the time-varying properties of the best-performing DMA(DMS)-HAR-2FX models,including size,inclusion probabilities,and coefficients,are examined.We find that the proposed DMA(DMS)-HAR-2FX model outperforms the competing models for both in-sample and out-of-sample forecasts.Furthermore,the speculation variable displays strong predictability for forecasting the realized volatility of financial futures in China.展开更多
Latent factor models have become a workhorse for a large number of recommender systems. While these sys- tems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds...Latent factor models have become a workhorse for a large number of recommender systems. While these sys- tems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pair- wise preference questions: "Do you prefer item A over B?" User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporat- ing the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain cri- terion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.展开更多
High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurat...High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurately represent them is of great significance.A latent factor(LF)model is one of the most popular and successful ways to address this issue.Current LF models mostly adopt L2-norm-oriented Loss to represent an HiDS matrix,i.e.,they sum the errors between observed data and predicted ones with L2-norm.Yet L2-norm is sensitive to outlier data.Unfortunately,outlier data usually exist in such matrices.For example,an HiDS matrix from RSs commonly contains many outlier ratings due to some heedless/malicious users.To address this issue,this work proposes a smooth L1-norm-oriented latent factor(SL-LF)model.Its main idea is to adopt smooth L1-norm rather than L2-norm to form its Loss,making it have both strong robustness and high accuracy in predicting the missing data of an HiDS matrix.Experimental results on eight HiDS matrices generated by industrial applications verify that the proposed SL-LF model not only is robust to the outlier data but also has significantly higher prediction accuracy than state-of-the-art models when they are used to predict the missing data of HiDS matrices.展开更多
This paper discusses the utilization of latent variable modeling related to occupational health and safety in the mining industry.Latent variable modeling,which is a statistical model that relates observable and laten...This paper discusses the utilization of latent variable modeling related to occupational health and safety in the mining industry.Latent variable modeling,which is a statistical model that relates observable and latent variables,could be used to facilitate researchers’understandings of the underlying constructs or hypothetical factors and their magnitude of effect that constitute a complex system.This enhanced understanding,in turn,can help emphasize the important factors to improve mine safety.The most commonly used techniques include the exploratory factor analysis(EFA),the confirmatory factor analysis(CFA)and the structural equation model with latent variables(SEM).A critical comparison of the three techniques regarding mine safety is provided.Possible applications of latent variable modeling in mining engineering are explored.In this scope,relevant research papers were reviewed.They suggest that the application of such methods could prove useful in mine accident and safety research.Application of latent variables analysis in cognitive work analysis was proposed to improve the understanding of human-work relationships in mining operations.展开更多
由于物联网中服务数量的海量性、设备状态的动态变化性等特点,传统的互联网中基于关键词的“被动式”语义服务搜索技术将不再适用于物联网环境,如何利用并分析用户和设备之间大量的交互信息来给用户推荐与之最相关的设备资源是物联网中...由于物联网中服务数量的海量性、设备状态的动态变化性等特点,传统的互联网中基于关键词的“被动式”语义服务搜索技术将不再适用于物联网环境,如何利用并分析用户和设备之间大量的交互信息来给用户推荐与之最相关的设备资源是物联网中资源发现算法的关键。为此,首先给出一种基于超图理论的物联网用户-设备交互的表示模型并配以对应的表示矩阵,基于该模型提出了物联网业务场景中的资源推荐问题,并将该问题转换成基于矩阵分解的相关程度预测问题,最后引入最优化理论中的交替最小二乘法(Alternating least squares,ALS)来求解矩阵的最优化分解问题,进而提出一种基于隐语义模型的资源推荐算法,并与传统推荐系统中基于物品的协同过滤算法(ItemCF)在均方根误差(Root mean square error,RMSE)和平均绝对误差(Mean absolute error,MAE)等方面作对比,实验结果证明了本文所提出的推荐算法的有效性。展开更多
协同过滤算法是推荐系统中使用最广泛的算法之一,随着个性化推荐技术的发展,传统的协同过滤算法在数据稀疏的情况下推荐的准确率较低,同时没有考虑用户的兴趣会随着时间的推移发生动态变化等因素,传统的协同过滤推荐算法已无法满足个性...协同过滤算法是推荐系统中使用最广泛的算法之一,随着个性化推荐技术的发展,传统的协同过滤算法在数据稀疏的情况下推荐的准确率较低,同时没有考虑用户的兴趣会随着时间的推移发生动态变化等因素,传统的协同过滤推荐算法已无法满足个性化推荐的需求。论文针对以上问题提出一种融合算法,将K-means算法和隐语义模型相结合,提出基于用户聚类和时间隐语义模型的推荐算法K-T-LFM(K-means algorithm clustering users and Time Based Latent Factor Model)。该算法根据用户的属性特征,采用最大-最小准则确定初始质心的K-means算法将用户聚类,解决了新用户登录的冷启动问题,降低了矩阵的稀疏程度和矩阵规模;根据艾宾浩斯遗忘曲线提出时间函数,并融合传统隐语义模型对聚类中的用户评分稀疏矩阵进行填充,有效缓解了数据的稀疏性,同时考虑了时间因素对用户的兴趣偏好的影响,提高了推荐算法的准确性。通过MovieLens数据集进行实验对比,该算法较其他的协同过滤算法准确率有所提升。展开更多
基金supported in part by the National Natural Science Foundation of China (6177249391646114)+1 种基金Chongqing research program of technology innovation and application (cstc2017rgzn-zdyfX0020)in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciences
文摘Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models.
基金supported by grants from the National Natural Science Foundation of China(72171088,71803049,72003205)the Ministry of Education of the People's Republic of China of Humanities and Social Sciences Youth Fundation(20YJC790142)the General Project of Social Science Planning in Guangdong Province,China(GD22CYJ12).
文摘We forecast realized volatilities by developing a time-varying heterogeneous autoregressive(HAR)latent factor model with dynamic model average(DMA)and dynamic model selection(DMS)approaches.The number of latent factors is determined using Chan and Grant's(2016)deviation information criteria.The predictors in our model include lagged daily,weekly,and monthly volatility variables,the corresponding volatility factors,and a speculation variable.In addition,the time-varying properties of the best-performing DMA(DMS)-HAR-2FX models,including size,inclusion probabilities,and coefficients,are examined.We find that the proposed DMA(DMS)-HAR-2FX model outperforms the competing models for both in-sample and out-of-sample forecasts.Furthermore,the speculation variable displays strong predictability for forecasting the realized volatility of financial futures in China.
文摘Latent factor models have become a workhorse for a large number of recommender systems. While these sys- tems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pair- wise preference questions: "Do you prefer item A over B?" User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporat- ing the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain cri- terion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.
基金supported in part by the National Natural Science Foundation of China(61702475,61772493,61902370,62002337)in part by the Natural Science Foundation of Chongqing,China(cstc2019jcyj-msxmX0578,cstc2019jcyjjqX0013)+1 种基金in part by the Chinese Academy of Sciences“Light of West China”Program,in part by the Pioneer Hundred Talents Program of Chinese Academy of Sciencesby Technology Innovation and Application Development Project of Chongqing,China(cstc2019jscx-fxydX0027)。
文摘High-dimensional and sparse(HiDS)matrices commonly arise in various industrial applications,e.g.,recommender systems(RSs),social networks,and wireless sensor networks.Since they contain rich information,how to accurately represent them is of great significance.A latent factor(LF)model is one of the most popular and successful ways to address this issue.Current LF models mostly adopt L2-norm-oriented Loss to represent an HiDS matrix,i.e.,they sum the errors between observed data and predicted ones with L2-norm.Yet L2-norm is sensitive to outlier data.Unfortunately,outlier data usually exist in such matrices.For example,an HiDS matrix from RSs commonly contains many outlier ratings due to some heedless/malicious users.To address this issue,this work proposes a smooth L1-norm-oriented latent factor(SL-LF)model.Its main idea is to adopt smooth L1-norm rather than L2-norm to form its Loss,making it have both strong robustness and high accuracy in predicting the missing data of an HiDS matrix.Experimental results on eight HiDS matrices generated by industrial applications verify that the proposed SL-LF model not only is robust to the outlier data but also has significantly higher prediction accuracy than state-of-the-art models when they are used to predict the missing data of HiDS matrices.
基金Natural Sciences and Engineering Research Council of Canada(NSERC)(ID:236482)for supporting this research
文摘This paper discusses the utilization of latent variable modeling related to occupational health and safety in the mining industry.Latent variable modeling,which is a statistical model that relates observable and latent variables,could be used to facilitate researchers’understandings of the underlying constructs or hypothetical factors and their magnitude of effect that constitute a complex system.This enhanced understanding,in turn,can help emphasize the important factors to improve mine safety.The most commonly used techniques include the exploratory factor analysis(EFA),the confirmatory factor analysis(CFA)and the structural equation model with latent variables(SEM).A critical comparison of the three techniques regarding mine safety is provided.Possible applications of latent variable modeling in mining engineering are explored.In this scope,relevant research papers were reviewed.They suggest that the application of such methods could prove useful in mine accident and safety research.Application of latent variables analysis in cognitive work analysis was proposed to improve the understanding of human-work relationships in mining operations.
文摘由于物联网中服务数量的海量性、设备状态的动态变化性等特点,传统的互联网中基于关键词的“被动式”语义服务搜索技术将不再适用于物联网环境,如何利用并分析用户和设备之间大量的交互信息来给用户推荐与之最相关的设备资源是物联网中资源发现算法的关键。为此,首先给出一种基于超图理论的物联网用户-设备交互的表示模型并配以对应的表示矩阵,基于该模型提出了物联网业务场景中的资源推荐问题,并将该问题转换成基于矩阵分解的相关程度预测问题,最后引入最优化理论中的交替最小二乘法(Alternating least squares,ALS)来求解矩阵的最优化分解问题,进而提出一种基于隐语义模型的资源推荐算法,并与传统推荐系统中基于物品的协同过滤算法(ItemCF)在均方根误差(Root mean square error,RMSE)和平均绝对误差(Mean absolute error,MAE)等方面作对比,实验结果证明了本文所提出的推荐算法的有效性。
文摘协同过滤算法是推荐系统中使用最广泛的算法之一,随着个性化推荐技术的发展,传统的协同过滤算法在数据稀疏的情况下推荐的准确率较低,同时没有考虑用户的兴趣会随着时间的推移发生动态变化等因素,传统的协同过滤推荐算法已无法满足个性化推荐的需求。论文针对以上问题提出一种融合算法,将K-means算法和隐语义模型相结合,提出基于用户聚类和时间隐语义模型的推荐算法K-T-LFM(K-means algorithm clustering users and Time Based Latent Factor Model)。该算法根据用户的属性特征,采用最大-最小准则确定初始质心的K-means算法将用户聚类,解决了新用户登录的冷启动问题,降低了矩阵的稀疏程度和矩阵规模;根据艾宾浩斯遗忘曲线提出时间函数,并融合传统隐语义模型对聚类中的用户评分稀疏矩阵进行填充,有效缓解了数据的稀疏性,同时考虑了时间因素对用户的兴趣偏好的影响,提高了推荐算法的准确性。通过MovieLens数据集进行实验对比,该算法较其他的协同过滤算法准确率有所提升。