Matrix factorization(MF)methods have superior recommendation performance and are flexible to incorporate other side information,but it is hard for humans to interpret the derived latent factors.Recently,the item-item ...Matrix factorization(MF)methods have superior recommendation performance and are flexible to incorporate other side information,but it is hard for humans to interpret the derived latent factors.Recently,the item-item cooccurrence information is exploited to learn item embeddings and enhance the recommendation performance.However,the item-item co-occurrence information,constructed from the sparse and long-tail distributed user-item interaction matrix,is over-estimated for rare items,which could lead to bias in learned item embeddings.In this paper,we seek to evaluate and improve the interpretability of item embeddings by leveraging a dense item-tag relevance matrix.Specifically,we design two metrics to quantitatively evaluate the interpretability of item embeddings from different viewpoints:interpretability of individual dimensions of item embeddings and semantic coherence of local neighborhoods in the latent space.We also propose a tag-informed item embedding(TIE)model that jointly factorizes the user-item interaction matrix,the item-item co-occurrence matrix and the item-tag relevance matrix with shared item embeddings so that different forms of information can co-operate with each other to learn better item embeddings.Experiments on the MovieLens20M dataset demonstrate that compared with other state-of-the-art MF methods,TIE achieves better top-N recommendations,and the relative improvement is larger when the user-item interaction matrix becomes sparser.By leveraging the itemtag relevance information,individual dimensions of item embeddings are more interpretable and local neighborhoods in the latent space are more semantically coherent;the bias in learned item embeddings are also mitigated to some extent.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.61672322,61672324)the Natural Science Foundation of Shandong Province(2016ZRE27468)the Fundamental Research Funds of Shandong University.
文摘Matrix factorization(MF)methods have superior recommendation performance and are flexible to incorporate other side information,but it is hard for humans to interpret the derived latent factors.Recently,the item-item cooccurrence information is exploited to learn item embeddings and enhance the recommendation performance.However,the item-item co-occurrence information,constructed from the sparse and long-tail distributed user-item interaction matrix,is over-estimated for rare items,which could lead to bias in learned item embeddings.In this paper,we seek to evaluate and improve the interpretability of item embeddings by leveraging a dense item-tag relevance matrix.Specifically,we design two metrics to quantitatively evaluate the interpretability of item embeddings from different viewpoints:interpretability of individual dimensions of item embeddings and semantic coherence of local neighborhoods in the latent space.We also propose a tag-informed item embedding(TIE)model that jointly factorizes the user-item interaction matrix,the item-item co-occurrence matrix and the item-tag relevance matrix with shared item embeddings so that different forms of information can co-operate with each other to learn better item embeddings.Experiments on the MovieLens20M dataset demonstrate that compared with other state-of-the-art MF methods,TIE achieves better top-N recommendations,and the relative improvement is larger when the user-item interaction matrix becomes sparser.By leveraging the itemtag relevance information,individual dimensions of item embeddings are more interpretable and local neighborhoods in the latent space are more semantically coherent;the bias in learned item embeddings are also mitigated to some extent.