The statute recommendation problem is a sub problem of the automated decision system, which can help the legal staff to deal with the process of the case in an intelligent and automated way. In this paper, an improved...The statute recommendation problem is a sub problem of the automated decision system, which can help the legal staff to deal with the process of the case in an intelligent and automated way. In this paper, an improved common word similarity algorithm is proposed for normalization. Meanwhile, word mover’s distance (WMD) algorithm was applied to the similarity measurement and statute recommendation problem, and the problem scene which was originally used for classification was extended. Finally, a variety of recommendation strategies different from traditional collaborative filtering methods were proposed. The experimental results show that it achieves the best value of Fmeasure reaching 0.799. And the comparative experiment shows that WMD algorithm can achieve better results than TF-IDF and LDA algorithm.展开更多
The number of mobile application services is showing an explosive growth trend,which makes it difficult for users to determine which ones are of interest.Especially,the new mobile application services are emerge conti...The number of mobile application services is showing an explosive growth trend,which makes it difficult for users to determine which ones are of interest.Especially,the new mobile application services are emerge continuously,most of them have not be rated when they need to be recommended to users.This is the typical problem of cold start in the field of collaborative filtering recommendation.This problem may makes it difficult for users to locate and acquire the services that they actually want,and the accuracy and novelty of service recommendations are also difficult to satisfy users.To solve this problem,a hybrid recommendation method for mobile application services based on content feature extraction is proposed in this paper.First,the proposed method in this paper extracts service content features through Natural Language Processing technologies such as word segmentation,part-of-speech tagging,and dependency parsing.It improves the accuracy of describing service attributes and the rationality of the method of calculating service similarity.Then,a language representation model called Bidirectional Encoder Representation from Transformers(BERT)is used to vectorize the content feature text,and an improved weighted word mover’s distance algorithm based on Term Frequency-Inverse Document Frequency(TFIDF-WMD)is used to calculate the similarity of mobile application services.Finally,the recommendation process is completed by combining the item-based collaborative filtering recommendation algorithm.The experimental results show that by using the proposed hybrid recommendation method presented in this paper,the cold start problem is alleviated to a certain extent,and the accuracy of the recommendation result has been significantly improved.展开更多
Behavior targeting(BT)based on individual web-browsing history has become more valuable in precision marketing for many companies through capturing users’interest and preference.It is common in practice that the beha...Behavior targeting(BT)based on individual web-browsing history has become more valuable in precision marketing for many companies through capturing users’interest and preference.It is common in practice that the behavior data collected from different online shopping applications are inconsistent since they are labelled by different item taxonomy,where the same behavior could have different representations and therefore analysis confusion arises.To address this issue,we propose a semantic similarity based strategy to transform the heterogeneous behavior extracted from deep packet inspection(DPI)data of a telecommunication operator into a unique standard one.The Word Mover’s Distance algorithm is exploited to evaluate the semantic similarity of the distributed representations of two web-browsing histories.Moreover,the architecture of the behavior targeting platform on Hadoop is implemented,which is capable of processing data with size of PB level every day.展开更多
文摘The statute recommendation problem is a sub problem of the automated decision system, which can help the legal staff to deal with the process of the case in an intelligent and automated way. In this paper, an improved common word similarity algorithm is proposed for normalization. Meanwhile, word mover’s distance (WMD) algorithm was applied to the similarity measurement and statute recommendation problem, and the problem scene which was originally used for classification was extended. Finally, a variety of recommendation strategies different from traditional collaborative filtering methods were proposed. The experimental results show that it achieves the best value of Fmeasure reaching 0.799. And the comparative experiment shows that WMD algorithm can achieve better results than TF-IDF and LDA algorithm.
基金Project supported by the National Natural Science Foundation,China(No.62172123)the Postdoctoral Science Foundation of Heilongjiang Province,China(No.LBH-Z19067)+1 种基金the special projects for the central government to guide the development of local science and technology,China(No.ZY20B11)the Natural Science Foundation of Heilongjiang Province,China(No.QC2018081).
文摘The number of mobile application services is showing an explosive growth trend,which makes it difficult for users to determine which ones are of interest.Especially,the new mobile application services are emerge continuously,most of them have not be rated when they need to be recommended to users.This is the typical problem of cold start in the field of collaborative filtering recommendation.This problem may makes it difficult for users to locate and acquire the services that they actually want,and the accuracy and novelty of service recommendations are also difficult to satisfy users.To solve this problem,a hybrid recommendation method for mobile application services based on content feature extraction is proposed in this paper.First,the proposed method in this paper extracts service content features through Natural Language Processing technologies such as word segmentation,part-of-speech tagging,and dependency parsing.It improves the accuracy of describing service attributes and the rationality of the method of calculating service similarity.Then,a language representation model called Bidirectional Encoder Representation from Transformers(BERT)is used to vectorize the content feature text,and an improved weighted word mover’s distance algorithm based on Term Frequency-Inverse Document Frequency(TFIDF-WMD)is used to calculate the similarity of mobile application services.Finally,the recommendation process is completed by combining the item-based collaborative filtering recommendation algorithm.The experimental results show that by using the proposed hybrid recommendation method presented in this paper,the cold start problem is alleviated to a certain extent,and the accuracy of the recommendation result has been significantly improved.
基金Beijing University of Posts and Telecommunications,ChinaChina Telecom for cooperation and support for this paper
文摘Behavior targeting(BT)based on individual web-browsing history has become more valuable in precision marketing for many companies through capturing users’interest and preference.It is common in practice that the behavior data collected from different online shopping applications are inconsistent since they are labelled by different item taxonomy,where the same behavior could have different representations and therefore analysis confusion arises.To address this issue,we propose a semantic similarity based strategy to transform the heterogeneous behavior extracted from deep packet inspection(DPI)data of a telecommunication operator into a unique standard one.The Word Mover’s Distance algorithm is exploited to evaluate the semantic similarity of the distributed representations of two web-browsing histories.Moreover,the architecture of the behavior targeting platform on Hadoop is implemented,which is capable of processing data with size of PB level every day.