Corporations focus on web based education to train their employees ever more than before. Unlike traditional learning environments, web based education applications store large amount of data. This growing availabilit...Corporations focus on web based education to train their employees ever more than before. Unlike traditional learning environments, web based education applications store large amount of data. This growing availability of data stimulated the emergence of a new field called educational data mining. In this study, the classification method is implemented on a data that is obtained from a company which uses web based education to train their employees. The authors' aim is to find out the most critical factors that influence the users' success. For the classification of the data, two decision tree algorithms, Classification and Regression Tree (CART) and Quick, Unbiased and Efficient Statistical Tree (QUEST) are applied. According to the results, assurance of a certificate at the end of the training is found to be the most critical factor that influences the users' success. Position, number of work years and the education level of the user, are also found as important factors.展开更多
With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this pap...With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.展开更多
文摘Corporations focus on web based education to train their employees ever more than before. Unlike traditional learning environments, web based education applications store large amount of data. This growing availability of data stimulated the emergence of a new field called educational data mining. In this study, the classification method is implemented on a data that is obtained from a company which uses web based education to train their employees. The authors' aim is to find out the most critical factors that influence the users' success. For the classification of the data, two decision tree algorithms, Classification and Regression Tree (CART) and Quick, Unbiased and Efficient Statistical Tree (QUEST) are applied. According to the results, assurance of a certificate at the end of the training is found to be the most critical factor that influences the users' success. Position, number of work years and the education level of the user, are also found as important factors.
基金supported in part by the Fundamental Research Funds for the Central Universities under Grant No.2013RC0114111 Project of China under Grant No.B08004
文摘With increasingly complex website structure and continuously advancing web technologies,accurate user clicks recognition from massive HTTP data,which is critical for web usage mining,becomes more difficult.In this paper,we propose a dependency graph model to describe the relationships between web requests.Based on this model,we design and implement a heuristic parallel algorithm to distinguish user clicks with the assistance of cloud computing technology.We evaluate the proposed algorithm with real massive data.The size of the dataset collected from a mobile core network is 228.7GB.It covers more than three million users.The experiment results demonstrate that the proposed algorithm can achieve higher accuracy than previous methods.