Purpose:The goal of our research is to suggest specific Web metrics that are useful for evaluating and improving user navigation experience on informational websites.Design/methodology/approach:We revised metrics in a...Purpose:The goal of our research is to suggest specific Web metrics that are useful for evaluating and improving user navigation experience on informational websites.Design/methodology/approach:We revised metrics in a Web forensic framework proposed in the literature and defined the metrics of footprint,track and movement.Data were obtained from user clickstreams provided by a real estate site’s administrators.There were two phases of data analysis with the first phase on navigation behavior based on user footprints and tracks,and the second phase on navigational transition patterns based on user movements.Findings:Preliminary results suggest that the apartment pages were heavily-trafficked while the agent pages and related information pages were underused to a great extent.Navigation within the same category of pages was prevalent,especially when users navigated among the regional apartment listings.However,navigation of these pages was found to be inefficient.Research limitations:The suggestions for navigation design optimization provided in the paper are specific to this website,and their applicability to other online environments needs to be verified.Preference predications or personal recommendations are not made during the current stage of research.Practical implications:Our clickstream data analysis results offer a base for future research.Meanwhile,website administrators and managers can make better use of the readily available clickstream data to evaluate the effectiveness and efficiency of their site navigation design.Originality/value:Our empirical study is valuable to those seeking analysis metrics for evaluating and improving user navigation experience on informational websites based on clickstream data.Our attempts to analyze the log file in terms of footprint,track and movement will enrich the utilization of such trace data to engender a deeper understanding of users’within-site navigation behavior.展开更多
In this paper, we first briefly introduce the concepts of clickstream data and data warehouse, analyze twoexisting clickstream star schema click star schema and session star schema in webhouse, then induce a new mod-e...In this paper, we first briefly introduce the concepts of clickstream data and data warehouse, analyze twoexisting clickstream star schema click star schema and session star schema in webhouse, then induce a new mod-el transaction star model based on them, and expressed the method of bringing out the model. Comparing withthe two schemas mentioned above, its most apparent speciality is that it includes a series of meaningful page-view se-quence rather than a single click. Thus, on the one hand it improves the query performance of data, on the other handit is in favor of executing more deepen analysis data mining, and simplifies the process of data pretreatment. Atlast ,the paper verifies its' feasibility and validity using association rules based on the model.展开更多
Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketin...Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketing, for web personalization, to predict web access sequence etc. In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website. Using this approach, web usage mining is done through different stages namely data cleaning, preprocessing, pattern discovery and pattern analysis. Results are given to explain how this approach produces tight usage clusters than the existing web usage mining techniques. Rather than traditional distance based clustering, the similarity measure is considered during clustering process in order to reduce computational complexity. This paper also deals with the problem of assessing the quality of user session clusters and cluster validity is measured by using statistical test, which measures the distances of clusters distributions to infer their dissimilarity and distinguish level. Using such statistical measures, it is proved that cluster accuracy is improved to the extent of 0.83, over existing k-means clustering with validity measure 0.26, FCM (Fuzzy C Means) clustering with validity measure 0.56. Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.:71203163)the Foundation for Humanities and Social Sciences of the Chinese Ministry of Education(Grant No.:12YJC870011)
文摘Purpose:The goal of our research is to suggest specific Web metrics that are useful for evaluating and improving user navigation experience on informational websites.Design/methodology/approach:We revised metrics in a Web forensic framework proposed in the literature and defined the metrics of footprint,track and movement.Data were obtained from user clickstreams provided by a real estate site’s administrators.There were two phases of data analysis with the first phase on navigation behavior based on user footprints and tracks,and the second phase on navigational transition patterns based on user movements.Findings:Preliminary results suggest that the apartment pages were heavily-trafficked while the agent pages and related information pages were underused to a great extent.Navigation within the same category of pages was prevalent,especially when users navigated among the regional apartment listings.However,navigation of these pages was found to be inefficient.Research limitations:The suggestions for navigation design optimization provided in the paper are specific to this website,and their applicability to other online environments needs to be verified.Preference predications or personal recommendations are not made during the current stage of research.Practical implications:Our clickstream data analysis results offer a base for future research.Meanwhile,website administrators and managers can make better use of the readily available clickstream data to evaluate the effectiveness and efficiency of their site navigation design.Originality/value:Our empirical study is valuable to those seeking analysis metrics for evaluating and improving user navigation experience on informational websites based on clickstream data.Our attempts to analyze the log file in terms of footprint,track and movement will enrich the utilization of such trace data to engender a deeper understanding of users’within-site navigation behavior.
文摘In this paper, we first briefly introduce the concepts of clickstream data and data warehouse, analyze twoexisting clickstream star schema click star schema and session star schema in webhouse, then induce a new mod-el transaction star model based on them, and expressed the method of bringing out the model. Comparing withthe two schemas mentioned above, its most apparent speciality is that it includes a series of meaningful page-view se-quence rather than a single click. Thus, on the one hand it improves the query performance of data, on the other handit is in favor of executing more deepen analysis data mining, and simplifies the process of data pretreatment. Atlast ,the paper verifies its' feasibility and validity using association rules based on the model.
文摘Web log mining is analysis of web log files with web page sequences. Discovering user access patterns from web access are necessary for building adaptive web servers, to improve e-commerce, to carry out cross-marketing, for web personalization, to predict web access sequence etc. In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website. Using this approach, web usage mining is done through different stages namely data cleaning, preprocessing, pattern discovery and pattern analysis. Results are given to explain how this approach produces tight usage clusters than the existing web usage mining techniques. Rather than traditional distance based clustering, the similarity measure is considered during clustering process in order to reduce computational complexity. This paper also deals with the problem of assessing the quality of user session clusters and cluster validity is measured by using statistical test, which measures the distances of clusters distributions to infer their dissimilarity and distinguish level. Using such statistical measures, it is proved that cluster accuracy is improved to the extent of 0.83, over existing k-means clustering with validity measure 0.26, FCM (Fuzzy C Means) clustering with validity measure 0.56. Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis.