期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Obtaining Profiles Based on Localized Non-negative Matrix Factorization 被引量:2
1
作者 JIANGJi-xiang XUBao-wen +1 位作者 LUJian-jiang ZhouXiao-yu 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期580-584,共5页
Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of informatio... Nonnegative matrix factorization (NMF) is a method to get parts-based features of information and form the typical profiles. But the basis vectors NMF gets are not orthogonal so that parts-based features of information are usually redundancy. In this paper, we propose two different approaches based on localized non-negative matrix factorization (LNMF) to obtain the typical user session profiles and typical semantic profiles of junk mails. The LNMF get basis vectors as orthogonal as possible so that it can get accurate profiles. The experiments show that the approach based on LNMF can obtain better profiles than the approach based on NMF. Key words localized non-negative matrix factorization - profile - log mining - mail filtering CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024), National Grand Fundamental Research 973 Program of China (2002CB312000), National Research Foundation for the Doctoral Program of Higher Education of China (20020286004).Biography: Jiang Ji-xiang (1980-), male, Master candidate, research direction: data mining, knowledge representation on the Web. 展开更多
关键词 localized non-negative matrix factorization PROFILE log mining mail filtering
下载PDF
Testing and Evaluation for Web Usability Based on Extended Markov Chain Model 被引量:2
2
作者 MAOCheng-ying LUYan-sheng 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期687-693,共7页
As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. W... As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining. 展开更多
关键词 statistical testing evaluation for Web usability extended Markov chain model (EMM) Web log mining reliability evaluation
下载PDF
Fuzzy Clustering Method for Web User Based on Pages Classification 被引量:2
3
作者 ZHANLi-qiang LIUDa-xin 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期553-556,共4页
A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the... A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. Key words Web log mining - fuzzy similarity matrix - fuzzy comprehensive evaluation - fuzzy clustering CLC number TP18 - TP311 - TP391 Foundation item: Supported by the Natural Science Foundation of Heilongjiang Province of China (F0304)Biography: ZHAN Li-qiang (1966-), male, Lecturer, Ph. D. research direction: the theory methods of data mining and theory of database. 展开更多
关键词 Web log mining fuzzy similarity matrix fuzzy comprehensive evaluation fuzzy clustering
下载PDF
Learning Query Ambiguity Models by Using Search Logs 被引量:1
4
作者 宋睿华 窦志成 +1 位作者 洪小文 俞勇 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期728-738,共11页
Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query.... Identifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%. 展开更多
关键词 ambiguous query log mining query classification
原文传递
Proactive planning of bandwidth resource using simulation-based what-if predictions for Web services in the cloud
5
作者 Jianpeng HU Linpeng HUANG +3 位作者 Tianqi SUN Ying FAN Wenqiang HU Hao ZHONG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第1期25-52,共28页
Resource planning is becoming an increasingly important and timely problem for cloud users.As more Web services are moved to the cloud,minimizing network usage is often a key driver of cost control.Most existing appro... Resource planning is becoming an increasingly important and timely problem for cloud users.As more Web services are moved to the cloud,minimizing network usage is often a key driver of cost control.Most existing approaches focus on resources such as CPU,memory,and disk I/O.In particular,CPU receives the most attention from researchers,but the bandwidth is somehow neglected.It is challenging to predict the network throughput of modem Web services,due to the factors of diverse and complex response,evolving Web services,and complex network transportation.In this paper,we propose a methodology of what-if analysis,named Log2Sim,to plan the bandwidth resource of Web services.Log2Sim uses a lightweight workload model to describe user behavior,an automated mining approach to obtain characteristics of workloads and responses from massive Web logs,and traffic-aware simulations to predict the impact on the bandwidth consumption and the response time in changing contexts.We use a real-life Web system and a classic benchmark to evaluate Log2Sim in multiple scenarios.The evaluation result shows that Log2Sim has good performance in the prediction of bandwidth consumption.The average relative error is 2%for the benchmark and 8% for the real-life system.As for the response time,Log2Sim cannot produce accurate predictions for every single service request,but the simulation results always show similar trends on average response time with the increase of workloads in different changing contexts.It can provide sufficient information for the system administrator in proactive bandwidth planning. 展开更多
关键词 what-if analysis bandwidth management network simulation Web service log mining resource planning evolution OPNET
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部