The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illeg...The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information,so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density -Based Clustering technique is used to reduce resource cost and obtain better efficiency.展开更多
The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based...The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient.展开更多
As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. W...As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining.展开更多
A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the...A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. Key words Web log mining - fuzzy similarity matrix - fuzzy comprehensive evaluation - fuzzy clustering CLC number TP18 - TP311 - TP391 Foundation item: Supported by the Natural Science Foundation of Heilongjiang Province of China (F0304)Biography: ZHAN Li-qiang (1966-), male, Lecturer, Ph. D. research direction: the theory methods of data mining and theory of database.展开更多
文摘The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information,so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density -Based Clustering technique is used to reduce resource cost and obtain better efficiency.
文摘The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient.
文摘As the increasing popularity and complexity of Web applications and the emergence of their new characteristics, the testing and maintenance of large, complex Web applications are becoming more complex and difficult. Web applications generally contain lots of pages and are used by enormous users. Statistical testing is an effective way of ensuring their quality. Web usage can be accurately described by Markov chain which has been proved to be an ideal model for software statistical testing. The results of unit testing can be utilized in the latter stages, which is an important strategy for bottom-to-top integration testing, and the other improvement of extended Markov chain model (EMM) is to present the error type vector which is treated as a part of page node. this paper also proposes the algorithm for generating test cases of usage paths. Finally, optional usage reliability evaluation methods and an incremental usability regression testing model for testing and evaluation are presented. Key words statistical testing - evaluation for Web usability - extended Markov chain model (EMM) - Web log mining - reliability evaluation CLC number TP311. 5 Foundation item: Supported by the National Defence Research Project (No. 41315. 9. 2) and National Science and Technology Plan (2001BA102A04-02-03)Biography: MAO Cheng-ying (1978-), male, Ph.D. candidate, research direction: software testing. Research direction: advanced database system, software testing, component technology and data mining.
文摘A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. Key words Web log mining - fuzzy similarity matrix - fuzzy comprehensive evaluation - fuzzy clustering CLC number TP18 - TP311 - TP391 Foundation item: Supported by the Natural Science Foundation of Heilongjiang Province of China (F0304)Biography: ZHAN Li-qiang (1966-), male, Lecturer, Ph. D. research direction: the theory methods of data mining and theory of database.