期刊文献+
共找到6,667篇文章
< 1 2 250 >
每页显示 20 50 100
Automatic Clustering of User Behaviour Profiles for Web Recommendation System
1
作者 S.Sadesh Osamah Ibrahim Khalaf +3 位作者 Mohammad Shorfuzzaman Abdulmajeed Alsufyani K.Sangeetha Mueen Uddin 《Intelligent Automation & Soft Computing》 SCIE 2023年第3期3365-3384,共20页
Web usage mining,content mining,and structure mining comprise the web mining process.Web-Page Recommendation(WPR)development by incor-porating Data Mining Techniques(DMT)did not include end-users with improved perform... Web usage mining,content mining,and structure mining comprise the web mining process.Web-Page Recommendation(WPR)development by incor-porating Data Mining Techniques(DMT)did not include end-users with improved performance in the obtainedfiltering results.The cluster user profile-based clustering process is delayed when it has a low precision rate.Markov Chain Monte Carlo-Dynamic Clustering(MC2-DC)is based on the User Behavior Profile(UBP)model group’s similar user behavior on a dynamic update of UBP.The Reversible-Jump Concept(RJC)reviews the history with updated UBP and moves to appropriate clusters.Hamilton’s Filtering Framework(HFF)is designed tofilter user data based on personalised information on automatically updated UBP through the Search Engine(SE).The Hamilton Filtered Regime Switching User Query Probability(HFRSUQP)works forward the updated UBP for easy and accuratefiltering of users’interests and improves WPR.A Probabilistic User Result Feature Ranking based on Gaussian Distribution(PURFR-GD)has been developed to user rank results in a web mining process.PURFR-GD decreases the delay time in the end-to-end workflow for SE personalization in various meth-ods by using the Gaussian Distribution Function(GDF).The theoretical analysis and experiment results of the proposed MC2-DC method automatically increase the updated UBP accuracy by 18.78%.HFRSUQP enabled extensive Maximize Log-Likelihood(ML-L)increases to 15.28%of User Personalized Information Search Retrieval Rate(UPISRT).For feature ranking,the PURFR-GD model defines higher Classification Accuracy(CA)and Precision Ratio(PR)while uti-lising minimum Execution Time(ET).Furthermore,UPISRT's ranking perfor-mance has improved by 20%. 展开更多
关键词 Data mining web mining process search engine web-page recommendation ACCURACY
下载PDF
Incremental Web Usage Mining Based on Active Ant Colony Clustering
2
作者 SHEN Jie LIN Ying CHEN Zhimin 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1081-1085,共5页
To alleviate the scalability problem caused by the increasing Web using and changing users’ interests, this paper presents a novel Web Usage Mining algorithm-Incremental Web Usage Mining algorithm based on Active Ant... To alleviate the scalability problem caused by the increasing Web using and changing users’ interests, this paper presents a novel Web Usage Mining algorithm-Incremental Web Usage Mining algorithm based on Active Ant Colony Clustering. Firstly, an active movement strategy about direction selection and speed, different with the positive strategy employed by other Ant Colony Clustering algorithms, is proposed to construct an Active Ant Colony Clustering algorithm, which avoid the idle and “flying over the plane” moving phenomenon, effectively improve the quality and speed of clustering on large dataset. Then a mechanism of decomposing clusters based on above methods is introduced to form new clusters when users’ interests change. Empirical studies on a real Web dataset show the active ant colony clustering algorithm has better performance than the previous algorithms, and the incremental approach based on the proposed mechanism can efficiently implement incremental Web usage mining. 展开更多
关键词 数据挖掘 web 蚁群算法 数据库
下载PDF
Fully Automated Density-Based Clustering Method
3
作者 Bilal Bataineh Ahmad A.Alzahrani 《Computers, Materials & Continua》 SCIE EI 2023年第8期1833-1851,共19页
Cluster analysis is a crucial technique in unsupervised machine learning,pattern recognition,and data analysis.However,current clustering algorithms suffer from the need for manual determination of parameter values,lo... Cluster analysis is a crucial technique in unsupervised machine learning,pattern recognition,and data analysis.However,current clustering algorithms suffer from the need for manual determination of parameter values,low accuracy,and inconsistent performance concerning data size and structure.To address these challenges,a novel clustering algorithm called the fully automated density-based clustering method(FADBC)is proposed.The FADBC method consists of two stages:parameter selection and cluster extraction.In the first stage,a proposed method extracts optimal parameters for the dataset,including the epsilon size and a minimum number of points thresholds.These parameters are then used in a density-based technique to scan each point in the dataset and evaluate neighborhood densities to find clusters.The proposed method was evaluated on different benchmark datasets andmetrics,and the experimental results demonstrate its competitive performance without requiring manual inputs.The results show that the FADBC method outperforms well-known clustering methods such as the agglomerative hierarchical method,k-means,spectral clustering,DBSCAN,FCDCSD,Gaussian mixtures,and density-based spatial clustering methods.It can handle any kind of data set well and perform excellently. 展开更多
关键词 Automated clustering data mining density-based clustering unsupervised machine learning
下载PDF
基于Web Mining的智能化、个性化的远程教育模型研究 被引量:30
4
作者 汪启军 申瑞民 《计算机工程》 CAS CSCD 北大核心 2000年第12期157-159,共3页
该文提出了一个新的基于Web Mining的远程教育模型,它能够充分利用站点上积累下来的信息,更好地用于远程教学。
关键词 远程教育 智能化 个性化 web INTERNET网
下载PDF
PHISHING WEB IMAGE SEGMENTATION BASED ON IMPROVING SPECTRAL CLUSTERING 被引量:1
5
作者 Li Yuancheng Zhao Liujun Jiao Runhai 《Journal of Electronics(China)》 2011年第1期101-107,共7页
This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels fro... This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation. 展开更多
关键词 Spectral clustering algorithm CLONAL MUTATION Quantum-inspired Evolutionary Algorithm(QEA) Phishing web image segmentation
下载PDF
A UNIFIED EXTENDING METHOD FOR CONTENT-IGNORANT WEB PAGE CLUSTERING
6
作者 Shi Lin Chen Chen 《Journal of Electronics(China)》 2010年第1期105-112,共8页
The content-ignorant clustering method takes advantages in time complexity and space complexity than the content based methods.In this paper,the authors introduce a unified expanding method for content-ignorant web pa... The content-ignorant clustering method takes advantages in time complexity and space complexity than the content based methods.In this paper,the authors introduce a unified expanding method for content-ignorant web page clustering by mining the "click-through" log,which tries to solve the problem that the "click-through" log is sparse.The relationship between two nodes which have been expanded is also defined and optimized.Analysis and experiment show that the performance of the new method has improved,by the comparison with the standard content-ignorant method.The new method can also work without iterative clustering. 展开更多
关键词 web data mining clustering Content-ignorant clustering
下载PDF
基于Web Mining的个性化远程教学研究 被引量:1
7
作者 余先虎 《宁波广播电视大学学报》 2006年第2期70-72,共3页
本文首先分析了现代远程教学的特点,提出了基于Web挖掘的个性化远程教学模型。最后分析和展望了Web挖掘技术的发展和在远程教学中的应用前景。
关键词 web挖掘 远程教学 个性化学习
下载PDF
基于Web Mining的推荐系统 被引量:2
8
作者 唐哲 丁二玉 +1 位作者 骆斌 陈世福 《计算机科学》 CSCD 北大核心 2005年第12期193-196,共4页
推荐系统(Recommender System)被电子商务站点用来向顾客提供信息以帮助顾客选择产品,其基本思想是以统计结果或者顾客以前的行为记录为依据,推测顾客未来可能的行为并给出相应的推荐。本文对基于传统技术和Web mining技术的推荐系统进... 推荐系统(Recommender System)被电子商务站点用来向顾客提供信息以帮助顾客选择产品,其基本思想是以统计结果或者顾客以前的行为记录为依据,推测顾客未来可能的行为并给出相应的推荐。本文对基于传统技术和Web mining技术的推荐系统进行了简要综述,同时描述了基于Web mining技术的推荐系统的工作流程,重点分析了应用于推荐系统的各种具体Web mining技术及其算法比较。 展开更多
关键词 推荐系统 web mining
下载PDF
Campus Economic Analysis Based on K-Means Clustering and Hotspot Mining
9
作者 Xiuzhang Yang Shuai Wu +2 位作者 Huan Xia Yuanbo Li Xin Li 《Review of Educational Theory》 2020年第2期42-50,共9页
With the advent of the era of big data and the development and construction of smart campuses,the campus is gradually moving towards digitalization,networking and informationization.The campus card is an important par... With the advent of the era of big data and the development and construction of smart campuses,the campus is gradually moving towards digitalization,networking and informationization.The campus card is an important part of the construction of a smart campus,and the massive data it generates can indirectly reflect the living conditions of students at school.In the face of the campus card,how to quickly and accurately obtain the information required by users from the massive data sets has become an urgent problem that needs to be solved.This paper proposes a data mining algorithm based on K-Means clustering and time series.It analyzes the consumption data of a college student’s card to deeply mine and analyze the daily life consumer behavior habits of students,and to make an accurate judgment on the specific life consumer behavior.The algorithm proposed in this paper provides a practical reference for the construction of smart campuses in universities,and has important theoretical and application values. 展开更多
关键词 Machine learning K-Means clustering Data mining Consumer behavior Campus economy Economic regionalization
下载PDF
A Chinese Web Page Clustering Algorithm Based on the Suffix Tree 被引量:4
10
作者 YANGJian-wu 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期817-822,共6页
In this paper, an improved algorithm, named STC\|I, is proposed for Chinese Web page clustering based on Chinese language characteristics, which adopts a new unit choice principle and a novel suffix tree construction ... In this paper, an improved algorithm, named STC\|I, is proposed for Chinese Web page clustering based on Chinese language characteristics, which adopts a new unit choice principle and a novel suffix tree construction policy. The experimental results show that the new algorithm keeps advantages of STC, and is better than STC in precision and speed when they are used to cluster Chinese Web page. 展开更多
关键词 聚类算法 web 数据挖掘 后缀树 STC-I 网络查询 中文主页
下载PDF
The design and implementation of web mining in web sites security 被引量:2
11
作者 LI Jian, ZHANG Guo-yin , GU Guo-chang, LI Jian-li College of Computer Science and Technology, Harbin Engineering University, Harbin 150001China 《Journal of Marine Science and Application》 2003年第1期81-86,共6页
The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illeg... The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information,so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density -Based Clustering technique is used to reduce resource cost and obtain better efficiency. 展开更多
关键词 web 网络安全 数据挖掘 计算机网络 逻辑推理
下载PDF
A State-of-the-Art Survey on Semantic Web Mining 被引量:1
12
作者 Qudamah K. Quboa Mohamad Saraee 《Intelligent Information Management》 2013年第1期10-17,共8页
The integration of the two fast-developing scientific research areas Semantic Web and Web Mining is known as Semantic Web Mining. The huge increase in the amount of Semantic Web data became a perfect target for many r... The integration of the two fast-developing scientific research areas Semantic Web and Web Mining is known as Semantic Web Mining. The huge increase in the amount of Semantic Web data became a perfect target for many researchers to apply Data Mining techniques on it. This paper gives a detailed state-of-the-art survey of on-going research in this new area. It shows the positive effects of Semantic Web Mining, the obstacles faced by researchers and propose number of approaches to deal with the very complex and heterogeneous information and knowledge which are produced by the technologies of Semantic Web. 展开更多
关键词 web mining SEMANTIC web DATA mining SEMANTIC web mining
下载PDF
Fuzzy Clustering Method for Web User Based on Pages Classification 被引量:2
13
作者 ZHANLi-qiang LIUDa-xin 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期553-556,共4页
A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the... A new method for Web users fuzzy clustering based on analysis of user interest characteristic is proposed in this article. The method first defines page fuzzy categories according to the links on the index page of the site, then computes fuzzy degree of cross page through aggregating on data of Web log. After that, by using fuzzy comprehensive evaluation method, the method constructs user interest vectors according to page viewing times and frequency of hits, and derives the fuzzy similarity matrix from the interest vectors for the Web users. Finally, it gets the clustering result through the fuzzy clustering method. The experimental results show the effectiveness of the method. 展开更多
关键词 模糊聚类方法 web 页分类 数据挖掘 模糊相似矩阵
下载PDF
Applied Approaches of Rough Set Theory to Web Mining 被引量:1
14
作者 孙铁利 教巍巍 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期117-120,共4页
Rough set theory is a new soft computing tool, and has received much attention of researchers around the world. It can deal with incomplete and uncertain information. Now, it has been applied in many areas successfull... Rough set theory is a new soft computing tool, and has received much attention of researchers around the world. It can deal with incomplete and uncertain information. Now, it has been applied in many areas successfully. This paper introduces the basic concepts of rough set and discusses its applications in Web mining. In particular, some applications of rough set theory to intelligent information processing are emphasized. 展开更多
关键词 网络数据 数据挖掘 数据处理 计算机技术
下载PDF
Parallel Web Mining System Based on Cloud Platform 被引量:1
15
作者 Shengmei Luo Qing He +3 位作者 Lixia Liu Xiang Ao Ning Li Fuzhen Zhuang 《ZTE Communications》 2012年第4期45-53,共9页
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithm... Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients. Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data. 展开更多
关键词 web挖掘 并行算法 计算平台 挖掘系统 机器学习算法 应用程序 现实世界 结构化数据
下载PDF
Visualization of Special Features in “The Tale of Genji” by Text Mining and Correspondence Analysis with Clustering
16
作者 Hisako Hosoi Takayuki Yamagata +1 位作者 Yuya Ikarashi Nobuyuki Fujisawa 《Journal of Flow Control, Measurement & Visualization》 2014年第1期1-6,共6页
In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence s... In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence style by the correspondence analysis with clustering. The result shows that the text mining error in the number of auxiliary verbs can be as small as 15%. The extracted feature in this study supports the multiple authors of “The Tale of Genji”, which agrees well with the result by Murakami and Imanishi [1]. It is also found that extracted features are robust to the text mining error, which suggests that the classification error is less affected by the text mining error and the possible use of this technique for further statistical study in classical literatures. 展开更多
关键词 VISUALIZATION SCIENTIFIC Art The TALE of Genji TEXT mining CORRESPONDENCE Analysis clustering
下载PDF
Mining Profitability of Telecommunication Customers Using K-Means Clustering
17
作者 Hasitha Indika Arumawadu R. M. Kapila Tharanga Rathnayaka S. K. Illangarathne 《Journal of Data Analysis and Information Processing》 2015年第3期63-71,共9页
Data mining is the powerful technique, which can be widely used for discovering the customers’ behaviors as well as customer’s preferences. As a result, it has been widely used in top level companies for evaluating ... Data mining is the powerful technique, which can be widely used for discovering the customers’ behaviors as well as customer’s preferences. As a result, it has been widely used in top level companies for evaluating their Customer Relationship Management (CRM) system today. In this study, a new K-means clustering method proposed to evaluate the cluster customers’ profitability in telecommunication industry in Sri Lanka. Furthermore, RFM model mainly used as an input variable for K-means clustering and distortion curve used to identify optimal number of initial clusters. Based on the results, telecommunication customers’ profitability in Sri Lanka mainly categorized into three levels. 展开更多
关键词 K-MEANS clustering Data mining RFM Model CUSTOMER Relationship Management
下载PDF
Web Fuzzy Clustering and a Case Study
18
作者 LIUMao-fu HEJing +1 位作者 HEYan-xiang HUHui-jun 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第4期411-414,共4页
We combine the web usage mining and fuzzy clustering and give the concept of web fuzzy clustering, and then put forward the web fuzzy clustering processing model which is discussed in detail. Web fuzzy clustering can ... We combine the web usage mining and fuzzy clustering and give the concept of web fuzzy clustering, and then put forward the web fuzzy clustering processing model which is discussed in detail. Web fuzzy clustering can be used in the web users clustering and web pages clustering. In the end, a case study is given and the result has proved the feasibility of using web fuzzy clustering in web pages clustering. 展开更多
关键词 web挖掘 web用法挖掘 web模糊聚类 数据对象 网页聚类
下载PDF
Research of Web Documents Clustering Based on Dynamic Concept
19
作者 WANGYun-hua CHENShi-hong 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期547-552,共6页
Conceptual clustering is mainly used for solving the deficiency and incompleteness of domain knowledge. Based on conceptual clustering technology and aiming at the institutional framework and characteristic of Web the... Conceptual clustering is mainly used for solving the deficiency and incompleteness of domain knowledge. Based on conceptual clustering technology and aiming at the institutional framework and characteristic of Web theme information, this paper proposes and implements dynamic conceptual clustering algorithm and merging algorithm for Web documents, and also analyses the super performance of the clustering algorithm in efficiency and clustering accuracy. 展开更多
关键词 web 文件标题 聚类算法 互联网 信息查询
下载PDF
Two-level Hierarchical Clustering Analysis and Application
20
作者 HU Hui-rong, WANG Zhou-jing (Department of Automation, Xiamen University, Xiamen 361005, China) 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2002年第S1期283-284,共2页
Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri n... Hierarchical clustering analysis based on statistic s is one of the most important mining algorithms, but the traditionary hierarchica l clustering method is based on global comparing, which only takes in Q clusteri ng while ignoring R clustering in practice, so it has some limitation especially when the number of sample and index is very large. Furthermore, because of igno ring the association between the different indexes, the clustering result is not good & true. In this paper, we present the model and the algorithm of two-level hierarchi cal clustering which integrates Q clustering with R clustering. Moreover, becaus e two-level hierarchical clustering is based on the respective clustering resul t of each class, the classification of the indexes directly effects on the a ccuracy of the final clustering result, how to appropriately classify the inde xes is the chief and difficult problem we must handle in advance. Although some literatures also have referred to the issue of the classificati on of the indexes, but the articles classify the indexes only according to their superficial signification, which is unscientific. The reasons are as follow s: First, the superficial signification of some indexes usually takes on different meanings and it is easy to be misapprehended by different person. Furthermore, t his classification method seldom make use of history data, the classification re sult is not so objective. Second, for some indexes, its superficial signification didn’t show any mean ings, so simply from the superficial signification, we can’t classify them to c ertain classes. Third, this classification method need the users have higher level knowledge of this field, otherwise it is difficult for the users to understand the signifi cation of some indexes, which sometimes is not available. So in this paper, to this question, we first use R clustering method to cluste ring indexes, dividing p dimension indexes into q classes, then adopt two-level clustering method to get the final result. Obviously, the classification result is more objective and accurate. Moreover, after the first step, we can get the relation of the different indexes and their interaction. We can also know under a certain class indexes, which samples can be clustering to a class. (These semi finished results sometimes are very useful.) The experiments also indicates the effective and accurate of the algorithms. And, the result of R clustering ca n be easily used for the later practice. 展开更多
关键词 data mining clustering hierarchical clustering R clustering Q clustering
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部