期刊文献+
共找到1,607篇文章
< 1 2 81 >
每页显示 20 50 100
Detecting network intrusions by data mining and variable-length sequence pattern matching 被引量:2
1
作者 Tian Xinguang Duan Miyi +1 位作者 Sun Chunlai Liu Xin 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2009年第2期405-411,共7页
Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux... Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance. 展开更多
关键词 intrusion detection anomaly detection system call data mining variable-length pattern
下载PDF
Pattern recognition and data mining software based on artificial neural networks applied to proton transfer in aqueous environments 被引量:2
2
作者 Amani Tahat Jordi Marti +1 位作者 Ali Khwaldeh Kaher Tahat 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第4期410-421,共12页
In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occu... In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occurred' and transfer 'not occurred'. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies. 展开更多
关键词 pattern recognition proton transfer chart pattern data mining artificial neural network empiricalvalence bond
下载PDF
Feature Selection with Optimal Stacked Sparse Autoencoder for Data Mining 被引量:4
3
作者 Manar Ahmed Hamza Siwar Ben Haj Hassine +5 位作者 Ibrahim Abunadi Fahd N.Al-Wesabi Hadeel Alsolai Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第8期2581-2596,共16页
Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine ... Data mining in the educational field can be used to optimize the teaching and learning performance among the students.The recently developed machine learning(ML)and deep learning(DL)approaches can be utilized to mine the data effectively.This study proposes an Improved Sailfish Optimizer-based Feature SelectionwithOptimal Stacked Sparse Autoencoder(ISOFS-OSSAE)for data mining and pattern recognition in the educational sector.The proposed ISOFS-OSSAE model aims to mine the educational data and derive decisions based on the feature selection and classification process.Moreover,the ISOFS-OSSAEmodel involves the design of the ISOFS technique to choose an optimal subset of features.Moreover,the swallow swarm optimization(SSO)with the SSAE model is derived to perform the classification process.To showcase the enhanced outcomes of the ISOFSOSSAE model,a wide range of experiments were taken place on a benchmark dataset from the University of California Irvine(UCI)Machine Learning Repository.The simulation results pointed out the improved classification performance of the ISOFS-OSSAE model over the recent state of art approaches interms of different performance measures. 展开更多
关键词 data mining pattern recognition feature selection data classification SSAE model
下载PDF
Mining Software Repository for Cleaning Bugs Using Data Mining Technique 被引量:1
4
作者 Nasir Mahmood Yaser Hafeez +4 位作者 Khalid Iqbal Shariq Hussain Muhammad Aqib Muhammad Jamal Oh-Young Song 《Computers, Materials & Continua》 SCIE EI 2021年第10期873-893,共21页
Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting... Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories.Thus,there is a need for a repository mining technique for relevant and bug-free data prediction.This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software.To predict errors in mining data,the Apriori algorithm was used to discover association rules by fixing confidence at more than 40%and support at least 30%.The pruning strategy was adopted based on evaluation measures.Next,the rules were extracted from three projects of different domains;the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values.To evaluate the proposed approach,we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects.The evaluation showed that the results of our proposal are promising.Practitioners and developers can utilize these rules for defect prediction during early software development. 展开更多
关键词 Fault prediction association rule data mining frequent pattern mining
下载PDF
Application of Data Mining Method to Improve the Accuracy of Springback Prediction in Sheet Metal Forming
5
作者 许京荆 张志伟 吴益敏 《Journal of Shanghai University(English Edition)》 CAS 2004年第3期348-353,共6页
A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) w... A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) was adopted for recognizing the material parameters. Then according to the even design idea, the suitable calculation scheme was confirmed, and FEM was used for calculating the springback. The computation results were compared with experiment data, the difference between them was taken as source data, and a new pattern recognition method of DM called hierarchical optimal map recognition method (HOMR) is applied for summarizing the calculation regulation in FEM. At the end, the mathematics model of the springback simulation was established. Based on the model, the calculation errors of springback can be controlled within 10% compared with the experimental results. 展开更多
关键词 springback prediction pattern recognition genetic algorithm FEM even design idea HOMR data mining.
下载PDF
COVID-19 Related Research by Data Mining in Single Cell Transcriptome Profiles
6
作者 Zi-Wei Wang Chi-Chang Chang Quan Zou 《Journal of Electronic Science and Technology》 CAS CSCD 2021年第1期1-5,共5页
The outbreak of coronavirus disease 2019(COVID-2019)has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,th... The outbreak of coronavirus disease 2019(COVID-2019)has drawn public attention all over the world.As a newly emerging area,single cell sequencing also exerts its power in the battle over the epidemic.In this review,the up-to-date knowledge of COVID-19 and its receptor is summarized,followed by a collection of the mining of single cell transcriptome profiling data for the information in aspects of the vulnerable cell types in humans and the potential mechanisms of the disease. 展开更多
关键词 Coronavirus disease 2019(COVID-19) BIOINFORMATICS data mining single cell sequencing
下载PDF
Study on Yan-Xin Wang’s medication experience and regularity in treating Insomnia from the liver based on data mining
7
作者 Yan-Yan Chen Yan-Xin Wang +2 位作者 Man Wen Zheng Yu Jing-Jing Wu 《TMR Integrative Medicine》 2023年第17期1-9,共9页
Background:Professor Yan-Xin Wang has been committed to the use of traditional Chinese medicine formulas to treat insomnia from the liver for many years,and has achieved excellent clinical results.In order to better i... Background:Professor Yan-Xin Wang has been committed to the use of traditional Chinese medicine formulas to treat insomnia from the liver for many years,and has achieved excellent clinical results.In order to better inherit Yan-Xin Wang’s academic thoughts.The purpose of this study is to use clinical data to explore the clinical experience of Prof.Yan-Xin Wang in the application of Chinese medicine to treat insomnia patients from the liver,explore the compatibility and medication rules of traditional Chinese medicine,and give more clinical treatment ideas for insomnia.Methods:The general data and prescription information of insomnia patients treated with Chinese herbal medicine by Prof.Yan-Xin Wang from January 1,2021,to December 31,2021,were summarized according to the inclusion and exclusion criteria,and the data were subjected to frequency statistics and drug association rules,complex network diagram analysis and cluster analysis.Results:A total of 159 patients and prescriptions were included in the study,of which 81.1%were women and 18.9%were men,containing 128 herbs;the highest frequency of use was 91.8%for Bupleuri Radix.Six Chinese herbs were used more than 70%of the time,namely Bupleuri Radix,Scutellariae Radix,oyster shell,Glycyrrhizae Radix,Os Draconis,and Ziziphi Spinosae Semen.The top 20 herbs in terms of frequency of use were analyzed in terms of the four Qi,five flavours,and their attributions.The four Qi were mainly calm and warm,the five flavours were mainly bitter and acrid,followed by sweet,and the attributions were mainly to the liver,spleen,and heart meridians.The Chinese medicine association rules set the confidence level>80%and the support level>10%,resulting in 10 two-herb and three-herb associations with the highest confidence level,such as Os Draconis is associated with oyster shell,Platycodonis Radix is associated with Achyranthis Bidentatae Radix,Scutellariae Radix is associated with Bupleuri Radix,Os Draconis,Bupleuri Radix is associated with oyster shell,Os Draconis,Scutellariae Radix is associated with oyster shell,etc.Cluster analysis yielded 3 classes of drug formulas.The complex network diagram shows that the core prescription drugs are composed of Bupleuri Radix,Chuanxiong Rhizoma,Pseudostellariae Radix,Jujubae Fructus,Os Draconis,Coptidis Rhizoma,Scutellariae Radix,Ziziphi Spinosae Semen,Smilacis Glabrae Rhizoma,Cinnamomi Cortex,White Moutan Cortex,Atractylodis Rhizoma,Glycyrrhizae Radix,oyster shell,Rehmanniae Radix Praeparata,Tritici Levis Fructus,Pinelliae Rhizoma Praeparatum,and Cinnamomi Ramulus.Conclusion:Prof.Yan-Xin Wang believes that the main treatment for patients with insomnia is based on the liver,by tonifying the deficiency and supporting the righteousness,mutually regulating the liver and spleen,and calming the mind and nourishing the heart,while adding and subtracting appropriate herbs according to the patient’s co-morbidities,which can significantly improve the patient’s insomnia symptoms. 展开更多
关键词 data mining INSOMNIA liver-based treatment medication patterns
下载PDF
Data Mining Based Cyber-Attack Detection
8
作者 TIANFIELD Huaglory 《系统仿真技术》 2017年第2期90-104,共15页
Detecting cyber-attacks undoubtedly has become a big data problem. This paper presents a tutorial on data mining based cyber-attack detection. First,a data driven defence framework is presented in terms of cyber secur... Detecting cyber-attacks undoubtedly has become a big data problem. This paper presents a tutorial on data mining based cyber-attack detection. First,a data driven defence framework is presented in terms of cyber security situational awareness. Then, the process of data mining based cyber-attack detection is discussed. Next,a multi-loop learning architecture is presented for data mining based cyber-attack detection. Finally,common data mining techniques for cyber-attack detection are discussed. 展开更多
关键词 big data analytics cyber-attack detection cyber security cyber situational awareness data mining pattern mining machine learning
下载PDF
Data Mining Method for Exploring the Composition Law and Therapeutic Mechanism of Chinese medicine of macroscopic
9
作者 Ting-Ting Chen Ya-Bo Shi 《Medical Data Mining》 2021年第3期47-53,共7页
Background:Based on the theory of"cancer toxin"in traditional Chinese medicine(TCM),combined with data mining method,this paper discusses the prescription and medication law of macroscopic"cancer toxin&... Background:Based on the theory of"cancer toxin"in traditional Chinese medicine(TCM),combined with data mining method,this paper discusses the prescription and medication law of macroscopic"cancer toxin".Methods Sort out the relevant prescriptions of macroscopic"cancer toxin"and carry out correlation analysis.Results The results showed that the main therapeutic drugs for macroscopic"cancer toxin"were invigorating the spleen and Qi,regulating the Qi mechanism,eliminating toxin and pathogenic factors,eliminating dampness and swelling,promoting blood circulation and removing blood stasis,attacking toxin,softening and firmness,and dredging collaterals to relieve pain.Conclusion High frequency medicine embodies the core of tangible macroscopic"cancer toxin"treatment,and the new prescription composition embodies the fundamental treatment.At the same time,the mechanism of macroscopic"cancer toxin"was analyzed to pave the way for further clinical practice. 展开更多
关键词 Macroscopic"cancer toxin" External prescriptions data mining Drug use patterns Action mechanism
下载PDF
SWFP-Miner: an efficient algorithm for mining weighted frequent pattern over data streams
10
作者 Wang Jie Zeng Yu 《High Technology Letters》 EI CAS 2012年第3期289-294,共6页
Previous weighted frequent pattern (WFP) mining algorithms are not suitable for data streams for they need multiple database scans. In this paper, we present an efficient algorithm SWFP-Miner to mine weighted freque... Previous weighted frequent pattern (WFP) mining algorithms are not suitable for data streams for they need multiple database scans. In this paper, we present an efficient algorithm SWFP-Miner to mine weighted frequent pattern over data streams. SWFP-Miner is based on sliding window and can discover important frequent pattern from the recent data. A new refined weight definition is proposed to keep the downward closure property, and two pruning strategies are presented to prune the weighted infrequent pattern. Experimental studies are performed to evaluate the effectiveness and efficiency of SWFP-Miner. 展开更多
关键词 weighted frequent pattern (WFP) mining data streams data mining slidingwindow SWFP-Miner
下载PDF
An Efficient Outlier Detection Approach on Weighted Data Stream Based on Minimal Rare Pattern Mining 被引量:1
11
作者 Saihua Cai Ruizhi Sun +2 位作者 Shangbo Hao Sicong Li Gang Yuan 《China Communications》 SCIE CSCD 2019年第10期83-99,共17页
The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional... The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining. 展开更多
关键词 OUTLIER detection WEIGHTED data STREAM MINIMAL WEIGHTED RARE pattern mining deviation factors
下载PDF
A Quarterly High RFM Mining Algorithm for Big Data Management
12
作者 Cuiwei Peng Jiahui Chen +1 位作者 Shicheng Wan Guotao Xu 《Computers, Materials & Continua》 SCIE EI 2024年第9期4341-4360,共20页
In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf ava... In today’s highly competitive retail industry,offline stores face increasing pressure on profitability.They hope to improve their ability in shelf management with the help of big data technology.For this,on-shelf availability is an essential indicator of shelf data management and closely relates to customer purchase behavior.RFM(recency,frequency,andmonetary)patternmining is a powerful tool to evaluate the value of customer behavior.However,the existing RFM patternmining algorithms do not consider the quarterly nature of goods,resulting in unreasonable shelf availability and difficulty in profit-making.To solve this problem,we propose a quarterly RFM mining algorithmfor On-shelf products named OS-RFM.Our algorithmmines the high recency,high frequency,and high monetary patterns and considers the period of the on-shelf goods in quarterly units.We conducted experiments using two real datasets for numerical and graphical analysis to prove the algorithm’s effectiveness.Compared with the state-of-the-art RFM mining algorithm,our algorithm can identify more patterns and performs well in terms of precision,recall,and F1-score,with the recall rate nearing 100%.Also,the novel algorithm operates with significantly shorter running times and more stable memory usage than existing mining algorithms.Additionally,we analyze the sales trends of products in different quarters and seasonal variations.The analysis assists businesses in maintaining reasonable on-shelf availability and achieving greater profitability. 展开更多
关键词 data mining recency pattern high-utility itemset RFM pattern mining on-shelf management
下载PDF
Mining Time Pattern Association Rules in Temporal Database
13
作者 Nguyen Dinh Thuan 《通讯和计算机(中英文版)》 2010年第3期50-56,共7页
关键词 挖掘关联规则 时间模式 时态数据库 大型数据库 时间间隔 优化技术 验算法
下载PDF
A Fast Interactive Sequential Pattern Mining Algorithm 被引量:1
14
作者 LU Jie-Ping LIU Yue-bo +2 位作者 NI wei-wei LIU Tong-ming SUN Zhi-hui 《Wuhan University Journal of Natural Sciences》 EI CAS 2006年第1期31-36,共6页
In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interacti... In order to reduce the computational and spatial complexity in rerunning algorithm of sequential patterns query, this paper proposes sequential patterns based and projection database based algorithm for fast interactive sequential patterns mining algorithm (FISP), in which the number of frequent items of the projection databases constructed by the correct mining which based on the previously mined sequences has been reduced. Furthermore, the algorithm's iterative running times are reduced greatly by using global-threshold. The results of experiments testify that FISP outperforms PrefixSpan in interactive mining 展开更多
关键词 data mining sequential patterns interactive mining projection database
下载PDF
A New Algorithm for Mining Frequent Pattern 被引量:2
15
作者 李力 靳蕃 《Journal of Southwest Jiaotong University(English Edition)》 2002年第1期10-20,共11页
Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidat... Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth. 展开更多
关键词 data mining algorithm frequent pattern set FP growth
下载PDF
Improved Pattern Tree for Incremental Frequent-Pattern Mining 被引量:1
16
作者 周明 王太勇 《Transactions of Tianjin University》 EI CAS 2010年第2期129-134,共6页
By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tre... By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree. 展开更多
关键词 data mining association rules improved pattern tree incremental mining
下载PDF
Mining Maximal Frequent Patterns in a Unidirectional FP-tree 被引量:1
17
作者 宋晶晶 刘瑞新 +1 位作者 王艳 姜保庆 《Journal of Donghua University(English Edition)》 EI CAS 2006年第6期105-109,共5页
Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model ... Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model only finds out the maximal frequent patterns, which can generate all frequent patterns. FP-growth algorithm is one of the most efficient frequent-pattern mining methods published so far. However, because FP-tree and conditional FP-trees must be two-way traversable, a great deal memory is needed in process of mining. This paper proposes an efficient algorithm Unid_FP-Max for mining maximal frequent patterns based on unidirectional FP-tree. Because of generation method of unidirectional FP-tree and conditional unidirectional FP-trees, the algorithm reduces the space consumption to the fullest extent. With the development of two techniques: single path pruning and header table pruning which can cut down many conditional unidirectional FP-trees generated recursively in mining process, Unid_FP-Max further lowers the expense of time and space. 展开更多
关键词 data mining frequent pattern the maximal frequent pattern Unid _ FP-tree conditional Unid _ FP-tree.
下载PDF
Quantum Algorithm for Mining Frequent Patterns for Association Rule Mining
18
作者 Abdirahman Alasow Marek Perkowski 《Journal of Quantum Information Science》 CAS 2023年第1期1-23,共23页
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre... Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits. 展开更多
关键词 data mining Association Rule mining Frequent pattern Apriori Algorithm Quantum Counter Quantum Comparator Grover’s Search Algorithm
下载PDF
Temporal pattern mining from user-generated content 被引量:1
19
作者 Adnan Ali Jinlong Li +1 位作者 Huanhuan Chen Ali Kashif Bashir 《Digital Communications and Networks》 SCIE CSCD 2022年第6期1027-1039,共13页
Faster internet, IoT, and social media have reformed the conventional web into a collaborative web resulting in enormous user-generated content. Several studies are focused on such content;however, they mainly focus o... Faster internet, IoT, and social media have reformed the conventional web into a collaborative web resulting in enormous user-generated content. Several studies are focused on such content;however, they mainly focus on textual data, thus undermining the importance of metadata. Considering this gap, we provide a temporal pattern mining framework to model and utilize user-generated content's metadata. First, we scrap 2.1 million tweets from Twitter between Nov-2020 to Sep-2021 about 100 hashtag keywords and present these tweets into 100 User-Tweet-Hashtag (UTH) dynamic graphs. Second, we extract and identify four time-series in three timespans (Day, Hour, and Minute) from UTH dynamic graphs. Lastly, we model these four time-series with three machine learning algorithms to mine temporal patterns with the accuracy of 95.89%, 93.17%, 90.97%, and 93.73%, respectively. We demonstrate that user-generated content's metadata contains valuable information, which helps to understand the users' collective behavior and can be beneficial for business and research. Dataset and codes are publicly available;the link is given in the dataset section. 展开更多
关键词 Social media analysis Collaborative computing Social data Twitter data Temporal patterns mining Dynamic graphs
下载PDF
An Overview of Data Mining and Knowledge Discovery 被引量:8
20
作者 范建华 李德毅 《Journal of Computer Science & Technology》 SCIE EI CSCD 1998年第4期348-368,共21页
With massive amounts of data stored in databases, mining information and knowledge in databases has become an important issue in recent research. Researchers in many different fields have shown great interest in data ... With massive amounts of data stored in databases, mining information and knowledge in databases has become an important issue in recent research. Researchers in many different fields have shown great interest in data mining and knowledge discovery in databases. Several emerging applications in information providing services, such as data warehousing and on-line services over the Internet, also call for various data mining and knowledge discovery techniques to understand user behavior better, to improve the service provided, and to increase the business opportunities. In response to such a demand, this article is to provide a comprehensive survey on the data mining and knowledge discovery techniques developed recently, and introduce some real application systems as well. In conclusion, this article also lists some problems and challenges for further research. 展开更多
关键词 Knowledge discovery in databases data mining machine learning association rule CLASSIFICATION data clustering data generalization pattern searching
原文传递
上一页 1 2 81 下一页 到第
使用帮助 返回顶部