Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Sp...Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Spatial Clustering of Applications with Noise(DBSCAN).It identifies clusters by grouping the densely connected objects into one group and discarding the noise objects.It requires two input parameters:epsilon(fixed neighborhood radius)and MinPts(the lowest number of objects in epsilon).However,it can’t handle clusters of various densities since it uses a global value for epsilon.This article proposes an adaptation of the DBSCAN method so it can discover clusters of varied densities besides reducing the required number of input parameters to only one.Only user input in the proposed method is the MinPts.Epsilon on the other hand,is computed automatically based on statistical information of the dataset.The proposed method finds the core distance for each object in the dataset,takes the average of these distances as the first value of epsilon,and finds the clusters satisfying this density level.The remaining unclustered objects will be clustered using a new value of epsilon that equals the average core distances of unclustered objects.This process continues until all objects have been clustered or the remaining unclustered objects are less than 0.006 of the dataset’s size.The proposed method requires MinPts only as an input parameter because epsilon is computed from data.Benchmark datasets were used to evaluate the effectiveness of the proposed method that produced promising results.Practical experiments demonstrate that the outstanding ability of the proposed method to detect clusters of different densities even if there is no separation between them.The accuracy of the method ranges from 92%to 100%for the experimented datasets.展开更多
Cluster analysis is a crucial technique in unsupervised machine learning,pattern recognition,and data analysis.However,current clustering algorithms suffer from the need for manual determination of parameter values,lo...Cluster analysis is a crucial technique in unsupervised machine learning,pattern recognition,and data analysis.However,current clustering algorithms suffer from the need for manual determination of parameter values,low accuracy,and inconsistent performance concerning data size and structure.To address these challenges,a novel clustering algorithm called the fully automated density-based clustering method(FADBC)is proposed.The FADBC method consists of two stages:parameter selection and cluster extraction.In the first stage,a proposed method extracts optimal parameters for the dataset,including the epsilon size and a minimum number of points thresholds.These parameters are then used in a density-based technique to scan each point in the dataset and evaluate neighborhood densities to find clusters.The proposed method was evaluated on different benchmark datasets andmetrics,and the experimental results demonstrate its competitive performance without requiring manual inputs.The results show that the FADBC method outperforms well-known clustering methods such as the agglomerative hierarchical method,k-means,spectral clustering,DBSCAN,FCDCSD,Gaussian mixtures,and density-based spatial clustering methods.It can handle any kind of data set well and perform excellently.展开更多
With the development of global position system(GPS),wireless technology and location aware services,it is possible to collect a large quantity of trajectory data.In the field of data mining for moving objects,the pr...With the development of global position system(GPS),wireless technology and location aware services,it is possible to collect a large quantity of trajectory data.In the field of data mining for moving objects,the problem of anomaly detection is a hot topic.Based on the development of anomalous trajectory detection of moving objects,this paper introduces the classical trajectory outlier detection(TRAOD) algorithm,and then proposes a density-based trajectory outlier detection(DBTOD) algorithm,which compensates the disadvantages of the TRAOD algorithm that it is unable to detect anomalous defects when the trajectory is local and dense.The results of employing the proposed algorithm to Elk1993 and Deer1995 datasets are also presented,which show the effectiveness of the algorithm.展开更多
Overlapping community detection in a network is a challenging issue which attracts lots of attention in recent years.A notion of hesitant node(HN) is proposed. An HN contacts with multiple communities while the comm...Overlapping community detection in a network is a challenging issue which attracts lots of attention in recent years.A notion of hesitant node(HN) is proposed. An HN contacts with multiple communities while the communications are not strong or even accidental, thus the HN holds an implicit community structure.However, HNs are not rare in the real world network. It is important to identify them because they can be efficient hubs which form the overlapping portions of communities or simple attached nodes to some communities. Current approaches have difficulties in identifying and clustering HNs. A density-based rough set model(DBRSM) is proposed by combining the virtue of densitybased algorithms and rough set models. It incorporates the macro perspective of the community structure of the whole network and the micro perspective of the local information held by HNs, which would facilitate the further "growth" of HNs in community. We offer a theoretical support for this model from the point of strength of the trust path. The experiments on the real-world and synthetic datasets show the practical significance of analyzing and clustering the HNs based on DBRSM. Besides, the clustering based on DBRSM promotes the modularity optimization.展开更多
Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro c...Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro cluster. We propose LeaDen-Stream (Leader Density-based clustering algorithm over evolving data Stream), a density-based clustering algorithm using leader clustering. The algorithm is based on a two-phase clustering. The online phase selects the proper mini-micro or micro-cluster leaders based on the distribution of data points in the micro clusters. Then, the leader centers are sent to the offline phase to form final clusters. In LeaDen-Stream, by carefully choosing between two kinds of micro leaders, we decrease time complexity of the clustering while maintaining the cluster quality. A pruning strategy is also used to filter out real data from noise by introducing dense and sparse mini-micro and micro-cluster leaders. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.展开更多
Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the...Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.展开更多
Human-human interaction recognition is crucial in computer vision fields like surveillance,human-computer interaction,and social robotics.It enhances systems’ability to interpret and respond to human behavior precise...Human-human interaction recognition is crucial in computer vision fields like surveillance,human-computer interaction,and social robotics.It enhances systems’ability to interpret and respond to human behavior precisely.This research focuses on recognizing human interaction behaviors using a static image,which is challenging due to the complexity of diverse actions.The overall purpose of this study is to develop a robust and accurate system for human interaction recognition.This research presents a novel image-based human interaction recognition method using a Hidden Markov Model(HMM).The technique employs hue,saturation,and intensity(HSI)color transformation to enhance colors in video frames,making them more vibrant and visually appealing,especially in low-contrast or washed-out scenes.Gaussian filters reduce noise and smooth imperfections followed by silhouette extraction using a statistical method.Feature extraction uses the features from Accelerated Segment Test(FAST),Oriented FAST,and Rotated BRIEF(ORB)techniques.The application of Quadratic Discriminant Analysis(QDA)for feature fusion and discrimination enables high-dimensional data to be effectively analyzed,thus further enhancing the classification process.It ensures that the final features loaded into the HMM classifier accurately represent the relevant human activities.The impressive accuracy rates of 93%and 94.6%achieved in the BIT-Interaction and UT-Interaction datasets respectively,highlight the success and reliability of the proposed technique.The proposed approach addresses challenges in various domains by focusing on frame improvement,silhouette and feature extraction,feature fusion,and HMM classification.This enhances data quality,accuracy,adaptability,reliability,and reduction of errors.展开更多
基金The author extends his appreciation to theDeputyship forResearch&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the project number(IFPSAU-2021/01/17758).
文摘Finding clusters based on density represents a significant class of clustering algorithms.These methods can discover clusters of various shapes and sizes.The most studied algorithm in this class is theDensity-Based Spatial Clustering of Applications with Noise(DBSCAN).It identifies clusters by grouping the densely connected objects into one group and discarding the noise objects.It requires two input parameters:epsilon(fixed neighborhood radius)and MinPts(the lowest number of objects in epsilon).However,it can’t handle clusters of various densities since it uses a global value for epsilon.This article proposes an adaptation of the DBSCAN method so it can discover clusters of varied densities besides reducing the required number of input parameters to only one.Only user input in the proposed method is the MinPts.Epsilon on the other hand,is computed automatically based on statistical information of the dataset.The proposed method finds the core distance for each object in the dataset,takes the average of these distances as the first value of epsilon,and finds the clusters satisfying this density level.The remaining unclustered objects will be clustered using a new value of epsilon that equals the average core distances of unclustered objects.This process continues until all objects have been clustered or the remaining unclustered objects are less than 0.006 of the dataset’s size.The proposed method requires MinPts only as an input parameter because epsilon is computed from data.Benchmark datasets were used to evaluate the effectiveness of the proposed method that produced promising results.Practical experiments demonstrate that the outstanding ability of the proposed method to detect clusters of different densities even if there is no separation between them.The accuracy of the method ranges from 92%to 100%for the experimented datasets.
基金the Deanship of Scientific Research at Umm Al-Qura University,Grant Code:(23UQU4361009DSR001).
文摘Cluster analysis is a crucial technique in unsupervised machine learning,pattern recognition,and data analysis.However,current clustering algorithms suffer from the need for manual determination of parameter values,low accuracy,and inconsistent performance concerning data size and structure.To address these challenges,a novel clustering algorithm called the fully automated density-based clustering method(FADBC)is proposed.The FADBC method consists of two stages:parameter selection and cluster extraction.In the first stage,a proposed method extracts optimal parameters for the dataset,including the epsilon size and a minimum number of points thresholds.These parameters are then used in a density-based technique to scan each point in the dataset and evaluate neighborhood densities to find clusters.The proposed method was evaluated on different benchmark datasets andmetrics,and the experimental results demonstrate its competitive performance without requiring manual inputs.The results show that the FADBC method outperforms well-known clustering methods such as the agglomerative hierarchical method,k-means,spectral clustering,DBSCAN,FCDCSD,Gaussian mixtures,and density-based spatial clustering methods.It can handle any kind of data set well and perform excellently.
基金supported by the Aeronautical Science Foundation of China(20111052010)the Jiangsu Graduates Innovation Project (CXZZ120163)+1 种基金the "333" Project of Jiangsu Provincethe Qing Lan Project of Jiangsu Province
文摘With the development of global position system(GPS),wireless technology and location aware services,it is possible to collect a large quantity of trajectory data.In the field of data mining for moving objects,the problem of anomaly detection is a hot topic.Based on the development of anomalous trajectory detection of moving objects,this paper introduces the classical trajectory outlier detection(TRAOD) algorithm,and then proposes a density-based trajectory outlier detection(DBTOD) algorithm,which compensates the disadvantages of the TRAOD algorithm that it is unable to detect anomalous defects when the trajectory is local and dense.The results of employing the proposed algorithm to Elk1993 and Deer1995 datasets are also presented,which show the effectiveness of the algorithm.
基金supported by the National Natural Science Foundation of China(71271018)
文摘Overlapping community detection in a network is a challenging issue which attracts lots of attention in recent years.A notion of hesitant node(HN) is proposed. An HN contacts with multiple communities while the communications are not strong or even accidental, thus the HN holds an implicit community structure.However, HNs are not rare in the real world network. It is important to identify them because they can be efficient hubs which form the overlapping portions of communities or simple attached nodes to some communities. Current approaches have difficulties in identifying and clustering HNs. A density-based rough set model(DBRSM) is proposed by combining the virtue of densitybased algorithms and rough set models. It incorporates the macro perspective of the community structure of the whole network and the micro perspective of the local information held by HNs, which would facilitate the further "growth" of HNs in community. We offer a theoretical support for this model from the point of strength of the trust path. The experiments on the real-world and synthetic datasets show the practical significance of analyzing and clustering the HNs based on DBRSM. Besides, the clustering based on DBRSM promotes the modularity optimization.
文摘Clustering evolving data streams is important to be performed in a limited time with a reasonable quality. The existing micro clustering based methods do not consider the distribution of data points inside the micro cluster. We propose LeaDen-Stream (Leader Density-based clustering algorithm over evolving data Stream), a density-based clustering algorithm using leader clustering. The algorithm is based on a two-phase clustering. The online phase selects the proper mini-micro or micro-cluster leaders based on the distribution of data points in the micro clusters. Then, the leader centers are sent to the offline phase to form final clusters. In LeaDen-Stream, by carefully choosing between two kinds of micro leaders, we decrease time complexity of the clustering while maintaining the cluster quality. A pruning strategy is also used to filter out real data from noise by introducing dense and sparse mini-micro and micro-cluster leaders. Our performance study over a number of real and synthetic data sets demonstrates the effectiveness and efficiency of our method.
基金National Natural Science Foundation of China(No.61374140)the Youth Foundation of National Natural Science Foundation of China(No.61403072)
文摘Complex industry processes often need multiple operation modes to meet the change of production conditions. In the same mode,there are discrete samples belonging to this mode. Therefore,it is important to consider the samples which are sparse in the mode.To solve this issue,a new approach called density-based support vector data description( DBSVDD) is proposed. In this article,an algorithm using Gaussian mixture model( GMM) with the DBSVDD technique is proposed for process monitoring. The GMM method is used to obtain the center of each mode and determine the number of the modes. Considering the complexity of the data distribution and discrete samples in monitoring process,the DBSVDD is utilized for process monitoring. Finally,the validity and effectiveness of the DBSVDD method are illustrated through the Tennessee Eastman( TE) process.
基金funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/12/6)supported via funding from Prince Satam bin Abdulaziz University Project Number(PSAU/2023/R/1444)+1 种基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R348)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia,and this work was also supported by the Ministry of Science and ICT(MSIT),South Korea,through the ICT Creative Consilience Program supervised by the Institute for Information and Communications Technology Planning and Evaluation(IITP)under Grant IITP-2023-2020-0-01821.
文摘Human-human interaction recognition is crucial in computer vision fields like surveillance,human-computer interaction,and social robotics.It enhances systems’ability to interpret and respond to human behavior precisely.This research focuses on recognizing human interaction behaviors using a static image,which is challenging due to the complexity of diverse actions.The overall purpose of this study is to develop a robust and accurate system for human interaction recognition.This research presents a novel image-based human interaction recognition method using a Hidden Markov Model(HMM).The technique employs hue,saturation,and intensity(HSI)color transformation to enhance colors in video frames,making them more vibrant and visually appealing,especially in low-contrast or washed-out scenes.Gaussian filters reduce noise and smooth imperfections followed by silhouette extraction using a statistical method.Feature extraction uses the features from Accelerated Segment Test(FAST),Oriented FAST,and Rotated BRIEF(ORB)techniques.The application of Quadratic Discriminant Analysis(QDA)for feature fusion and discrimination enables high-dimensional data to be effectively analyzed,thus further enhancing the classification process.It ensures that the final features loaded into the HMM classifier accurately represent the relevant human activities.The impressive accuracy rates of 93%and 94.6%achieved in the BIT-Interaction and UT-Interaction datasets respectively,highlight the success and reliability of the proposed technique.The proposed approach addresses challenges in various domains by focusing on frame improvement,silhouette and feature extraction,feature fusion,and HMM classification.This enhances data quality,accuracy,adaptability,reliability,and reduction of errors.