To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms....To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms.Develop a semi-supervised document clustering approach based on the latent Dirichlet allocation(LDA)model,namely,pLDA,guided by the user provided key terms.Propose a generalized Polya urn(GPU) model to integrate the user preferences to the document clustering process.A Gibbs sampler was investigated to infer the document collection structure.Experiments on real datasets were taken to explore the performance of pLDA.The results demonstrate that the pLDA approach is effective.展开更多
This paper presents an unsupervised approach to cluster reviews of products collected from Amazon and then generates its labels of each cluster.Instead of using a complete review,this paper splits a review into senten...This paper presents an unsupervised approach to cluster reviews of products collected from Amazon and then generates its labels of each cluster.Instead of using a complete review,this paper splits a review into sentences and considers all sentences from the reviews as inputs for Clustering.Hierarchical Agglomerative Clustering(HAC)is used to cluster sentences.The approaches of cluster labeling are also unsupervised.For labeling,three different methods have been used to find a limited number of essential words for each cluster.Extracted essential words are used to construct phrases.Constructed phrases are used as labels for each cluster.This paper compares the result of the labeling method with baseline labeling.In the result evaluation,all the labeling methods outperform the baseline method.The aim of this research is cluster labeling that makes a set of labels to describe a cluster content and distinguishes the labels from other cluster labels.展开更多
In this paper,a novel multi-frame track-before-detect algorithm is proposed,which is based on root label clustering to reduce the high computational complexity arising by observation area expansion and clutter/noise d...In this paper,a novel multi-frame track-before-detect algorithm is proposed,which is based on root label clustering to reduce the high computational complexity arising by observation area expansion and clutter/noise density increase.A criterion of track extrapolation is used to construct state transition set,root label is marked by state transition set to obtain the distribution information of multiple targets in measurement space,then measurement plots of multi-frame are divided into several clusters,and finally multi-frame track-before-detect algorithm is implemented in each cluster.The computational complexity can be reduced by employing the proposed algorithm.Simulation results show that the proposed algorithm can accurately detect multiple targets in close proximity and reduce the number of false tracks.展开更多
基金National Natural Science Foundations of China(Nos.61262006,61462011,61202089)the Major Applied Basic Research Program of Guizhou Province Project,China(No.JZ20142001)+2 种基金the Science and Technology Foundation of Guizhou Province Project,China(No.LH20147636)the National Research Foundation for the Doctoral Program of Higher Education of China(No.20125201120006)the Graduate Innovated Foundations of Guizhou University Project,China(No.2015012)
文摘To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms.Develop a semi-supervised document clustering approach based on the latent Dirichlet allocation(LDA)model,namely,pLDA,guided by the user provided key terms.Propose a generalized Polya urn(GPU) model to integrate the user preferences to the document clustering process.A Gibbs sampler was investigated to infer the document collection structure.Experiments on real datasets were taken to explore the performance of pLDA.The results demonstrate that the pLDA approach is effective.
文摘This paper presents an unsupervised approach to cluster reviews of products collected from Amazon and then generates its labels of each cluster.Instead of using a complete review,this paper splits a review into sentences and considers all sentences from the reviews as inputs for Clustering.Hierarchical Agglomerative Clustering(HAC)is used to cluster sentences.The approaches of cluster labeling are also unsupervised.For labeling,three different methods have been used to find a limited number of essential words for each cluster.Extracted essential words are used to construct phrases.Constructed phrases are used as labels for each cluster.This paper compares the result of the labeling method with baseline labeling.In the result evaluation,all the labeling methods outperform the baseline method.The aim of this research is cluster labeling that makes a set of labels to describe a cluster content and distinguishes the labels from other cluster labels.
基金supported by the Innovation Project of Science and Technology Commission of the Central Military Commission,China(No.19-HXXX-01-ZD-006-XXX-XX)。
文摘In this paper,a novel multi-frame track-before-detect algorithm is proposed,which is based on root label clustering to reduce the high computational complexity arising by observation area expansion and clutter/noise density increase.A criterion of track extrapolation is used to construct state transition set,root label is marked by state transition set to obtain the distribution information of multiple targets in measurement space,then measurement plots of multi-frame are divided into several clusters,and finally multi-frame track-before-detect algorithm is implemented in each cluster.The computational complexity can be reduced by employing the proposed algorithm.Simulation results show that the proposed algorithm can accurately detect multiple targets in close proximity and reduce the number of false tracks.