To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms....To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms.Develop a semi-supervised document clustering approach based on the latent Dirichlet allocation(LDA)model,namely,pLDA,guided by the user provided key terms.Propose a generalized Polya urn(GPU) model to integrate the user preferences to the document clustering process.A Gibbs sampler was investigated to infer the document collection structure.Experiments on real datasets were taken to explore the performance of pLDA.The results demonstrate that the pLDA approach is effective.展开更多
The Hangzhou Bay(HZB) and Xiangshan Bay(XSB), in northern Zhejiang Province and connect to the East China Sea(ECS) were considerably affected by the consequence of water quality degradation. In this study, we an...The Hangzhou Bay(HZB) and Xiangshan Bay(XSB), in northern Zhejiang Province and connect to the East China Sea(ECS) were considerably affected by the consequence of water quality degradation. In this study, we analyzed physical and biogeochemical properties of water quality via multivariate statistical techniques. Hierarchical cluster analysis(HCA) grouped HZB and XSB into two subareas of different pollution sources based on similar physical and biogeochemical properties. Principal component analysis(PCA) identified three latent pollution sources in HZB and XSB respectively and emphasized the importance of terrestrial inputs, coastal industries as well as natural processes in determining the water quality of the two bays. Therefore, proper measurement for the protection of aquatic ecoenvironment in HZB and XSB were of great urgency.展开更多
基金National Natural Science Foundations of China(Nos.61262006,61462011,61202089)the Major Applied Basic Research Program of Guizhou Province Project,China(No.JZ20142001)+2 种基金the Science and Technology Foundation of Guizhou Province Project,China(No.LH20147636)the National Research Foundation for the Doctoral Program of Higher Education of China(No.20125201120006)the Graduate Innovated Foundations of Guizhou University Project,China(No.2015012)
文摘To discover personalized document structure with the consideration of user preferences,user preferences were captured by limited amount of instance level constraints and given as interested and uninterested key terms.Develop a semi-supervised document clustering approach based on the latent Dirichlet allocation(LDA)model,namely,pLDA,guided by the user provided key terms.Propose a generalized Polya urn(GPU) model to integrate the user preferences to the document clustering process.A Gibbs sampler was investigated to infer the document collection structure.Experiments on real datasets were taken to explore the performance of pLDA.The results demonstrate that the pLDA approach is effective.
基金The National Marine Ecoenvironment Assessment Program of State Oceanic Administration
文摘The Hangzhou Bay(HZB) and Xiangshan Bay(XSB), in northern Zhejiang Province and connect to the East China Sea(ECS) were considerably affected by the consequence of water quality degradation. In this study, we analyzed physical and biogeochemical properties of water quality via multivariate statistical techniques. Hierarchical cluster analysis(HCA) grouped HZB and XSB into two subareas of different pollution sources based on similar physical and biogeochemical properties. Principal component analysis(PCA) identified three latent pollution sources in HZB and XSB respectively and emphasized the importance of terrestrial inputs, coastal industries as well as natural processes in determining the water quality of the two bays. Therefore, proper measurement for the protection of aquatic ecoenvironment in HZB and XSB were of great urgency.