Digital twinning enables manufacturers to create digital representations of physical entities,thus implementing virtual simulations for product development.Previous efforts of digital twinning neglect the decisive con...Digital twinning enables manufacturers to create digital representations of physical entities,thus implementing virtual simulations for product development.Previous efforts of digital twinning neglect the decisive consumer feedback in product development stages,failing to cover the gap between physical and digital spaces.This work mines real-world consumer feedbacks through social media topics,which is significant to product development.We specifically analyze the prevalent time of a product topic,giving an insight into both consumer attention and the widely-discussed time of a product.The primary body of current studies regards the prevalent time prediction as an accompanying task or assumes the existence of a preset distribution.Therefore,these proposed solutions are either biased in focused objectives and underlying patterns or weak in the capability of generalization towards diverse topics.To this end,this work combines deep learning and survival analysis to predict the prevalent time of topics.We propose a specialized deep survival model which consists of two modules.The first module enriches input covariates by incorporating latent features of the time-varying text,and the second module fully captures the temporal pattern of a rumor by a recurrent network structure.Moreover,a specific loss function different from regular survival models is proposed to achieve a more reasonable prediction.Extensive experiments on real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods.展开更多
In the speech delivered at the centenary celebration of the Communist Party of China(CPC),Xi Jinping,general secretary of the Communist Party of China(CPC)Central Committee,made important remarks on the“Chinese path ...In the speech delivered at the centenary celebration of the Communist Party of China(CPC),Xi Jinping,general secretary of the Communist Party of China(CPC)Central Committee,made important remarks on the“Chinese path to modernization.”This term represents the latest achievement China has scored in adapting Marxism to the Chinese context and the needs of the times.It has set the course as China embarks on a new journey of building a strong socialist country with Chinese characteristics and achieving national rejuvenation.Drawing on CiteSpace,we conducted a visualized bibliometric analysis of literature on the Chinese path to modernization by searching the CNKI database using subject terms such as“Chinese modernization,Chinese path to modernization,and Chinese-style modernization.”The findings reveal that:(a)Research on the Chinese path to modernization has gone through three stages:initial establishment,pioneering exploration,and comprehensive in-depth development.(b)Existing literature has covered the four key topics associated with the Chinese path to modernization,namely its essence,goal,methodology,and pioneering achievements.(c)Future research may focus on building up China’s strength in agriculture,developing the digital economy,modernizing China’s system and capacity for governance,and establishing a unique socialist discourse system for Chinese modernization.展开更多
以热带医学研究领域为例,探索InCites数据库中的Citation Topics功能在选题策划中的应用。选取Web of Science数据库中热带医学领域近5年SCIE收录的论文,利用Citation Topics,对个别发文量多或被引频次高的研究方向、区域、研究人员、...以热带医学研究领域为例,探索InCites数据库中的Citation Topics功能在选题策划中的应用。选取Web of Science数据库中热带医学领域近5年SCIE收录的论文,利用Citation Topics,对个别发文量多或被引频次高的研究方向、区域、研究人员、机构进行微观主题举例分析。疟疾微观主题下表现最活跃的区域为USA,机构为University of London,研究人员为Drakeley,Chris,出版物为Malaria Journal,United States Department of Health&Human Services为疟疾微观主题提供的基金资助最多;血吸虫病、疟疾、登革热、包虫囊肿和冠状病毒是中国热带医学领域的研究重点,冠状病毒、血吸虫病、隐孢子虫、登革热和疟疾微观主题的论文影响力相对较高;研究人员Zhou,Xiao-Nong的重点研究方向为血吸虫病、疟疾和包虫囊肿,疟疾、登革热、轮状病毒、犬弓首线虫和莱姆病研究主题的论文质量和关注度高;University of London热带医学领域的研究重点为疟疾、血吸虫病和登革热。InCites中Citation Topics功能可以实现对研究主题、人员、机构、国家/地区等模块进行更精细的分析,有助于科技期刊编辑更高效地制定选题方案。展开更多
Purpose: Formal concept analysis(FCA) and concept lattice theory(CLT) are introduced for constructing a network of IDR topics and for evaluating their effectiveness for knowledge structure exploration.Design/methodolo...Purpose: Formal concept analysis(FCA) and concept lattice theory(CLT) are introduced for constructing a network of IDR topics and for evaluating their effectiveness for knowledge structure exploration.Design/methodology/approach: We introduced the theory and applications of FCA and CLT, and then proposed a method for interdisciplinary knowledge discovery based on CLT. As an example of empirical analysis, interdisciplinary research(IDR) topics in Information & Library Science(LIS) and Medical Informatics, and in LIS and Geography-Physical, were utilized as empirical fields. Subsequently, we carried out a comparative analysis with two other IDR topic recognition methods.Findings: The CLT approach is suitable for IDR topic identification and predictions.Research limitations: IDR topic recognition based on the CLT is not sensitive to the interdisciplinarity of topic terms, since the data can only reflect whether there is a relationship between the discipline and the topic terms. Moreover, the CLT cannot clearly represent a large amounts of concepts.Practical implications: A deeper understanding of the IDR topics was obtained as the structural and hierarchical relationships between them were identified, which can help to get more precise identification and prediction to IDR topics.Originality/value: IDR topics identification based on CLT have performed well and this theory has several advantages for identifying and predicting IDR topics. First, in a concept lattice, there is a partial order relation between interconnected nodes, and consequently, a complete concept lattice can present hierarchical properties. Second, clustering analysis of IDR topics based on concept lattices can yield clusters that highlight the essential knowledge features and help display the semantic relationship between different IDR topics. Furthermore, the Hasse diagram automatically displays all the IDR topics associated with the different disciplines, thus forming clusters of specific concepts and visually retaining and presenting the associations of IDR topics through multiple inheritance relationships between the concepts.展开更多
Purpose: In this paper, we combined the method of co-word analysis and alluvial diagram to detect hot topics and illustrate their dynamics. Design/methodology/approach: Articles in the field of scientometrics were c...Purpose: In this paper, we combined the method of co-word analysis and alluvial diagram to detect hot topics and illustrate their dynamics. Design/methodology/approach: Articles in the field of scientometrics were chosen as research cases in this study. A time-sliced co-word network was generated and then clustered. Afterwards, we generated an alluvial diagram to show dynamic changes of hot topics, including their merges and splits over time. Findings: After analyzing the dynamic changes in the field of scientometrics from 2011 to 2015, we found that two clusters being merged did not mean that the old topics had disappeared and a totally new one had emerged. The topics were possibly still active the following year, but the newer topics had drawn more attention. The changes of hot topics reflected the shift in researchers' interests. subdivided and re-merged. For example, several topics as research progressed. Research topics in scientometrics were constantly a cluster involving "industry" was divided into Research limitations: When examining longer time periods, we encounter the problem of dealing with bigger data sets. Analyzing data year by year would be tedious, but if we combine, e.g. two years into one time slice, important details would be missed. Practical implications: This method can be applied to any research field to illustrate the dynamics of hot topics. It can indicate the promising directions for researchers and provide guidance to decision makers. Originality/value: The use of alluvial diagrams is a distinctive and meaningful approach to detecting hot topics and especially to illustrating their dynamics.展开更多
Purpose:To reveal the research hotpots and relationship among three research hot topics in b iomedicine,namely CRISPR,iPS(induced Pluripotent Stem)cell and Synthetic biology.Design/methodology/approach:We set up their...Purpose:To reveal the research hotpots and relationship among three research hot topics in b iomedicine,namely CRISPR,iPS(induced Pluripotent Stem)cell and Synthetic biology.Design/methodology/approach:We set up their keyword co-occurrence networks with using three indicators and information visualization for metric analysis.Findings:The results reveal the main research hotspots in the three topics are different,but the overlapping keywords in the three topics indicate that they are mutually integrated and interacted each other.Research limitations:All analyses use keywords,without any other forms.Practical implications:We try to find the information distribution and structure of these three hot topics for revealing their research status and interactions,and for promoting biomedical developments.Originality/value:We chose the core keywords in three research hot topics in biomedicine by using h-index.展开更多
Promoting physical activity and health through active video games Volume 6,Issue 1,2017Guest editor:Zan Gao Hamstring muscle strain injury Volume 6,Issues 2-3,2017Guest editors:Bing Yu,Li Li Physical activity,fitness,...Promoting physical activity and health through active video games Volume 6,Issue 1,2017Guest editor:Zan Gao Hamstring muscle strain injury Volume 6,Issues 2-3,2017Guest editors:Bing Yu,Li Li Physical activity,fitness,and obesity in Chinese school-aged children and adolescents:An update Volume 6,Issue 4,2017Guest editors:Fuzhong Li.展开更多
Fields of particular interest to JSHS include (but are not limited to):·Sport and exercise medicine·Injury prevention and clinical rehabilitation·Sport and exercise physiology·Public health promoti...Fields of particular interest to JSHS include (but are not limited to):·Sport and exercise medicine·Injury prevention and clinical rehabilitation·Sport and exercise physiology·Public health promotion·Physical activity epidemiology·Biomechanics and motor展开更多
At the first working meeting of the 'Key Technical Standard PromotionProject', an important project of the national scientific and technical support plan during the'11th Five Year' held recently in Bei...At the first working meeting of the 'Key Technical Standard PromotionProject', an important project of the national scientific and technical support plan during the'11th Five Year' held recently in Beijing, the first group of 11 research topics for the 'KeyTechnical Standard Promotion Project' was formally introduced. According to Ms. Yu Xinli, who is incharge of the project as Deputy Director of the China National Institute of Standardization, theproject comprises four aspects: the international standards key breakthrough project, the technicalstandards promotion project of adapting technical trade measures, the technical standards innovationproject of basic public welfare, and the technical standard enhancement project of public security.展开更多
Social applications such as Weibo have provided a quick platform for information propagation, which have led to an explosive propagation for hot topic. User sentiments about propagation information play an important r...Social applications such as Weibo have provided a quick platform for information propagation, which have led to an explosive propagation for hot topic. User sentiments about propagation information play an important role in propagation speed, which receive more and more attention from data mining field. In this paper, we propose an sentiment-based hot topics prediction model called PHT-US. PHT-US firstly classifies a large amount of text data in Weibo into different topics, then converts user sentiments and time factors into embedding vectors that are input into recurrent neural networks (both LSTM and GRU), and predicts whether the target topic could be a hot spot. Experiments on Sina Weibo show that PHT-US can effectively predict the hot topics in the future. Social applications such as Weibo provide a platform for quick information propagation, which leads to an explosive propagation for hot topics. User sentiments about propagation information play an important role in propagation speed, and thus receive more attention from data mining field. In this paper, a sentiment-based hot topics prediction model called PHT-US is proposed. Firstly a large amount of text data in Weibo was classified into different topics, and then user sentiments and time factors were converted into embedding vectors that are input into recurrent neural networks (both LSTM and GRU), and future hotspots were predicted. Experiments on Sina Weibo show that PHT-US can effectively predict hot topics in the future.展开更多
In this article , the present state of cotton production in Xinjiang is introduced. The strong points and problems in cotton production are discussed in detail. In addition, cotton research advances are reviewed in a ...In this article , the present state of cotton production in Xinjiang is introduced. The strong points and problems in cotton production are discussed in detail. In addition, cotton research advances are reviewed in a comprehensive manner. In consideration of all this, the authors expound some monographic topics and disciplinary projects regarding Xinjiang cotton research which are to be implemented in the years to come.展开更多
Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty...Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.展开更多
The problem of "rich topics get richer"(RTGR) is popular to the topic models,which will bring the wrong topic distribution if the distributing process has not been intervened.In standard LDA(Latent Dirichlet...The problem of "rich topics get richer"(RTGR) is popular to the topic models,which will bring the wrong topic distribution if the distributing process has not been intervened.In standard LDA(Latent Dirichlet Allocation) model,each word in all the documents has the same statistical ability.In fact,the words have different impact towards different topics.Under the guidance of this thought,we extend ILDA(Infinite LDA) by considering the bias role of words to divide the topics.We propose a self-adaptive topic model to overcome the RTGR problem specifically.The model proposed in this paper is adapted to three questions:(1) the topic number is changeable with the collection of the documents,which is suitable for the dynamic data;(2) the words have discriminating attributes to topic distribution;(3) a selfadaptive method is used to realize the automatic re-sampling.To verify our model,we design a topic evolution analysis system which can realize the following functions:the topic classification in each cycle,the topic correlation in the adjacent cycles and the strength calculation of the sub topics in the order.The experiment both on NIPS corpus and our self-built news collections showed that the system could meet the given demand,the result was feasible.展开更多
基金supported by Sichuan Science and Technology Program(Nos.2019YFG0507,2020YFG0328 and 2021YFG0018)by National Natural Science Foundation of China(NSFC)under Grant No.U19A2059+1 种基金by the Young Scientists Fund of the National Natural Science Foundation of China under Grant No.61802050by the Fundamental Research Funds for the Central Universities(No.ZYGX2021J019).
文摘Digital twinning enables manufacturers to create digital representations of physical entities,thus implementing virtual simulations for product development.Previous efforts of digital twinning neglect the decisive consumer feedback in product development stages,failing to cover the gap between physical and digital spaces.This work mines real-world consumer feedbacks through social media topics,which is significant to product development.We specifically analyze the prevalent time of a product topic,giving an insight into both consumer attention and the widely-discussed time of a product.The primary body of current studies regards the prevalent time prediction as an accompanying task or assumes the existence of a preset distribution.Therefore,these proposed solutions are either biased in focused objectives and underlying patterns or weak in the capability of generalization towards diverse topics.To this end,this work combines deep learning and survival analysis to predict the prevalent time of topics.We propose a specialized deep survival model which consists of two modules.The first module enriches input covariates by incorporating latent features of the time-varying text,and the second module fully captures the temporal pattern of a rumor by a recurrent network structure.Moreover,a specific loss function different from regular survival models is proposed to achieve a more reasonable prediction.Extensive experiments on real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods.
基金the phased outcome of the National Social Science Fund of China(NSSFC)Project,A Case Study on Urbanization in Xizang with Local Features from the Perspective of Stabilizing Border Areas and Boosting Local Economies(22CMZ013)。
文摘In the speech delivered at the centenary celebration of the Communist Party of China(CPC),Xi Jinping,general secretary of the Communist Party of China(CPC)Central Committee,made important remarks on the“Chinese path to modernization.”This term represents the latest achievement China has scored in adapting Marxism to the Chinese context and the needs of the times.It has set the course as China embarks on a new journey of building a strong socialist country with Chinese characteristics and achieving national rejuvenation.Drawing on CiteSpace,we conducted a visualized bibliometric analysis of literature on the Chinese path to modernization by searching the CNKI database using subject terms such as“Chinese modernization,Chinese path to modernization,and Chinese-style modernization.”The findings reveal that:(a)Research on the Chinese path to modernization has gone through three stages:initial establishment,pioneering exploration,and comprehensive in-depth development.(b)Existing literature has covered the four key topics associated with the Chinese path to modernization,namely its essence,goal,methodology,and pioneering achievements.(c)Future research may focus on building up China’s strength in agriculture,developing the digital economy,modernizing China’s system and capacity for governance,and establishing a unique socialist discourse system for Chinese modernization.
文摘以热带医学研究领域为例,探索InCites数据库中的Citation Topics功能在选题策划中的应用。选取Web of Science数据库中热带医学领域近5年SCIE收录的论文,利用Citation Topics,对个别发文量多或被引频次高的研究方向、区域、研究人员、机构进行微观主题举例分析。疟疾微观主题下表现最活跃的区域为USA,机构为University of London,研究人员为Drakeley,Chris,出版物为Malaria Journal,United States Department of Health&Human Services为疟疾微观主题提供的基金资助最多;血吸虫病、疟疾、登革热、包虫囊肿和冠状病毒是中国热带医学领域的研究重点,冠状病毒、血吸虫病、隐孢子虫、登革热和疟疾微观主题的论文影响力相对较高;研究人员Zhou,Xiao-Nong的重点研究方向为血吸虫病、疟疾和包虫囊肿,疟疾、登革热、轮状病毒、犬弓首线虫和莱姆病研究主题的论文质量和关注度高;University of London热带医学领域的研究重点为疟疾、血吸虫病和登革热。InCites中Citation Topics功能可以实现对研究主题、人员、机构、国家/地区等模块进行更精细的分析,有助于科技期刊编辑更高效地制定选题方案。
基金an outcome of the project "Study on the Recognition Method of Innovative Evolving Trajectory based on Topic Correlation Analysis of Science and Technology" (No. 71704170) supported by National Natural Science Foundation of Chinathe project "Study on Regularity and Dynamics of Knowledge Diffusion among Scientific Disciplines" (No. 71704063) supported by National Natura Science Foundation of Chinathe Youth Innovation Promotion Association, CAS (Grant No. 2016159)
文摘Purpose: Formal concept analysis(FCA) and concept lattice theory(CLT) are introduced for constructing a network of IDR topics and for evaluating their effectiveness for knowledge structure exploration.Design/methodology/approach: We introduced the theory and applications of FCA and CLT, and then proposed a method for interdisciplinary knowledge discovery based on CLT. As an example of empirical analysis, interdisciplinary research(IDR) topics in Information & Library Science(LIS) and Medical Informatics, and in LIS and Geography-Physical, were utilized as empirical fields. Subsequently, we carried out a comparative analysis with two other IDR topic recognition methods.Findings: The CLT approach is suitable for IDR topic identification and predictions.Research limitations: IDR topic recognition based on the CLT is not sensitive to the interdisciplinarity of topic terms, since the data can only reflect whether there is a relationship between the discipline and the topic terms. Moreover, the CLT cannot clearly represent a large amounts of concepts.Practical implications: A deeper understanding of the IDR topics was obtained as the structural and hierarchical relationships between them were identified, which can help to get more precise identification and prediction to IDR topics.Originality/value: IDR topics identification based on CLT have performed well and this theory has several advantages for identifying and predicting IDR topics. First, in a concept lattice, there is a partial order relation between interconnected nodes, and consequently, a complete concept lattice can present hierarchical properties. Second, clustering analysis of IDR topics based on concept lattices can yield clusters that highlight the essential knowledge features and help display the semantic relationship between different IDR topics. Furthermore, the Hasse diagram automatically displays all the IDR topics associated with the different disciplines, thus forming clusters of specific concepts and visually retaining and presenting the associations of IDR topics through multiple inheritance relationships between the concepts.
基金supported by the National Social Science Foundation of China (Grant No.: 14BTQ030)
文摘Purpose: In this paper, we combined the method of co-word analysis and alluvial diagram to detect hot topics and illustrate their dynamics. Design/methodology/approach: Articles in the field of scientometrics were chosen as research cases in this study. A time-sliced co-word network was generated and then clustered. Afterwards, we generated an alluvial diagram to show dynamic changes of hot topics, including their merges and splits over time. Findings: After analyzing the dynamic changes in the field of scientometrics from 2011 to 2015, we found that two clusters being merged did not mean that the old topics had disappeared and a totally new one had emerged. The topics were possibly still active the following year, but the newer topics had drawn more attention. The changes of hot topics reflected the shift in researchers' interests. subdivided and re-merged. For example, several topics as research progressed. Research topics in scientometrics were constantly a cluster involving "industry" was divided into Research limitations: When examining longer time periods, we encounter the problem of dealing with bigger data sets. Analyzing data year by year would be tedious, but if we combine, e.g. two years into one time slice, important details would be missed. Practical implications: This method can be applied to any research field to illustrate the dynamics of hot topics. It can indicate the promising directions for researchers and provide guidance to decision makers. Originality/value: The use of alluvial diagrams is a distinctive and meaningful approach to detecting hot topics and especially to illustrating their dynamics.
基金the National Natural Science Foundation of China Grant 71673131 for financial support
文摘Purpose:To reveal the research hotpots and relationship among three research hot topics in b iomedicine,namely CRISPR,iPS(induced Pluripotent Stem)cell and Synthetic biology.Design/methodology/approach:We set up their keyword co-occurrence networks with using three indicators and information visualization for metric analysis.Findings:The results reveal the main research hotspots in the three topics are different,but the overlapping keywords in the three topics indicate that they are mutually integrated and interacted each other.Research limitations:All analyses use keywords,without any other forms.Practical implications:We try to find the information distribution and structure of these three hot topics for revealing their research status and interactions,and for promoting biomedical developments.Originality/value:We chose the core keywords in three research hot topics in biomedicine by using h-index.
文摘Promoting physical activity and health through active video games Volume 6,Issue 1,2017Guest editor:Zan Gao Hamstring muscle strain injury Volume 6,Issues 2-3,2017Guest editors:Bing Yu,Li Li Physical activity,fitness,and obesity in Chinese school-aged children and adolescents:An update Volume 6,Issue 4,2017Guest editors:Fuzhong Li.
文摘Fields of particular interest to JSHS include (but are not limited to):·Sport and exercise medicine·Injury prevention and clinical rehabilitation·Sport and exercise physiology·Public health promotion·Physical activity epidemiology·Biomechanics and motor
文摘At the first working meeting of the 'Key Technical Standard PromotionProject', an important project of the national scientific and technical support plan during the'11th Five Year' held recently in Beijing, the first group of 11 research topics for the 'KeyTechnical Standard Promotion Project' was formally introduced. According to Ms. Yu Xinli, who is incharge of the project as Deputy Director of the China National Institute of Standardization, theproject comprises four aspects: the international standards key breakthrough project, the technicalstandards promotion project of adapting technical trade measures, the technical standards innovationproject of basic public welfare, and the technical standard enhancement project of public security.
基金the National Natural Science Foundation of China (No. 61602159)the Natural Science Foundation of Heilongjiang Province (No. F201430)+1 种基金the Innovation Talents Project of Science and Technology Bureau of Harbin (No. 2017RAQXJ094)the fundamental research funds of universities in Heilongjiang Province, special fund of Heilongjiang University (No. HDJCCX-201608).
文摘Social applications such as Weibo have provided a quick platform for information propagation, which have led to an explosive propagation for hot topic. User sentiments about propagation information play an important role in propagation speed, which receive more and more attention from data mining field. In this paper, we propose an sentiment-based hot topics prediction model called PHT-US. PHT-US firstly classifies a large amount of text data in Weibo into different topics, then converts user sentiments and time factors into embedding vectors that are input into recurrent neural networks (both LSTM and GRU), and predicts whether the target topic could be a hot spot. Experiments on Sina Weibo show that PHT-US can effectively predict the hot topics in the future. Social applications such as Weibo provide a platform for quick information propagation, which leads to an explosive propagation for hot topics. User sentiments about propagation information play an important role in propagation speed, and thus receive more attention from data mining field. In this paper, a sentiment-based hot topics prediction model called PHT-US is proposed. Firstly a large amount of text data in Weibo was classified into different topics, and then user sentiments and time factors were converted into embedding vectors that are input into recurrent neural networks (both LSTM and GRU), and future hotspots were predicted. Experiments on Sina Weibo show that PHT-US can effectively predict hot topics in the future.
基金The project is supported by the Research Project "95-001-04-04-02".
文摘In this article , the present state of cotton production in Xinjiang is introduced. The strong points and problems in cotton production are discussed in detail. In addition, cotton research advances are reviewed in a comprehensive manner. In consideration of all this, the authors expound some monographic topics and disciplinary projects regarding Xinjiang cotton research which are to be implemented in the years to come.
基金Supported by the National High Technology Research and Development Program of China(No.2012AA011005)
文摘Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.
基金ACKNOWLEDGMENTS This work is supported by grants National 973 project (No.2013CB29606), Natural Science Foundation of China (No.61202244), research fund of ShangQiu Normal Colledge (No. 2013GGJS013). N1PS corpus is supported by SourceForge. We thank the anonymous reviewers for their helpful comments.
文摘The problem of "rich topics get richer"(RTGR) is popular to the topic models,which will bring the wrong topic distribution if the distributing process has not been intervened.In standard LDA(Latent Dirichlet Allocation) model,each word in all the documents has the same statistical ability.In fact,the words have different impact towards different topics.Under the guidance of this thought,we extend ILDA(Infinite LDA) by considering the bias role of words to divide the topics.We propose a self-adaptive topic model to overcome the RTGR problem specifically.The model proposed in this paper is adapted to three questions:(1) the topic number is changeable with the collection of the documents,which is suitable for the dynamic data;(2) the words have discriminating attributes to topic distribution;(3) a selfadaptive method is used to realize the automatic re-sampling.To verify our model,we design a topic evolution analysis system which can realize the following functions:the topic classification in each cycle,the topic correlation in the adjacent cycles and the strength calculation of the sub topics in the order.The experiment both on NIPS corpus and our self-built news collections showed that the system could meet the given demand,the result was feasible.