Microblogs have become an important platform for people to publish,transform information and acquire knowledge.This paper focuses on the problem of discovering user interest in microblogs.In this paper,we propose a to...Microblogs have become an important platform for people to publish,transform information and acquire knowledge.This paper focuses on the problem of discovering user interest in microblogs.In this paper,we propose a topic mining model based on Latent Dirichlet Allocation(LDA) named user-topic model.For each user,the interests are divided into two parts by different ways to generate the microblogs:original interest and retweet interest.We represent a Gibbs sampling implementation for inference the parameters of our model,and discover not only user's original interest,but also retweet interest.Then we combine original interest and retweet interest to compute interest words for users.Experiments on a dataset of Sina microblogs demonstrate that our model is able to discover user interest effectively and outperforms existing topic models in this task.And we find that original interest and retweet interest are similar and the topics of interest contain user labels.The interest words discovered by our model reflect user labels,but range is much broader.展开更多
With the increasing importance of computer intelligence in the new round of the industrial revolution,administrative,regulatory,or design(ARD)green technology contributes to improving national technological competitiv...With the increasing importance of computer intelligence in the new round of the industrial revolution,administrative,regulatory,or design(ARD)green technology contributes to improving national technological competitiveness and promoting the transformation of green technology,which is becoming an important field under sustainable development goals.The U.S.and China ranked top two in terms of paper influence and patent applications in the field of ARD green technology.However,few comparative studies have been conducted in these two countries.This study presents the evolution and landscapes of ARD green technology between China and the U.S.,focusing on comparing development priorities and technical layouts in each five-year plan period.According to the“International Patent Classification(IPC)Green Inventory”launched by the World Intellectual Property Organization(WIPO),we retrieved 69,412 patents published between 2001 and 2020 from the PatSnap database.Descriptive,content,and thematic network analyses were conducted using latent dirichlet allocation(LDA)and community detection algorithms.The results show that both China and the U.S.strategically focus on ARD green technology development.The technical topics in this field can be divided into three themes:data processing systems,traffic control systems,and building designs.The emphasis on technology research and development(R&D)differs between China and the U.S.There is also evidence that the U.S.has advantages in terms of technological innovation and capabilities.However,China has an advantage in terms of data volume,and the gap between China and the U.S.is gradually narrowing.We also highlight the contributions and limitations of this study.展开更多
为了给医生及病人安全、合理、高效用药提供决策支持,提出了一种基于LDA(Latent Dirichlet Allocation)的用药分析方法 Ma LDA(Medication Analysis based on LDA)。该方法结合了用药记录和就诊记录,将药物看作文档、药物功能看作主题...为了给医生及病人安全、合理、高效用药提供决策支持,提出了一种基于LDA(Latent Dirichlet Allocation)的用药分析方法 Ma LDA(Medication Analysis based on LDA)。该方法结合了用药记录和就诊记录,将药物看作文档、药物功能看作主题、疾病看作词语,通过主题模型LDA发现隐含的药物功能,通过药物功能,将相关药物、相关疾病和药物与疾病联系起来。根据药物对药物功能的分布对药物进行聚类,每一类药物被相关的疾病所描述,进而对临床用药进行分析。Ma LDA不仅能发现临床用药中针对某一类疾病效用较好的药物,而且能发现隐含的联合用药。实验数据来源于上海市某医院137 510位病人的用药记录和就诊记录。实验结果证实了Ma LDA相对于其他方法在对电子就医记录进行用药分析的有效性。展开更多
基金This work was supported by the National High Technology Research and Development Program of China(No. 2010AA012505, 2011AA010702, 2012AA01A401 and 2012AA01A402), Chinese National Science Foundation (No. 60933005, 91124002,61303265), National Technology Support Foundation (No. 2012BAH38B04) and National 242 Foundation (No. 2011A010)
文摘Microblogs have become an important platform for people to publish,transform information and acquire knowledge.This paper focuses on the problem of discovering user interest in microblogs.In this paper,we propose a topic mining model based on Latent Dirichlet Allocation(LDA) named user-topic model.For each user,the interests are divided into two parts by different ways to generate the microblogs:original interest and retweet interest.We represent a Gibbs sampling implementation for inference the parameters of our model,and discover not only user's original interest,but also retweet interest.Then we combine original interest and retweet interest to compute interest words for users.Experiments on a dataset of Sina microblogs demonstrate that our model is able to discover user interest effectively and outperforms existing topic models in this task.And we find that original interest and retweet interest are similar and the topics of interest contain user labels.The interest words discovered by our model reflect user labels,but range is much broader.
基金supported by the National Natural Science Foundation of China(Grant No.:71774130)China Huaneng Group(Grant No.:HNKJ20-H87).
文摘With the increasing importance of computer intelligence in the new round of the industrial revolution,administrative,regulatory,or design(ARD)green technology contributes to improving national technological competitiveness and promoting the transformation of green technology,which is becoming an important field under sustainable development goals.The U.S.and China ranked top two in terms of paper influence and patent applications in the field of ARD green technology.However,few comparative studies have been conducted in these two countries.This study presents the evolution and landscapes of ARD green technology between China and the U.S.,focusing on comparing development priorities and technical layouts in each five-year plan period.According to the“International Patent Classification(IPC)Green Inventory”launched by the World Intellectual Property Organization(WIPO),we retrieved 69,412 patents published between 2001 and 2020 from the PatSnap database.Descriptive,content,and thematic network analyses were conducted using latent dirichlet allocation(LDA)and community detection algorithms.The results show that both China and the U.S.strategically focus on ARD green technology development.The technical topics in this field can be divided into three themes:data processing systems,traffic control systems,and building designs.The emphasis on technology research and development(R&D)differs between China and the U.S.There is also evidence that the U.S.has advantages in terms of technological innovation and capabilities.However,China has an advantage in terms of data volume,and the gap between China and the U.S.is gradually narrowing.We also highlight the contributions and limitations of this study.
文摘为了给医生及病人安全、合理、高效用药提供决策支持,提出了一种基于LDA(Latent Dirichlet Allocation)的用药分析方法 Ma LDA(Medication Analysis based on LDA)。该方法结合了用药记录和就诊记录,将药物看作文档、药物功能看作主题、疾病看作词语,通过主题模型LDA发现隐含的药物功能,通过药物功能,将相关药物、相关疾病和药物与疾病联系起来。根据药物对药物功能的分布对药物进行聚类,每一类药物被相关的疾病所描述,进而对临床用药进行分析。Ma LDA不仅能发现临床用药中针对某一类疾病效用较好的药物,而且能发现隐含的联合用药。实验数据来源于上海市某医院137 510位病人的用药记录和就诊记录。实验结果证实了Ma LDA相对于其他方法在对电子就医记录进行用药分析的有效性。