期刊文献+
共找到75篇文章
< 1 2 4 >
每页显示 20 50 100
Analyzing COVID-19 Discourse on Twitter: Text Clustering and Classification Models for Public Health Surveillance
1
作者 Pakorn Santakij Samai Srisuay Pongporn Punpeng 《Computer Systems Science & Engineering》 2024年第3期665-689,共25页
Social media has revolutionized the dissemination of real-life information,serving as a robust platform for sharing life events.Twitter,characterized by its brevity and continuous flow of posts,has emerged as a crucia... Social media has revolutionized the dissemination of real-life information,serving as a robust platform for sharing life events.Twitter,characterized by its brevity and continuous flow of posts,has emerged as a crucial source for public health surveillance,offering valuable insights into public reactions during the COVID-19 pandemic.This study aims to leverage a range of machine learning techniques to extract pivotal themes and facilitate text classification on a dataset of COVID-19 outbreak-related tweets.Diverse topic modeling approaches have been employed to extract pertinent themes and subsequently form a dataset for training text classification models.An assessment of coherence metrics revealed that the Gibbs Sampling Dirichlet Mixture Model(GSDMM),which utilizes trigram and bag-of-words(BOW)feature extraction,outperformed Non-negative Matrix Factorization(NMF),Latent Dirichlet Allocation(LDA),and a hybrid strategy involving Bidirectional Encoder Representations from Transformers(BERT)combined with LDA and K-means to pinpoint significant themes within the dataset.Among the models assessed for text clustering,the utilization of LDA,either as a clustering model or for feature extraction combined with BERT for K-means,resulted in higher coherence scores,consistent with human ratings,signifying their efficacy.In particular,LDA,notably in conjunction with trigram representation and BOW,demonstrated superior performance.This underscores the suitability of LDA for conducting topic modeling,given its proficiency in capturing intricate textual relationships.In the context of text classification,models such as Linear Support Vector Classification(LSVC),Long Short-Term Memory(LSTM),Bidirectional Long Short-Term Memory(BiLSTM),Convolutional Neural Network with BiLSTM(CNN-BiLSTM),and BERT have shown outstanding performance,achieving accuracy and weighted F1-Score scores exceeding 80%.These results significantly surpassed other models,such as Multinomial Naive Bayes(MNB),Linear Support Vector Machine(LSVM),and Logistic Regression(LR),which achieved scores in the range of 60 to 70 percent. 展开更多
关键词 Topic modeling text classification TWITTER feature extraction social media
下载PDF
TG-SMR:AText Summarization Algorithm Based on Topic and Graph Models 被引量:1
2
作者 Mohamed Ali Rakrouki Nawaf Alharbe +1 位作者 Mashael Khayyat Abeer Aljohani 《Computer Systems Science & Engineering》 SCIE EI 2023年第4期395-408,共14页
Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in r... Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators. 展开更多
关键词 Natural language processing text summarization graph model topic model
下载PDF
Research on high-performance English translation based on topic model
3
作者 Yumin Shen Hongyu Guo 《Digital Communications and Networks》 SCIE CSCD 2023年第2期505-511,共7页
Retelling extraction is an important branch of Natural Language Processing(NLP),and high-quality retelling resources are very helpful to improve the performance of machine translation.However,traditional methods based... Retelling extraction is an important branch of Natural Language Processing(NLP),and high-quality retelling resources are very helpful to improve the performance of machine translation.However,traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application.In order to solve this problem,we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance.In this method,Probabilistic Latent Semantic Analysis(PLSA)is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition.Then we design a decoder to simplify the decoding process.Experiments show that the proposed method can effectively improve the accuracy of translation. 展开更多
关键词 Machine translation Topic model Statistical machine translation Bilingual word vector RETELLING
下载PDF
Topic Modelling and Sentimental Analysis of Students’Reviews
4
作者 Omer S.Alkhnbashi Rasheed Mohammad Nassr 《Computers, Materials & Continua》 SCIE EI 2023年第3期6835-6848,共14页
Globally,educational institutions have reported a dramatic shift to online learning in an effort to contain the COVID-19 pandemic.The fundamental concern has been the continuance of education.As a result,several novel... Globally,educational institutions have reported a dramatic shift to online learning in an effort to contain the COVID-19 pandemic.The fundamental concern has been the continuance of education.As a result,several novel solutions have been developed to address technical and pedagogical issues.However,these were not the only difficulties that students faced.The implemented solutions involved the operation of the educational process with less regard for students’changing circumstances,which obliged them to study from home.Students should be asked to provide a full list of their concerns.As a result,student reflections,including those from Saudi Arabia,have been analysed to identify obstacles encountered during the COVID-19 pandemic.However,most of the analyses relied on closed-ended questions,which limited student involvement.To delve into students’responses,this study used open-ended questions,a qualitative method(content analysis),a quantitative method(topic modelling),and a sentimental analysis.This study also looked at students’emotional states during and after the COVID-19 pandemic.In terms of determining trends in students’input,the results showed that quantitative and qualitative methods produced similar outcomes.Students had unfavourable sentiments about studying during COVID-19 and positive sentiments about the face-to-face study.Furthermore,topic modelling has revealed that the majority of difficulties are more related to the environment(home)and social life.Students were less accepting of online learning.As a result,it is possible to conclude that face-to-face study still attracts students and provides benefits that online study cannot,such as social interaction and effective eye-to-eye communication. 展开更多
关键词 Topic modelling sentimental analysis COVID-19 students’input
下载PDF
News Modeling and Retrieving Information: Data-Driven Approach
5
作者 Elias Hossain Abdullah Alshahrani Wahidur Rahman 《Intelligent Automation & Soft Computing》 2023年第11期109-123,共15页
This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three p... This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three phases:the Text Classification Approach(TCA),the Proposed Algorithms Interpretation(PAI),andfinally,Information Retrieval Approach(IRA).The TCA reflects the text preprocessing pipeline called a clean corpus.The Global Vec-tors for Word Representation(Glove)pre-trained model,FastText,Term Frequency-Inverse Document Fre-quency(TF-IDF),and Bag-of-Words(BOW)for extracting the features have been interpreted in this research.The PAI manifests the Bidirectional Long Short-Term Memory(Bi-LSTM)and Convolutional Neural Network(CNN)to classify the COVID-19 news.Again,the IRA explains the mathematical interpretation of Latent Dirich-let Allocation(LDA),obtained for modelling the topic of Information Retrieval(IR).In this study,99%accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove.A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research.Furthermore,some text analyses and the most influential aspects of each document have been explored in this study.We have utilized Bidirectional Encoder Representations from Trans-formers(BERT)as a Deep Learning mechanism in our model training,but the result has not been uncovered satisfactory.However,the proposed system can be adjustable in the real-time news classification of COVID-19. 展开更多
关键词 COVID-19 news retrieving DATA-DRIVEN machine learning BERT topic modelling
下载PDF
Enhancing emerging technology discovery in nanomedicine by integrating innovative sentences using BERT and NLDA
6
作者 Yifan Wang Xiaoping Liu Xiang-Li Zhu 《Journal of Data and Information Science》 CSCD 2024年第4期155-195,共41页
Purpose:Nanomedicine has significant potential to revolutionize biomedicine and healthcare through innovations in diagnostics,therapeutics,and regenerative medicine.This study aims to develop a novel framework that in... Purpose:Nanomedicine has significant potential to revolutionize biomedicine and healthcare through innovations in diagnostics,therapeutics,and regenerative medicine.This study aims to develop a novel framework that integrates advanced natural language processing,noise-free topic modeling,and multidimensional bibliometrics to systematically identify emerging nanomedicine technology topics from scientific literature.Design/methodology/approach:The framework involves collecting full-text articles from PubMed Central and nanomedicine-related metrics from the Web of Science for the period 2013-2023.A fine-tuned BERT model is employed to extract key informative sentences.Noiseless Latent Dirichlet Allocation(NLDA)is applied to model interpretable topics from the cleaned corpus.Additionally,we develop and apply metrics for novelty,innovation,growth,impact,and intensity to quantify the emergence of novel technological topics.Findings:By applying this methodology to nanomedical publications,we identify an increasing emphasis on research aligned with global health priorities,particularly inflammation and biomaterial interactions in disease research.This methodology provides deeper insights through full-text analysis and leading to a more robust discovery of emerging technologies.Research limitations:One limitation of this study is its reliance on the existing scientific literature,which may introduce publication biases and language constraints.Additionally,manual annotation of the dataset,while thorough,is subject to subjectivity and can be time-consuming.Future research could address these limitations by incorporating more diverse data sources,and automating the annotation process.Practical implications:The methodology presented can be adapted to explore emerging technologies in other scientific domains.It allows for tailored assessment criteria based on specific contexts and objectives,enabling more precise analysis and decision-making in various fields.Originality/value:This study offers a comprehensive framework for identifying emerging technologies in nanomedicine,combining theoretical insights and practical applications.Its potential for adaptation across scientific disciplines enhances its value for future research and decision-making in technology discovery. 展开更多
关键词 BIBLIOMETRICS NANOMEDICINE Emerging technologies BERT Topic modeling
下载PDF
Research evolution of metal organic frameworks: A scientometric approach with human-in-the-loop
7
作者 Xintong Zhao Kyle Langlois +5 位作者 Jacob Furst Yuan An Xiaohua Hu Diego Gomez Gualdron Fernando Uribe-Romo Jane Greenberg 《Journal of Data and Information Science》 CSCD 2024年第3期44-64,共21页
Purpose:This paper reports on a scientometric analysis bolstered by human-in-the-loop,domain experts,to examine the field of metal-organic frameworks(MOFs)research.Scientometric analyses reveal the intellectual landsc... Purpose:This paper reports on a scientometric analysis bolstered by human-in-the-loop,domain experts,to examine the field of metal-organic frameworks(MOFs)research.Scientometric analyses reveal the intellectual landscape of a field.The study engaged MOF scientists in the design and review of our research workflow.MOF materials are an essential component in next-generation renewable energy storage and biomedical technologies.The research approach demonstrates how engaging experts,via human-in-the-loop processes,can help develop a comprehensive view of a field’s research trends,influential works,and specialized topics.Design/methodology/approach:Ascientometric analysis was conducted,integrating natural language processing(NLP),topic modeling,and network analysis methods.The analytical approach was enhanced through a human-in-the-loop iterative process involving MOF research scientists at selected intervals.MOF researcher feedback was incorporated into our method.The data sample included 65,209 MOF research articles.Python3 and software tool VOSviewer were used to perform the analysis.Findings:The findings demonstrate the value of including domain experts in research workflows,refinement,and interpretation of results.At each stage of the analysis,the MOF researchers contributed to interpreting the results and method refinements targeting our focus Research evolution of metal organic frameworks:A scientometric approach with human-in-the-loop on MOF research.This study identified influential works and their themes.Our findings also underscore four main MOF research directions and applications.Research limitations:This study is limited by the sample(articles identified and referenced by the Cambridge Structural Database)that informed our analysis.Practical implications:Our findings contribute to addressing the current gap in fully mapping out the comprehensive landscape of MOF research.Additionally,the results will help domain scientists target future research directions.Originality/value:To the best of our knowledge,the number of publications collected for analysis exceeds those of previous studies.This enabled us to explore a more extensive body of MOF research compared to previous studies.Another contribution of our work is the iterative engagement of domain scientists,who brought in-depth,expert interpretation to the data analysis,helping hone the study. 展开更多
关键词 Scientometric Metal-Organic Frameworks(MOFs) Network analysis Topic modeling Human-in-the-loop
下载PDF
Dynamic evaluation of digital and green development policies based on text mining of the PMC framework
8
作者 Ye Chunmei Wu Lihua 《Journal of Southeast University(English Edition)》 EI CAS 2024年第3期319-326,共8页
Aiming to identify policy topics and their evolutionary logic that enhance the digital and green development(dual development)of traditional manufacturing enterprises,address weaknesses in current policies,and provide... Aiming to identify policy topics and their evolutionary logic that enhance the digital and green development(dual development)of traditional manufacturing enterprises,address weaknesses in current policies,and provide resources for refining dual development policies,a total of 15954 dual development-related policies issued by national and various departmental authorities in China from January 2000 to August 2023 were analyzed.Based on topic modeling techniques and the policy modeling consistency(PMC)framework,the evolution of policy topics was visualized,and a dynamic assessment of the policies was conducted.The results show that the digital and green development policy framework is progressively refined,and the governance philosophy shifts from a“regulatory government”paradigm to a“service-oriented government”.The support pattern evolves from“dispersed matching”to“integrated symbiosis”.However,there are still significant deficiencies in departmental cooperation,balanced measures,coordinated links,and multi-stakeholder participation.Future policy improvements should,therefore,focus on guiding multi-stakeholder participation,enhancing public demand orientation,and addressing the entire value chain.These steps aim to create an open and shared digital industry ecosystem to promote the coordinated dual development of traditional manufacturing enterprises. 展开更多
关键词 digital and green development text mining topic modeling policy modeling consistency(PMC)framework machine learning
下载PDF
Anomaly detection in traffic surveillance with sparse topic model 被引量:4
9
作者 XIA Li-min HU Xiang-jie WANG Jun 《Journal of Central South University》 SCIE EI CAS CSCD 2018年第9期2245-2257,共13页
Most research on anomaly detection has focused on event that is different from its spatial-temporal neighboring events.It is still a significant challenge to detect anomalies that involve multiple normal events intera... Most research on anomaly detection has focused on event that is different from its spatial-temporal neighboring events.It is still a significant challenge to detect anomalies that involve multiple normal events interacting in an unusual pattern.In this work,a novel unsupervised method based on sparse topic model was proposed to capture motion patterns and detect anomalies in traffic surveillance.scale-invariant feature transform(SIFT)flow was used to improve the dense trajectory in order to extract interest points and the corresponding descriptors with less interference.For the purpose of strengthening the relationship of interest points on the same trajectory,the fisher kernel method was applied to obtain the representation of trajectory which was quantized into visual word.Then the sparse topic model was proposed to explore the latent motion patterns and achieve a sparse representation for the video scene.Finally,two anomaly detection algorithms were compared based on video clip detection and visual word analysis respectively.Experiments were conducted on QMUL Junction dataset and AVSS dataset.The results demonstrated the superior efficiency of the proposed method. 展开更多
关键词 motion pattern sparse topic model SIFT flow dense trajectory fisher kernel
下载PDF
BURST-LDA: A NEW TOPIC MODEL FOR DETECTING BURSTY TOPICS FROM STREAM TEXT 被引量:3
10
作者 Qi Xiang Huang Yu +4 位作者 Chen Ziyan Liu Xiaoyan Tian Jing Huang Tinglei Wang Hongqi 《Journal of Electronics(China)》 2014年第6期565-575,共11页
Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty... Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method. 展开更多
关键词 Text mining Burst detection Topic model Graphical model Bayesian inference
下载PDF
Enhancing Collaborative Filtering via Topic Model Integrated Uniform Euclidean Distance 被引量:1
11
作者 Tieliang Gao Bo Cheng +1 位作者 Junliang Chen Ming Chen 《China Communications》 SCIE CSCD 2017年第11期48-58,共11页
Recommendation system can greatly alleviate the "information overload" in the big data era. Existing recommendation methods, however, typically focus on predicting missing rating values via analyzing user-it... Recommendation system can greatly alleviate the "information overload" in the big data era. Existing recommendation methods, however, typically focus on predicting missing rating values via analyzing user-item dualistic relationship, which neglect an important fact that the latent interests of users can influence their rating behaviors. Moreover, traditional recommendation methods easily suffer from the high dimensional problem and cold-start problem. To address these challenges, in this paper, we propose a PBUED(PLSA-Based Uniform Euclidean Distance) scheme, which utilizes topic model and uniform Euclidean distance to recommend the suitable items for users. The solution first employs probabilistic latent semantic analysis(PLSA) to extract users' interests, users with different interests are divided into different subgroups. Then, the uniform Euclidean distance is adopted to compute the users' similarity in the same interest subset; finally, the missing rating values of data are predicted via aggregating similar neighbors' ratings. We evaluate PBUED on two datasets and experimental results show PBUED can lead to better predicting performance and ranking performance than other approaches. 展开更多
关键词 recommendation system topic model user interest uniform euclidean distance
下载PDF
Assessing citizen science opportunities in forest monitoring using probabilistic topic modelling 被引量:1
12
作者 Stefan Daume Matthias Albert Klaus von Gadow 《Forestry Studies in China》 CAS 2014年第2期93-104,共12页
Background: With mounting global environmental, social and economic pressures the resilience and stability of forests and thus the provisioning of vital ecosystem services is increasingly threatened. Intensified moni... Background: With mounting global environmental, social and economic pressures the resilience and stability of forests and thus the provisioning of vital ecosystem services is increasingly threatened. Intensified monitoring can help to detect ecological threats and changes earlier, but monitoring resources are limited. Participatory forest monitoring with the help of "citizen scientists" can provide additional resources for forest monitoring and at the same time help to communicate with stakeholders and the general public. Examples for citizen science projects in the forestry domain can be found but a solid, applicable larger framework to utilise public participation in the area of forest monitoring seems to be lacking. We propose that a better understanding of shared and related topics in citizen science and forest monitoring might be a first step towards such a framework. Methods: We conduct a systematic meta-analysis of 1015 publication abstracts addressing "forest monitoring" and "citizen science" in order to explore the combined topical landscape of these subjects. We employ 'topic modelling an unsupervised probabilistic machine learning method, to identify latent shared topics in the analysed publications. Results: We find that large shared topics exist, but that these are primarily topics that would be expected in scientific publications in general. Common domain-specific topics are under-represented and indicate a topical separation of the two document sets on "forest monitoring" and "citizen science" and thus the represented domains. While topic modelling as a method proves to be a scalable and useful analytical tool, we propose that our approach could deliver even more useful data if a larger document set and full-text publications would be available for analysis. Conclusions: We propose that these results, together with the observation of non-shared but related topics, point at under-utilised opportunities for public participation in forest monitoring. Citizen science could be applied as a versatile tool in forest ecosystems monitoring, complementing traditional forest monitoring programmes, assisting early threat recognition and helping to connect forest management with the general public. We conclude that our presented approach should be pursued further as it may aid the understanding and setup of citizen science efforts in the forest monitoring domain. 展开更多
关键词 Forest monitoring Citizen science Participatory forest monitoring Probabilistic topic modelling Text analysis
下载PDF
Self-Adaptive Topic Model: A Solution to the Problem of "Rich Topics Get Richer" 被引量:1
13
作者 FANG Ying 《China Communications》 SCIE CSCD 2014年第12期35-43,共9页
The problem of "rich topics get richer"(RTGR) is popular to the topic models,which will bring the wrong topic distribution if the distributing process has not been intervened.In standard LDA(Latent Dirichlet... The problem of "rich topics get richer"(RTGR) is popular to the topic models,which will bring the wrong topic distribution if the distributing process has not been intervened.In standard LDA(Latent Dirichlet Allocation) model,each word in all the documents has the same statistical ability.In fact,the words have different impact towards different topics.Under the guidance of this thought,we extend ILDA(Infinite LDA) by considering the bias role of words to divide the topics.We propose a self-adaptive topic model to overcome the RTGR problem specifically.The model proposed in this paper is adapted to three questions:(1) the topic number is changeable with the collection of the documents,which is suitable for the dynamic data;(2) the words have discriminating attributes to topic distribution;(3) a selfadaptive method is used to realize the automatic re-sampling.To verify our model,we design a topic evolution analysis system which can realize the following functions:the topic classification in each cycle,the topic correlation in the adjacent cycles and the strength calculation of the sub topics in the order.The experiment both on NIPS corpus and our self-built news collections showed that the system could meet the given demand,the result was feasible. 展开更多
关键词 topic model infinite Latent Dirichlet Allocation Dirichlet process topic evolution
下载PDF
Microencapsulated tumor assay:Evaluation of the nude mouse model of pancreatic cancer 被引量:1
14
作者 Ming-Zhe Ma Dong-Feng Cheng +5 位作者 Jin-Hua Ye Yong Zhou Jia-Xiang Wang Min-Min Shi Bao-San Han Cheng-Hong Peng 《World Journal of Gastroenterology》 SCIE CAS CSCD 2012年第3期257-267,共11页
AIM: To establish a more stable and accurate nude mouse model of pancreatic cancer using cancer cell microencapsulation. METHODS: The assay is based on microencapsulation technology, wherein human tumor cells are enca... AIM: To establish a more stable and accurate nude mouse model of pancreatic cancer using cancer cell microencapsulation. METHODS: The assay is based on microencapsulation technology, wherein human tumor cells are encapsulated in small microcapsules (approximately 420 μm in diameter) constructed of semipermeable membranes. We implemented two kinds of subcutaneous implantation models in nude mice using the injection of single tumor cells and encapsulated pancreatic tumor cells. The size of subcutaneously implanted tumors was observed ona weekly basis using two methods, and growth curves were generated from these data. The growth and metastasis of orthotopically injected single tumor cells and encapsulated pancreatic tumor cells were evaluated at four and eight weeks postimplantation by positron emission tomography-computed tomography scan and necropsy. The pancreatic tumor samples obtained from each method were then sent for pathological examination. We evaluated differences in the rates of tumor incidence and the presence of metastasis and variations in tumor volume and tumor weight in the cancer microcapsules vs single-cell suspensions. RESULTS: Sequential in vitro observations of the microcapsules showed that the cancer cells in microcapsules proliferated well and formed spheroids at days 4 to 6. Further in vitro culture resulted in bursting of the membrane of the microcapsules and cells deviated outward and continued to grow in flasks. The optimum injection time was found to be 5 d after tumor encapsulation. In the subcutaneous implantation model, there were no significant differences in terms of tumor volume between the encapsulated pancreatic tumor cells and cells alone and rate of tumor incidence. There was a significant difference in the rate of successful im- plantation between the cancer cell microencapsulation group and the single tumor-cell suspension group (100% vs 71.43%, respectively, P = 0.0489) in the orthotropic implantation model. The former method displayed an obvious advantage in tumor mass (4th wk: 0.0461 ± 0.0399 vs 0.0313 ± 0.021, t = -0.81, P = 0.4379; 8th wk: 0.1284 ± 0.0284 vs 0.0943 ± 0.0571, t = -2.28, respectively, P = 0.0457) compared with the latter in the orthotopic implantation model. CONCLUSION: Encapsulation of pancreatic tumor cells is a reliable method for establishing a pancreatic tumor animal model. 展开更多
关键词 Nude mice model of pancreatic neoplasms Encapsulation Subcutaneous implantation model Ortho- topic implantation model
下载PDF
A Phrase Topic Model Based on Distributed Representation
15
作者 Jialin Ma Jieyi Cheng +2 位作者 Lin Zhang Lei Zhou Bolun Chen 《Computers, Materials & Continua》 SCIE EI 2020年第7期455-469,共15页
Traditional topic models have been widely used for analyzing semantic topics from electronic documents.However,the obvious defects of topic words acquired by them are poor in readability and consistency.Only the domai... Traditional topic models have been widely used for analyzing semantic topics from electronic documents.However,the obvious defects of topic words acquired by them are poor in readability and consistency.Only the domain experts are possible to guess their meaning.In fact,phrases are the main unit for people to express semantics.This paper presents a Distributed Representation-Phrase Latent Dirichlet Allocation(DR-Phrase LDA)which is a phrase topic model.Specifically,we reasonably enhance the semantic information of phrases via distributed representation in this model.The experimental results show the topics quality acquired by our model is more readable and consistent than other similar topic models. 展开更多
关键词 PHRASE topic model LDA distributed representation Gibbs sampling
下载PDF
A Structural Topic Model for Exploring User Satisfaction with Mobile Payments
16
作者 Jang Hyun Kim Jisung Jang +1 位作者 Yonghwan Kim Dongyan Nan 《Computers, Materials & Continua》 SCIE EI 2022年第11期3815-3826,共12页
This study explored user satisfaction with mobile payments by applying a novel structural topic model.Specifically,we collected 17,927 online reviews of a specific mobile payment(i.e.,PayPal).Then,we employed a struct... This study explored user satisfaction with mobile payments by applying a novel structural topic model.Specifically,we collected 17,927 online reviews of a specific mobile payment(i.e.,PayPal).Then,we employed a structural topic model to investigate the relationship between the attributes extracted from online reviews and user satisfaction with mobile payment.Consequently,we discovered that“lack of reliability”and“poor customer service”tend to appear in negative reviews.Whereas,the terms“convenience,”“user-friendly interface,”“simple process,”and“secure system”tend to appear in positive reviews.On the basis of information system success theory,we categorized the topics“convenience,”“user-friendly interface,”and“simple process,”as system quality.In addition,“poor customer service”was categorized as service quality.Furthermore,based on the previous studies of trust and security,“lack of reliability”and“secure system”were categorized as trust and security,respectively.These outcomes indicate that users are satisfied when they perceive that system quality and security of specific mobile payments are great.On the contrary,users are dissatisfied when they feel that service quality and reliability of specific mobile payments is lacking.Overall,our research implies that a novel structural topic model is an effective method to explore mobile payment user experience. 展开更多
关键词 Mobile payment user satisfaction online review structural topic model
下载PDF
NON-PARAMETRIC TOPIC MODEL FOR DISCOVERING GEOGRAPHICAL TOPIC VARIATIONS
17
作者 Qi Xiang Huang Yu +3 位作者 Song Jun Huang Tinglei Wang Hongqi Fu Kun 《Journal of Electronics(China)》 2014年第6期576-586,共11页
This paper presents a non-parametric topic model that captures not only the latent topics in text collections, but also how the topics change over space. Unlike other recent work that relies on either Gaussian assumpt... This paper presents a non-parametric topic model that captures not only the latent topics in text collections, but also how the topics change over space. Unlike other recent work that relies on either Gaussian assumptions or discretization of locations, here topics are associated with a distance dependent Chinese Restaurant Process(ddC RP), and for each document, the observed words are influenced by the document's GPS-tag. Our model allows both unbound number and flexible distribution of the geographical variations of the topics' content. We develop a Gibbs sampler for the proposal, and compare it with existing models on a real data set basis. 展开更多
关键词 Text mining Topic model Geographical topics Bayesian non-parameter
下载PDF
Identification of Topics from Scientific Papers through Topic Modeling
18
作者 Denis Luiz Marcello Owa 《Open Journal of Applied Sciences》 2021年第4期541-548,共8页
Topic modeling is a probabilistic model that identifies topics covered in text(s). In this paper, topics were loaded from two implementations of topic modeling, namely, Latent Semantic Indexing (LSI) and Latent Dirich... Topic modeling is a probabilistic model that identifies topics covered in text(s). In this paper, topics were loaded from two implementations of topic modeling, namely, Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA). This analysis was performed in a corpus of 1000 academic papers written in English, obtained from PLOS ONE website, in the areas of Biology, Medicine, Physics and Social Sciences. The objective is to verify if the four academic fields were represented in the four topics obtained by topic modeling. The four topics obtained from Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) did not represent the four academic fields. 展开更多
关键词 Topic modeling Corpus Linguistics Gensim LSI LDA
下载PDF
Ensemble Deep Learning Framework for Situational Aspects-Based Annotation and Classification of International Student’s Tweets during COVID-19
19
作者 Shabir Hussain Muhammad Ayoub +4 位作者 Yang Yu Junaid Abdul Wahid Akmal Khan Dietmar P.F.Moller Hou Weiyan 《Computers, Materials & Continua》 SCIE EI 2023年第6期5355-5377,共23页
As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles an... As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles and hardships during this difficult time.To better understand the sentiments and experiences of these international students,we developed the Situational Aspect-Based Annotation and Classification(SABAC)text mining framework.This framework uses a three-layer approach,combining baseline Deep Learning(DL)models with Machine Learning(ML)models as meta-classifiers to accurately predict the sentiments and aspects expressed in tweets from our collected Student-COVID-19 dataset.Using the pro-posed aspect2class annotation algorithm,we labeled bulk unlabeled tweets according to their contained aspect terms.However,we also recognized the challenges of reducing data’s high dimensionality and sparsity to improve performance and annotation on unlabeled datasets.To address this issue,we proposed the Volatile Stopwords Filtering(VSF)technique to reduce sparsity and enhance classifier performance.The resulting Student-COVID Twitter dataset achieved a sophisticated accuracy of 93.21%when using the random forest as a meta-classifier.Through testing on three benchmark datasets,we found that the SABAC ensemble framework performed exceptionally well.Our findings showed that international students during the pandemic faced various issues,including stress,uncertainty,health concerns,financial stress,and difficulties with online classes and returning to school.By analyzing and summarizing these annotated tweets,decision-makers can better understand and address the real-time problems international students face during the ongoing pandemic. 展开更多
关键词 COVID-19 pandemic situational awareness ensemble learning aspect-based text classification deep learning models international students topic modeling
下载PDF
ESG Discourse Analysis Through BERTopic: Comparing News Articles and Academic Papers
20
作者 Haein Lee Seon Hong Lee +1 位作者 Kyeo Re Lee Jang Hyun Kim 《Computers, Materials & Continua》 SCIE EI 2023年第6期6023-6037,共15页
Environmental,social,and governance(ESG)factors are critical in achieving sustainability in business management and are used as values aiming to enhance corporate value.Recently,non-financial indicators have been cons... Environmental,social,and governance(ESG)factors are critical in achieving sustainability in business management and are used as values aiming to enhance corporate value.Recently,non-financial indicators have been considered as important for the actual valuation of corporations,thus analyzing natural language data related to ESG is essential.Several previous studies limited their focus to specific countries or have not used big data.Past methodologies are insufficient for obtaining potential insights into the best practices to leverage ESG.To address this problem,in this study,the authors used data from two platforms:LexisNexis,a platform that provides media monitoring,and Web of Science,a platform that provides scientific papers.These big data were analyzed by topic modeling.Topic modeling can derive hidden semantic structures within the text.Through this process,it is possible to collect information on public and academic sentiment.The authors explored data from a text-mining perspective using bidirectional encoder representations from transformers topic(BERTopic)—a state-of-the-art topic-modeling technique.In addition,changes in subject patterns over time were considered using dynamic topic modeling.As a result,concepts proposed in an international organization such as the United Nations(UN)have been discussed in academia,and the media have formed a variety of agendas. 展开更多
关键词 ESG BERTopic natural language processing topic modeling
下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部