期刊文献+
共找到112篇文章
< 1 2 6 >
每页显示 20 50 100
Assessing trends in wildland-urban interface fire research through text mining: a comprehensive analysis of published literature
1
作者 Hafsae Lamsaf Asmae Lamsaf +1 位作者 Mounir A.Kerroum Miguel Almeida 《Journal of Forestry Research》 SCIE EI CAS CSCD 2024年第4期102-114,共13页
Research on fires at the wildland-urban inter-face(WUI)has generated significant insights and advance-ments across various fields of study.Environmental,agri-culture,and social sciences have played prominent roles in ... Research on fires at the wildland-urban inter-face(WUI)has generated significant insights and advance-ments across various fields of study.Environmental,agri-culture,and social sciences have played prominent roles in understanding the impacts of fires in the environment,in protecting communities,and addressing management challenges.This study aimed to create a database using a text mining technique for global researchers interested in WUI-projects and highlighting the interest of countries in this field.Author’s-Keywords analysis emphasized the dominance of fire science-related terms,especially related to WUI,and identified keyword clusters related to the WUI fire-risk-assessment-system-“exposure”,“danger”,and“vulnerability”within wildfire research.Trends over the past decade showcase shifting research interests with a growing focus on WUI fires,while regional variations highlighted that the“exposure”keyword cluster received greater atten-tion in the southern Europe and South America.However,vulnerability keywords have relatively a lower representation across all regions.The analysis underscores the interdisci-plinary nature of WUI research and emphasizes the need for targeted approaches to address the unique challenges of the wildland-urban interface.Overall,this study provides valu-able insights for researchers and serves as a foundation for further collaboration in this field through the understanding of the trends over recent years and in different regions. 展开更多
关键词 WUI Text mining WILDFIRES Fire science State of the art Scientific publications
下载PDF
Introducing MagBERT:A language model for magnesium textual data mining and analysis
2
作者 Surjeet Kumar Russlan Jaafreh +2 位作者 Nirpendra Singh Kotiba Hamad Dae Ho Yoon 《Journal of Magnesium and Alloys》 SCIE EI CAS CSCD 2024年第8期3216-3228,共13页
Magnesium(Mg)based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio.However,to fully harness the potential of Mg alloys,structured analytics are esse... Magnesium(Mg)based materials hold immense potential for various applications due to their lightweight and high strength-to-weight ratio.However,to fully harness the potential of Mg alloys,structured analytics are essential to gain valuable insights from centuries of accumulated knowledge.Efficient information extraction from the vast corpus of scientific literature is crucial for this purpose.In this work,we introduce MagBERT,a BERT-based language model specifically trained for Mg-based materials.Utilizing a dataset of approximately 370,000 abstracts focused on Mg and its alloys,MagBERT is designed to understand the intricate details and specialized terminology of this domain.Through rigorous evaluation,we demonstrate the effectiveness of MagBERT for information extraction using a fine-tuned named entity recognition(NER)model,named MagNER.This NER model can extract mechanical,microstructural,and processing properties related to Mg alloys.For instance,we have created an Mg alloy dataset that includes properties such as ductility,yield strength,and ultimate tensile strength(UTS),along with standard alloy names.The introduction of MagBERT is a novel advancement in the development of Mg-specific language models,marking a significant milestone in the discovery of Mg alloys and textual information extraction.By making the pre-trained weights of MagBERT publicly accessible,we aim to accelerate research and innovation in the field of Mg-based materials through efficient information extraction and knowledge discovery. 展开更多
关键词 Mg alloys MagBERT BERT NLP Text mining Information extraction
下载PDF
Dynamic evaluation of digital and green development policies based on text mining of the PMC framework
3
作者 Ye Chunmei Wu Lihua 《Journal of Southeast University(English Edition)》 EI CAS 2024年第3期319-326,共8页
Aiming to identify policy topics and their evolutionary logic that enhance the digital and green development(dual development)of traditional manufacturing enterprises,address weaknesses in current policies,and provide... Aiming to identify policy topics and their evolutionary logic that enhance the digital and green development(dual development)of traditional manufacturing enterprises,address weaknesses in current policies,and provide resources for refining dual development policies,a total of 15954 dual development-related policies issued by national and various departmental authorities in China from January 2000 to August 2023 were analyzed.Based on topic modeling techniques and the policy modeling consistency(PMC)framework,the evolution of policy topics was visualized,and a dynamic assessment of the policies was conducted.The results show that the digital and green development policy framework is progressively refined,and the governance philosophy shifts from a“regulatory government”paradigm to a“service-oriented government”.The support pattern evolves from“dispersed matching”to“integrated symbiosis”.However,there are still significant deficiencies in departmental cooperation,balanced measures,coordinated links,and multi-stakeholder participation.Future policy improvements should,therefore,focus on guiding multi-stakeholder participation,enhancing public demand orientation,and addressing the entire value chain.These steps aim to create an open and shared digital industry ecosystem to promote the coordinated dual development of traditional manufacturing enterprises. 展开更多
关键词 digital and green development text mining topic modeling policy modeling consistency(PMC)framework machine learning
下载PDF
Smart Approaches to Efficient Text Mining for Categorizing Sexual Reproductive Health Short Messages into Key Themes
4
作者 Tobias Makai Mayumbo Nyirenda 《Open Journal of Applied Sciences》 2024年第2期511-532,共22页
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a... To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms. 展开更多
关键词 Knowledge Discovery in Text (KDT) Sexual Reproductive Health (SRH) Text Categorization Text Classification Text Extraction Text mining Feature Extraction Automated Classification Process Performance Stemming and Lemmatization Natural Language Processing (NLP)
下载PDF
Research and Enlightenment of Text Mining Applications in ADR from Social Media
5
作者 Lin Xueyi Pang Li +1 位作者 Huang Zhe Lian Guiyu 《Asian Journal of Social Pharmacy》 2024年第1期9-19,共11页
Objective To discuss how to use social media data for post-marketing drug safety monitoring in China as soon as possible by systematically combing the text mining applications,and to provide new ideas and methods for ... Objective To discuss how to use social media data for post-marketing drug safety monitoring in China as soon as possible by systematically combing the text mining applications,and to provide new ideas and methods for pharmacovigilance.Methods Relevant domestic and foreign literature was used to explore text classification based on machine learning,text mining based on deep learning(neural networks)and adverse drug reaction(ADR)terminology.Results and Conclusion Text classification based on traditional machine learning mainly include support vector machine(SVM)algorithm,naive Bayesian(NB)classifier,decision tree,hidden Markov model(HMM)and bidirectional en-coder representations from transformers(BERT).The main neural network text mining based on deep learning are convolution neural network(CNN),recurrent neural network(RNN)and long short-term memory(LSTM).ADR terminology standardization tools mainly include“Medical Dictionary for Regulatory Activities”(MedDRA),“WHODrug”and“Systematized Nomenclature of Medicine-Clinical Terms”(SNOMED CT). 展开更多
关键词 social media data text mining adverse drug reaction
下载PDF
The Early Emotional Responses and Central Issues of People in the Epicenter of the COVID-19 Pandemic: An Analysis from Twitter Text Mining 被引量:1
6
作者 Eun-Joo Choi Yun-Jung Choi 《International Journal of Mental Health Promotion》 2023年第1期21-29,共9页
This study aimed to explore citizens’emotional responses and issues of interest in the context of the coronavirus disease 2019(COVID-19)pandemic.The dataset comprised 65,313 tweets with the location marked as New Yor... This study aimed to explore citizens’emotional responses and issues of interest in the context of the coronavirus disease 2019(COVID-19)pandemic.The dataset comprised 65,313 tweets with the location marked as New York State.The data collection period was four days of tweets when New York City imposed a lockdown order due to an increase in confirmed cases.Data analysis was performed using R Studio.The emotional responses in tweets were analyzed using the Bing and NRC(National Research Council Canada)dictionaries.The tweets’central issue was identified by Text Network Analysis.When tweets were classified as either positive or negative,the negative sentiment was higher.Using the NRC dictionary,eight emotional classifications were devised:“trust,”“fear,”“anticipation,”“sadness,”“anger,”“joy,”“surprise,”and“disgust.”These results indicated that citizens showed negative and trusting emotional reactions in the early days of the pandemic.Moreover,citizens showed a strong interest in overcoming and coping with other people such as social solidarity.Citizens were concerned about the confirmation of COVID-19 infection status and death.Efforts should be made to ensure citizens’psychological stability by promptly informing them of the status of infectious disease management and the route of infection. 展开更多
关键词 COVID-19 community mental health emotional responses text mining TWITTER
下载PDF
Environmental complaint insights through text mining based on the driver,pressure,state,impact,and response(DPSIR)framework:Evidence from an Italian environmental agency
7
作者 Fabiana MANSERVISI Michele BANZI +5 位作者 Tomaso TONELLI Paolo VERONESI Susanna RICCI Damiano DISTANTE Stefano FARALLI Giuseppe BORTONE 《Regional Sustainability》 2023年第3期261-281,共21页
Individuals,local communities,environmental associations,private organizations,and public representatives and bodies may all be aggrieved by environmental problems concerning poor air quality,illegal waste disposal,wa... Individuals,local communities,environmental associations,private organizations,and public representatives and bodies may all be aggrieved by environmental problems concerning poor air quality,illegal waste disposal,water contamination,and general pollution.Environmental complaints represent the expressions of dissatisfaction with these issues.As the timeconsuming of managing a large number of complaints,text mining may be useful for automatically extracting information on stakeholder priorities and concerns.The paper used text mining and semantic network analysis to crawl relevant keywords about environmental complaints from two online complaint submission systems:online claim submission system of Regional Agency for Prevention,Environment and Energy(Arpae)(“Contact Arpae”);and Arpae's internal platform for environmental pollution(“Environmental incident reporting portal”)in the Emilia-Romagna Region,Italy.We evaluated the total of 2477 records and classified this information based on the claim topic(air pollution,water pollution,noise pollution,waste,odor,soil,weather-climate,sea-coast,and electromagnetic radiation)and geographical distribution.Then,this paper used natural language processing to extract keywords from the dataset,and classified keywords ranking higher in Term Frequency-Inverse Document Frequency(TF-IDF)based on the driver,pressure,state,impact,and response(DPSIR)framework.This study provided a systemic approach to understanding the interaction between people and environment in different geographical contexts and builds sustainable and healthy communities.The results showed that most complaints are from the public and associated with air pollution and odor.Factories(particularly foundries and ceramic industries)and farms are identified as the drivers of environmental issues.Citizen believed that environmental issues mainly affect human well-being.Moreover,the keywords of“odor”,“report”,“request”,“presence”,“municipality”,and“hours”were the most influential and meaningful concepts,as demonstrated by their high degree and betweenness centrality values.Keywords connecting odor(classified as impacts)and air pollution(classified as state)were the most important(such as“odor-burnt plastic”and“odor-acrid”).Complainants perceived odor annoyance as a primary environmental concern,possibly related to two main drivers:“odor-factory”and“odorsfarms”.The proposed approach has several theoretical and practical implications:text mining may quickly and efficiently address citizen needs,providing the basis toward automating(even partially)the complaint process;and the DPSIR framework might support the planning and organization of information and the identification of stakeholder concerns and priorities,as well as metrics and indicators for their assessment.Therefore,integration of the DPSIR framework with the text mining of environmental complaints might generate a comprehensive environmental knowledge base as a prerequisite for a wider exploitation of analysis to support decision-making processes and environmental management activities. 展开更多
关键词 Environmental complaints Text mining approach Term Frequency-Inverse Document Frequency(TF-IDF) DRIVER PRESSURE STATE impact and response(DPSIR)framework Semantic network analysis Regional Agency for Prevention Environment and Energy(Arpae)
下载PDF
Literature classification and its applications in condensed matter physics and materials science by natural language processing
8
作者 吴思远 朱天念 +5 位作者 涂思佳 肖睿娟 袁洁 吴泉生 李泓 翁红明 《Chinese Physics B》 SCIE EI CAS CSCD 2024年第5期117-123,共7页
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio... The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions. 展开更多
关键词 natural language processing text mining materials science
下载PDF
A comprehensive review of existing corpora and methods for creating annotated corpora for event extraction tasks
9
作者 Mohd Hafizul Afifi Abdullah Norshakirah Aziz +3 位作者 Said Jadid Abdulkadir Kashif Hussain Hitham Alhussian Noureen Talpur 《Journal of Data and Information Science》 CSCD 2024年第4期196-238,共43页
Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are ... Purpose:The purpose of this study is to serve as a comprehensive review of the existing annotated corpora.This review study aims to provide information on the existing annotated corpora for event extraction,which are limited but essential for training and improving the existing event extraction algorithms.In addition to the primary goal of this study,it provides guidelines for preparing an annotated corpus and suggests suitable tools for the annotation task.Design/methodology/approach:This study employs an analytical approach to examine available corpus that is suitable for event extraction tasks.It offers an in-depth analysis of existing event extraction corpora and provides systematic guidelines for researchers to develop accurate,high-quality corpora.This ensures the reliability of the created corpus and its suitability for training machine learning algorithms.Findings:Our exploration reveals a scarcity of annotated corpora for event extraction tasks.In particular,the English corpora are mainly focused on the biomedical and general domains.Despite the issue of annotated corpora scarcity,there are several high-quality corpora available and widely used as benchmark datasets.However,access to some of these corpora might be limited owing to closed-access policies or discontinued maintenance after being initially released,rendering them inaccessible owing to broken links.Therefore,this study documents the available corpora for event extraction tasks.Research limitations:Our study focuses only on well-known corpora available in English and Chinese.Nevertheless,this study places a strong emphasis on the English corpora due to its status as a global lingua franca,making it widely understood compared to other languages.Practical implications:We genuinely believe that this study provides valuable knowledge that can serve as a guiding framework for preparing and accurately annotating events from text corpora.It provides comprehensive guidelines for researchers to improve the quality of corpus annotations,especially for event extraction tasks across various domains.Originality/value:This study comprehensively compiled information on the existing annotated corpora for event extraction tasks and provided preparation guidelines. 展开更多
关键词 Information extraction Event extraction Text mining Large language model Natural language processing
下载PDF
A Study on the Correlation Between LIWC Word Categories and Chinese Composition Writing Performance of Fourth,Fifth,and Sixth Grade Students
10
作者 Yufeng Wu 《Journal of Contemporary Educational Research》 2024年第1期199-206,共8页
This study focuses on the analysis of the Chinese composition writing performance of fourth,fifth,and sixth grade students in 16 selected schools in Longhua District,Shenzhen during the spring semester of 2023.Using L... This study focuses on the analysis of the Chinese composition writing performance of fourth,fifth,and sixth grade students in 16 selected schools in Longhua District,Shenzhen during the spring semester of 2023.Using LIWC(Linguistic Inquiry and Word Count)as a text analysis tool,the study explores the impact of LIWC categories on writing performance which is scaled by score.The results show that the simple LIWC word categories have a significant positive influence on the composition scores of lower-grade students;while complex LIWC word categories have a significant negative influence on the composition scores of lower-grade students but a significant positive influence on the composition scores of higher-grade students.Process word categories have a positive influence on the composition scores of all three grades,but the impact of complex process word categories increases as the grade level rises. 展开更多
关键词 Chinese composition LIWC word categories Writing performance Grades Text mining
下载PDF
Topic prevalence and trends of metaverse in healthcare:a bibliometric analysis
11
作者 Pei Wu Donghua Chen Runtong Zhang 《Data Science and Management》 2024年第2期129-143,共15页
Metaverse technology is an advanced form of virtual reality and augmented technologies. It merges the digital world with the real world, thus benefitting healthcare services. Medical informatics is promising in the me... Metaverse technology is an advanced form of virtual reality and augmented technologies. It merges the digital world with the real world, thus benefitting healthcare services. Medical informatics is promising in the metaverse. Despite the increasing adoption of the metaverse in commercial applications, a considerable research gap remains in the academic domain, which hinders the comprehensive delineation of research prospects for the metaverse in healthcare. This study employs text-mining methods to investigate the prevalence and trends of the metaverse in healthcare;in particular, more than 34,000 academic articles and news reports are analyzed. Subsequently, the topic prevalence, similarity, and correlation are measured using topic-modeling methods. Based on bibliometric analysis, this study proposes a theoretical framework from the perspectives of knowledge, socialization, digitization, and intelligence. This study provides insights into its application in healthcare via an extensive literature review. The key to promoting the metaverse in healthcare is to perform technological upgrades in computer science, telecommunications, healthcare services, and computational biology. Digitization, virtualization, and hyperconnectivity technologies are crucial in advancing healthcare systems. Realizing their full potential necessitates collective support and concerted effort toward the transformation of relevant service providers, the establishment of a digital economy value system, and the reshaping of social governance and health concepts. The results elucidate the current state of research and offer guidance for the advancement of the metaverse in healthcare. 展开更多
关键词 Metaverse Health informatics Text mining Bibliometric analysis
下载PDF
Evaluation of Driver-Induced Human Errors in Smart Construction Tower Crane Operations Based on DEMATEL-ISM-MICMAC
12
作者 Jiahao Wang Wen Si 《Journal of Applied Mathematics and Physics》 2024年第4期1541-1556,共16页
With the advent of Industry 4.0, smart construction sites have seen significant development in China. However, accidents involving digitized tower cranes continue to be a persistent issue. Among the contributing facto... With the advent of Industry 4.0, smart construction sites have seen significant development in China. However, accidents involving digitized tower cranes continue to be a persistent issue. Among the contributing factors, human unsafe behavior stands out as a primary cause for these incidents. This study aims to assess the human reliability of tower crane operations on smart construction sites. To proactively enhance safety measures, the research employs text mining techniques (TF-IDF-Truncated SVD-Complement NB) to identify patterns of human errors among tower crane operators. Building upon the SHEL model, the study categorizes behavioral factors affecting human reliability in the man-machine interface, leading to the establishment of the Performance Shaping Factors (PSFs) system. Furthermore, the research constructs an error impact indicator system for the intelligent construction site tower crane operator interface. Using the DEMATEL method, it analyzes the significance of various factors influencing human errors in tower crane operations. Additionally, the ISM-MICMAC method is applied to unveil the hierarchical relationships and driving-dependent connections among these influencing factors. The findings indicate that personal state, operating procedures, and physical environment directly impact human errors, while personal capability, technological environment, and one fundamental organizational management factor contribute indirectly. . 展开更多
关键词 Text mining DEMATEL-ISM-MICMAC Performance Shaping Factors Smart Construction Tower Crane Operator
下载PDF
Comprehensive review of text‑mining applications in finance 被引量:4
13
作者 Aaryan Gupta Vinya Dengre +1 位作者 Hamza Abubakar Kheruwala Manan Shah 《Financial Innovation》 2020年第1期732-756,共25页
Text-mining technologies have substantially affected financial industries.As the data in every sector of finance have grown immensely,text mining has emerged as an important field of research in the domain of finance.... Text-mining technologies have substantially affected financial industries.As the data in every sector of finance have grown immensely,text mining has emerged as an important field of research in the domain of finance.Therefore,reviewing the recent literature on text-mining applications in finance can be useful for identifying areas for further research.This paper focuses on the text-mining literature related to financial forecasting,banking,and corporate finance.It also analyses the existing literature on text mining in financial applications and provides a summary of some recent studies.Finally,the paper briefly discusses various text-mining methods being applied in the financial domain,the challenges faced in these applications,and the future scope of text mining in finance. 展开更多
关键词 Text mining Machine learning Financial forecasting Sentiment analysis Text classification Corporate finance
下载PDF
Research on Text Mining of Syndrome Element Syndrome Differentiation by Natural Language Processing 被引量:5
14
作者 DENG Wen-Xiang ZHU Jian-Ping +6 位作者 LI Jing YUAN Zhi-Ying WU Hua-Ying YAO Zhong-Hua ZHANG Yi-Ge ZHANG Wen-An HUANG Hui-Yong 《Digital Chinese Medicine》 2019年第2期61-71,共11页
Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis envir... Objective Natural language processing (NLP) was used to excavate and visualize the core content of syndrome element syndrome differentiation (SESD). Methods The first step was to build a text mining and analysis environment based on Python language, and built a corpus based on the core chapters of SESD. The second step was to digitalize the corpus. The main steps included word segmentation, information cleaning and merging, document-entry matrix, dictionary compilation and information conversion. The third step was to mine and display the internal information of SESD corpus by means of word cloud, keyword extraction and visualization. Results NLP played a positive role in computer recognition and comprehension of SESD. Different chapters had different keywords and weights. Deficiency syndrome elements were an important component of SESD, such as "Qi deficiency""Yang deficiency" and "Yin deficiency". The important syndrome elements of substantiality included "Blood stasis""Qi stagnation", etc. Core syndrome elements were closely related. Conclusions Syndrome differentiation and treatment was the core of SESD. Using NLP to excavate syndromes differentiation could help reveal the internal relationship between syndromes differentiation and provide basis for artificial intelligence to learn syndromes differentiation. 展开更多
关键词 Syndrome element syndrome differentiation (SESD) Natural language processing (NLP) Diagnostics of TCM Artificial intelligence Text mining
下载PDF
Mining Related Articles for Automatic Journal Cataloging
15
作者 Yuqing Mao Zhiyong Lu 《Journal of Data and Information Science》 2016年第2期45-59,共15页
Purpose: This paper is an investigation of the effectiveness of the method of clustering biomedical journals through mining the content similarity of journal articles. Design/methodology/approach: 3,265 journals in ... Purpose: This paper is an investigation of the effectiveness of the method of clustering biomedical journals through mining the content similarity of journal articles. Design/methodology/approach: 3,265 journals in Pub Med are analyzed based on article content similarity and Web usage, respectively. Comparisons of the two analysis approaches and a citation-based approach are given.Findings: Our results suggest that article content similarity is useful for clustering biomedical journals, and the content-similarity-based journal clustering method is more robust and less subject to human factors compared with the usage-based approach and the citation-based approach. Research limitations: Our paper currently focuses on clustering journals in the biomedical domain because there are a large volume of freely available resources such as Pub Med and Me SH in this field. Further investigation is needed to improve this approach to fit journals in other domains.Practical implications: Our results show that it is feasible to catalog biomedical journals by mining the article content similarity. This work is also significant in serving practical needs in research portfolio analysis.Originality/value: To the best of our knowledge, we are among the first to report on clustering journals in the biomedical field through mining the article content similarity. This method can be integrated with existing approaches to create a new paradigm for future studies of journal clustering. 展开更多
关键词 PUBMED JOURNALS CLUSTER CATALOG Text mining Research evaluation
下载PDF
Contextual Text Mining Framework for Unstructured Textual Judicial Corpora through Ontologies
16
作者 Zubair Nabi Ramzan Talib +1 位作者 Muhammad Kashif Hanif Muhammad Awais 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期1357-1374,共18页
Digitalization has changed the way of information processing, and newtechniques of legal data processing are evolving. Text mining helps to analyze andsearch different court cases available in the form of digital text... Digitalization has changed the way of information processing, and newtechniques of legal data processing are evolving. Text mining helps to analyze andsearch different court cases available in the form of digital text documents toextract case reasoning and related data. This sort of case processing helps professionals and researchers to refer the previous case with more accuracy in reducedtime. The rapid development of judicial ontologies seems to deliver interestingproblem solving to legal knowledge formalization. Mining context informationthrough ontologies from corpora is a challenging and interesting field. Thisresearch paper presents a three tier contextual text mining framework throughontologies for judicial corpora. This framework comprises on the judicial corpus,text mining processing resources and ontologies for mining contextual text fromcorpora to make text and data mining more reliable and fast. A top-down ontologyconstruction approach has been adopted in this paper. The judicial corpus hasbeen selected with a sufficient dataset to process and evaluate the results.The experimental results and evaluations show significant improvements incomparison with the available techniques. 展开更多
关键词 Natural language processing judicial corpora contextual text mining ontologies information extraction information retrieval
下载PDF
Automatic Surveillance of Pandemics Using Big Data and Text Mining
17
作者 Abdullah Alharbi Wael Alosaimi MIrfan Uddin 《Computers, Materials & Continua》 SCIE EI 2021年第7期303-317,共15页
COVID-19 disease is spreading exponentially due to the rapid transmission of the virus between humans.Different countries have tried different solutions to control the spread of the disease,including lockdowns of coun... COVID-19 disease is spreading exponentially due to the rapid transmission of the virus between humans.Different countries have tried different solutions to control the spread of the disease,including lockdowns of countries or cities,quarantines,isolation,sanitization,and masks.Patients with symptoms of COVID-19 are tested using medical testing kits;these tests must be conducted by healthcare professionals.However,the testing process is expensive and time-consuming.There is no surveillance system that can be used as surveillance framework to identify regions of infected individuals and determine the rate of spread so that precautions can be taken.This paper introduces a novel technique based on deep learning(DL)that can be used as a surveillance system to identify infected individuals by analyzing tweets related to COVID-19.The system is used only for surveillance purposes to identify regions where the spread of COVID-19 is high;clinical tests should then be used to test and identify infected individuals.The system proposed here uses recurrent neural networks(RNN)and word-embedding techniques to analyze tweets and determine whether a tweet provides information about COVID-19 or refers to individuals who have been infected with the virus.The results demonstrate that RNN can conduct this analysis more accurately than other machine learning(ML)algorithms. 展开更多
关键词 Disease surveillance social media analysis recurrent neural networks text mining
下载PDF
A Parallel Platform for Web Text Mining
18
作者 Ping Lu Zhenjiang Dong +4 位作者 Shengmei Luo Lixia Liu Shanshan Guan Shengyu Liu Qingcai Chen 《ZTE Communications》 2013年第3期56-61,共6页
With user-generated content, anyone can De a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is beeoming harder to efficiently obtain required information.... With user-generated content, anyone can De a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is beeoming harder to efficiently obtain required information. In this paper, we describe how natural language processing and text mining can be parallelized using Hadoop and Message Passing Interface. We propose a parallel web text mining platform that processes massive amounts data quickly and efficiently. Our web knowledge service platform is designed to collect information about the IT and telecommunications industries from the web and process this in-formation using natural language processing and data-mining techniques. 展开更多
关键词 natural language processing text mining massive data paral-lel web knowledge service
下载PDF
The Chinese Image on Twitter: An Empirical Study Based on Text Mining
19
作者 Ming Xiao Hongfa Yi 《Journalism and Mass Communication》 2016年第8期469-479,共11页
The study use crawler to get 842,917 hot tweets written in English with keyword Chinese or China. Topic modeling and sentiment analysis are used to explore the tweets. Thirty topics are extracted. Overall, 33% of the ... The study use crawler to get 842,917 hot tweets written in English with keyword Chinese or China. Topic modeling and sentiment analysis are used to explore the tweets. Thirty topics are extracted. Overall, 33% of the tweets relate to politics, and 20% relate to economy, 21% relate to culture, and 26% relate to society. Regarding the polarity, 55% of the tweets are positive, 31% are negative and the other 14% are neutral. There are only 25.3% of the tweets with obvious sentiment, most of them are joy. 展开更多
关键词 Chinese image topic modeling sentiment analysis text mining TWITTER
下载PDF
A Novel Framework for Biomedical Text Mining
20
作者 Janyl Jumadinova Oliver Bonham-Carter +2 位作者 Hanzhong Zheng Michael Camara Dejie Shi 《Journal on Big Data》 2020年第4期145-155,共11页
Text mining has emerged as an effective method of handling and extracting useful information from the exponentially growing biomedical literature and biomedical databases.We developed a novel biomedical text mining mo... Text mining has emerged as an effective method of handling and extracting useful information from the exponentially growing biomedical literature and biomedical databases.We developed a novel biomedical text mining model implemented by a multi-agent system and distributed computing mechanism.Our distributed system,TextMed,comprises of several software agents,where each agent uses a reinforcement learning method to update the sentiment of relevant text from a particular set of research articles related to specific keywords.TextMed can also operate on different physical machines to expedite its knowledge extraction by utilizing a clustering technique.We collected the biomedical textual data from PubMed and then assigned to a multi-agent biomedical text mining system,where each agent directly communicates with each other collaboratively to determine the relevant information inside the textual data.Our experimental results indicate that TexMed parallels and distributes the learning process into individual agents and appropriately learn the sentiment score of specific keywords,and efficiently find connections in biomedical information through text mining paradigm. 展开更多
关键词 Biomedical text mining reinforcement learning MULTI-AGENT distributed text mining CLUSTER
下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部