期刊文献+
共找到2,611篇文章
< 1 2 131 >
每页显示 20 50 100
Smart Approaches to Efficient Text Mining for Categorizing Sexual Reproductive Health Short Messages into Key Themes
1
作者 Tobias Makai Mayumbo Nyirenda 《Open Journal of Applied Sciences》 2024年第2期511-532,共22页
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a... To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms. 展开更多
关键词 Knowledge Discovery in text (KDT) Sexual Reproductive Health (SRH) text Categorization text Classification text Extraction text mining Feature Extraction Automated Classification Process Performance Stemming and Lemmatization Natural Language Processing (NLP)
下载PDF
Research and Enlightenment of Text Mining Applications in ADR from Social Media
2
作者 Lin Xueyi Pang Li +1 位作者 Huang Zhe Lian Guiyu 《Asian Journal of Social Pharmacy》 2024年第1期9-19,共11页
Objective To discuss how to use social media data for post-marketing drug safety monitoring in China as soon as possible by systematically combing the text mining applications,and to provide new ideas and methods for ... Objective To discuss how to use social media data for post-marketing drug safety monitoring in China as soon as possible by systematically combing the text mining applications,and to provide new ideas and methods for pharmacovigilance.Methods Relevant domestic and foreign literature was used to explore text classification based on machine learning,text mining based on deep learning(neural networks)and adverse drug reaction(ADR)terminology.Results and Conclusion Text classification based on traditional machine learning mainly include support vector machine(SVM)algorithm,naive Bayesian(NB)classifier,decision tree,hidden Markov model(HMM)and bidirectional en-coder representations from transformers(BERT).The main neural network text mining based on deep learning are convolution neural network(CNN),recurrent neural network(RNN)and long short-term memory(LSTM).ADR terminology standardization tools mainly include“Medical Dictionary for Regulatory Activities”(MedDRA),“WHODrug”and“Systematized Nomenclature of Medicine-Clinical Terms”(SNOMED CT). 展开更多
关键词 social media data text mining adverse drug reaction
下载PDF
Probabilistic Language Modelling for Context-Sensitive Opinion Mining
3
《信息工程期刊(中英文版)》 2015年第5期7-11,共5页
关键词 上下文相关 采矿方法 建模方法 语言 概率 机器学习 指示物 分析学
下载PDF
The Early Emotional Responses and Central Issues of People in the Epicenter of the COVID-19 Pandemic: An Analysis from Twitter Text Mining
4
作者 Eun-Joo Choi Yun-Jung Choi 《International Journal of Mental Health Promotion》 2023年第1期21-29,共9页
This study aimed to explore citizens’emotional responses and issues of interest in the context of the coronavirus disease 2019(COVID-19)pandemic.The dataset comprised 65,313 tweets with the location marked as New Yor... This study aimed to explore citizens’emotional responses and issues of interest in the context of the coronavirus disease 2019(COVID-19)pandemic.The dataset comprised 65,313 tweets with the location marked as New York State.The data collection period was four days of tweets when New York City imposed a lockdown order due to an increase in confirmed cases.Data analysis was performed using R Studio.The emotional responses in tweets were analyzed using the Bing and NRC(National Research Council Canada)dictionaries.The tweets’central issue was identified by Text Network Analysis.When tweets were classified as either positive or negative,the negative sentiment was higher.Using the NRC dictionary,eight emotional classifications were devised:“trust,”“fear,”“anticipation,”“sadness,”“anger,”“joy,”“surprise,”and“disgust.”These results indicated that citizens showed negative and trusting emotional reactions in the early days of the pandemic.Moreover,citizens showed a strong interest in overcoming and coping with other people such as social solidarity.Citizens were concerned about the confirmation of COVID-19 infection status and death.Efforts should be made to ensure citizens’psychological stability by promptly informing them of the status of infectious disease management and the route of infection. 展开更多
关键词 COVID-19 community mental health emotional responses text mining TWITTER
下载PDF
Environmental complaint insights through text mining based on the driver,pressure,state,impact,and response(DPSIR)framework:Evidence from an Italian environmental agency
5
作者 Fabiana MANSERVISI Michele BANZI +5 位作者 Tomaso TONELLI Paolo VERONESI Susanna RICCI Damiano DISTANTE Stefano FARALLI Giuseppe BORTONE 《Regional Sustainability》 2023年第3期261-281,共21页
Individuals,local communities,environmental associations,private organizations,and public representatives and bodies may all be aggrieved by environmental problems concerning poor air quality,illegal waste disposal,wa... Individuals,local communities,environmental associations,private organizations,and public representatives and bodies may all be aggrieved by environmental problems concerning poor air quality,illegal waste disposal,water contamination,and general pollution.Environmental complaints represent the expressions of dissatisfaction with these issues.As the timeconsuming of managing a large number of complaints,text mining may be useful for automatically extracting information on stakeholder priorities and concerns.The paper used text mining and semantic network analysis to crawl relevant keywords about environmental complaints from two online complaint submission systems:online claim submission system of Regional Agency for Prevention,Environment and Energy(Arpae)(“Contact Arpae”);and Arpae's internal platform for environmental pollution(“Environmental incident reporting portal”)in the Emilia-Romagna Region,Italy.We evaluated the total of 2477 records and classified this information based on the claim topic(air pollution,water pollution,noise pollution,waste,odor,soil,weather-climate,sea-coast,and electromagnetic radiation)and geographical distribution.Then,this paper used natural language processing to extract keywords from the dataset,and classified keywords ranking higher in Term Frequency-Inverse Document Frequency(TF-IDF)based on the driver,pressure,state,impact,and response(DPSIR)framework.This study provided a systemic approach to understanding the interaction between people and environment in different geographical contexts and builds sustainable and healthy communities.The results showed that most complaints are from the public and associated with air pollution and odor.Factories(particularly foundries and ceramic industries)and farms are identified as the drivers of environmental issues.Citizen believed that environmental issues mainly affect human well-being.Moreover,the keywords of“odor”,“report”,“request”,“presence”,“municipality”,and“hours”were the most influential and meaningful concepts,as demonstrated by their high degree and betweenness centrality values.Keywords connecting odor(classified as impacts)and air pollution(classified as state)were the most important(such as“odor-burnt plastic”and“odor-acrid”).Complainants perceived odor annoyance as a primary environmental concern,possibly related to two main drivers:“odor-factory”and“odorsfarms”.The proposed approach has several theoretical and practical implications:text mining may quickly and efficiently address citizen needs,providing the basis toward automating(even partially)the complaint process;and the DPSIR framework might support the planning and organization of information and the identification of stakeholder concerns and priorities,as well as metrics and indicators for their assessment.Therefore,integration of the DPSIR framework with the text mining of environmental complaints might generate a comprehensive environmental knowledge base as a prerequisite for a wider exploitation of analysis to support decision-making processes and environmental management activities. 展开更多
关键词 Environmental complaints text mining approach Term Frequency-Inverse Document Frequency(TF-IDF) DRIVER PRESSURE STATE impact and response(DPSIR)framework Semantic network analysis Regional Agency for Prevention Environment and Energy(Arpae)
下载PDF
基于TextCNN-Attention-BiLSTM融合模型的煤矿隐患文本分类研究
6
作者 罗海平 曾向阳 陈勇 《武汉理工大学学报(信息与管理工程版)》 CAS 2024年第2期299-305,共7页
为实现大量煤矿隐患文本的迅速、精确分类,及时了解安全概况并加以管理。首先,选取安全文库网中多个煤矿隐患数据库为实验数据源,对煤矿隐患文本进行预处理,包括去除噪声词、分词和词向量表示等;其次,利用TextCNN对文本进行卷积操作,提... 为实现大量煤矿隐患文本的迅速、精确分类,及时了解安全概况并加以管理。首先,选取安全文库网中多个煤矿隐患数据库为实验数据源,对煤矿隐患文本进行预处理,包括去除噪声词、分词和词向量表示等;其次,利用TextCNN对文本进行卷积操作,提取不同尺寸的特征表示,再利用BiLSTM模型对得到的特征向量进行时序建模,并结合注意力机制(Attention),从而更好地关注文本中关键信息,捕捉文本全局语义信息;最后,利用全连接层的多标签分类器预测文本隐患类别。实验结果表明:TextCNN-Attention-BiLSTM融合模型在准确率、精确率、召回率和F 1值上均达到92%以上,为煤矿隐患文本分类提供了一种更加准确和有效的解决方案,对煤矿安全管理优化具有重要意义。 展开更多
关键词 煤矿安全 textCNN 注意力机制 BiLSTM 文本分类
下载PDF
中文Web文本挖掘系统WebTextMiner开发 被引量:1
7
作者 魏松 钟义信 王翔英 《计算机应用研究》 CSCD 北大核心 2006年第6期211-213,共3页
W eb文本挖掘系统的开发对W eb文本挖掘的研究有着很大的推进作用。因此在对基于SVM的中文网页分类器性能研究的基础上,根据研究和实用的需要,实现了一个性能较好的中文W eb文本挖掘系统。
关键词 WEB文本挖掘 支持向量机 K-最近邻
下载PDF
Comprehensive review of text‑mining applications in finance 被引量:4
8
作者 Aaryan Gupta Vinya Dengre +1 位作者 Hamza Abubakar Kheruwala Manan Shah 《Financial Innovation》 2020年第1期732-756,共25页
Text-mining technologies have substantially affected financial industries.As the data in every sector of finance have grown immensely,text mining has emerged as an important field of research in the domain of finance.... Text-mining technologies have substantially affected financial industries.As the data in every sector of finance have grown immensely,text mining has emerged as an important field of research in the domain of finance.Therefore,reviewing the recent literature on text-mining applications in finance can be useful for identifying areas for further research.This paper focuses on the text-mining literature related to financial forecasting,banking,and corporate finance.It also analyses the existing literature on text mining in financial applications and provides a summary of some recent studies.Finally,the paper briefly discusses various text-mining methods being applied in the financial domain,the challenges faced in these applications,and the future scope of text mining in finance. 展开更多
关键词 text mining Machine learning Financial forecasting Sentiment analysis text classification Corporate finance
下载PDF
Knowledge acquisition, semantic text mining, and security risks in health and biomedical informatics 被引量:2
9
作者 J Harold Pardue William T Gerthoffer 《World Journal of Biological Chemistry》 CAS 2012年第2期27-33,共7页
Computational techniques have been adopted in medi-cal and biological systems for a long time. There is no doubt that the development and application of computational methods will render great help in better understan... Computational techniques have been adopted in medi-cal and biological systems for a long time. There is no doubt that the development and application of computational methods will render great help in better understanding biomedical and biological functions. Large amounts of datasets have been produced by biomedical and biological experiments and simulations. In order for researchers to gain knowledge from origi- nal data, nontrivial transformation is necessary, which is regarded as a critical link in the chain of knowledge acquisition, sharing, and reuse. Challenges that have been encountered include: how to efficiently and effectively represent human knowledge in formal computing models, how to take advantage of semantic text mining techniques rather than traditional syntactic text mining, and how to handle security issues during the knowledge sharing and reuse. This paper summarizes the state-of-the-art in these research directions. We aim to provide readers with an introduction of major computing themes to be applied to the medical and biological research. 展开更多
关键词 BIOMEDICAL informatics BIOINFORMATICS Knowledge SHARING Ontology matching Heterogeneous SEMANTICS SEMANTIC integration SEMANTIC data mining SEMANTIC text mining Security risk
下载PDF
Advantages of Using a Spell Checker in Text Mining Pre-Processes 被引量:1
10
作者 Jhonathan Quillo-Espino Rosa María Romero-González Alberto Lara-Guevara 《Journal of Computer and Communications》 2018年第11期43-54,共12页
The aim of this work was the behavior analysis when a spell checker was integrated as an extra pre-process during the first stage of the test mining. Different models were analyzed, choosing the most complete one cons... The aim of this work was the behavior analysis when a spell checker was integrated as an extra pre-process during the first stage of the test mining. Different models were analyzed, choosing the most complete one considering the pre-processes as the initial part of the text mining process. Algorithms for the Spanish language were developed and adapted, as well as for the methodology testing through the analysis of 2363 words. A capable notation for removing special and unwanted characters was created. Execution times of each algorithm were analyzed to test the efficiency of the text mining pre-process with and without orthographic revision. The total time was shorter with the spell-checker than without it. The key difference of this work among the existing related studies is the first time that the spell checker is used in the text mining preprocesses. 展开更多
关键词 Spell CHECKER text mining STEMMING TOKENIZATION PORTER ALGORITHM SNOWBALL ALGORITHM
下载PDF
Research on the Development of Text Mining Technology based on Bibliometrics and Knowledge Map Visualization 被引量:4
11
作者 Qiuxue Xu Na Niu +1 位作者 Yongmin Quan Zhezhi Jin 《信息工程期刊(中英文版)》 2017年第1期15-26,共12页
关键词 采矿技术 地图可视化 知识 文本数据 统计分析 社会科学 可视化分析 测量分析
下载PDF
Visualization of Special Features in “The Tale of Genji” by Text Mining and Correspondence Analysis with Clustering
12
作者 Hisako Hosoi Takayuki Yamagata +1 位作者 Yuya Ikarashi Nobuyuki Fujisawa 《Journal of Flow Control, Measurement & Visualization》 2014年第1期1-6,共6页
In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence s... In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence style by the correspondence analysis with clustering. The result shows that the text mining error in the number of auxiliary verbs can be as small as 15%. The extracted feature in this study supports the multiple authors of “The Tale of Genji”, which agrees well with the result by Murakami and Imanishi [1]. It is also found that extracted features are robust to the text mining error, which suggests that the classification error is less affected by the text mining error and the possible use of this technique for further statistical study in classical literatures. 展开更多
关键词 VISUALIZATION SCIENTIFIC Art The TALE of GENJI text mining CORRESPONDENCE Analysis CLUSTERING
下载PDF
Construction of Complex Intervention in Traditional Chinese Medicine (TCM): A Preliminary Methodological Study Based on Text Mining and Interviewing Method
13
作者 Lian Gong Wenzhi Hao Feifei Xue 《Pharmacology & Pharmacy》 2019年第3期130-136,共7页
Aim: To explore and analyze the feasibility of establishing a program of complex intervention in Traditional Chinese Medicine (TCM) based on Text Mining and Interviewing method. Methods: According to MRC, Constructing... Aim: To explore and analyze the feasibility of establishing a program of complex intervention in Traditional Chinese Medicine (TCM) based on Text Mining and Interviewing method. Methods: According to MRC, Constructing the program of complex intervention in TCM by Text Mining and Interviewing method should include 4 steps: 1) establishment of interview framework via normalization of extraction of ancient documents and Effectiveness of collection of modern periodical literatures;2) materialization of interview outline based on Focus Group Interview;3) rudimentary construction of complex intervention program based on Semi-structured Interview;4) evaluation of curative effect of complex intervention. Conclusions: It is feasible and significative to establish a program of complex intervention in TCM based on Text Mining and Interviewing method. 展开更多
关键词 TRADITIONAL Chinese MEDICINE text mining Interviewing METHOD Complex INTERVENTION
下载PDF
A Parallel Platform for Web Text Mining
14
作者 Ping Lu Zhenjiang Dong +4 位作者 Shengmei Luo Lixia Liu Shanshan Guan Shengyu Liu Qingcai Chen 《ZTE Communications》 2013年第3期56-61,共6页
With user-generated content, anyone can De a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is beeoming harder to efficiently obtain required information.... With user-generated content, anyone can De a content creator. This phenomenon has infinitely increased the amount of information circulated online, and it is beeoming harder to efficiently obtain required information. In this paper, we describe how natural language processing and text mining can be parallelized using Hadoop and Message Passing Interface. We propose a parallel web text mining platform that processes massive amounts data quickly and efficiently. Our web knowledge service platform is designed to collect information about the IT and telecommunications industries from the web and process this in-formation using natural language processing and data-mining techniques. 展开更多
关键词 natural language processing text mining massive data paral-lel web knowledge service
下载PDF
Automatic Surveillance of Pandemics Using Big Data and Text Mining
15
作者 Abdullah Alharbi Wael Alosaimi MIrfan Uddin 《Computers, Materials & Continua》 SCIE EI 2021年第7期303-317,共15页
COVID-19 disease is spreading exponentially due to the rapid transmission of the virus between humans.Different countries have tried different solutions to control the spread of the disease,including lockdowns of coun... COVID-19 disease is spreading exponentially due to the rapid transmission of the virus between humans.Different countries have tried different solutions to control the spread of the disease,including lockdowns of countries or cities,quarantines,isolation,sanitization,and masks.Patients with symptoms of COVID-19 are tested using medical testing kits;these tests must be conducted by healthcare professionals.However,the testing process is expensive and time-consuming.There is no surveillance system that can be used as surveillance framework to identify regions of infected individuals and determine the rate of spread so that precautions can be taken.This paper introduces a novel technique based on deep learning(DL)that can be used as a surveillance system to identify infected individuals by analyzing tweets related to COVID-19.The system is used only for surveillance purposes to identify regions where the spread of COVID-19 is high;clinical tests should then be used to test and identify infected individuals.The system proposed here uses recurrent neural networks(RNN)and word-embedding techniques to analyze tweets and determine whether a tweet provides information about COVID-19 or refers to individuals who have been infected with the virus.The results demonstrate that RNN can conduct this analysis more accurately than other machine learning(ML)algorithms. 展开更多
关键词 Disease surveillance social media analysis recurrent neural networks text mining
下载PDF
Contextual Text Mining Framework for Unstructured Textual Judicial Corpora through Ontologies
16
作者 Zubair Nabi Ramzan Talib +1 位作者 Muhammad Kashif Hanif Muhammad Awais 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期1357-1374,共18页
Digitalization has changed the way of information processing, and newtechniques of legal data processing are evolving. Text mining helps to analyze andsearch different court cases available in the form of digital text... Digitalization has changed the way of information processing, and newtechniques of legal data processing are evolving. Text mining helps to analyze andsearch different court cases available in the form of digital text documents toextract case reasoning and related data. This sort of case processing helps professionals and researchers to refer the previous case with more accuracy in reducedtime. The rapid development of judicial ontologies seems to deliver interestingproblem solving to legal knowledge formalization. Mining context informationthrough ontologies from corpora is a challenging and interesting field. Thisresearch paper presents a three tier contextual text mining framework throughontologies for judicial corpora. This framework comprises on the judicial corpus,text mining processing resources and ontologies for mining contextual text fromcorpora to make text and data mining more reliable and fast. A top-down ontologyconstruction approach has been adopted in this paper. The judicial corpus hasbeen selected with a sufficient dataset to process and evaluate the results.The experimental results and evaluations show significant improvements incomparison with the available techniques. 展开更多
关键词 Natural language processing judicial corpora contextual text mining ontologies information extraction information retrieval
下载PDF
Treatment Principles of Obesity with Chinese Herbal Medicine: Literature Analysis by Text Mining
17
作者 Yunyu Huang Lianjie Wang +5 位作者 Shidong Wang Feng Cai Guang Zheng Aiping Lu Xiuchen Yu Miao Jiang 《Engineering(科研)》 2013年第10期7-11,共5页
Obesity represents a social health problem worldwide, associated with serious health risks and increased mortality. The prevalence of obesity is reported to be increasing in both developed and developing countries. Ob... Obesity represents a social health problem worldwide, associated with serious health risks and increased mortality. The prevalence of obesity is reported to be increasing in both developed and developing countries. Obesity is associated with a significant range of comorbidities and is linked with increases in mortality, thus the treatment of obesity is very important. Chinese herbal medicine (CHM) has been used for weight management both in China and in western countries for many years, the effectiveness and safety of CHMs in obesity have been proved. Yet the principles of treating obesity with CHMs are hard to manage due to the complexity of TCM theory. In this study, a novel text mining method was developed based on a comprehensive collection of literatures in order to explore the treatment principles more intuitively. Networks of TCM patterns and CHMs which are most frequently used in obesity treatment are built-up and analyzed, two major principles are explored in treating obesity: one is resolving phlegm and dampness, the other is clearing heat and reinforcing deficiency. These findings might guide the clinicians in treatment of obesity. 展开更多
关键词 OBESITY CHINESE HERBAL MEDICINE Pattern Traditional CHINESE MEDICINE text mining
下载PDF
A Novel Framework for Biomedical Text Mining
18
作者 Janyl Jumadinova Oliver Bonham-Carter +2 位作者 Hanzhong Zheng Michael Camara Dejie Shi 《Journal on Big Data》 2020年第4期145-155,共11页
Text mining has emerged as an effective method of handling and extracting useful information from the exponentially growing biomedical literature and biomedical databases.We developed a novel biomedical text mining mo... Text mining has emerged as an effective method of handling and extracting useful information from the exponentially growing biomedical literature and biomedical databases.We developed a novel biomedical text mining model implemented by a multi-agent system and distributed computing mechanism.Our distributed system,TextMed,comprises of several software agents,where each agent uses a reinforcement learning method to update the sentiment of relevant text from a particular set of research articles related to specific keywords.TextMed can also operate on different physical machines to expedite its knowledge extraction by utilizing a clustering technique.We collected the biomedical textual data from PubMed and then assigned to a multi-agent biomedical text mining system,where each agent directly communicates with each other collaboratively to determine the relevant information inside the textual data.Our experimental results indicate that TexMed parallels and distributes the learning process into individual agents and appropriately learn the sentiment score of specific keywords,and efficiently find connections in biomedical information through text mining paradigm. 展开更多
关键词 Biomedical text mining reinforcement learning MULTI-AGENT distributed text mining CLUSTER
下载PDF
Text Mining Analysis of Efficiency of the Continuously Implemented Gathering Type Action Plan for Male Elderly People Obtained
19
作者 Motoya Yamada Ruriko Kidachi +4 位作者 Tetsuko Takaoka Yosuke Kamata Chiyoko Kimura Mayumi Shimizu Kazutaka Kikuchi 《Open Journal of Nursing》 2022年第1期25-41,共17页
<strong>Aim: </strong>To clarify transformation of the participants’ consciousness for rebuilding the community and its factors from the discussion contents by actions for male elderly people in Town A in... <strong>Aim: </strong>To clarify transformation of the participants’ consciousness for rebuilding the community and its factors from the discussion contents by actions for male elderly people in Town A in Fukushima prefecture. <strong>Design: </strong>This study was an action research. <strong>Method: </strong>The author verbalized discussion contents of the action conducted in 2018-2019 and analyzed them for each year by the text mining method. <strong>Results: </strong>The word appearance frequency was high in the order of “Person” and “Town A” in both years. One large word network was formed in 2018 and its topic was about what the participants feel in their life in Town A. Two large word networks were formed in 2019 and their topic was about the community participation including difficulty in motivating others such as how people who do not participate can feel like joining it. 展开更多
关键词 Action Research Male Elderly People Community Reconstitution text mining Method Nuclear Power Plant Accident
下载PDF
Text Mining Based on the Korean Word Segmentation System in the Context of Big Data
20
作者 Yongmin Quan Na Niu +1 位作者 Hongyi Li Zhezhi Jin 《信息工程期刊(中英文版)》 2018年第1期1-7,共7页
Text mining is a text data analysis,found that the relationship between concepts and underlying concepts from unstructured text,it is extracted from large text database has not yet been realized patterns or associatio... Text mining is a text data analysis,found that the relationship between concepts and underlying concepts from unstructured text,it is extracted from large text database has not yet been realized patterns or associations,some information retrieval and text processing system can find the relationship between words and paragraphs.This article first describes the data sources and a brief introduction to the related platforms and functional components.Secondly,it explains the Chinese word segmentation and the Korean word segmentation system.At last,it takes the news,documents and materials of the Korean Peninsula as well as the various public opinion data on the network as the basic data for the research.The examples of word frequency graph and word cloud graph is carried out to show the results of text mining through Chinese word segmentation system and Korean word segmentation system. 展开更多
关键词 BIG Data Platform Chinese WORD SEGMENTATION SYSTEM KOREAN WORD SEGMENTATION SYSTEM text mining
下载PDF
上一页 1 2 131 下一页 到第
使用帮助 返回顶部