期刊文献+
共找到33,754篇文章
< 1 2 250 >
每页显示 20 50 100
The Classic Shaping of Typical Female Images by Mulan Motif in Historical Times-Historical Review and Textual Sorting of the Motif
1
作者 XIAO Yuwei ZHANG Pei 《Cultural and Religious Studies》 2024年第5期320-323,共4页
The image of Mulan is well known to the public as an important symbol in the dissemination of Chinese excellent traditional culture.This paper aims to summarise the mother-title from traditional canonical texts,to exp... The image of Mulan is well known to the public as an important symbol in the dissemination of Chinese excellent traditional culture.This paper aims to summarise the mother-title from traditional canonical texts,to explore the content and value of sustainable IP development,and to study a large number of derivatives with the image of Mulan as the mother-title,based on the wide circulation of the prototype of the mother-title“The Poem of Mulan”(木兰辞)and the positive values conveyed by the content.Through the processing and imagination of scholars and writers on the mother text in the past generations,the image of Mulan has gradually formed a relatively stable cultural communication theme in the process of dissemination in China’s historical period,and many adaptations with international influence based on the mother title of Mulan have emerged in the foreign dissemination,so through the combing and summarisation of the textual works of various periods both at home and abroad,we will dig out the textual transmission of the mother title of Mulan,which is representative of the mother title of China’s excellent traditional culture,and the development of the Chinese spiritual core.The Development of the Chinese Spiritual Kernel.This paper adopts research methods such as documentary evidence method and discourse analysis to show the textual flow of Mulan’s parent theme in a more diversified form. 展开更多
关键词 Mulan motifs feminine power textual transmission literature review identity
下载PDF
CVTD: A Robust Car-Mounted Video Text Detector
2
作者 Di Zhou Jianxun Zhang +2 位作者 Chao Li Yifan Guo Bowen Li 《Computers, Materials & Continua》 SCIE EI 2024年第2期1821-1842,共22页
Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted vid... Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted videos can assist drivers in making decisions.However,Car-mounted video text images pose challenges such as complex backgrounds,small fonts,and the need for real-time detection.We proposed a robust Car-mounted Video Text Detector(CVTD).It is a lightweight text detection model based on ResNet18 for feature extraction,capable of detecting text in arbitrary shapes.Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation(CATA)and enhanced the representation capability through stacking two Feature Pyramid Enhancement Fusion Modules(FPEFM),strengthening feature representation,and integrating text local features and global position information,reinforcing the representation capability of the CVTD model.The enhanced feature maps,when acted upon by Text Activation Maps(TAM),effectively distinguished text foreground from non-text regions.Additionally,we collected and annotated a dataset containing 2200 images of Car-mounted Video Text(CVT)under various road conditions for training and evaluating our model’s performance.We further tested our model on four other challenging public natural scene text detection benchmark datasets,demonstrating its strong generalization ability and real-time detection speed.This model holds potential for practical applications in real-world scenarios. 展开更多
关键词 Deep learning text detection Car-mounted video text detector intelligent driving assistance arbitrary shape text detector
下载PDF
C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 pathway as a therapeutic target and regulatory mechanism for spinal cord injury
3
作者 Xiangzi Wang Xiaofei Niu +4 位作者 Yingkai Wang Yang Liu Cheng Yang Xuyi Chen Zhongquan Qi 《Neural Regeneration Research》 SCIE CAS 2025年第8期2231-2244,共14页
Spinal cord injury involves non-reversible damage to the central nervous system that is characterized by limited regenerative capacity and secondary inflammatory damage.The expression of the C-C motif chemokine ligand... Spinal cord injury involves non-reversible damage to the central nervous system that is characterized by limited regenerative capacity and secondary inflammatory damage.The expression of the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis exhibits significant differences before and after injury.Recent studies have revealed that the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis is closely associated with secondary inflammatory responses and the recruitment of immune cells following spinal cord injury,suggesting that this axis is a novel target and regulatory control point for treatment.This review comprehensively examines the therapeutic strategies targeting the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis,along with the regenerative and repair mechanisms linking the axis to spinal cord injury.Additionally,we summarize the upstream and downstream inflammatory signaling pathways associated with spinal cord injury and the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis.This review primarily elaborates on therapeutic strategies that target the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis and the latest progress of research on antagonistic drugs,along with the approaches used to exploit new therapeutic targets within the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis and the development of targeted drugs.Nevertheless,there are presently no clinical studies relating to spinal cord injury that are focusing on the C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 axis.This review aims to provide new ideas and therapeutic strategies for the future treatment of spinal cord injury. 展开更多
关键词 apoptosis C-C motif chemokine ligand 2/C-C motif chemokine receptor 2 pathway C-C motif chemokine receptor 2 antagonists chemokine ligand 2 chemokine receptor 2 inflammation macrophage microglia spinal cord injury therapeutic method
下载PDF
Method to Remove Handwritten Texts Using Smart Phone
4
作者 Haiquan Fang 《Journal of Harbin Institute of Technology(New Series)》 CAS 2024年第2期12-21,共10页
To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spat... To remove handwritten texts from an image of a document taken by smart phone,an intelligent removal method was proposed that combines dewarping and Fully Convolutional Network with Atrous Convolutional and Atrous Spatial Pyramid Pooling(FCN-AC-ASPP).For a picture taken by a smart phone,firstly,the image is transformed into a regular image by the dewarping algorithm.Secondly,the FCN-AC-ASPP is used to classify printed texts and handwritten texts.Lastly,handwritten texts can be removed by a simple algorithm.Experiments show that the classification accuracy of the FCN-AC-ASPP is better than FCN,DeeplabV3+,FCN-AC.For handwritten texts removal effect,the method of combining dewarping and FCN-AC-ASPP is superior to FCN-AC-ASP alone. 展开更多
关键词 handwritten texts printed texts CLASSIFICATION FCN-AC-ASPP smart phone
下载PDF
Leveraging Uncertainty for Depth-Aware Hierarchical Text Classification
5
作者 Zixuan Wu Ye Wang +2 位作者 Lifeng Shen Feng Hu Hong Yu 《Computers, Materials & Continua》 SCIE EI 2024年第9期4111-4127,共17页
Hierarchical Text Classification(HTC)aims to match text to hierarchical labels.Existing methods overlook two critical issues:first,some texts cannot be fully matched to leaf node labels and need to be classified to th... Hierarchical Text Classification(HTC)aims to match text to hierarchical labels.Existing methods overlook two critical issues:first,some texts cannot be fully matched to leaf node labels and need to be classified to the correct parent node instead of treating leaf nodes as the final classification target.Second,error propagation occurs when a misclassification at a parent node propagates down the hierarchy,ultimately leading to inaccurate predictions at the leaf nodes.To address these limitations,we propose an uncertainty-guided HTC depth-aware model called DepthMatch.Specifically,we design an early stopping strategy with uncertainty to identify incomplete matching between text and labels,classifying them into the corresponding parent node labels.This approach allows us to dynamically determine the classification depth by leveraging evidence to quantify and accumulate uncertainty.Experimental results show that the proposed DepthMatch outperforms recent strong baselines on four commonly used public datasets:WOS(Web of Science),RCV1-V2(Reuters Corpus Volume I),AAPD(Arxiv Academic Paper Dataset),and BGC.Notably,on the BGC dataset,it improvesMicro-F1 andMacro-F1 scores by at least 1.09%and 1.74%,respectively. 展开更多
关键词 Hierarchical text classification incomplete text-label matching UNCERTAINTY depth-aware early stopping strategy
下载PDF
基于Motif的图采样算法
6
作者 石俊豪 王欣 +2 位作者 邹杰军 方宇 蒋星 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第4期552-565,共14页
图采样通过对图数据进行约简操作,获得比原图的规模更小的图结构,进而服务于图谱分析、图可视化等下游任务.现有的图采样算法侧重于保留图中显著的结构特征而忽略了节点属性,导致采样图在许多下游任务如频繁模式挖掘等,难以取得预期效果... 图采样通过对图数据进行约简操作,获得比原图的规模更小的图结构,进而服务于图谱分析、图可视化等下游任务.现有的图采样算法侧重于保留图中显著的结构特征而忽略了节点属性,导致采样图在许多下游任务如频繁模式挖掘等,难以取得预期效果.为此,提出基于Motif的节点有偏采样算法(Motif-Based Node Biased Sampling,MNBS),利用频繁Motif结构重新定义图中节点的重要性,随后进行有偏节点采样,实现融合节点属性与结构特征的采样.为了快速识别频繁Motif模式,设计了具有“提前终止”特性的Motif模式快速发现算法(Fast Motif-Pattern Discovery,FMPD),能高效且准确地发现Motif模式以支持图采样.实验表明,MNBS采样算法在多项指标上优于其他基线算法,其对数归一化累积组相关性指标平均降低0.54,使用包含“提前终止”特性策略的FMPD算法的时间消耗和内存消耗比基线算法分别降低56.1%和29.8%. 展开更多
关键词 图采样 图数据挖掘 网络分析 motif结构
下载PDF
基于3D-motif法的刮研表面形貌表征
7
作者 杨春鹏 王立华 +1 位作者 陈谢瑞 蒋维 《工程设计学报》 CSCD 北大核心 2024年第3期368-376,共9页
针对刮研表面微观特性和功能机理研究中缺少表面形貌量化表征方法的问题,采用3D-motif法对刮研表面形貌进行表征。使用LI-3型接触式三维表面形貌测量仪对刮研表面进行测量,并利用三维点云数据生成刮研表面的二维灰度图像。然后,根据3D-m... 针对刮研表面微观特性和功能机理研究中缺少表面形貌量化表征方法的问题,采用3D-motif法对刮研表面形貌进行表征。使用LI-3型接触式三维表面形貌测量仪对刮研表面进行测量,并利用三维点云数据生成刮研表面的二维灰度图像。然后,根据3D-motif法中集水盆地的定义,采用分水岭算法对刮研表面的灰度图像进行motif分割与合并。以不同精度等级的刮研表面的整体纹理区域motif分割结果为对象,定义了特征显著度并提取计算了刮研表面在2种不同面积尺度(25 mm^(2)和0.25 mm^(2))上的深度、面积、方向角、各向异性率、扁平系数和特征显著度等6项motif参数。结合部分motif参数的分布情况和motif数量的变化趋势,从形貌特征的尺寸和形态两个方面对刮研表面进行了表征和分析,实现了以较少参数完整表征刮研表面的三维形貌。研究结果可为后续刮研表面微观特性的深入分析提供理论基础。 展开更多
关键词 刮研表面 3D-motif 表面形貌 分水岭算法
下载PDF
Multi-layer network embedding on scc-based network with motif
8
作者 Lu Sun Xiaona Li +4 位作者 Mingyue Zhang Liangtian Wan Yun Lin Xianpeng Wang Gang Xu 《Digital Communications and Networks》 SCIE CSCD 2024年第3期546-556,共11页
Interconnection of all things challenges the traditional communication methods,and Semantic Communication and Computing(SCC)will become new solutions.It is a challenging task to accurately detect,extract,and represent... Interconnection of all things challenges the traditional communication methods,and Semantic Communication and Computing(SCC)will become new solutions.It is a challenging task to accurately detect,extract,and represent semantic information in the research of SCC-based networks.In previous research,researchers usually use convolution to extract the feature information of a graph and perform the corresponding task of node classification.However,the content of semantic information is quite complex.Although graph convolutional neural networks provide an effective solution for node classification tasks,due to their limitations in representing multiple relational patterns and not recognizing and analyzing higher-order local structures,the extracted feature information is subject to varying degrees of loss.Therefore,this paper extends from a single-layer topology network to a multi-layer heterogeneous topology network.The Bidirectional Encoder Representations from Transformers(BERT)training word vector is introduced to extract the semantic features in the network,and the existing graph neural network is improved by combining the higher-order local feature module of the network model representation network.A multi-layer network embedding algorithm on SCC-based networks with motifs is proposed to complete the task of end-to-end node classification.We verify the effectiveness of the algorithm on a real multi-layer heterogeneous network. 展开更多
关键词 Semantic communication and computing Multi-layer network Graph neural network motif
下载PDF
YOLOv5ST:A Lightweight and Fast Scene Text Detector
9
作者 Yiwei Liu Yingnan Zhao +2 位作者 Yi Chen Zheng Hu Min Xia 《Computers, Materials & Continua》 SCIE EI 2024年第4期909-926,共18页
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ... Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments. 展开更多
关键词 Scene text detection YOLOv5 LIGHTWEIGHT object detection
下载PDF
Smart Approaches to Efficient Text Mining for Categorizing Sexual Reproductive Health Short Messages into Key Themes
10
作者 Tobias Makai Mayumbo Nyirenda 《Open Journal of Applied Sciences》 2024年第2期511-532,共22页
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a... To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms. 展开更多
关键词 Knowledge Discovery in text (KDT) Sexual Reproductive Health (SRH) text Categorization text Classification text Extraction text Mining Feature Extraction Automated Classification Process Performance Stemming and Lemmatization Natural Language Processing (NLP)
下载PDF
Study on the Textual Coherence Function of Conjunctions in Political Texts and Their Translation Reconstruction
11
作者 Goya Guli Kader Jingwen Qiao Aixia Yang 《Journal of Contemporary Educational Research》 2024年第1期25-30,共6页
The assessment of translation quality in political texts is primarily based on achieving effective communication.Throughout the translation process,it is essential to not only accurately convey the original content bu... The assessment of translation quality in political texts is primarily based on achieving effective communication.Throughout the translation process,it is essential to not only accurately convey the original content but also effectively transform the structural mechanisms of the source language.In the translation reconstruction of political texts,various textual cohesion methods are often employed,with conjunctions serving as a primary means for semantic coherence within text units. 展开更多
关键词 Political texts CONJUNCTIONS textual cohesion Chinese to Russian translation
下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
12
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER Pre-trained language model
下载PDF
Identifying multidisciplinary problems from scientific publications based on a text generation method
13
作者 Ziyan Xu Hongqi Han +2 位作者 Linna Li Junsheng Zhang Zexu Zhou 《Journal of Data and Information Science》 CSCD 2024年第3期213-237,共25页
Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the... Purpose:A text generation based multidisciplinary problem identification method is proposed,which does not rely on a large amount of data annotation.Design/methodology/approach:The proposed method first identifies the research objective types and disciplinary labels of papers using a text classification technique;second,it generates abstractive titles for each paper based on abstract and research objective types using a generative pre-trained language model;third,it extracts problem phrases from generated titles according to regular expression rules;fourth,it creates problem relation networks and identifies the same problems by exploiting a weighted community detection algorithm;finally,it identifies multidisciplinary problems based on the disciplinary labels of papers.Findings:Experiments in the“Carbon Peaking and Carbon Neutrality”field show that the proposed method can effectively identify multidisciplinary research problems.The disciplinary distribution of the identified problems is consistent with our understanding of multidisciplinary collaboration in the field.Research limitations:It is necessary to use the proposed method in other multidisciplinary fields to validate its effectiveness.Practical implications:Multidisciplinary problem identification helps to gather multidisciplinary forces to solve complex real-world problems for the governments,fund valuable multidisciplinary problems for research management authorities,and borrow ideas from other disciplines for researchers.Originality/value:This approach proposes a novel multidisciplinary problem identification method based on text generation,which identifies multidisciplinary problems based on generative abstractive titles of papers without data annotation required by standard sequence labeling techniques. 展开更多
关键词 Problem identification MULTIDISCIPLINARY text generation text classification
下载PDF
From text to image:challenges in integrating vision into ChatGPT for medical image interpretation
14
作者 Shunsuke Koga Wei Du 《Neural Regeneration Research》 SCIE CAS 2025年第2期487-488,共2页
Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive te... Large language models(LLMs),such as ChatGPT developed by OpenAI,represent a significant advancement in artificial intelligence(AI),designed to understand,generate,and interpret human language by analyzing extensive text data.Their potential integration into clinical settings offers a promising avenue that could transform clinical diagnosis and decision-making processes in the future(Thirunavukarasu et al.,2023).This article aims to provide an in-depth analysis of LLMs’current and potential impact on clinical practices.Their ability to generate differential diagnosis lists underscores their potential as invaluable tools in medical practice and education(Hirosawa et al.,2023;Koga et al.,2023). 展开更多
关键词 IMAGE DIAGNOSIS text
下载PDF
Relational Turkish Text Classification Using Distant Supervised Entities and Relations
15
作者 Halil Ibrahim Okur Kadir Tohma Ahmet Sertbas 《Computers, Materials & Continua》 SCIE EI 2024年第5期2209-2228,共20页
Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved throu... Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research. 展开更多
关键词 text classification relation extraction NER distant supervision deep learning machine learning
下载PDF
Assessing trends in wildland-urban interface fire research through text mining: a comprehensive analysis of published literature
16
作者 Hafsae Lamsaf Asmae Lamsaf +1 位作者 Mounir A.Kerroum Miguel Almeida 《Journal of Forestry Research》 SCIE EI CAS CSCD 2024年第4期102-114,共13页
Research on fires at the wildland-urban inter-face(WUI)has generated significant insights and advance-ments across various fields of study.Environmental,agri-culture,and social sciences have played prominent roles in ... Research on fires at the wildland-urban inter-face(WUI)has generated significant insights and advance-ments across various fields of study.Environmental,agri-culture,and social sciences have played prominent roles in understanding the impacts of fires in the environment,in protecting communities,and addressing management challenges.This study aimed to create a database using a text mining technique for global researchers interested in WUI-projects and highlighting the interest of countries in this field.Author’s-Keywords analysis emphasized the dominance of fire science-related terms,especially related to WUI,and identified keyword clusters related to the WUI fire-risk-assessment-system-“exposure”,“danger”,and“vulnerability”within wildfire research.Trends over the past decade showcase shifting research interests with a growing focus on WUI fires,while regional variations highlighted that the“exposure”keyword cluster received greater atten-tion in the southern Europe and South America.However,vulnerability keywords have relatively a lower representation across all regions.The analysis underscores the interdisci-plinary nature of WUI research and emphasizes the need for targeted approaches to address the unique challenges of the wildland-urban interface.Overall,this study provides valu-able insights for researchers and serves as a foundation for further collaboration in this field through the understanding of the trends over recent years and in different regions. 展开更多
关键词 WUI text mining WILDFIRES Fire science State of the art Scientific publications
下载PDF
Generating Factual Text via Entailment Recognition Task
17
作者 Jinqiao Dai Pengsen Cheng Jiayong Liu 《Computers, Materials & Continua》 SCIE EI 2024年第7期547-565,共19页
Generating diverse and factual text is challenging and is receiving increasing attention.By sampling from the latent space,variational autoencoder-based models have recently enhanced the diversity of generated text.Ho... Generating diverse and factual text is challenging and is receiving increasing attention.By sampling from the latent space,variational autoencoder-based models have recently enhanced the diversity of generated text.However,existing research predominantly depends on summarizationmodels to offer paragraph-level semantic information for enhancing factual correctness.The challenge lies in effectively generating factual text using sentence-level variational autoencoder-based models.In this paper,a novel model called fact-aware conditional variational autoencoder is proposed to balance the factual correctness and diversity of generated text.Specifically,our model encodes the input sentences and uses them as facts to build a conditional variational autoencoder network.By training a conditional variational autoencoder network,the model is enabled to generate text based on input facts.Building upon this foundation,the input text is passed to the discriminator along with the generated text.By employing adversarial training,the model is encouraged to generate text that is indistinguishable to the discriminator,thereby enhancing the quality of the generated text.To further improve the factual correctness,inspired by the natural language inference system,the entailment recognition task is introduced to be trained together with the discriminator via multi-task learning.Moreover,based on the entailment recognition results,a penalty term is further proposed to reconstruct the loss of our model,forcing the generator to generate text consistent with the facts.Experimental results demonstrate that compared with competitivemodels,ourmodel has achieved substantial improvements in both the quality and factual correctness of the text,despite only sacrificing a small amount of diversity.Furthermore,when considering a comprehensive evaluation of diversity and quality metrics,our model has also demonstrated the best performance. 展开更多
关键词 text generation entailment recognition task natural language processing artificial intelligence
下载PDF
Analyzing COVID-19 Discourse on Twitter: Text Clustering and Classification Models for Public Health Surveillance
18
作者 Pakorn Santakij Samai Srisuay Pongporn Punpeng 《Computer Systems Science & Engineering》 2024年第3期665-689,共25页
Social media has revolutionized the dissemination of real-life information,serving as a robust platform for sharing life events.Twitter,characterized by its brevity and continuous flow of posts,has emerged as a crucia... Social media has revolutionized the dissemination of real-life information,serving as a robust platform for sharing life events.Twitter,characterized by its brevity and continuous flow of posts,has emerged as a crucial source for public health surveillance,offering valuable insights into public reactions during the COVID-19 pandemic.This study aims to leverage a range of machine learning techniques to extract pivotal themes and facilitate text classification on a dataset of COVID-19 outbreak-related tweets.Diverse topic modeling approaches have been employed to extract pertinent themes and subsequently form a dataset for training text classification models.An assessment of coherence metrics revealed that the Gibbs Sampling Dirichlet Mixture Model(GSDMM),which utilizes trigram and bag-of-words(BOW)feature extraction,outperformed Non-negative Matrix Factorization(NMF),Latent Dirichlet Allocation(LDA),and a hybrid strategy involving Bidirectional Encoder Representations from Transformers(BERT)combined with LDA and K-means to pinpoint significant themes within the dataset.Among the models assessed for text clustering,the utilization of LDA,either as a clustering model or for feature extraction combined with BERT for K-means,resulted in higher coherence scores,consistent with human ratings,signifying their efficacy.In particular,LDA,notably in conjunction with trigram representation and BOW,demonstrated superior performance.This underscores the suitability of LDA for conducting topic modeling,given its proficiency in capturing intricate textual relationships.In the context of text classification,models such as Linear Support Vector Classification(LSVC),Long Short-Term Memory(LSTM),Bidirectional Long Short-Term Memory(BiLSTM),Convolutional Neural Network with BiLSTM(CNN-BiLSTM),and BERT have shown outstanding performance,achieving accuracy and weighted F1-Score scores exceeding 80%.These results significantly surpassed other models,such as Multinomial Naive Bayes(MNB),Linear Support Vector Machine(LSVM),and Logistic Regression(LR),which achieved scores in the range of 60 to 70 percent. 展开更多
关键词 Topic modeling text classification TWITTER feature extraction social media
下载PDF
基于motif连通性的社区搜索方法 被引量:1
19
作者 杜明 顾万里 +1 位作者 周军锋 王志军 《计算机应用》 CSCD 北大核心 2023年第7期2190-2199,共10页
社区搜索的目标是从数据图中得到包含查询顶点的紧密子图,在社会学、生物学等领域有着广泛应用。针对现有基于子图连通性的社区模型的基础连通结构都是完全连通图,无法满足实际应用中用户对社区结构多样性的需求的问题,提出一种基于moti... 社区搜索的目标是从数据图中得到包含查询顶点的紧密子图,在社会学、生物学等领域有着广泛应用。针对现有基于子图连通性的社区模型的基础连通结构都是完全连通图,无法满足实际应用中用户对社区结构多样性的需求的问题,提出一种基于motif连通性的社区搜索方法,其中包括基于motif连通性的社区(MCC)模型以及两个相应的社区搜索算法——MPCS(Motif-Processed Community Search)算法和基于MP-index的社区搜索算法。MCC模型可以协助用户自由指定社区的基础连通结构,MPCS算法可以用来解决MCC的搜索问题。此外,提出两个分别针对motif实例搜索过程及所属社区判断过程的剪枝优化技术。最后,设计了MP-index以避免社区搜索过程中的冗余遍历操作。在多个真实数据集上进行实验的结果表明:剪枝优化可以使MPCS算法的耗时减少60%~85%,而基于MP-index的社区搜索算法相较于加入剪枝优化的MPCS算法,效率提升普遍达到了2~3个数量级。可见,所提方法在商品推荐和社交网络等问题上有着实际应用价值。 展开更多
关键词 社区搜索 motif连通性 子图连通性社区 剪枝优化 社区结构多样性
下载PDF
Quantitative Comparative Study of the Performance of Lossless Compression Methods Based on a Text Data Model
20
作者 Namogo Silué Sié Ouattara +1 位作者 Mouhamadou Dosso Alain Clément 《Open Journal of Applied Sciences》 2024年第7期1944-1962,共19页
Data compression plays a key role in optimizing the use of memory storage space and also reducing latency in data transmission. In this paper, we are interested in lossless compression techniques because their perform... Data compression plays a key role in optimizing the use of memory storage space and also reducing latency in data transmission. In this paper, we are interested in lossless compression techniques because their performance is exploited with lossy compression techniques for images and videos generally using a mixed approach. To achieve our intended objective, which is to study the performance of lossless compression methods, we first carried out a literature review, a summary of which enabled us to select the most relevant, namely the following: arithmetic coding, LZW, Tunstall’s algorithm, RLE, BWT, Huffman coding and Shannon-Fano. Secondly, we designed a purposive text dataset with a repeating pattern in order to test the behavior and effectiveness of the selected compression techniques. Thirdly, we designed the compression algorithms and developed the programs (scripts) in Matlab in order to test their performance. Finally, following the tests conducted on relevant data that we constructed according to a deliberate model, the results show that these methods presented in order of performance are very satisfactory:- LZW- Arithmetic coding- Tunstall algorithm- BWT + RLELikewise, it appears that on the one hand, the performance of certain techniques relative to others is strongly linked to the sequencing and/or recurrence of symbols that make up the message, and on the other hand, to the cumulative time of encoding and decoding. 展开更多
关键词 Arithmetic Coding BWT Compression Ratio Comparative Study Compression Techniques Shannon-Fano HUFFMAN Lossless Compression LZW PERFORMANCE REDUNDANCY RLE text Data Tunstall
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部