期刊文献+
共找到1,037篇文章
< 1 2 52 >
每页显示 20 50 100
The effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance
1
作者 WANG Min CAO Yang-bo 《Sino-US English Teaching》 2009年第2期14-18,28,共6页
The present study probed into the effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance. There are 112 college-level students participated in the experiment an... The present study probed into the effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance. There are 112 college-level students participated in the experiment and their English proficiency belonged to distinct levels. The subjects' performance on the recall of two passages written in different types of structure was examined. Results of statistical indicate that text structure, structure awareness and proficiency level all have main effects on the subjects' reading performance. More specifically, two major findings emerged from the results of the investigation. One the one hand, text structures significantly affected the quantity but not the quality of the information recalled while proficiency level and structure awareness had significant impact on both the quantity and quality of information recalled. On the other hand, structure awareness was irrelevant to either text structure or proficiency level. The implications of the findings for teaching L2/FL reading were suggested. 展开更多
关键词 text structure structure awareness proficiency level reading performance
下载PDF
The Comparative Study of Two Translated Texts on Huangdirent Translating Styles o'n s Internal Classics——Simple Analysis on the Diffe Culture-Specific Lexicon, Figure of Speech and Four-Chinese-Character Structures
2
作者 杨峥 《海外英语》 2017年第17期5-6,共2页
Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translat... Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin. 展开更多
关键词 Huangdi’s Internal Classics the comparative Study of translated texts culture-specific lexicon figure of speech four-Chinese-character structures
下载PDF
A Brief Introduction of the Distinctive Styles in News Text
3
作者 许华蓉 《海外英语》 2016年第11期225-226,共2页
Newspaper is, to some extent, a mirror of our society, reflecting the latest change and development of the society.News text is a linguistic representation of the world. This paper is to briefly introduce the structur... Newspaper is, to some extent, a mirror of our society, reflecting the latest change and development of the society.News text is a linguistic representation of the world. This paper is to briefly introduce the structure, writing and linguistic styles of news texts and thus to increase readers' awareness of the distinctive features of news texts. 展开更多
关键词 NEWS text structure WRITING LINGUISTICS
下载PDF
Cultural Influence on Text Structures──A Comparative Study of Chinese and German Routine Narratives
4
作者 Comspondence: Liu QishengGerman DepartmentGuangdong University of Foreign StudiesGuangzhou,P R. China 510420< qliu@gdufs.edu.cn > 《现代外语》 CSSCI 北大核心 1999年第4期346-348,共3页
Culturalexchangesenablepeopletorecognizedifferencesbetweentextstructuresindifferentlanguages.Asearlyasthe60's,Kaplanalreadypointedouttheinfluencethatculturehadupontextstructures,butculturalfactorshavenotyetreceive... Culturalexchangesenablepeopletorecognizedifferencesbetweentextstructuresindifferentlanguages.Asearlyasthe60's,Kaplanalreadypointedouttheinfluencethatculturehadupontextstructures,butculturalfactorshavenotyetreceiveddueattentioninmanylinguisticwritings... 展开更多
关键词 text structure NaRRaTIVE point of VIEW CULTURaL element COGNITIVE pattern
下载PDF
The Image of an Addressee in Translational Discourse: Exemplified by the Texts Translated From Slovenian Language
5
作者 Irina Shchukina 《Journal of Literature and Art Studies》 2013年第12期787-798,共12页
Translational discourse requires at least three participants, therefore it is suggested to consider the universal model of the picture of the world, according to which it is much easier for a translator to combine the... Translational discourse requires at least three participants, therefore it is suggested to consider the universal model of the picture of the world, according to which it is much easier for a translator to combine the pictures of the world of an addressee and an author. An addressee is a mental image existing in the mind of an addresser during the creative process. Having defined its parameters, a translator has an opportunity to deliver the thought of an addresser to an addressee as accurately as possible and to select the means of expression that are clear to an addressee. The type of an addressee correlates with "the relation to the new". 展开更多
关键词 cognitive linguistics target text the language picture of the world DISCOURSE addresser addressee the levels of the structure of the language world picture
下载PDF
Application of the probability-based covering algorithm model in text classification
6
作者 ZHOU Ying 《Chinese Journal of Library and Information Science》 2009年第4期1-17,共17页
The probability-based covering algorithm(PBCA) is a new algorithm based on probability distribution. It decides, by voting, the class of the tested samples on the border of the coverage area, based on the probability ... The probability-based covering algorithm(PBCA) is a new algorithm based on probability distribution. It decides, by voting, the class of the tested samples on the border of the coverage area, based on the probability of training samples. When using the original covering algorithm(CA), many tested samples that are located on the border of the coverage cannot be classified by the spherical neighborhood gained. The network structure of PBCA is a mixed structure composed of both a feed-forward network and a feedback network. By using this method of adding some heterogeneous samples and enlarging the coverage radius,it is possible to decrease the number of rejected samples and improve the rate of recognition accuracy. Relevant computer experiments indicate that the algorithm improves the study precision and achieves reasonably good results in text classification. 展开更多
关键词 Probability-based covering algorithm Structural training algorithm PROBaBILITY text classification
下载PDF
Automatic Persian Text Summarization Using Linguistic Features from Text Structure Analysis 被引量:1
7
作者 Ebrahim Heidary Hamïd Parvïn +2 位作者 Samad Nejatian Karamollah Bagherifard Vahideh Rezaie 《Computers, Materials & Continua》 SCIE EI 2021年第12期2845-2861,共17页
With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text... With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text documents into shorter summaries for easier review of its core contents,which must be done without losing important features and information.This paper introduces a new hybrid method for extractive text summarization with feature selection based on text structure.The major advantage of the proposed summarization method over previous systems is the modeling of text structure and relationship between entities in the input text,which improves the sentence feature selection process and leads to the generation of unambiguous,concise,consistent,and coherent summaries.The paper also presents the results of the evaluation of the proposed method based on precision and recall criteria.It is shown that the method produces summaries consisting of chains of sentences with the aforementioned characteristics from the original text. 展开更多
关键词 Natural language processing extractive summarization linguistic feature text structure analysis
下载PDF
An Auto-Grading Oriented Approach for Off-Line Handwritten Organic Cyclic Compound Structure Formulas Recognition
8
作者 Ting Zhang Yifei Wang +3 位作者 Xinxin Jin Zhiwen Gu Xiaoliang Zhang Bin He 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2267-2285,共19页
Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researcher... Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results. 展开更多
关键词 Handwritten chemical structure formulas structure interpretation components detection text recognition
下载PDF
Text Sentiment Analysis Based on Multi-Layer Bi-Directional LSTM with a Trapezoidal Structure
9
作者 Zhengfang He Cristina E.Dumdumaya Ivy Kim D.Machica 《Intelligent Automation & Soft Computing》 SCIE 2023年第7期639-654,共16页
Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,m... Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,measure,and investigate affective states and subjective data.Sentiment analy-sis algorithms include emotion lexicon,traditional machine learning,and deep learning.In the text sentiment analysis algorithm based on a neural network,multi-layer Bi-directional long short-term memory(LSTM)is widely used,but the parameter amount of this model is too huge.Hence,this paper proposes a Bi-directional LSTM with a trapezoidal structure model.The design of the trapezoidal structure is derived from classic neural networks,such as LeNet-5 and AlexNet.These classic models have trapezoidal-like structures,and these structures have achieved success in the field of deep learning.There are two benefits to using the Bi-directional LSTM with a trapezoidal structure.One is that compared with the single-layer configuration,using the of the multi-layer structure can better extract the high-dimensional features of the text.Another is that using the trapezoidal structure can reduce the model’s parameters.This paper introduces the Bi-directional LSTM with a trapezoidal structure model in detail and uses Stanford sentiment treebank 2(STS-2)for experiments.It can be seen from the experimental results that the trapezoidal structure model and the normal structure model have similar performances.However,the trapezoidal structure model parameters are 35.75%less than the normal structure model. 展开更多
关键词 text sentiment Bi-directional LSTM Trapezoidal structure
下载PDF
Oracle Text技术在复杂结构数据库中的应用 被引量:5
10
作者 蒙辉 陈燕 《计算机技术与发展》 2007年第4期38-40,44,共4页
全文检索技术是智能信息管理的关键技术之一,Oracle Text作为Oracle9i的一个组件,提供了强大的全文检索功能。但Oracle Text全文检索技术只是针对表结构相对固定的数据库,而对表结构以及表数量不断变化的数据库实现全文检索的能力是不... 全文检索技术是智能信息管理的关键技术之一,Oracle Text作为Oracle9i的一个组件,提供了强大的全文检索功能。但Oracle Text全文检索技术只是针对表结构相对固定的数据库,而对表结构以及表数量不断变化的数据库实现全文检索的能力是不足的。文中介绍了Oracle Text全文检索技术的方法和步骤,阐述了其在复杂结构数据库中的具体应用,最后程序实现了所设计的全文检索技术。 展开更多
关键词 ORaCLE text 复杂结构数据库 全文检索
下载PDF
基于EBRCG的API结构模式信息增强方法研究
11
作者 钟林辉 祝艳霞 +3 位作者 黄琪轩 屈乔乔 夏子豪 郑燚 《计算机科学》 CSCD 北大核心 2024年第S02期793-802,共10页
针对API调用模式缺乏结构信息及结果高冗余等问题,提出了基于扩展的分支保留调用图(the Extended Branch-Reserving Call Graph,EBRCG)的API结构模式信息增强方法。以Java开源项目源代码为研究对象,使用EBRCG来表示Java类的方法的结构信... 针对API调用模式缺乏结构信息及结果高冗余等问题,提出了基于扩展的分支保留调用图(the Extended Branch-Reserving Call Graph,EBRCG)的API结构模式信息增强方法。以Java开源项目源代码为研究对象,使用EBRCG来表示Java类的方法的结构信息,在EBRCG中,同时考虑了API调用语句、分支语句(将if语句和所有循环语句视为分支语句)、switch-case多分支语句、异常语句等,并提出了EBRCG裁剪算法来获取特定API调用模式的代码结构。同时,采用聚类和排序的方法对API调用模式的多个代码结构信息进行筛选,最终选择具有代表性的API调用模式的代码结构。为验证该方法的效果,将该方法与TextRank方法进行了3组实验比较。结果显示,该方法能有效地获取API调用模式的代码结构,相比TextRank方法能更准确地描述API的使用,有一定的研究意义,并为软件开发人员提供了参考。 展开更多
关键词 aPI调用模式 扩展的分支保留调用图 代码结构 K-MEaNS聚类
下载PDF
Automatic extraction and structuration of soil–environment relationship information from soil survey reports 被引量:8
12
作者 WANG De-sheng LIU Jun-zhi +3 位作者 ZHU A-xing WANG Shu ZENG Can-ying MA Tianwu 《Journal of Integrative Agriculture》 SCIE CAS CSCD 2019年第2期328-339,共12页
In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Co... In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Considering that the words describing soil–environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing(NLP) techniques to automatically extract and structure information from soil survey reports regarding soil–environment relationships. The method includes two steps:(1) construction of a knowledge frame and(2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision(P), recall(R), and F1-measure(F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields(CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles(i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil–environment relationship information from a text data source. 展开更多
关键词 soil–environment relationship text natural LaNGUaGE processing extraction STRUCTURaTION
下载PDF
Acoustic Characteristics of Advertisement Calls in Babina adenopleura
13
作者 Xiaobin FANG Xia QIU +4 位作者 Yilin ZHOU Luyi YANG Yi ZHAO Weihong ZHENG Jinsong LIU 《Asian Herpetological Research》 SCIE CSCD 2015年第3期220-228,共9页
Acoustic communication is the most important form of communication in anuran amphibians. To understand the acoustic characteristics of male Babina adenopleura, we recorded advertisement calls and analyzed their acoust... Acoustic communication is the most important form of communication in anuran amphibians. To understand the acoustic characteristics of male Babina adenopleura, we recorded advertisement calls and analyzed their acoustic parameters during the breeding season. Male B. adenopleura produced calls with a variable number of notes(1–5), and each note contained harmonics. Although 6% of call notes did not exhibit frequency modulation(FM), two call note FM patterns were observed:(1) upward FM;(2) upward–downward FM. With the exception of 1- and 5- note calls, the duration of successive notes decreased monotonically. With the exception of 1 note calls, the fundamental frequency of the first note was lowest, then increased; the greatest change in the fundamental frequency was always between notes 1 and 2. The dominant frequency varied between calls. For example for the first call note the dominant frequency occurred in some cases in the first harmonic(located in the 605.320 ± 64.533 Hz frequency band), the second harmonic(918 ± 9 Hz band), the fourth harmonic(1712 ± 333 Hz band), the sixth harmonic(the 2165 ± 152 Hz band), the seventh harmonic(the 2269 ± 140 Hz band), the eighth harmonic(the 2466 ± 15 Hz band) or the ninth harmonic(the 2636 ± 21 Hz band). Although male B. adenopleura advertisement calls have a distinctive structure, they have similar characteristics to the calls of the music frog, B. daunchina. 展开更多
关键词 anuran amphibians advertisement calls Babina adenopleura call structure
下载PDF
An Intelligent Tree Extractive Text Summarization Deep Learning
14
作者 Abeer Abdulaziz AlArfaj Hanan Ahmed Hosni Mahmoud 《Computers, Materials & Continua》 SCIE EI 2022年第11期4231-4244,共14页
In recent research,deep learning algorithms have presented effective representation learning models for natural languages.The deep learningbased models create better data representation than classical models.They are ... In recent research,deep learning algorithms have presented effective representation learning models for natural languages.The deep learningbased models create better data representation than classical models.They are capable of automated extraction of distributed representation of texts.In this research,we introduce a new tree Extractive text summarization that is characterized by fitting the text structure representation in knowledge base training module,and also addresses memory issues that were not addresses before.The proposed model employs a tree structured mechanism to generate the phrase and text embedding.The proposed architecture mimics the tree configuration of the text-texts and provide better feature representation.It also incorporates an attention mechanism that offers an additional information source to conduct better summary extraction.The novel model addresses text summarization as a classification process,where the model calculates the probabilities of phrase and text-summary association.The model classification is divided into multiple features recognition such as information entropy,significance,redundancy and position.The model was assessed on two datasets,on the Multi-Doc Composition Query(MCQ)and Dual Attention Composition dataset(DAC)dataset.The experimental results prove that our proposed model has better summarization precision vs.other models by a considerable margin. 展开更多
关键词 Neural network architecture text structure abstractive summarization
下载PDF
文生视频模型Sora的时间性结构分析——对生成式人工智能的现象学思考 被引量:1
15
作者 邓志文 《编辑之友》 CSSCI 北大核心 2024年第6期46-52,共7页
近日,OpenAI推出了代表了目前文生视频最高水平的模型Sora,成为生成式人工智能发展史上的里程碑。然而,Sora还是存在着一些技术上的缺陷和不足。从时间现象学角度看,Sora外在时间结构“阵容”残缺,只有客观时间,没有主观时间和内在时间... 近日,OpenAI推出了代表了目前文生视频最高水平的模型Sora,成为生成式人工智能发展史上的里程碑。然而,Sora还是存在着一些技术上的缺陷和不足。从时间现象学角度看,Sora外在时间结构“阵容”残缺,只有客观时间,没有主观时间和内在时间意识,导致其无法描述人类的心理时间,不能解释事件的因果关系和建构复杂有意义的事件及情节。此外,滞留和前摄的缺席,导致其无法连接动作和结果;缺少内在时间性动态生成结构的介入,Sora亦难以展现随着时间推移而发生的事件。因此,从技术层面增加数据模型的意向性实践和提升意向性设计的算量、算法,完善内外两个时间性结构,成为提升Sora现实表现的关键。 展开更多
关键词 文生视频 SORa 时间性结构 生成式人工智能 现象学 滞留与前摄
下载PDF
基于语义增强模式链接的Text-to-SQL模型
16
作者 吴相岚 肖洋 +1 位作者 刘梦莹 刘明铭 《计算机应用》 CSCD 北大核心 2024年第9期2689-2695,共7页
为优化基于异构图编码器的Text-to-SQL生成效果,提出SELSQL模型。首先,模型采用端到端的学习框架,使用双曲空间下的庞加莱距离度量替代欧氏距离度量,以此优化使用探针技术从预训练语言模型中构建的语义增强的模式链接图;其次,利用K头加... 为优化基于异构图编码器的Text-to-SQL生成效果,提出SELSQL模型。首先,模型采用端到端的学习框架,使用双曲空间下的庞加莱距离度量替代欧氏距离度量,以此优化使用探针技术从预训练语言模型中构建的语义增强的模式链接图;其次,利用K头加权的余弦相似度以及图正则化方法学习相似度度量图使得初始模式链接图在训练中迭代优化;最后,使用改良的关系图注意力网络(RGAT)图编码器以及多头注意力机制对两个模块的联合语义模式链接图进行编码,并且使用基于语法的神经语义解码器和预定义的结构化语言进行结构化查询语言(SQL)语句解码。在Spider数据集上的实验结果表明,使用ELECTRA-large预训练模型时,SELSQL模型比最佳基线模型的准确率提升了2.5个百分点,对于复杂SQL语句生成的提升效果很大。 展开更多
关键词 模式链接 图结构学习 预训练语言模型 text-to-SQL 异构图
下载PDF
基于遗传算法的APP用户隐私保护文本挖掘系统设计 被引量:2
17
作者 童沐雨 刘建平 林熠来 《自动化技术与应用》 2024年第3期116-119,共4页
针对受到句子相似性较高的影响,存在APP用户隐私保护文本挖掘效率低的问题,为此,设计基于遗传算法的APP用户隐私保护文本挖掘系统。使用Heritrix爬虫结构采集文本信息,采用多线程ToePool,管理抓取的线程,借助ARM处理器,预处理文本信息,... 针对受到句子相似性较高的影响,存在APP用户隐私保护文本挖掘效率低的问题,为此,设计基于遗传算法的APP用户隐私保护文本挖掘系统。使用Heritrix爬虫结构采集文本信息,采用多线程ToePool,管理抓取的线程,借助ARM处理器,预处理文本信息,使用垂直搜索引擎模块,将索引域写入索引,通过分析词特征在句子中出现的频率,计算两个文档句子之间的相似度,确定导向式文摘的查询相关性,衡量摘要查询相关程度,根据选取的数据源,设计APP用户隐私保护文本提取流程。实验结果可知,该系统与数据源存在5次的频率误差,其余均一致,具有良好挖掘效果。 展开更多
关键词 遗传算法 aPP用户 隐私保护 文本挖掘 Heritrix爬虫结构
下载PDF
基于SAO结构的颠覆性技术关联机会发现路径研究
18
作者 金可怡 周立军 杨静 《情报杂志》 CSSCI 北大核心 2024年第9期84-91,111,共9页
[研究目的]对颠覆性技术的发展和关联进行识别及预测,有助于及时开展技术布局,在产业竞争中形成战略优势。[研究方法]以技术专利摘要为数据对象,提出了融合SAO结构抽取、LDA主题分析和链路预测的技术关联机会发现路径,并以脑机接口为例... [研究目的]对颠覆性技术的发展和关联进行识别及预测,有助于及时开展技术布局,在产业竞争中形成战略优势。[研究方法]以技术专利摘要为数据对象,提出了融合SAO结构抽取、LDA主题分析和链路预测的技术关联机会发现路径,并以脑机接口为例对方法进行了验证。[研究结论]脑机接口在自身技术迭代、医疗健康领域、扩展应用领域及多模态集成控制系统领域具有广阔的关联机会,与论文、专利、课题、新闻报道等多源技术报告的对比评估说明了该路径的可行性和有效性。 展开更多
关键词 颠覆性技术 技术识别 技术专利 专利文本 SaO结构 LDa主题模型 链路预测 脑机接口
下载PDF
基于复合加权LDA模型的书目信息分类方法研究 被引量:14
19
作者 李湘东 丁丛 高凡 《情报学报》 CSSCI CSCD 北大核心 2017年第4期352-360,共9页
以书目信息为分类对象的自动分类研究对信息资源组织具有重要意义。本文以概率主题模型LDA作为书目信息的文本表示模型,以克服因文本短小而产生的特征稀疏问题;以书目信息的体例结构和所在类目的类别区分能力分别实现两种不同的特征加... 以书目信息为分类对象的自动分类研究对信息资源组织具有重要意义。本文以概率主题模型LDA作为书目信息的文本表示模型,以克服因文本短小而产生的特征稀疏问题;以书目信息的体例结构和所在类目的类别区分能力分别实现两种不同的特征加权策略,在此基础上构建复合加权策略,使获取的特征词集既不向高频词倾斜,也更能代表书目信息的所属类别。将复合加权策略融合于LDA、提出一种基于复合加权LDA的书目信息分类方法。使用公开和自建的书目信息语料进行对比实验,验证和分析复合加权策略的有效性,实验显示本文提出的复合加权LDA分类方法的分类性能优于仅考虑其中一种特征加权策略的LDA分类方法。 展开更多
关键词 文本分类 LDa模型 特征加权 书目信息 文本体例结构
下载PDF
语言模型辅助的英语科技论文摘要语步语料库构建研究
20
作者 李洪政 王若锦 +1 位作者 刘芳 冯冲 《外语学刊》 北大核心 2025年第1期29-38,共10页
语步结构是学术论文中的文本语篇单位,在学术用途英语等方面具有重要价值。尽管关于学术论文的语步研究非常丰富,但语步标注数据资源仍然相对较少。本研究借助自然语言处理领域的语言模型构建了涵盖多个学科领域的英语科技论文摘要语步... 语步结构是学术论文中的文本语篇单位,在学术用途英语等方面具有重要价值。尽管关于学术论文的语步研究非常丰富,但语步标注数据资源仍然相对较少。本研究借助自然语言处理领域的语言模型构建了涵盖多个学科领域的英语科技论文摘要语步标注语料库,包括近3.4万个语步结构。语料库构建的第一阶段依靠专家标注形成高质量语料,在第二阶段也是主要阶段,采用基于BERT架构的自动标注模型,在保证标注质量的同时能够快速提升标注速度、扩大标注规模。本研究随后开展了摘要语步自动标注识别实验,对比自动标注模型与大语言模型ChatGPT和Claude3识别不同学科领域的语步结构的效果,验证了模型和语料库的价值。该研究能为科技论文写作智能批改等自然语言处理任务以及学术用途英语等外语教学与研究等提供必要的数据资源,也验证了大语言模型辅助构建语言资源的可能性,体现了语言智能驱动的智慧外语教育的重要性,能有效推动外语教育数字化转型。 展开更多
关键词 语步结构 语料库 摘要文本 大语言模型
下载PDF
上一页 1 2 52 下一页 到第
使用帮助 返回顶部