期刊文献+
共找到11,535篇文章
< 1 2 250 >
每页显示 20 50 100
Sentence及其结构研究——以古典主义时期作品为例
1
作者 符方泽 杨伟杰 《北方音乐》 2024年第1期131-140,共10页
文章以sentence为研究对象,一方面对国内外曲式学教材及相关文献展开概念比较、辨析与理论梳理,阐明其基本概念与内部结构;另一方面则以古典主义时期作品为分析实例,讨论这种特定主题(句法)的“结构范型”、并进一步对其“结构变形”的... 文章以sentence为研究对象,一方面对国内外曲式学教材及相关文献展开概念比较、辨析与理论梳理,阐明其基本概念与内部结构;另一方面则以古典主义时期作品为分析实例,讨论这种特定主题(句法)的“结构范型”、并进一步对其“结构变形”的情况进行归类。 展开更多
关键词 sentence 古典风格 范型 变形 陈述短句 延续短句
下载PDF
基于Sentence-BERT的专利技术主题聚类研究——以人工智能领域为例 被引量:1
2
作者 阮光册 周萌葳 《情报杂志》 北大核心 2024年第2期110-117,共8页
[研究目的]将Sentence-BERT模型应用于专利技术主题聚类,解决专利文献为突出新颖性,常使用独特技术术语造成词汇向量语义特征稀疏的问题。[研究方法]以人工智能领域2015年-2019年的22370篇专利为实验数据。首先,采用Sentence-BERT算法... [研究目的]将Sentence-BERT模型应用于专利技术主题聚类,解决专利文献为突出新颖性,常使用独特技术术语造成词汇向量语义特征稀疏的问题。[研究方法]以人工智能领域2015年-2019年的22370篇专利为实验数据。首先,采用Sentence-BERT算法对专利文献摘要文本进行向量化表示;其次,对向量化矩阵进行数据降维,利用HDBSCAN方式寻找原始数据中的高密度簇;最后,识别类簇文本集合中的主题特征,并完成主题呈现。[研究结论]对比LDA主题模型、K-means、doc2vec等方法,本文的实验结果提高了主题划分的细粒度和精确度,获得了较好的主题一致性。如何采用fine-tune策略进一步提升模型的效果,是未来该方法进一步深入探索的方向。 展开更多
关键词 sentence-BERT 专利文本 主题识别 文本聚类
下载PDF
Classification of Conversational Sentences Using an Ensemble Pre-Trained Language Model with the Fine-Tuned Parameter
3
作者 R.Sujatha K.Nimala 《Computers, Materials & Continua》 SCIE EI 2024年第2期1669-1686,共18页
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir... Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88. 展开更多
关键词 Bidirectional encoder for representation of transformer conversation ensemble model fine-tuning generalized autoregressive pretraining for language understanding generative pre-trained transformer hyperparameter tuning natural language processing robustly optimized BERT pretraining approach sentence classification transformer models
下载PDF
融合Sentence-BERT和LDA的评论文本主题识别 被引量:4
4
作者 阮光册 黄韵莹 《现代情报》 2023年第5期46-53,共8页
[目的/意义]为了解决评论文本主题识别时语义描述不充分以及学习到的主题语义连贯性不强等问题。本文将Sentence-BERT句子嵌入模型和LDA模型相结合,提升评论文本主题的语义性。[方法/过程]采用Sentence-BERT模型获取评论文本句子层面的... [目的/意义]为了解决评论文本主题识别时语义描述不充分以及学习到的主题语义连贯性不强等问题。本文将Sentence-BERT句子嵌入模型和LDA模型相结合,提升评论文本主题的语义性。[方法/过程]采用Sentence-BERT模型获取评论文本句子层面的向量特征,同时,采用LDA模型获取评论文本的概率主题向量,随后使用自动编码器连接两组向量,运用K-means算法对潜在空间向量进行聚类,从类簇中获取上下文主题信息。[结果/结论]通过对评论文本数据集的实验,本文方法可以较好地获得具有语义信息的主题词。Sentence-BERT模型与LDA结合,增加了模型的复杂性。通过对比,本文方法获得的主题一致性指标(Coherence)优于目前常见的评论文本主题识别方法。 展开更多
关键词 sentence-BERT LDA模型 评论文本 主题识别
下载PDF
Next Words Prediction and Sentence Completion in Bangla Language Using GRU-Based RNN on N-Gram Language Model
5
作者 Afranul Hoque Busrat Jahan +3 位作者 Shaikat Chandra Paul Zinat Ara Zabu Rakhi Mondal Papeya Akter 《Journal of Data Analysis and Information Processing》 2023年第4期388-399,共12页
We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuab... We use a lot of devices in our daily life to communicate with others. In this modern world, people use email, Facebook, Twitter, and many other social network sites for exchanging information. People lose their valuable time misspelling and retyping, and some people are not happy to type large sentences because they face unnecessary words or grammatical issues. So, for this reason, word predictive systems help to exchange textual information more quickly, easier, and comfortably for all people. These systems predict the next most probable words and give users to choose of the needed word from these suggested words. Word prediction can help the writer by predicting the next word and helping complete the sentence correctly. This research aims to forecast the most suitable next word to complete a sentence for any given context. In this research, we have worked on the Bangla language. We have presented a process that can expect the next maximum probable and proper words and suggest a complete sentence using predicted words. In this research, GRU-based RNN has been used on the N-gram dataset to develop the proposed model. We collected a large dataset using multiple sources in the Bangla language and also compared it to the other approaches that have been used such as LSTM, and Naive Bayes. But this suggested approach provides excellent exactness than others. Here, the Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is 97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average accurateness. We think that our proposed method profound impression on Bangla search engines. 展开更多
关键词 Bangla Language Words Prediction sentence Completion GRU RNN Corpus N-Gram
下载PDF
A Sentence Retrieval Generation Network Guided Video Captioning
6
作者 Ou Ye Mimi Wang +3 位作者 Zhenhua Yu Yan Fu Shun Yi Jun Deng 《Computers, Materials & Continua》 SCIE EI 2023年第6期5675-5696,共22页
Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide... Currently,the video captioning models based on an encoder-decoder mainly rely on a single video input source.The contents of video captioning are limited since few studies employed external corpus information to guide the generation of video captioning,which is not conducive to the accurate descrip-tion and understanding of video content.To address this issue,a novel video captioning method guided by a sentence retrieval generation network(ED-SRG)is proposed in this paper.First,a ResNeXt network model,an efficient convolutional network for online video understanding(ECO)model,and a long short-term memory(LSTM)network model are integrated to construct an encoder-decoder,which is utilized to extract the 2D features,3D features,and object features of video data respectively.These features are decoded to generate textual sentences that conform to video content for sentence retrieval.Then,a sentence-transformer network model is employed to retrieve different sentences in an external corpus that are semantically similar to the above textual sentences.The candidate sentences are screened out through similarity measurement.Finally,a novel GPT-2 network model is constructed based on GPT-2 network structure.The model introduces a designed random selector to randomly select predicted words with a high probability in the corpus,which is used to guide and generate textual sentences that are more in line with human natural language expressions.The proposed method in this paper is compared with several existing works by experiments.The results show that the indicators BLEU-4,CIDEr,ROUGE_L,and METEOR are improved by 3.1%,1.3%,0.3%,and 1.5%on a public dataset MSVD and 1.3%,0.5%,0.2%,1.9%on a public dataset MSR-VTT respectively.It can be seen that the proposed method in this paper can generate video captioning with richer semantics than several state-of-the-art approaches. 展开更多
关键词 Video captioning encoder-decoder sentence retrieval external corpus RS GPT-2 network model
下载PDF
A Study of Nominal Predicate Sentences Under the Framework of Cognitive Grammar
7
作者 ZOU Wen-jie GAO Wen-cheng 《Journal of Literature and Art Studies》 2023年第9期704-707,共4页
Cognitive grammar,as a linguistic theory that attaches importance to the relationship between language and thinking,provides us with a more comprehensive way to understand the structure,semantics and cognitive process... Cognitive grammar,as a linguistic theory that attaches importance to the relationship between language and thinking,provides us with a more comprehensive way to understand the structure,semantics and cognitive processing of noun predicate sentences.Therefore,under the framework of cognitive grammar,this paper tries to analyze the semantic connection and cognitive process in noun predicate sentences from the semantic perspective and the method of example theory,and discusses the motivation of the formation of this construction,so as to provide references for in-depth analysis of the cognitive laws behind noun predicate sentences. 展开更多
关键词 nominal predicate sentences cognitive grammar SEMANTICS
下载PDF
汉语的“句子”与英语的sentence 被引量:20
8
作者 姜望琪 《解放军外国语学院学报》 北大核心 2005年第1期10-15,共6页
汉语的"句子"不等于英语的sentence,它更像utterance。在以英语研究为代表的西方语言研究中,sentence是一个抽象单位。而汉语研究一向注重实际使用的单位,忽视抽象单位,特别是对相当于sentence这一级的研究开始较晚,以至汉语... 汉语的"句子"不等于英语的sentence,它更像utterance。在以英语研究为代表的西方语言研究中,sentence是一个抽象单位。而汉语研究一向注重实际使用的单位,忽视抽象单位,特别是对相当于sentence这一级的研究开始较晚,以至汉语的"句子"至今仍是一个具体单位,或称"动态单位"。跟sentence相当的汉语单位是"词组",不是"句子"。"词组"或称"短语"是汉语最大的结构单位、语法单位或"静态单位"。 展开更多
关键词 句子 话语 动态单位 静态单位
下载PDF
基于Sentence-BERT语义表示的咨询问题提示列表自动构建方法研究——以糖尿病咨询为例 被引量:13
9
作者 唐晓波 刘亚岚 《现代情报》 CSSCI 2021年第8期3-15,共13页
[目的/意义]咨询问题提示列表能引导咨询者在智能问答和智能咨询系统进行咨询并为动态咨询引导提供基础。目前,关于问题提示列表构建的研究大多采用专家咨询法、访谈法,这些方法无法满足智能咨询服务要求,本文以有问必答网中糖尿病问答... [目的/意义]咨询问题提示列表能引导咨询者在智能问答和智能咨询系统进行咨询并为动态咨询引导提供基础。目前,关于问题提示列表构建的研究大多采用专家咨询法、访谈法,这些方法无法满足智能咨询服务要求,本文以有问必答网中糖尿病问答为例,提出了基于Sentence-BERT语义表示的咨询问题提示列表自动构建模型。[方法/过程]本文首先在糖尿病相关文献调查和分析的基础上确定糖尿病类目体系,并人工标注咨询问题类别;其次使用LDA模型对每类问题集进行主题聚类;然后各主题下通过Sentence-BERT预训练模型进行问题语义表示,textRank算法计算问题重要性并排序;最终冗余处理后构建出咨询问题提示列表。[结果/结论]实验结果表明,本文提出的模型能有效构建出信息质量较高的、内容丰富的咨询问题提示列表,对咨询引导有促进作用。 展开更多
关键词 问题提示列表 智能问答 智能咨询 问答社区 糖尿病咨询 LDA sentence-BERT
下载PDF
On Sentence Complexity in THE TIMES: A Comparative Study of SentenceLength and Sentence Complexity in the News Section and the Sports Section
10
作者 赤列德吉 《海外英语》 2014年第15期3-4,共2页
My investigation will serve two purposes. First, I shall investigate the function of the subclauses in the corpus in relation to their complexity, and I shall establish whether there is a correlation between sentence ... My investigation will serve two purposes. First, I shall investigate the function of the subclauses in the corpus in relation to their complexity, and I shall establish whether there is a correlation between sentence length and sentence complexity.Second, I shall analyse the complexity of the subclauses collected from the two sections and compare the results from these sections, focusing on finite subclauses and non-finite subclauses. I hope to be able to point out some differences in style between the news and sports sections concerning the use of subordinate clauses in various syntactic functions in order to examine how the choice of linguistic structures differs in different sections of The Times. 展开更多
关键词 sentence length sentence COMPLEXITY style MARKER t
下载PDF
Developing Sentence Sense
11
作者 霍鑫红 商洋 《海外英语》 2018年第6期212-213,共2页
There is a big problem in understanding long sentences which are complex and complicated in English for many people.The desire seems obvious that people have difficulties using a complete sentence. So this paper is to... There is a big problem in understanding long sentences which are complex and complicated in English for many people.The desire seems obvious that people have difficulties using a complete sentence. So this paper is to solve the problems mentionedby developing sentence sense and a chart with much help. 展开更多
关键词 sentence sense finite verbs non-finite verbs CHART
下载PDF
基于Sentence-Rank的图像句子标注
12
作者 徐守坤 徐坚 +2 位作者 李宁 周佳 刘楚秋 《计算机工程与应用》 CSCD 北大核心 2019年第2期121-127,共7页
传统的图像语义句子标注是利用句子模板完成对图像内容描述,但其标注句子很难做到符合语言逻辑。针对这一问题,提出基于统计思想从语料库中选出一条最优的句子来描述图像内容,设计以N-gram算法为主要思想的Sentence-Rank算法生成标注句... 传统的图像语义句子标注是利用句子模板完成对图像内容描述,但其标注句子很难做到符合语言逻辑。针对这一问题,提出基于统计思想从语料库中选出一条最优的句子来描述图像内容,设计以N-gram算法为主要思想的Sentence-Rank算法生成标注句子。首先执行机器视觉特征学习,选择标注性能最好的HSV-LBP-HOG融合特征完成图像分类,获得图像标注关键词。然后,利用字符串匹配算法从语料库中列出包含所有标注关键词的句子,并将得到的句子通过Sentence-Rank算法进行价值排序,选取评分最高的句子描述图像。实验结果表明,该方法得到的标注句子具有较低的困惑度,较好地解决了句子的语言逻辑问题。 展开更多
关键词 机器学习 自然语言处理 特征融合 sentence-Rank N-GRAM
下载PDF
Masked Sentence Model Based on BERT for Move Recognition in Medical Scientific Abstracts 被引量:19
13
作者 Gaihong Yu Zhixiong Zhang +1 位作者 Huan Liu Liangping Ding 《Journal of Data and Information Science》 CSCD 2019年第4期42-55,共14页
Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,... Purpose:Mo ve recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units.To improve the performance of move recognition in scientific abstracts,a novel model of move recognition is proposed that outperforms the BERT-based method.Design/methodology/approach:Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences.In this paper,inspired by the BERT masked language model(MLM),we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition.Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps.Then,we compare our model with HSLN-RNN,BERT-based and SciBERT using the same dataset.Findings:Compared with the BERT-based and SciBERT models,the F1 score of our model outperforms them by 4.96%and 4.34%,respectively,which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-theart results of HSLN-RNN at present.Research limitations:The sequential features of move labels are not considered,which might be one of the reasons why HSLN-RNN has better performance.Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed,which is a typical biomedical database,to fine-tune our model.Practical implications:The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences.Originality/value:T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way.The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks. 展开更多
关键词 Move recognition BERT Masked sentence model Scientific abstracts
下载PDF
CiteOpinion: Evidence-based Evaluation Tool for Academic Contributions of Research Papers Based on Citing Sentences 被引量:6
14
作者 Xiaoqiu Le Jingdan Chu +4 位作者 Siyi Deng Qihang Jiao Jingjing Pei Liya Zhu Junliang Yao 《Journal of Data and Information Science》 CSCD 2019年第4期26-41,共16页
Purpose:To uncover the evaluation information on the academic contribution of research papers cited by peers based on the content cited by citing papers,and to provide an evidencebased tool for evaluating the academic... Purpose:To uncover the evaluation information on the academic contribution of research papers cited by peers based on the content cited by citing papers,and to provide an evidencebased tool for evaluating the academic value of cited papers.Design/methodology/approach:CiteOpinion uses a deep learning model to automatically extract citing sentences from representative citing papers;it starts with an analysis on the citing sentences,then it identifies major academic contribution points of the cited paper,positive/negative evaluations from citing authors and the changes in the subjects of subsequent citing authors by means of Recognizing Categories of Moves(problems,methods,conclusions,etc.),and sentiment analysis and topic clustering.Findings:Citing sentences in a citing paper contain substantial evidences useful for academic evaluation.They can also be used to objectively and authentically reveal the nature and degree of contribution of the cited paper reflected by citation,beyond simple citation statistics.Practical implications:The evidence-based evaluation tool CiteOpinion can provide an objective and in-depth academic value evaluation basis for the representative papers of scientific researchers,research teams,and institutions.Originality/value:No other similar practical tool is found in papers retrieved.Research limitations:There are difficulties in acquiring full text of citing papers.There is a need to refine the calculation based on the sentiment scores of citing sentences.Currently,the tool is only used for academic contribution evaluation,while its value in policy studies,technical application,and promotion of science is not yet tested. 展开更多
关键词 Cited paper Citing paper Citing sentence Citation motive Citation sentiment Academic contribution Evaluation
下载PDF
A Sentence Similarity Estimation Method Based on Improved Siamese Network 被引量:4
15
作者 Ziming Chi Bingyan Zhang 《Journal of Intelligent Learning Systems and Applications》 2018年第4期121-134,共14页
In this paper we employ an improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design ou... In this paper we employ an improved Siamese neural network to assess the semantic similarity between sentences. Our model implements the function of inputting two sentences to obtain the similarity score. We design our model based on the Siamese network using deep Long Short-Term Memory (LSTM) Network. And we add the special attention mechanism to let the model give different words different attention while modeling sentences. The fully-connected layer is proposed to measure the complex sentence representations. Our results show that the accuracy is better than the baseline in 2016. Furthermore, it is showed that the model has the ability to model the sequence order, distribute reasonable attention and extract meanings of a sentence in different dimensions. 展开更多
关键词 sentence SIMILARITY sentence Modeling SIMILARITY Measurement ATTENTION Mechanism Fully-Connected Layer DISORDER sentence DATASET
下载PDF
Markedness and UG in Chinese Children's Acquisition of One-word and Negative Sentences 被引量:1
16
作者 Yu Shanzhi Department of Foreign LanguagesHenan University Kadeng 475001P. R. China< sZyu@mail.henu.edu.cn>Zhang Xinhong Faculty Of English Language and Culture Guangdong University of Foreign Studies Guangzhou 510420P. R. China or < bbjohnson@ ]63.net > 《现代外语》 CSSCI 北大核心 1999年第4期379-381,共3页
Thepresentstudyisaninvestigationandanalysisoftherelationshipbetweenmarkednessandfirstlanguageacquisitionsequence,asshowninthecasesofone-wordandnegativesentences.Hereourobjectivesaretoargueforthepriorityofunmarkednesso... Thepresentstudyisaninvestigationandanalysisoftherelationshipbetweenmarkednessandfirstlanguageacquisitionsequence,asshowninthecasesofone-wordandnegativesentences.Hereourobjectivesaretoargueforthepriorityofunmarkednessovermarkednessintheacquisitionsequ... 展开更多
关键词 MARKEDNESS UG ACQUISITION one-word sentence negative sentence.
下载PDF
The Contrasts of English Sentences and Chinese Sentences and Translation Skills
17
作者 郭丁菊 张晶 《英语广场(学术研究)》 2013年第3期21-22,共2页
Since the opening-up policy was carried out in 1979, every facility of social modernized construction has developed at high speed; meanwhile, the need of English is increased year by year. The occasion and scope of us... Since the opening-up policy was carried out in 1979, every facility of social modernized construction has developed at high speed; meanwhile, the need of English is increased year by year. The occasion and scope of using it are expanded with the communications among countries. Therefore, English has become the generally international language in our country; in particular, translation plays an important and irreplaceable part in English to convey information. This paper aims to introduce the contrasts of English sentences and Chinese sentences and discuss some skills of translating each other. Though it is not complete and authoritative, yet it may help some people to understand the differences between two languages and to grasp some practical skills. 展开更多
关键词 TRANSLATION CONTRASTS Chinese sentences English sentences translation skills
下载PDF
A New Method for Calculating Similarity between Sentences and Application on Automatic Abstracting 被引量:1
18
作者 Wenqian JI Zhoujun LI +1 位作者 Wenhan CHAO Xiaoming CHEN 《Intelligent Information Management》 2009年第1期36-42,共7页
Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several m... Sentence similarity computing plays an important role in machine question-answering systems, machine-translation systems, information retrieval and automatic abstracting systems. This article firstly sums up several methods for calculating similarity between sentences, and brings out a new method which takes all factors into consideration including critical words, semantic information, sentential form and sen-tence length. And on this basis, a automatic abstracting system based on LexRank algorithm is implemented. We made several improvements in both sentence weight computing and redundancy resolution. The system described in this article could deal with single or multi-document summarization both in English and Chinese. With evaluations on two corpuses, our system could produce better summaries to a certain degree. We also show that our system is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents. And in the end, existing problem and the developing trend of automatic summariza-tion technology are discussed. 展开更多
关键词 sentence SIMILARITY AUTOMATIC abstracting lexrank sentence-weight computing REDUNDANCY resolution
下载PDF
基于Sentence-LDA主题模型的短文本分类 被引量:4
19
作者 张浩 钟敏 《计算机与现代化》 2019年第3期102-106,共5页
短文本特征稀疏、上下文依赖性强的特点,导致传统长文本分类技术不能有效地被直接应用。为了解决短文本特征稀疏的问题,提出基于Sentence-LDA主题模型进行特征扩展的短文本分类方法。该主题模型是隐含狄利克雷分布模型(Latent Dirichlet... 短文本特征稀疏、上下文依赖性强的特点,导致传统长文本分类技术不能有效地被直接应用。为了解决短文本特征稀疏的问题,提出基于Sentence-LDA主题模型进行特征扩展的短文本分类方法。该主题模型是隐含狄利克雷分布模型(Latent Dirichlet Allocation, LDA)的扩展,假设一个句子只产生一个主题分布。利用训练好的Sentence-LDA主题模型预测原始短文本的主题分布,从而将得到的主题词扩展到原始短文本特征中,完成短文本特征扩展。对扩展后的短文本使用支持向量机(Support Vector Machine, SVM)进行最后的分类。实验显示,与传统的基于向量空间模型(Vector Space Model,VSM)直接表示短文本的方法比较,本文提出的方法可以有效地提高短文本分类的准确率。 展开更多
关键词 短文本分类 sentence-LDA 主题模型 特征扩展 SVM
下载PDF
Information mining and similarity computation for semi-/un-structured sentences from the social data 被引量:1
20
作者 Peiying Zhang Xingzhe Huang Lei Zhang 《Digital Communications and Networks》 SCIE CSCD 2021年第4期518-525,共8页
In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely f... In recent years,with the development of the social Internet of Things(IoT),all kinds of data accumulated on the network.These data,which contain a lot of social information and opinions.However,these data are rarely fully analyzed,which is a major obstacle to the intelligent development of the social IoT.In this paper,we propose a sentence similarity analysis model to analyze the similarity in people’s opinions on hot topics in social media and news pages.Most of these data are unstructured or semi-structured sentences,so the accuracy of sentence similarity analysis largely determines the model’s performance.For the purpose of improving accuracy,we propose a novel method of sentence similarity computation to extract the syntactic and semantic information of the semi-structured and unstructured sentences.We mainly consider the subjects,predicates and objects of sentence pairs and use Stanford Parser to classify the dependency relation triples to calculate the syntactic and semantic similarity between two sentences.Finally,we verify the performance of the model with the Microsoft Research Paraphrase Corpus(MRPC),which consists of 4076 pairs of training sentences and 1725 pairs of test sentences,and most of the data came from the news of social data.Extensive simulations demonstrate that our method outperforms other state-of-the-art methods regarding the correlation coefficient and the mean deviation. 展开更多
关键词 sentence similarity computation Information mining and computation Social data Internet of things Type of sentence pairs
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部