期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
基于N-最短路径方法的中文词语粗分模型 被引量:99
1
作者 张华平 刘群 《中文信息学报》 CSCD 北大核心 2002年第5期1-7,共7页
预处理过程的词语粗切分,是整个中文词语分析的基础环节,对最终的召回率、准确率、运行效率起着重要的作用。词语粗分必须能为后续的过程提供少量的、高召回率的、中间结果。本文提出了一种基于N-最短路径方法的粗分模型,旨在兼顾高召... 预处理过程的词语粗切分,是整个中文词语分析的基础环节,对最终的召回率、准确率、运行效率起着重要的作用。词语粗分必须能为后续的过程提供少量的、高召回率的、中间结果。本文提出了一种基于N-最短路径方法的粗分模型,旨在兼顾高召回率和高效率。在此基础上,引入了词频的统计数据,对原有模型进行改进,建立了更实用的统计模型。针对人民日报一个月的语料库(共计185,192个句子),作者进行了粗分实验。按句子进行统计,2-最短路径非统计粗分模型的召回率为99.73%;在10-最短路径统计粗分模型中,平均6.12个粗分结果得到的召回率高达99.94%,比最大匹配方法高出15%,比以前最好的切词方法至少高出6.4%。而粗分结果数的平均值较全切分减少了64倍。实验结果表明:N-最短路径方法是一种预处理过程中实用、有效的的词语粗分手段。 展开更多
关键词 N-最短路径方法 中文词语粗分模型 中文词语分析 预处理 统计模型 中文信息处理
下载PDF
基于改进的蚁群算法中文词语自动分类技术研究 被引量:2
2
作者 赖娟 《科技通报》 北大核心 2012年第2期152-154,共3页
研究了中文词自动分类问题。针对传统的蚁群算法中文词语分类精确度低等问题,提出了一种将蚁群算法应用到了中文词语自动分类中。方法建立在首先对大规模语料文本进行统计和计算的基础上,得到词的一元和二元信息,然后采用了蚁群算法对... 研究了中文词自动分类问题。针对传统的蚁群算法中文词语分类精确度低等问题,提出了一种将蚁群算法应用到了中文词语自动分类中。方法建立在首先对大规模语料文本进行统计和计算的基础上,得到词的一元和二元信息,然后采用了蚁群算法对该信息进行词的分类。实验结果表明,提出的算法有效提高了词语分类的精确度。 展开更多
关键词 中文词语分类 蚁群算法 语料文本 互信息
下载PDF
多级组件融合中文词嵌入模型及核心药物识别应用
3
作者 孔德阳 张云 +2 位作者 王维广 陈子杰 刘勇国 《计算机应用》 CSCD 北大核心 2024年第S2期50-54,共5页
词嵌入模型可以将词语映射到低维向量空间以分析词语语义,为计算机理解和文本处理提供有效手段。传统中文词嵌入模型通过中文词语内部的组成信息学习语义信息,然而,对于汉字及其不同层级组件信息的利用程度,不同模型存在利用不够或过度... 词嵌入模型可以将词语映射到低维向量空间以分析词语语义,为计算机理解和文本处理提供有效手段。传统中文词嵌入模型通过中文词语内部的组成信息学习语义信息,然而,对于汉字及其不同层级组件信息的利用程度,不同模型存在利用不够或过度的问题。为了更好地利用汉字不同层级组件信息生成高质量的词嵌入,提出多级组件融合中文词嵌入(MJWE)模型,综合考虑词语、汉字和多级组件的特征,融合带有位置信息的字嵌入,构建以偏旁、部首和更小粒度的组件构成的多级组件嵌入,从而更全面地捕捉中文词语内部语义信息。同时,构建非组合词词表防止词语内部信息的过度利用。实验结果表明,在词相似任务WS-295上,与JWE(Joint learning Word Embeddings)模型相比,MJWE模型的准确率提高了2.11%;在词类比任务state上,与跳元(SG)模型相比,MJWE模型的准确率提高了2.52%;在词类比任务family上,与连续词袋(CBOW)模型相比,MJWE模型的准确率提高了6.58%。在情感二分类任务上,与JWE模型相比,MJWE模型的准确率提高了0.71%;在情感七分类任务上,与SG模型相比,MJWE模型的准确率提高了8.60%。同时,将MJWE模型应用于中医文献分析,在方剂核心药物识别的任务中,MJWE可以识别治疗慢性肾小球肾炎不同证候的核心药物。可见,MJWE可以生成质量较好的中文词嵌入,结合社区检测算法可以识别治疗慢性肾小球肾炎不同证候的核心药物,有利于辅助中医医师临床决策。 展开更多
关键词 词嵌入 中文词语 多级组件 中药方剂 核心药物
下载PDF
基于迭代算法的新词识别 被引量:7
4
作者 赵小宝 张华平 《计算机工程》 CAS CSCD 2014年第7期154-158,164,共6页
新词识别是中文信息处理的重要基础,但中文字符极强的构词能力给新词检测带来较大困难。受对偶原理的启发,提出一种基于迭代算法的新词识别算法。对目标语料进行分词和词性标注,通过两遍扫描进行字符串统计并提取重复模式。结合词语结... 新词识别是中文信息处理的重要基础,但中文字符极强的构词能力给新词检测带来较大困难。受对偶原理的启发,提出一种基于迭代算法的新词识别算法。对目标语料进行分词和词性标注,通过两遍扫描进行字符串统计并提取重复模式。结合词语结构的特征,迭代使用重复模式互信息、左(右)熵,左(右)邻右(左)平均熵等特征进行新词识别,获得候选新词列表。利用中文词语搭配库对候选新词列表进行最后一次过滤得到最终新词列表。实验结果表明,利用该方法进行新词识别,P@10值达到100%,P@100值提高至90%,左(右)邻右(左)平均熵可在一定程度上提高新词识别的准确率。 展开更多
关键词 对偶原理 新词识别 迭代算法 信息熵 重复模式 中文词语搭配库
下载PDF
Feature study for improving Chinese overlapping ambiguity resolution based on SVM 被引量:1
5
作者 熊英 朱杰 《Journal of Southeast University(English Edition)》 EI CAS 2007年第2期179-184,共6页
In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual informat... In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation. 展开更多
关键词 support vector machine Chinese overlapping ambiguity Chinese word segmentation word probability model
下载PDF
Unified Framework of Performing Chinese Word Segmentation and Part-of-Speech Tagging 被引量:4
6
作者 Zhang Kaixu Sun Maosong 《China Communications》 SCIE CSCD 2012年第3期1-9,共9页
The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) taggi... The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) tagging.In this framework,the input of the PoS tagger is a candidate set of several CWS results provided by the CWS model.The widely used one-at-a-time approach and all-at-once approach are two extreme cases of the proposed candidate-based approaches.Experiments on Penn Chinese Treebank 5 and Tsinghua Chinese Treebank show that the generalized candidate-based approach outperforms one-at-a-time approach and even the all-at-once approach.The candidate-based approach is also faster than the time-consuming all-at-once approach.The authors compare three different methods based on sentence,words and character-intervals to generate the candidate set.It turns out that the word-based method has the best performance. 展开更多
关键词 natural language processing Chineseword segmentation PoS tagging CANDIDATE wordlattice
下载PDF
Cultural Introduction in English Vocabulary Teaching in China
7
作者 Chao Lei 《Sociology Study》 2018年第4期180-187,共8页
Every language possesses three cardinal elements: phonetic element, lexical element, and grammatical structure, of which lexis is the fundamental pillar that supports the huge system of a language. The close relation... Every language possesses three cardinal elements: phonetic element, lexical element, and grammatical structure, of which lexis is the fundamental pillar that supports the huge system of a language. The close relationship between language and culture is most readily seen in words. In fact, being the most active and elastic element of a language, vocabulary has the greatest culture-loading capacity. Vocabulary teaching is an integral part of foreign language teaching. Its efficiency has a direct relation with the development of the learners' communicative competence. Vocabulary is culture-bound, so it is self-evident that culture introduction is indispensable in teaching. The author attempts to make a comparison between English and Chinese cultures, to make clear how cultural disparities exist in English and Chinese vocabulary, and to put forward some constructive suggestions on how to integrate culture into vocabulary teaching in Chinese schools, so as to promote the efficiency of vocabulary teaching and improve learners' competence in intercultural communication. 展开更多
关键词 Cultural differences English vocabulary teaching PRINCIPLES APPROACHES
下载PDF
A Lexical Comparative Analysis of the English and Chinese Color Terms in Connotation
8
作者 WANG Bei 《Journalism and Mass Communication》 2014年第3期215-223,共9页
The vocabulary is the most active factor in the language, and the meaning of color words in the vocabulary is abundant. The nature is multicolored, so various nationalities have formed each unique color view in precip... The vocabulary is the most active factor in the language, and the meaning of color words in the vocabulary is abundant. The nature is multicolored, so various nationalities have formed each unique color view in precipitated long-term history, refracting gorgeous national culture. Culture has restrained the meaning of the color words from developing, and the cultural meaning of the color words has refracted out abundant cultural intension again. Because of different cultural issues, cultural tradition, and culture psychology, the cultural connotations of the English and Chinese color words differ greatly, as a result, these particular cultural connotation meanings are cast under different environments by different nationalities. There are a lot of similarities and differences on the meaning between English and Chinese color words. This paper analyzing Chinese and English color terms in the angle of lexicology, is guided by the book An Introduction to English Lexicology. After reviewing the research done by some linguists, this paper starts from the definition and origin of color terms, studies the changing and word-formation of color terms, then ends with some researches on the idioms on color terms, which gives a systematically comparison of Chinese and English color terms on their developing progress. 展开更多
关键词 color terms ORIGIN CHANGING WORD-FORMATION IDIOMS
下载PDF
浅谈小学语文教师如何做好词语教学
9
作者 孙文保 《好家长》 2018年第25期203-203,共1页
对于语文学科来说,词汇的积累和掌握是非常重要的,同样词语教学在小学语文教学中一直是教学内容的重要组成部分。但由于当前不少教师的词语教学是单一的,往往会让学生产生厌倦的心理,使学生不能对词语有更深的理解,对学生的学习发展有... 对于语文学科来说,词汇的积累和掌握是非常重要的,同样词语教学在小学语文教学中一直是教学内容的重要组成部分。但由于当前不少教师的词语教学是单一的,往往会让学生产生厌倦的心理,使学生不能对词语有更深的理解,对学生的学习发展有很不好的影响。所以在小学语文教学中应采用正确的教学方法,促进小学生的文字掌握和学习,提高学生的写作能力。而如何进行小学语文教学,可以借鉴以下几点内容。 展开更多
关键词 小学教学 中文词语 教学方法
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部