期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
基于N-最短路径方法的中文词语粗分模型 被引量:99
1
作者 张华平 刘群 《中文信息学报》 CSCD 北大核心 2002年第5期1-7,共7页
预处理过程的词语粗切分,是整个中文词语分析的基础环节,对最终的召回率、准确率、运行效率起着重要的作用。词语粗分必须能为后续的过程提供少量的、高召回率的、中间结果。本文提出了一种基于N-最短路径方法的粗分模型,旨在兼顾高召... 预处理过程的词语粗切分,是整个中文词语分析的基础环节,对最终的召回率、准确率、运行效率起着重要的作用。词语粗分必须能为后续的过程提供少量的、高召回率的、中间结果。本文提出了一种基于N-最短路径方法的粗分模型,旨在兼顾高召回率和高效率。在此基础上,引入了词频的统计数据,对原有模型进行改进,建立了更实用的统计模型。针对人民日报一个月的语料库(共计185,192个句子),作者进行了粗分实验。按句子进行统计,2-最短路径非统计粗分模型的召回率为99.73%;在10-最短路径统计粗分模型中,平均6.12个粗分结果得到的召回率高达99.94%,比最大匹配方法高出15%,比以前最好的切词方法至少高出6.4%。而粗分结果数的平均值较全切分减少了64倍。实验结果表明:N-最短路径方法是一种预处理过程中实用、有效的的词语粗分手段。 展开更多
关键词 N-最短路径方法 中文词语粗分模型 中文词语分析 预处理 统计模型 中文信息处理
下载PDF
基于改进的蚁群算法中文词语自动分类技术研究 被引量:2
2
作者 赖娟 《科技通报》 北大核心 2012年第2期152-154,共3页
研究了中文词自动分类问题。针对传统的蚁群算法中文词语分类精确度低等问题,提出了一种将蚁群算法应用到了中文词语自动分类中。方法建立在首先对大规模语料文本进行统计和计算的基础上,得到词的一元和二元信息,然后采用了蚁群算法对... 研究了中文词自动分类问题。针对传统的蚁群算法中文词语分类精确度低等问题,提出了一种将蚁群算法应用到了中文词语自动分类中。方法建立在首先对大规模语料文本进行统计和计算的基础上,得到词的一元和二元信息,然后采用了蚁群算法对该信息进行词的分类。实验结果表明,提出的算法有效提高了词语分类的精确度。 展开更多
关键词 中文词语分类 蚁群算法 语料文本 互信息
下载PDF
基于迭代算法的新词识别 被引量:7
3
作者 赵小宝 张华平 《计算机工程》 CAS CSCD 2014年第7期154-158,164,共6页
新词识别是中文信息处理的重要基础,但中文字符极强的构词能力给新词检测带来较大困难。受对偶原理的启发,提出一种基于迭代算法的新词识别算法。对目标语料进行分词和词性标注,通过两遍扫描进行字符串统计并提取重复模式。结合词语结... 新词识别是中文信息处理的重要基础,但中文字符极强的构词能力给新词检测带来较大困难。受对偶原理的启发,提出一种基于迭代算法的新词识别算法。对目标语料进行分词和词性标注,通过两遍扫描进行字符串统计并提取重复模式。结合词语结构的特征,迭代使用重复模式互信息、左(右)熵,左(右)邻右(左)平均熵等特征进行新词识别,获得候选新词列表。利用中文词语搭配库对候选新词列表进行最后一次过滤得到最终新词列表。实验结果表明,利用该方法进行新词识别,P@10值达到100%,P@100值提高至90%,左(右)邻右(左)平均熵可在一定程度上提高新词识别的准确率。 展开更多
关键词 对偶原理 新词识别 迭代算法 信息熵 重复模式 中文词语搭配库
下载PDF
Feature study for improving Chinese overlapping ambiguity resolution based on SVM 被引量:1
4
作者 熊英 朱杰 《Journal of Southeast University(English Edition)》 EI CAS 2007年第2期179-184,共6页
In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual informat... In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation. 展开更多
关键词 support vector machine Chinese overlapping ambiguity Chinese word segmentation word probability model
下载PDF
Unified Framework of Performing Chinese Word Segmentation and Part-of-Speech Tagging 被引量:3
5
作者 Zhang Kaixu Sun Maosong 《China Communications》 SCIE CSCD 2012年第3期1-9,共9页
The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) taggi... The paper proposes a unified framework to combine the advantages of the fast one-at-a-time approach and the high-performance all-at-once approach to perform Chinese Word Segmentation(CWS) and Part-of-Speech(PoS) tagging.In this framework,the input of the PoS tagger is a candidate set of several CWS results provided by the CWS model.The widely used one-at-a-time approach and all-at-once approach are two extreme cases of the proposed candidate-based approaches.Experiments on Penn Chinese Treebank 5 and Tsinghua Chinese Treebank show that the generalized candidate-based approach outperforms one-at-a-time approach and even the all-at-once approach.The candidate-based approach is also faster than the time-consuming all-at-once approach.The authors compare three different methods based on sentence,words and character-intervals to generate the candidate set.It turns out that the word-based method has the best performance. 展开更多
关键词 natural language processing Chineseword segmentation PoS tagging CANDIDATE wordlattice
下载PDF
Cultural Introduction in English Vocabulary Teaching in China
6
作者 Chao Lei 《Sociology Study》 2018年第4期180-187,共8页
Every language possesses three cardinal elements: phonetic element, lexical element, and grammatical structure, of which lexis is the fundamental pillar that supports the huge system of a language. The close relation... Every language possesses three cardinal elements: phonetic element, lexical element, and grammatical structure, of which lexis is the fundamental pillar that supports the huge system of a language. The close relationship between language and culture is most readily seen in words. In fact, being the most active and elastic element of a language, vocabulary has the greatest culture-loading capacity. Vocabulary teaching is an integral part of foreign language teaching. Its efficiency has a direct relation with the development of the learners' communicative competence. Vocabulary is culture-bound, so it is self-evident that culture introduction is indispensable in teaching. The author attempts to make a comparison between English and Chinese cultures, to make clear how cultural disparities exist in English and Chinese vocabulary, and to put forward some constructive suggestions on how to integrate culture into vocabulary teaching in Chinese schools, so as to promote the efficiency of vocabulary teaching and improve learners' competence in intercultural communication. 展开更多
关键词 Cultural differences English vocabulary teaching PRINCIPLES APPROACHES
下载PDF
A Lexical Comparative Analysis of the English and Chinese Color Terms in Connotation
7
作者 WANG Bei 《Journalism and Mass Communication》 2014年第3期215-223,共9页
The vocabulary is the most active factor in the language, and the meaning of color words in the vocabulary is abundant. The nature is multicolored, so various nationalities have formed each unique color view in precip... The vocabulary is the most active factor in the language, and the meaning of color words in the vocabulary is abundant. The nature is multicolored, so various nationalities have formed each unique color view in precipitated long-term history, refracting gorgeous national culture. Culture has restrained the meaning of the color words from developing, and the cultural meaning of the color words has refracted out abundant cultural intension again. Because of different cultural issues, cultural tradition, and culture psychology, the cultural connotations of the English and Chinese color words differ greatly, as a result, these particular cultural connotation meanings are cast under different environments by different nationalities. There are a lot of similarities and differences on the meaning between English and Chinese color words. This paper analyzing Chinese and English color terms in the angle of lexicology, is guided by the book An Introduction to English Lexicology. After reviewing the research done by some linguists, this paper starts from the definition and origin of color terms, studies the changing and word-formation of color terms, then ends with some researches on the idioms on color terms, which gives a systematically comparison of Chinese and English color terms on their developing progress. 展开更多
关键词 color terms ORIGIN CHANGING WORD-FORMATION IDIOMS
下载PDF
浅谈小学语文教师如何做好词语教学
8
作者 孙文保 《好家长》 2018年第25期203-203,共1页
对于语文学科来说,词汇的积累和掌握是非常重要的,同样词语教学在小学语文教学中一直是教学内容的重要组成部分。但由于当前不少教师的词语教学是单一的,往往会让学生产生厌倦的心理,使学生不能对词语有更深的理解,对学生的学习发展有... 对于语文学科来说,词汇的积累和掌握是非常重要的,同样词语教学在小学语文教学中一直是教学内容的重要组成部分。但由于当前不少教师的词语教学是单一的,往往会让学生产生厌倦的心理,使学生不能对词语有更深的理解,对学生的学习发展有很不好的影响。所以在小学语文教学中应采用正确的教学方法,促进小学生的文字掌握和学习,提高学生的写作能力。而如何进行小学语文教学,可以借鉴以下几点内容。 展开更多
关键词 小学教学 中文词语 教学方法
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部