期刊文献+

汉语词汇动态属性与变异

Dynamic Attribute and Change of Chinese Lexicon
原文传递
导出
摘要 本文区分词语和词汇,词语是个别词,词汇是词语的集合体。过去讨论不同词汇的差异都只能列举词语的异同,无法呈现宏观的词汇特色。以断代词典所收录的字词来比较,也很难看出不同时代词汇的差异。本文考查'中研院'所收集的上古汉语数字资源、近代汉语数字文本、现代汉语平衡语料库、《唐诗三百首》、《宋词三百首》、北京大学标记的《人民日报》1998年新闻稿以及台湾通讯社1991—2002年所发布的新闻文字,论述词语成千上万,须要提炼出有意义的词汇特色来区别词汇异同。区别的关键在于词语的使用而不在于词语的有无,词语使用表现在语流或文本中。因此,本文所提出的词汇属性称为词汇动态属性。在语流中词语出现次数高低可以排序,从排序中可以从最高词频往下累积,得出词频在全部词语数目中的百分比。我们以词频统计中最高的15个词语的词频累积百分比作为高频词集中度,以高频词集中度当作词汇动态特性。从文本计算出来的词汇动态特性能清楚划分出所考查的古代、近代、现代、诗词和新闻稿的词汇属性。希望这个计量性质的词汇属性对今后的词汇研究有些助益。 This study makes a distinction between word and lexicon. A word is an individual lexical item. The lexicon is an aggregate of words. Past discussions of differences among lexicons could only list individual words for comparison. There was no way to show an overall view of lexical characteristics. Even when one compared the words collected in the dictionaries of different historical periods, it was difficult to see the lexical differences in various stages. In this study, the Old Chinese Digital Resources, Pre-Modern Chinese Digital Archive, Balanced Modern Chinese Digital Resources of "Academia Sinica", the 300 Tang Poems, the 300 Song Lyrics, the 1998 People’s Daily News Releases as word-segmented and tagged at Peking University, and the 1991-2002 digital news reports of the News Agency of Taiwan were examined. As the words used in these texts were vast in number, it was mandatory to extract a small number of significant lexical characteristics to capture the distinct nature of the lexicons. The crucial point of distinction is how words are used and not whether particular words exit. Word usage appears in word streams or texts. Therefore, the lexical attribute discussed here is called dynamic attribute. The occurrences of words in the text streams can be tabulated for their frequency and percentage of the occurrence with respect to the entire texts. As the word of the highest frequency is listed first, the cumulative percentage of the occurrences of the 15 highest frequency words is also tabulated. The cumulative percentage can be considered as the concentration level of high frequency words in use. This concentration level clearly differentiates the types of texts used in Old Chinese, Pre-Modern Chinese, Modern Chinese, poetic writings and press releases. Thus this lexical attribute is of quantitative nature and may be of some use in future research.
作者 郑锦全
出处 《语言学论丛》 CSSCI 2017年第2期1-19,共19页 Essays on Linguistics
关键词 词语与词汇 词汇动态特性 词频累积百分比 高频词集中度 word and lexicon lexical dynamic attribute cumulative frequency percentage concentration level of high frequency words
  • 相关文献

参考文献2

二级参考文献8

  • 1王力.《汉语词汇史》,载《王力文集》(第十一卷),济南:山东教育出版社,1990.
  • 2王力.《新训诂学》,《开明书店二十周年纪念文集》,1947年版.
  • 3王力1957.《汉语史稿》,《王力文集》第9卷,济南:山东教育出版社.
  • 4王力1962.《训诂学上的一些问题》,《王力文集》第19卷,济南:山东教育出版社.
  • 5王力1964.《中国语言学史》,《王力文集》第12卷,济南:山东教育出版社.
  • 6王力1982.《同源字典》,《王力文集》第8卷,济南:山东教育出版社.
  • 7王力1983.《研究古代汉语要建立历史发展的观点》,《王力文集》第16卷,济南:山东教育出版社.
  • 8王力1984.《我的治学经验》,《王力文集》第20卷,济南:山东教育出版社.

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部