摘要
本文从分析现有的方言词汇相似度计算方法入手,指出王沈计量法和加权平均法均存在不足之处。在此基础上,提出一种新的算法——语素加权法。语素加权法将每个词目权重设定为1,并根据词中语素的重要程度为每个语素加权;在语素加权的基础上,将构词法纳入考虑,计算其相似度。运用语素加权法,统计了普通话、广州话及七个四邑方言点的词汇相似度,并在此基础上为四邑方言做了聚类分析。
With a review of the existing measurements of lexical similarity among dialects, this study demonstrates the shortcomings of the Wang-Shen measurements and the weighted average method, and proposes a new measurement, i.e. morpheme weighting. Morpheme weighting assigns the invariable weight 1 to each word, and morphemes are weighted according to their importance as well as word-formation. By adopting this method, the paper calculates the lexical similarity of Mandarin, the Guangzhou dialect and the Siyi dialects, and conducts a clustering analysis for the Siyi dialects. It is found that the relations among the Siyi dialects are a reflection of their geographical positions in relation to one another and that lexical similarity reflects a synchronic rather than diachronic relation.
出处
《中国语文》
CSSCI
北大核心
2017年第6期693-703,共11页
Studies of the Chinese Language
基金
广东省创新强校特色创新类项目"五邑侨乡方言与地方文化研究"(项目号:2015WTSCX105)资助
关键词
词汇
相似度
语素加权法
四邑方言
lexical similarity
morpheme weighting
Siyi dialects