期刊文献+

大型藏语词典的优化索引模型(英文)

THE OPTIMIZED INDEX MODEL OF TIBETAN DICTIONARY
下载PDF
导出
摘要 A traditional-ordered Tibetan dictionary based on present Tibetan coded character sets (ISO/IEC 10646-1:1993 & GB16959-1997) is of hashing structure, and can make no effective index work because of lacking of ordered internal coded character within computers. This paper establishes a transformational relationship between Tibetan letters and numerical codes with the supplement of analyzing the constructional rules of Tibetan words. According to the statistical analysis of syllabic distribution in a large Tibetan dictionary, we design a multi-level index optimizing project for dictionary data retrieval. The core content includes the idea of layer upon layer processing to the letters of basic consonants and vowels and the matching method based on code prefixes of words. At last we propose a concept of 揵ucket?to process the homographs encountered in data retrieval. A traditional-ordered Tibetan dictionary based on present Tibetan coded character sets (ISO/IEC 10646-1:1993 & GB16959-1997) is of hashing structure, and can make no effective index work because of lacking of ordered internal coded character within computers. This paper establishes a transformational relationship between Tibetan letters and numerical codes with the supplement of analyzing the constructional rules of Tibetan words. According to the statistical analysis of syllabic distribution in a large Tibetan dictionary, we design a multi-level index optimizing project for dictionary data retrieval. The core content includes the idea of layer upon layer processing to the letters of basic consonants and vowels and the matching method based on code prefixes of words. At last we propose a concept of 揵ucket?to process the homographs encountered in data retrieval.
作者 康才畯 江荻
出处 《语言研究》 CSSCI 北大核心 2004年第1期120-125,共6页 Studies in Language and Linguistics
关键词 现代藏语 多级索引 检索码前缀 字符数值编码 藏文词典 优化索引模型 Tibetan multi-level index code prefixes character numerical codes
  • 相关文献

参考文献2

二级参考文献6

  • 1[1]中国国家标准.信息技术信息交换用藏文编码字符集基本集(GB6959).北京:中国标准出版社,1997
  • 2[2]张怡荪.藏汉大词典.北京:民族出版社,1985
  • 3[3]周季文.藏文拼音教材.北京:民族出版社,1983
  • 4National Standard of PRC. Information Technology, Tibeyan Coded Character Sets for Information Interchange, Basic Set(GB 16959-1997). Beijing: Standards Press of China, 1998(in Chinese)(中华人民共和国国家标准. 信息技术、信息交换用藏文编码字符集、基本集(GB16959-1997). 北京:中国标准出版社,1998)
  • 5ISO/IEC 10646-1:1993:Information Technology-Universal Multiple-Octet Coded Character(UCS)
  • 6江荻,周季文.论藏文的序性及排序方法[J].中文信息学报,2000,14(1):56-64. 被引量:34

共引文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部