期刊文献+

维吾尔语动词体范畴的有限状态自动机的构建 被引量:4

Generating the Finite State Machines of Uyghur Verb Aspect Categories
下载PDF
导出
摘要 维吾尔语动词的体范畴是维吾尔语动词语法范畴中极为复杂的范畴,也是维吾尔语信息处理中的难点问题之一,计算机对维吾尔语动词体范畴的处理是在对人称、时、否定等语法范畴处理之后才进行处理。但是难点就是体范畴重叠问题的解决。维吾尔语动词的体范畴词尾按照一定的规则连接在词干,这使得维吾尔语动词体范畴的重叠形式可用有限状态自动机形式化描述。因此它根据重叠规则构造从右向左的非确定自动机,之后把从右向左方向的自动机转换成从左向右的非确定自动机,最后把非确定自动机转换成确定自动机来实现维吾尔语动词体范畴的形式化描述。 The verb aspect category is one of the most complicated categories in Uighur language and,thus,remains as one of the hardest problems in Uyghur language processing.Computer processing of verb aspect category can only be done after resolving the grammatical categories such as tense,person,negative in Uighur language.But overlapping of verb aspect is hard to crack.The verb aspect suffixes of Uighur language are attached to the verb stem according to specific rules,which enables to describe the overlapping forms of Uyghur verb aspect in terms of finite state machine.An FSM can be firstly generated from right to left according to overlapping rules,then it can be transformed into DFA from left to right,during which the formal description of Uyghur verb aspect is realized.
出处 《中文信息学报》 CSCD 北大核心 2012年第4期61-65,84,共6页 Journal of Chinese Information Processing
基金 2011年度教育部人文社会科学青年基金资助项目(11YJC740001) 国家社会科学基金资助项目(10AYY006) 新疆维吾尔自治区普通高等学校人文社会科学重点研究基地基金资助项目(010812B04)
关键词 维吾尔语 动词 体范畴 有限状态自动机 形式化 Uyghur language verb aspect category,finite state machine,formalization
  • 相关文献

参考文献14

  • 1吐尔迪·艾合买提.维吾尔语[M].新疆:人民出版社,1981,716.
  • 2哈米提·铁木尔.现代维吾尔语语法[M].北京:民族出版社,1987:246-248.
  • 3蒲泉,武致中.实用维吾尔语语法[M].新疆:人民出版社,1994,155.
  • 4刘珉.汉维共时语法[M].新疆:人民出版社,1991,143-155.
  • 5木哈白提·哈斯木,哈力克·尼亚孜.现代维吾尔语动词体语缀的重叠与分布[J].民族语文,1996(1):57-60. 被引量:3
  • 6木哈白提,哈力克.现代维语动词语缀─wεt、─wal-wεr、-ala、-wat探析[J].语言与翻译,1996(2):12-15. 被引量:1
  • 7L. S. Larkey, L. Ballesteros, M. E. Connell. Impro- ving Stemming for Arabic Information Retrieval: Light Stemming and Co-occurrence Analysis [C]// Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, Aug. 2002: 275-282.
  • 8Greengrass M. , Robertson A. M. , Robyn S. , et al. Processing morphological variants in searches of Latin text[J]. Information Research News, 1996, 6 (4) : 2-5.
  • 9Berlian V. , Vega S. N. , Bressan S. Indexing the In- donesian web: Language identification and miscellane- ous issues [C]//Proceedings of 10th International World Wide Web Conference, Hong Kong, 2001.
  • 10G. Eryigit & E. Adah. An Affix Stripping Morpho- logical Analyzer for Turkish[C]//Proceedings of the IASTED International Conference ARTIFICIAL IN- TELLIGENCE AND APPLICATIONS, 2004, Inns- bruck, Austria.

二级参考文献15

  • 1古丽拉.阿东别克,米吉提.阿布力米提.维吾尔语词切分方法初探[J].中文信息学报,2004,18(6):61-65. 被引量:39
  • 2力提甫.托乎提.电脑处理维吾尔语语音和谐律的可能性[J].中央民族大学学报(哲学社会科学版),2004,31(5):108-113. 被引量:14
  • 3阿依克孜.卡德尔,开沙尔.卡德尔,吐尔根.依布拉音.面向自然语言信息处理的维吾尔语名词形态分析研究[J].中文信息学报,2006,20(3):43-48. 被引量:23
  • 4L. S. Larkey, L. Ballesteros and M. E. Connell. Improving Stemming for Arabic Information Retrieval: Light Stemming and Co-occurrence Analysis[C]//Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, Tampere, Finland,2002, 275-282.
  • 5Tai, S. Y., Ong, C. S., and Abdullah, N. A. On designing an automated Malaysian stemmer for the Malay language(poster) [C]//Proeeedings of the fifth international workshop on information retrieval with Asian languages, Hong Kong, 2000: 207-208.
  • 6Greengrass, M., Robertson, A. M., Robyn, S., and Willett, P. Processing morphological variants in searches of Latin text [J]. Information research news, 1996, 6(4): 2-5.
  • 7Berlian, V., Vega, S. N., and Bressan, S. Indexing the Indonesian web: Language identification and miscellaneous issues[C]//Presented at Tenth International World Wide Web Conference, Hong Kong, 2001.
  • 8Carlberger, J., Dalianis, H., Hassel, M., and Knutsson, O. Improving precision in information retrieval for Swedish using stemming[C]//Proceedings of NO- DALIDA'01-13th Nordic conference on computational linguistics, Uppsala,Sweden, 2001.
  • 9Monz, C. and de Rijke, M. Shallow morphological analysis in rnonolingual information retrieval for German and Italian[C]//Cross-qanguage information retrieval and evaluation: Proceedings of the CLEF 2001 workshoo, C. Peters, Ed.: Soringer Verlag. 2001.
  • 10G. Eryigit & E. Adal I. An Affix Stripping Morphological Analyzer for Turkish [C]//Proceedings of the lasted International Conference Artificial Iintelligence Applications, Innsbruck, Austria, 2004.

共引文献32

同被引文献42

  • 1古丽拉.阿东别克,米吉提.阿布力米提.维吾尔语词切分方法初探[J].中文信息学报,2004,18(6):61-65. 被引量:39
  • 2马欢,吾守尔.斯拉木.维吾尔语文语转换系统文本分析模块初探[J].计算机工程,2006,32(16):267-268. 被引量:6
  • 3王斯日古楞.蒙古语单词词性自动识别研究[J].内蒙古师范大学学报(自然科学汉文版),2007,36(3):319-321. 被引量:2
  • 4哈密提·铁木尔.现代维吾尔语语法[M].北京:民族出版社,1987.
  • 5SZARVAS M,FURUI S.Finite State Transducer basedModeling of Morphosyntax with Application to HungarianLVCSR [C] // ICASSP 2003.[ s.l.]:ConferencePublications,2003:368-371.
  • 6HIRSIMAKI T,CREUTZ M,SIIYOLA V,et al.Unlim-ited Vocabulary Speech Recognition with Morph LanguageModels Applied to Finnish [ J].Computer Speech andLanguage,2006,20(4):515-541.
  • 7KWON O,PARK J.Korean Large Vocabulary Continu-ous Speech Recognition with Morpheme-based Recogni-tion Units [ J].Speech Communication,2003,39(3-4):287-300.
  • 8HACIOGLU K,PELLOM B.On Lexicon Creation forTurkish LVCSR [ C] // Eurospeech 2003.[s.l.]:Conference Publications,2003:1165-1168.
  • 9ARISOY E,DUTAGACI H,ARSLAN L M.A UnifiedLanguage Model for Large Vocabulary Continuous SpeechKecognition of Turkish [J].Signal Process,2006,86(10):2844-2862.
  • 10SAK H,SARAgLAR M,GUNGOR T. Morphology-basedand Suh-wonl Language Modeling for Turkish SpeechRecognition [ C].// ICASSP 2010.[ s.l.]:Confer-ence Publications,2010:5402-5405.

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部