摘要
对连续话语语料库进行切分和标记是一项新的课题 ,它对语料库的充分利用有重要作用 ,如何做好这项工作是一个值得探讨的问题。本文通过对一个语料库的切分和标记 ,得出了一些初步看法和认识 ,在这里跟同行们切磋 ,以使这项工作做得更完善。
segmentation and labeling for continuous speech database is important to the better use of database.The question of how to improve segmentation and labeling calls for further discussion. This paper shows the labeling and segmentation work we have done in standard Chinese.We have concluded with some labeling rules and segmentation units according to the database.We hope to got sayyestions and to do the work further.
出处
《语言文字应用》
CSSCI
北大核心
2000年第2期78-82,共5页
Applied Linguistics