摘要
我们在歌词上做了一些传统的自然语言处理相关的实验。歌词是歌曲语义上的重要表达,因此,对歌词的分析可以作为歌曲音频处理的互补。我们利用齐夫定律对歌词语料库的字和词进行统计特征的考察,实验表明,其分布基本符合齐夫定律。利用向量空间模型的表示,我们可以找到比较相似的歌词集合。另外,我们探讨了如何利用歌词中的时间标注信息进行进一步的分析:例如发现歌曲中重复片段,节奏划分,检索等。初步的实验表明,我们的方法具有一定的效果。
We report experiments on song lyrics based on natural language processing techniques. Song lyrics play an important role of the semantics in songs; therefore, analysis of lyrics may be a complement of acoustic methods. We investigate the lyrics corpus based on Zip'f Law using both character and word as a unit, which proves the validness Zip'f Law in such corpus. Also, we find a set of lyrics that are similar to each other by means of vector space mo- del. Moreover, we discuss how to use the time annotation for further analysis; detecting the repetition of songs identifying rhythms, retrieving songs and soon. Preliminary experiment shows the effectiveness of our proposed method.
出处
《中文信息学报》
CSCD
北大核心
2007年第5期61-67,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60573187
60621062
60520130299)
关键词
计算机应用
中文信息处理
歌词
齐夫定律
K-近邻
节奏
computer application
Chinese information processing
song lyrics
zipf' s law
k-NN
rhythm