期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Dance2MIDI:Dance-driven multi-instrument music generation
1
作者 Bo Han Yuheng Li +2 位作者 Yixuan Shen Yi Ren Feilin Han 《Computational Visual Media》 SCIE EI CSCD 2024年第4期791-802,共12页
Dance-driven music generation aims to generate musical pieces conditioned on dance videos.Previous works focus on monophonic or raw audio generation,while the multi-instrument scenario is under-explored.The challenges... Dance-driven music generation aims to generate musical pieces conditioned on dance videos.Previous works focus on monophonic or raw audio generation,while the multi-instrument scenario is under-explored.The challenges associated with dancedriven multi-instrument music(MIDI)generation are twofold:(i)lack of a publicly available multi-instrument MIDI and video paired dataset and(ii)the weak correlation between music and video.To tackle these challenges,we have built the first multi-instrument MIDI and dance paired dataset(D2MIDI).Based on this dataset,we introduce a multi-instrument MIDI generation framework(Dance2MIDI)conditioned on dance video.Firstly,to capture the relationship between dance and music,we employ a graph convolutional network to encode the dance motion.This allows us to extract features related to dance movement and dance style.Secondly,to generate a harmonious rhythm,we utilize a transformer model to decode the drum track sequence,leveraging a cross-attention mechanism.Thirdly,we model the task of generating the remaining tracks based on the drum track as a sequence understanding and completion task.A BERTlike model is employed to comprehend the context of the entire music piece through self-supervised learning.We evaluate the music generated by our framework trained on the D2MIDI dataset and demonstrate that our method achieves state-of-the-art performance. 展开更多
关键词 video understanding music generation symbolic music cross-modal learning self-supervision
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部