Separated examples by vocal ensemble separation methods using our jaCappella corpus

Tomohiko Nakamura, Shinnosuke Takamichi, Naoko Tanji (The University of Tokyo), Satoru Fukayama (AIST), Hiroshi Saruwatari (The University of Tokyo)

This is the demo page of our ICASSP 2023 paper [1]. We show separated examples by vocal ensemble separation methods using our jaCappella corpus. The project page of the corpus is here.


Mixture Voice part Ground Truth X-UMX [2] DPTNet [3] MRDLA [4]
Dongurikorokoro
(どんぐりころころ in Japanese)
Genre: bossa nova
Lead vocal
Soprano
Alto
Tenor
Bass
Vocal percussion
Hiraitahiraita
(ひらいたひらいた in Japanese)
Genre: enka
Lead vocal
Soprano
Alto
Tenor
Bass
Vocal percussion
Otamajakushi
(お玉じゃくし in Japanese)
Genre: jazz
Lead vocal
Soprano
Alto
Tenor
Bass
Vocal percussion

References

[1] Tomohiko Nakamura, Shinnosuke Takamichi, Naoko Tanji, Satoru Fukayama, and Hiroshi Saruwatari, “jaCappella corpus: A japanese a cappella vocal ensemble corpus,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Jun. 2023.
arXiv, demo , code , dataset
[2] https://github.com/asteroid-team/asteroid/tree/master/egs/musdb18/X-UMX
[3] S. Sarkar, E. Benetos, and M. Sandler, "Vocal harmony separation using time-domain neural networks," in Proc. INTERSPEECH, 2021, pp. 3515–3519.
[4] Tomohiko Nakamura, Shihori Kozuka, and Hiroshi Saruwatari, “Time-domain audio source separation with neural networks based on multiresolution analysis,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1687–1701, Apr. 2021.
slides , poster , demo , code , [The Itakura Prize Innovative Young Researcher Award / 第17回日本音響学会・独創研究奨励賞板倉記念]