Demonstration on Automatic Musical Key Transposition Using Harmonic-Temporal Factor Decomposition

Tomohiko Nakamura (The University of Tokyo) and Hirokazu Kameoka (The University of Tokyo/Nippon Telegraph and Telephone Corporation)

We show the effectiveness of the proposed methods [1] for an automatic musical key transposition task.
  • Proposed methods: Harmonic-Temporal Factor Decomposition (HTFD) and its source-filter extension (SF-HTFD)
  • Conventional method: Harmonic Non-Negative Matrix Factorization (HNMF)
Our procedure of the automatic musical key transposition is as follows:
  1. Given a music audio signal and its key, we separated the magnitude spectrogram of the music signal into those of individual semitones using the separation method (HTFD, SF-HTFD, or HNMF).
  2. We next transposed the separated components of specific semitones by shifting the corresponding separated spectrograms along the log-frequency axis.
  3. We added all the magnitude spectrograms together to obtain the pitch-transposed magnitude spectrograms.
  4. We finally converted the pitch-transposed magnitude spectrogram into a time-domain signal by a spectrogram inversion algorithm for fast approximate continuous wavelet transform presented in [2].
Several samples obtained with the methods are shown below. The keys of the original musical pieces are from the corresponding datasets.

HNMF HTFD (Proposed) SF-HTFD (Proposed)
01-AchGottundHerr from Bach10 dataset [3]
Key: C Major → C Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches
02-AchLiebenChristen from Bach10 dataset [3]
Key: A Minor → A Major
Transposed
Pitches to be transposed
Unchanged pitches
03-ChristederdubistTagundLicht from Bach10 dataset [3]
Key: Bb Major → Bb Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches
04-ChristeDuBeistand from Bach10 dataset [3]
Key: C Major → C Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches
09-Jesus from Bach10 dataset [3]
Key: D Minor → D Major
Transposed
Pitches to be transposed
Unchanged pitches
10-NunBitten from Bach10 dataset [3]
Key: A Major → A Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches
AuMix_05_Entertainer_tpt_tpt from URMP dataset [4]
Key: Eb Major → Eb Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches
AuMix_34_Fugue_tpt_tpt_hn_tbn from URMP dataset [4]
Key: F Major → F Natural Minor
Transposed
Pitches to be transposed
Unchanged pitches

References

[1] Tomohiko Nakamura and Hirokazu Kameoka, “Harmonic-temporal factor decomposition for unsupervised monaural separation of harmonic sounds,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 68–82, Nov. 2020.
slides , poster , demo , code , dataset
[2] Tomohiko Nakamura and Hirokazu Kameoka, “Fast signal reconstruction from magnitude spectrogram of continuous wavelet transform based on spectrogram consistency,” in Proceedings of International Conference on Digital Audio Effects, Sep. 2014, pp. 129–135.
paper , demo , [Travel Grant by the Hara Research Foundation]
[3] Zhiyao Duan and Bryan Pardo, "Soundprism: An Online System for Score-Informed Source Separation of Music Audio," IEEE Journal of Selected Topics in Signal Processing, vol. 5, no. 6, pp. 1205–1215, 2011.
[4] Bochen Li, Xinzhao Liu, Karthik Dinesh, Zhiyao Duan, and Gaurav Sharma, "Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications," IEEE Transactions on Multimedia, vol. 21, no. 2, pp. 522–535, 2019.