International Conferences & Workshops / 国際会議¶
- Go Nishikawa, Wataru Nakata, Yuki Saito, Kanami Imamura, Hiroshi Saruwatari, and Tomohiko Nakamura, “Multi-sampling-frequency naturalness MOS prediction using self-supervised learning model with sampling-frequency-independent layer,” in Proceedings of IEEE Automatic Speech Recognition and Understanding Workshop, Dec. 2025. (First and second authors contributed equally.)
bib - Ryan Niu, Shoichi Koyama, and Tomohiko Nakamura, “Head-related transfer function individualization using anthropometric features and spatially independent latent representations,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 2025.
bib - Hitoshi Suda, Junya Koguchi, Shunsuke Yoshida, Tomohiko Nakamura, Fukayama Satoru, and Jun Ogata, “IdolSongsJp corpus: A multi-singer song corpus in the style of Japanese idol groups,” in Proceedings of International Society for Music Information Retrieval Conference, Sep. 2025.
bib arXiv - Kanami Imamura, Tomohiko Nakamura, Norihiro Takamune, Kohei Yatabe, and Hiroshi Saruwatari, “Local equivariance error-based metrics for evaluating sampling-frequency-independent property of neural network,” in Proceedings of European Signal Processing Conference, Sep. 2025.
bib arXiv - Aogu Wada, Tomohiko Nakamura, and Saruwatari Hiroshi, “Hyperbolic embeddings for order-aware classification of audio effect chains,” in Proceedings of International Conference on Digital Audio Effects, Sep. 2025.
bib arXiv - Yuto Ishikawa, Tomohiko Nakamura, Norihiro Takamune, and Hiroshi Saruwatari, “Hearing-aids system using distributed assistive device and blind speech extraction method under diffuse noise,” in Proceedings of International Congress on Acoustics, May 2025.
bib - Tomohiko Nakamura, Kwanghee Choi, Keigo Hojo, Yoshiaki Bando, Satoru Fukayama, and Shinji Watanabe, “Discrete speech unit extraction via independent component analysis,” in Proceedings of SALMA: Speech and Audio Language Models - Architectures, Data Sources, and Training Paradigms, IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, Apr. 2025.
bib arXiv poster code - Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, “Real-time noise estimation for Lombard-effect speech synthesis in human–avatar dialogue systems,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2024.
bib paper - Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura, Junya Koguchi, and Hiroshi Saruwatari, “DNN-based ensemble singing voice synthesis with interactions between singers,” in Proceedings of IEEE Spoken Language Technology Workshop, Dec. 2024, pp. 660–667.
bib arXiv code - Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Fukayama Satoru, and Jun Ogata, “FruitsMusic: A real-world corpus of Japanese idol-group songs,” in Proceedings of International Society for Music Information Retrieval Conference, Nov. 2024.
bib arXiv dataset - Yuta Amezawa, Tomohiko Nakamura, Satoru Fukayama, Takahiro Shiina, and Takahiko Uchide, “Automatic extraction and peak arrival estimation of later phase in S coda,” in International Joint Workshop on Slow-to-Fast Earthquakes 2024, Sep. 2024.
bib - Yuto Ishikawa, Tomohiko Nakamura, Norihiro Takamune, and Hiroshi Saruwatari, “Real-time framework for speech extraction based on independent low-rank matrix analysis with spatial regularization and rank-constrained spatial covariance matrix estimation,” in Workshop on Spoken Dialogue Systems for Cybernetic Avatars (SDS4CA), Sep. 2024. (Presentation only)
bib - Kwanghee Choi, Ankita Pasad, Tomohiko Nakamura, Satoru Fukayama, Karen Livescu, and Shinji Watanabe, “Self-supervised speech representations are more phonetic than semantic,” in Proceedings of INTERSPEECH, Sep. 2024, pp. 4578–4582.
bib arXiv code - Yoshiaki Bando, Tomohiko Nakamura, and Shinji Watanabe, “Neural blind source separation and diarization for distant speech recognition,” in Proceedings of INTERSPEECH, Sep. 2024, pp. 722–726.
bib arXiv demo code - Yuto Ishikawa, Kohei Konaka, Tomohiko Nakamura, Norihiro Takamune, and Hiroshi Saruwatari, “Real-time speech extraction using spatially regularized independent low-rank matrix analysis and rank-constrained spatial covariance matrix estimation,” in Proceedings of Hands-Free Speech Communication and Microphone Arrays, IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, Apr. 2024, pp. 730–734.
bib arXiv - Kanami Imamura, Tomohiko Nakamura, Norihiro Takamune, Kohei Yatabe, and Hiroshi Saruwatari, “Algorithms of sampling-frequency-independent layers for non-integer strides,” in Proceedings of European Signal Processing Conference, Sep. 2023, pp. 326–330.
bib arXiv - Joonyong Park, Shinnosuke Takamichi, Tomohiko Nakamura, Kentaro Seki, Detai Xin, and Hiroshi Saruwatari, “How generative spoken language model encodes noisy speech: Investigation from phonetics to syntactics,” in Proceedings of INTERSPEECH, Aug. 2023, pp. 1085–1089.
bib arXiv - Tomohiko Nakamura, Shinnosuke Takamichi, Naoko Tanji, Satoru Fukayama, and Hiroshi Saruwatari, “jaCappella corpus: A Japanese a cappella vocal ensemble corpus,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Jun. 2023.
bib arXiv demo code dataset - Kota Arai, Yutaro Hirao, Takuji Narumi, Tomohiko Nakamura, Shinnosuke Takamichi, and Shigeo Yoshida, “TimToShape: Supporting practice of musical instruments by visualizing timbre with 2D shapes based on crossmodal correspondences,” in Proceedings of ACM Conference on Intelligent User Interfaces, Mar. 2023, pp. 850–865.
bib blog - Futa Nakashima, Tomohiko Nakamura, Norihiro Takamune, Satoru Fukayama, and Hiroshi Saruwatari, “Hyperbolic timbre embedding for musical instrument sound synthesis based on variational autoencoders,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Nov. 2022, pp. 736–743.
bib arXiv - Yuki Ito, Tomohiko Nakamura, Shoichi Koyama, and Hiroshi Saruwatari, “Head-related transfer function interpolation from spatially sparse measurements using autoencoder with source position conditioning,” in Proceedings of International Workshop on Acoustic Signal Enhancement, Sep. 2022.
[Finalist of Best Student Paper Award of IWAENC 2022 (Yuki Ito)]
bib arXiv slides demo code - Kazuhide Shigemi, Shoichi Koyama, Tomohiko Nakamura, and Hiroshi Saruwatari, “Physics-informed convolutional neural network with bicubic spline interpolation for sound field estimation,” in Proceedings of International Workshop on Acoustic Signal Enhancement, Sep. 2022.
bib arXiv - Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura, Naoko Tanji, and Hiroshi Saruwatari, “SelfRemaster: Self-supervised speech restoration with analysis-by-synthesis approach using channel modeling,” in Proceedings of INTERSPEECH, Sep. 2022, pp. 4406–4410.
bib arXiv demo code - Masaya Kawamura, Tomohiko Nakamura, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, “Differentiable digital signal processing mixture model for synthesis parameter extraction from mixture of harmonic sounds,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2022, pp. 941–945.
[IEEE Signal Processing Society Japan Student Conference Paper Award (Awardee: Masaya Kawamura) / 第16回 IEEE Signal Processing Society Japan Student Conference Paper Award(受賞者:川村 真也)]
bib arXiv demo - Takuya Hasumi, Tomohiko Nakamura, Norihiro Takamune, Hiroshi Saruwatari, Daichi Kitamura, Yu Takahashi, and Kazunobu Kondo, “Multichannel audio source separation with independent deeply learned matrix analysis using product of source models,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2021, pp. 1226–1233.
bib paper arXiv - Sota Misawa, Norihiro Takamune, Tomohiko Nakamura, Daichi Kitamura, Hiroshi Saruwatari, Masakazu Une, and Shoji Makino, “Speech enhancement by noise self-supervised rank-constrained spatial covariance matrix estimation via independent deeply learned matrix analysis,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2021, pp. 578–584.
bib paper arXiv - Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, “Prior distribution design for music bleeding-sound reduction based on nonnegative matrix factorization,” in Proceedings of Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Dec. 2021, pp. 651–658.
bib paper arXiv - Koichi Saito, Tomohiko Nakamura, Kohei Yatabe, Yuma Koizumi, and Hiroshi Saruwatari, “Sampling-frequency-independent audio source separation using convolution layer based on impulse invariant method,” in Proceedings of European Signal Processing Conference, Aug. 2021, pp. 321–325.
bib arXiv - Naoki Narisawa, Rintaro Ikeshita, Norihiro Takamune, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, and Tomohiro Nakatani, “Independent deeply learned tensor analysis for determined audio source separation,” in Proceedings of European Signal Processing Conference, Aug. 2021, pp. 326–330.
bib arXiv - Takuya Hasumi, Tomohiko Nakamura, Norihiro Takamune, Hiroshi Saruwatari, Daichi Kitamura, Yu Takahashi, and Kazunobu Kondo, “Empirical bayesian independent deeply learned matrix analysis for multichannel audio source separation,” in Proceedings of European Signal Processing Conference, Aug. 2021, pp. 331–335.
bib arXiv - Shihori Kozuka, Tomohiko Nakamura, and Hiroshi Saruwatari, “Investigation on wavelet basis function of DNN-based time domain audio source separation inspired by multiresolution analysis,” in Proceedings of International Congress and Exposition on Noise Control Engineering, Aug. 2020, pp. 4013–4022.
bib paper - Tomohiko Nakamura and Hiroshi Saruwatari, “Time-domain audio source separation based on Wave-U-Net combined with discrete wavelet transform,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2020, pp. 386–390.
bib arXiv - Tomohiko Nakamura and Hirokazu Kameoka, “Shifted and convolutive source-filter non-negative matrix factorization for monaural audio source separation,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Mar. 2016, pp. 489–493.
bib - Tomohiko Nakamura and Hirokazu Kameoka, “Lp-norm non-negative matrix factorization and its application to singing voice enhancement,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr. 2015, pp. 2115–2119.
bib - Tomohiko Nakamura, Kotaro Shikata, Norihiro Takamune, and Hirokazu Kameoka, “Harmonic-temporal factor decomposition incorporating music prior information for informed monaural source separation,” in Proceedings of International Society for Music Information Retrieval Conference, Oct. 2014, pp. 623–628.
[Travel Grant by the Tateishi Science and Technology Foundation]
bib paper demo - Tomohiko Nakamura and Hirokazu Kameoka, “Fast signal reconstruction from magnitude spectrogram of continuous wavelet transform based on spectrogram consistency,” in Proceedings of International Conference on Digital Audio Effects, Sep. 2014, pp. 129–135.
[Travel Grant by the Hara Research Foundation]
bib paper demo - Takuya Higuchi, Hirofumi Takeda, Tomohiko Nakamura, and Hirokazu Kameoka, “A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models,” in Proceedings of INTERSPEECH, Sep. 2014, pp. 850–854.
bib paper - Tomohiko Nakamura, Hirokazu Kameoka, Kazuyoshi Yoshii, and Masataka Goto, “Timbre replacement of harmonic and drum components for music audio signals,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2014, pp. 7520–7524.
bib demo - Takuya Higuchi, Norihiro Takamune, Tomohiko Nakamura, and Hirokazu Kameoka, “Underdetermined blind separation and tracking of moving sources based on DOA-HMM,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, May 2014, pp. 3215–3219.
bib - Shigeki Sagayama, Tomohiko Nakamura, Eita Nakamura, Yasuyuki Saito, Hirokazu Kameoka, and Nobutaka Ono, “Automatic music accompaniment allowing errors and arbitrary repeats and jumps,” in Proceedings of Meetings on Acoustics, Acoustic Society of America, May 2014, vol. 21, 35003.
bib - Tomohiko Nakamura, Eita Nakamura, and Shigeki Sagayama, “Acoustic score following to musical performance with errors and arbitrary repeats and skips for automatic accompaniment,” in Proceedings of Sound and Music Computing Conference, Aug. 2013, pp. 299–304.
[Travel Grant by the Telecommunications Advancement Foundation]
bib paper demo - Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Tomohiko Nakamura, Nobutaka Ono, and Shigeki Sagayama, “Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct. 2011, pp. 325–328.
bib - Tomohiko Nakamura, Shinji Hara, and Yutaka Hori, “Local stability analysis for a class of quorum-sensing networks with cyclic gene regulatory networks,” in Proceedings of SICE Annual Conference, Sep. 2011, pp. 2111–2116.
[SICE Annual Conference 2011 International Award and Finalist of Young Author's Award]
bib paper