Skip to the content.

Abstract

混合音から各音源信号を抽出する音源分離は、様々な音メディア処理システムの前処理として利用できます。汎用的に使用可能な音源分離を実現するためには、標本化周波数などの後段のタスクで要求される様々な音響的条件下でも頑健に動作する必要があります。本研究では、深層学習モデルを信号処理の観点から解釈することで、標本化周波数に非依存な層を構築し、汎用的な音メディア処理用深層学習フレームワークの実現を目指します。

Audio source separation is a technique of separating individual sources from a mixture audio, and it is often used for preprocessing of audio applications. To build a source separation model that can be used as a versatile preprocessor, various acoustic conditions (for example, sampling frequency) required by possible downstream tasks should be handled. Although conventional source separation models based on deep neural networks work well only at a trained sampling frequency, they are difficult to work with sounds of untrained sampling frequencies. In this study, interpreting deep neural networks from a signal processing viewpoint, I develop layers independent of sampling frequency to establish a more versatile deep learning framework for audio media processing.

Members

Current members

Alumni

Research

Publications