Abstract: Recent advancements in the domain of computer vision have enabled the analysis of audio spectrograms. In this paper, we present a novel approach that leverages spectrogram representations ...
Abstract: Discrete audio representation, aka audio tokenization, has seen renewed interest driven by its potential to facilitate the application of text language modeling approaches in audio domain.