Improving spoken language identification with MAP-Mix

Abstract

We address dialect classification in low-resource settings using a pre-trained multilingual XLSR model. We introduce Map-Mix, a data augmentation technique that uses model training dynamics to improve sampling for latent mixup operations. The method achieves approximately 2% improvement in weighted F1 scores compared to random mixup and produces better-calibrated models.

Publication
ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing