#speech

Improving End-to-End SLU Performance with Prosodic Attention and Distillation

2023-08-20 Shangeth Rajaa Interspeech 2023, pp. 1114–1118

#Voice AI #Spoken Language Understanding #Prosody #Speech

Two techniques for incorporating prosody into end-to-end SLU: prosody-attention and prosody-distillation. Up to 8% intent classification accuracy improvement on SLURP.

View

Improving Spoken Language Identification with Map-Mix

2023-06-04 Shangeth Rajaa, Kriti Anandan, Swaraj Dalmia, Tarun Gupta, Eng Siong Chng ICASSP 2023 — IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1–5

#Voice AI #Speech #Language Identification #Data Augmentation

Map-Mix: a data augmentation approach using model training dynamics to guide latent mixup sampling, giving ~2% weighted F1 improvement on low-resource dialect classification.

View

Skit-S2I: An Indian Accented Speech to Intent Dataset

2022-12-26 Shangeth Rajaa, Swaraj Dalmia, Kumarmanas Nethil arXiv preprint arXiv:2212.13015

#Voice AI #Spoken Language Understanding #Dataset #Speech

The first public Indian-accented SLU dataset in the banking domain. SSL speech representations beat ASR-based approaches for intent classification.

View

Code Mixing in NLP and Speech

2021-08-24

#Speech #NLP #Deep Learning

Notes from a seminar covering six papers on code-mixing across NLP, speech synthesis, and speech recognition — including multilingual synthesis and code-mixed ASR.

View