Shangeth Rajaa

Shangeth RajaaShangeth Rajaa, Senior ML Scientist at Anyreach AI (ex-Skit.ai). Voice AI research on full-duplex spoken dialogue, turn-taking, and speech LLMs.https://shangeth.com/[Publication] DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraininghttps://shangeth.com/publications/dualturn/https://shangeth.com/publications/dualturn/Dual-channel generative pretraining for learning natural turn-taking in spoken dialogue without labeled data. A 0.5B model that outperforms models 6x its size on turn prediction.Mon, 09 Mar 2026 00:00:00 GMTSpeech LLMs for Conversationshttps://shangeth.com/posts/speech-llms-conversations/https://shangeth.com/posts/speech-llms-conversations/A multimodal speech LLM that processes audio directly to enhance conversational AI while reducing overhead compared to traditional ASR-LLM-TTS pipelines.Thu, 09 May 2024 00:00:00 GMT[Publication] Improving End-to-End SLU Performance with Prosodic Attention and Distillationhttps://shangeth.com/publications/slu-prosodic/https://shangeth.com/publications/slu-prosodic/Two techniques for incorporating prosody into end-to-end SLU: prosody-attention and prosody-distillation. Up to 8% intent classification accuracy improvement on SLURP.Sun, 20 Aug 2023 00:00:00 GMT[Publication] Improving Spoken Language Identification with Map-Mixhttps://shangeth.com/publications/icassp2023/https://shangeth.com/publications/icassp2023/Map-Mix: a data augmentation approach using model training dynamics to guide latent mixup sampling, giving ~2% weighted F1 improvement on low-resource dialect classification.Sun, 04 Jun 2023 00:00:00 GMT[Publication] Skit-S2I: An Indian Accented Speech to Intent Datasethttps://shangeth.com/publications/skit-s2i/https://shangeth.com/publications/skit-s2i/The first public Indian-accented SLU dataset in the banking domain. SSL speech representations beat ASR-based approaches for intent classification.Mon, 26 Dec 2022 00:00:00 GMTFeature Disentanglement - Ihttps://shangeth.com/posts/feature-disentanglement/https://shangeth.com/posts/feature-disentanglement/How deep learning models can isolate independent factors of variation in data through VAEs and Beta-TCVAE, enabling controlled synthesis and better downstream representations.Tue, 22 Feb 2022 00:00:00 GMT[Publication] Learning Speaker Representation with Semi-supervised Learning Approach for Speaker Profilinghttps://shangeth.com/publications/speaker-representation/https://shangeth.com/publications/speaker-representation/A semi-supervised framework for speaker profiling that leverages external unlabelled corpora via supervised, unsupervised, and consistency training, achieving RMSE of 6.8 years on age estimation.Sun, 24 Oct 2021 00:00:00 GMTCode Mixing in NLP and Speechhttps://shangeth.com/posts/code-mixing-seminar/https://shangeth.com/posts/code-mixing-seminar/Notes from a seminar covering six papers on code-mixing across NLP, speech synthesis, and speech recognition — including multilingual synthesis and code-mixed ASR.Tue, 24 Aug 2021 00:00:00 GMTKL Divergence: Entropy, Cross Entropy, and Mutual Information in PyTorchhttps://shangeth.com/posts/kl-divergence/https://shangeth.com/posts/kl-divergence/A walkthrough of information entropy, KL divergence, mutual information, and cross entropy — with PyTorch implementations.Tue, 01 Sep 2020 00:00:00 GMTOff-Policy Monte Carlo Prediction with Importance Samplinghttps://shangeth.com/posts/off-policy-monte-carlo/https://shangeth.com/posts/off-policy-monte-carlo/How importance sampling lets us estimate value functions under a target policy using episodes collected by a different behavior policy.Sat, 01 Aug 2020 00:00:00 GMT[Publication] Towards Automated Deep Learning: Analysis of the AutoDL Challenge Series 2019https://shangeth.com/publications/pmlr/https://shangeth.com/publications/pmlr/Design and results of the AutoDL challenge series 2019 (AutoCV, AutoCV2, AutoNLP, AutoSpeech, AutoDL), showing winning solutions generalize to unseen datasets.Mon, 01 Jun 2020 00:00:00 GMT[Publication] Overview and Unifying Conceptualization of Automated Machine Learninghttps://shangeth.com/publications/ads/https://shangeth.com/publications/ads/A novel generic mathematical formulation of AutoML unifying HPO and meta-learning, showing meta-learning addresses AutoML more fundamentally than hyperparameter optimization.Mon, 16 Sep 2019 00:00:00 GMTGAN 5: Conditional GAN and Pix2Pixhttps://shangeth.com/posts/gan-5/https://shangeth.com/posts/gan-5/How Conditional GANs extend vanilla GANs to generate class-specific samples, and how Pix2Pix does image-to-image translation.Sat, 25 May 2019 00:00:00 GMTGAN 4: Deep Convolutional GAN (DCGAN) on SVHNhttps://shangeth.com/posts/gan-4/https://shangeth.com/posts/gan-4/Implementing DCGAN with CNNs, batch normalization, and transposed convolutions to generate Street View House Numbers.Fri, 24 May 2019 00:00:00 GMTGAN 2: The Game Theory Behind Generator and Discriminatorhttps://shangeth.com/posts/gan-2/https://shangeth.com/posts/gan-2/How the adversarial game between Generator and Discriminator works, and why equilibrium matters.Tue, 21 May 2019 00:00:00 GMTGAN 3: Implementing a Linear GAN on MNIST in PyTorchhttps://shangeth.com/posts/gan-3/https://shangeth.com/posts/gan-3/Step-by-step PyTorch implementation of a vanilla GAN trained on MNIST — data loading, Discriminator, Generator, training loop.Tue, 21 May 2019 00:00:00 GMTGAN 1: Introduction to Generative Adversarial Networkshttps://shangeth.com/posts/gan-1/https://shangeth.com/posts/gan-1/What GANs are, how they work, and some remarkable recent research — StackGAN, iGAN, Pix2Pix.Mon, 20 May 2019 00:00:00 GMTUnsupervised Learning 101https://shangeth.com/posts/unsupervised-learning/https://shangeth.com/posts/unsupervised-learning/An intro to unsupervised learning — clustering, feature learning, and dimensionality reduction, and why it matters.Mon, 20 May 2019 00:00:00 GMT[Publication] Convolutional Feature Extraction and Neural Arithmetic Logic Units for Stock Predictionhttps://shangeth.com/publications/icacds/https://shangeth.com/publications/icacds/A data-driven deep learning approach combining CNN feature extraction with Neural Arithmetic Logic Units (NALU) for stock price prediction using historical price data.Fri, 12 Apr 2019 00:00:00 GMT