<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Shangeth Rajaa</title><description>Shangeth Rajaa, Senior ML Scientist at Anyreach AI (ex-Skit.ai). Voice AI research on full-duplex spoken dialogue, turn-taking, and speech LLMs.</description><link>https://shangeth.com/</link><item><title>[Publication] DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraining</title><link>https://shangeth.com/publications/dualturn/</link><guid isPermaLink="true">https://shangeth.com/publications/dualturn/</guid><description>Dual-channel generative pretraining for learning natural turn-taking in spoken dialogue without labeled data. A 0.5B model that outperforms models 6x its size on turn prediction.</description><pubDate>Mon, 09 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Speech LLMs for Conversations</title><link>https://shangeth.com/posts/speech-llms-conversations/</link><guid isPermaLink="true">https://shangeth.com/posts/speech-llms-conversations/</guid><description>A multimodal speech LLM that processes audio directly to enhance conversational AI while reducing overhead compared to traditional ASR-LLM-TTS pipelines.</description><pubDate>Thu, 09 May 2024 00:00:00 GMT</pubDate></item><item><title>[Publication] Improving End-to-End SLU Performance with Prosodic Attention and Distillation</title><link>https://shangeth.com/publications/slu-prosodic/</link><guid isPermaLink="true">https://shangeth.com/publications/slu-prosodic/</guid><description>Two techniques for incorporating prosody into end-to-end SLU: prosody-attention and prosody-distillation. Up to 8% intent classification accuracy improvement on SLURP.</description><pubDate>Sun, 20 Aug 2023 00:00:00 GMT</pubDate></item><item><title>[Publication] Improving Spoken Language Identification with Map-Mix</title><link>https://shangeth.com/publications/icassp2023/</link><guid isPermaLink="true">https://shangeth.com/publications/icassp2023/</guid><description>Map-Mix: a data augmentation approach using model training dynamics to guide latent mixup sampling, giving ~2% weighted F1 improvement on low-resource dialect classification.</description><pubDate>Sun, 04 Jun 2023 00:00:00 GMT</pubDate></item><item><title>[Publication] Skit-S2I: An Indian Accented Speech to Intent Dataset</title><link>https://shangeth.com/publications/skit-s2i/</link><guid isPermaLink="true">https://shangeth.com/publications/skit-s2i/</guid><description>The first public Indian-accented SLU dataset in the banking domain. SSL speech representations beat ASR-based approaches for intent classification.</description><pubDate>Mon, 26 Dec 2022 00:00:00 GMT</pubDate></item><item><title>Feature Disentanglement - I</title><link>https://shangeth.com/posts/feature-disentanglement/</link><guid isPermaLink="true">https://shangeth.com/posts/feature-disentanglement/</guid><description>How deep learning models can isolate independent factors of variation in data through VAEs and Beta-TCVAE, enabling controlled synthesis and better downstream representations.</description><pubDate>Tue, 22 Feb 2022 00:00:00 GMT</pubDate></item><item><title>[Publication] Learning Speaker Representation with Semi-supervised Learning Approach for Speaker Profiling</title><link>https://shangeth.com/publications/speaker-representation/</link><guid isPermaLink="true">https://shangeth.com/publications/speaker-representation/</guid><description>A semi-supervised framework for speaker profiling that leverages external unlabelled corpora via supervised, unsupervised, and consistency training, achieving RMSE of 6.8 years on age estimation.</description><pubDate>Sun, 24 Oct 2021 00:00:00 GMT</pubDate></item><item><title>Code Mixing in NLP and Speech</title><link>https://shangeth.com/posts/code-mixing-seminar/</link><guid isPermaLink="true">https://shangeth.com/posts/code-mixing-seminar/</guid><description>Notes from a seminar covering six papers on code-mixing across NLP, speech synthesis, and speech recognition — including multilingual synthesis and code-mixed ASR.</description><pubDate>Tue, 24 Aug 2021 00:00:00 GMT</pubDate></item><item><title>KL Divergence: Entropy, Cross Entropy, and Mutual Information in PyTorch</title><link>https://shangeth.com/posts/kl-divergence/</link><guid isPermaLink="true">https://shangeth.com/posts/kl-divergence/</guid><description>A walkthrough of information entropy, KL divergence, mutual information, and cross entropy — with PyTorch implementations.</description><pubDate>Tue, 01 Sep 2020 00:00:00 GMT</pubDate></item><item><title>Off-Policy Monte Carlo Prediction with Importance Sampling</title><link>https://shangeth.com/posts/off-policy-monte-carlo/</link><guid isPermaLink="true">https://shangeth.com/posts/off-policy-monte-carlo/</guid><description>How importance sampling lets us estimate value functions under a target policy using episodes collected by a different behavior policy.</description><pubDate>Sat, 01 Aug 2020 00:00:00 GMT</pubDate></item><item><title>[Publication] Towards Automated Deep Learning: Analysis of the AutoDL Challenge Series 2019</title><link>https://shangeth.com/publications/pmlr/</link><guid isPermaLink="true">https://shangeth.com/publications/pmlr/</guid><description>Design and results of the AutoDL challenge series 2019 (AutoCV, AutoCV2, AutoNLP, AutoSpeech, AutoDL), showing winning solutions generalize to unseen datasets.</description><pubDate>Mon, 01 Jun 2020 00:00:00 GMT</pubDate></item><item><title>[Publication] Overview and Unifying Conceptualization of Automated Machine Learning</title><link>https://shangeth.com/publications/ads/</link><guid isPermaLink="true">https://shangeth.com/publications/ads/</guid><description>A novel generic mathematical formulation of AutoML unifying HPO and meta-learning, showing meta-learning addresses AutoML more fundamentally than hyperparameter optimization.</description><pubDate>Mon, 16 Sep 2019 00:00:00 GMT</pubDate></item><item><title>GAN 5: Conditional GAN and Pix2Pix</title><link>https://shangeth.com/posts/gan-5/</link><guid isPermaLink="true">https://shangeth.com/posts/gan-5/</guid><description>How Conditional GANs extend vanilla GANs to generate class-specific samples, and how Pix2Pix does image-to-image translation.</description><pubDate>Sat, 25 May 2019 00:00:00 GMT</pubDate></item><item><title>GAN 4: Deep Convolutional GAN (DCGAN) on SVHN</title><link>https://shangeth.com/posts/gan-4/</link><guid isPermaLink="true">https://shangeth.com/posts/gan-4/</guid><description>Implementing DCGAN with CNNs, batch normalization, and transposed convolutions to generate Street View House Numbers.</description><pubDate>Fri, 24 May 2019 00:00:00 GMT</pubDate></item><item><title>GAN 2: The Game Theory Behind Generator and Discriminator</title><link>https://shangeth.com/posts/gan-2/</link><guid isPermaLink="true">https://shangeth.com/posts/gan-2/</guid><description>How the adversarial game between Generator and Discriminator works, and why equilibrium matters.</description><pubDate>Tue, 21 May 2019 00:00:00 GMT</pubDate></item><item><title>GAN 3: Implementing a Linear GAN on MNIST in PyTorch</title><link>https://shangeth.com/posts/gan-3/</link><guid isPermaLink="true">https://shangeth.com/posts/gan-3/</guid><description>Step-by-step PyTorch implementation of a vanilla GAN trained on MNIST — data loading, Discriminator, Generator, training loop.</description><pubDate>Tue, 21 May 2019 00:00:00 GMT</pubDate></item><item><title>GAN 1: Introduction to Generative Adversarial Networks</title><link>https://shangeth.com/posts/gan-1/</link><guid isPermaLink="true">https://shangeth.com/posts/gan-1/</guid><description>What GANs are, how they work, and some remarkable recent research — StackGAN, iGAN, Pix2Pix.</description><pubDate>Mon, 20 May 2019 00:00:00 GMT</pubDate></item><item><title>Unsupervised Learning 101</title><link>https://shangeth.com/posts/unsupervised-learning/</link><guid isPermaLink="true">https://shangeth.com/posts/unsupervised-learning/</guid><description>An intro to unsupervised learning — clustering, feature learning, and dimensionality reduction, and why it matters.</description><pubDate>Mon, 20 May 2019 00:00:00 GMT</pubDate></item><item><title>[Publication] Convolutional Feature Extraction and Neural Arithmetic Logic Units for Stock Prediction</title><link>https://shangeth.com/publications/icacds/</link><guid isPermaLink="true">https://shangeth.com/publications/icacds/</guid><description>A data-driven deep learning approach combining CNN feature extraction with Neural Arithmetic Logic Units (NALU) for stock price prediction using historical price data.</description><pubDate>Fri, 12 Apr 2019 00:00:00 GMT</pubDate></item></channel></rss>