Shangeth
Home
Research
Experience
Blog
GitHub
Contact
Speaker Profiling
SpeechLLM: Multi-Modal LLM for Speech Understanding
SpeechLLM: A multimodal LLM combining speech encoders with TinyLlama for joint ASR, gender, age, accent, and emotion prediction from audio.
Learning speaker representation with semi-supervised learning approach for speaker profiling
Semi-supervised framework for speaker profiling (age, height estimation) leveraging external unlabelled speech data via consistency training.
Cite
×