DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraining
Dual-channel generative pretraining for learning natural turn-taking in spoken dialogue without labeled data. A 0.5B model that outperforms models 6x its size on turn prediction.