Cloudflare Developers•2mo ago

Support for Speaker Diarization on Cloudflare Workers AI

Hi everyone, I’m currently using Cloudflare Workers AI for speech-to-text transcription with Whisper-large-v3-turbo, and it works great. However, I also need speaker diarization to differentiate between multiple speakers in an audio file. Right now, the best open-source option is Pyannote, but it requires a GPU and seems too heavy to run on Cloudflare Workers due to resource limits. **Is there any way to run speaker diarization on Cloudflare Workers AI (e.g., an optimized model or workaround)? Alternatively has anyone successfully implemented lightweight diarization within Cloudflare’s ecosystem (Workers, KV, R2, etc.)? I’d love to keep everything within Cloudflare rather than using third-party services. Any suggestions or insights would be greatly appreciated! Thanks in advance! 🚀

1 Reply

mr.niko.la•2w ago

For the whisper v3 turbo is there sample code.. I can’t seem to find it on workers ai doc

Gaming

Programming

Support for Speaker Diarization on Cloudflare Workers AI

Did you find this page helpful?