Getting user's speaking data that are connected to the same voice channel as the bot.
Hi! I am trying to make a speech to text bot but i have difficulties to get speaking data... i have almost everything i want but cannot get something working like voiceConnection.on("speaking", ...) or anything like this. If a bot is in a voice channel how can i add the functionality of getting each speaking user's voice data/packets? Is it even possible? I am trying to figure it out but cannot find a way...
"dependencies": {
"@discordjs/voice": "^0.16.0",
"discord.js": "^14.11.0",
"ibm-watson": "^8.0.0"
}
7 Replies
1.
[email protected] /home/hugo/Dev/VanProject/DiscordBot/SpeechToText/npm
└── [email protected]
v20.4.0
2. No error, only debugging logs.
3. Attached.
4. As the description above.
5. -
6. -
Also i am on Ubuntu 22.04.2 LTS
Pastebin
DiscordBot help - Pastebin.com
Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.
I figured out i have to do something like this: https://github.com/discordjs/voice-examples/blob/main/recorder/src/createListeningStream.ts
GitHub
voice-examples/recorder/src/createListeningStream.ts at main · disc...
A collection of examples of how to use @discordjs/voice in your projects - discordjs/voice-examples
but i don't know what else and how to exactly...
<VoiceConection>.receiver.subscribe(<userId>)
returns an opus packet stream for the given user
this can be decoded to pcm with the opus.Decoder
class from prism-media
(which is a dependency of @discordjs/voice
)
you can detect when a user starts and stops speaking with the start
and stop
events on <VoiceConnection>.receiver.speaking
https://discord.js.org/docs/packages/voice/0.16.0/VoiceReceiver:Class#subscribe
https://amishshah.github.io/prism-media/opus.Decoder.html
https://discord.js.org/docs/packages/voice/0.16.0/SpeakingMap:Class#onThank you! I'll keep this open until i try the things you said.
With the information i managed to get buffers when talking. Is there a "free" speech to text service (that converts hungarian speech) that i could use? If yes how should I? Should i concat the buffers into one and send it on speaking stop or should i send the buffer stream as it comes?
Is there a "free" speech to text service (that converts hungarian speech) that i could use? If yes how should I?that'd be a question for google
Should i concat the buffers into one and send it on speaking stop or should i send the buffer stream as it comes?that's entirely up to you