Supporting interruptions in OpenAI Realtime demo

I'm trying to add interruptions support in https://github.com/membraneframework/membrane_demo/blob/master/livebooks/openai_realtime_with_membrane_webrtc/openai_realtime_with_membrane_webrtc.livemd Seems like the Realtime API already sends an event for when the audio needs to stop playing(status = cancelled):
17:04:02.824 [info] <0.331.0>/:open_ai AI response: %{"id" => "resp_AhIivPmVtLc0kekMYf9eg", "metadata" => nil, "object" => "realtime.response", "output" => [%{"content" => [%{"transcript" => "Sure! ...", "type" => "audio"}], "id" => "item_AhIivTi9IdPKaAo03MpkV", "object" => "realtime.item", "role" => "assistant", "status" => "completed", "type" => "message"}], "status" => "cancelled", "status_details" => %{"reason" => "turn_detected", "type" => "cancelled"}, "usage" => %{"input_token_details" => %{"audio_tokens" => 840, "cached_tokens" => 1216, "cached_tokens_details" => %{"audio_tokens" => 832, "text_tokens" => 384}, "text_tokens" => 444}, "input_tokens" => 1284, "output_token_details" => %{"audio_tokens" => 186, "text_tokens" => 43}, "output_tokens" => 229, "total_tokens" => 1513}}
17:04:02.824 [info] <0.331.0>/:open_ai AI response: %{"id" => "resp_AhIivPmVtLc0kekMYf9eg", "metadata" => nil, "object" => "realtime.response", "output" => [%{"content" => [%{"transcript" => "Sure! ...", "type" => "audio"}], "id" => "item_AhIivTi9IdPKaAo03MpkV", "object" => "realtime.item", "role" => "assistant", "status" => "completed", "type" => "message"}], "status" => "cancelled", "status_details" => %{"reason" => "turn_detected", "type" => "cancelled"}, "usage" => %{"input_token_details" => %{"audio_tokens" => 840, "cached_tokens" => 1216, "cached_tokens_details" => %{"audio_tokens" => 832, "text_tokens" => 384}, "text_tokens" => 444}, "input_tokens" => 1284, "output_token_details" => %{"audio_tokens" => 186, "text_tokens" => 43}, "output_tokens" => 229, "total_tokens" => 1513}}
But I'm not sure how I can avoid playing back audio buffers that we've already sent here: https://github.com/membraneframework/membrane_demo/blob/1226a0dc04cf9a9549544e3e78660957c6dbf391/livebooks/openai_realtime_with_membrane_webrtc/openai_realtime_with_membrane_webrtc.livemd?plain=1#L91 Is there a way to drop buffers that we already sent?
GitHub
membrane_demo/livebooks/openai_realtime_with_membrane_webrtc/openai...
Examples of using the Membrane Framework. Contribute to membraneframework/membrane_demo development by creating an account on GitHub.
GitHub
membrane_demo/livebooks/openai_realtime_with_membrane_webrtc/openai...
Examples of using the Membrane Framework. Contribute to membraneframework/membrane_demo development by creating an account on GitHub.
1 Reply
samrat
samratOP5d ago
I tried accumulating the audio responses and using manual flow control, so that we can clear the accumulated audio if it's cancelled but it doesn't seem to work: https://gist.github.com/samrat/2d2f5d9b17580d701512d02f48efbe79 I'm not sure if I'm using redemand in the correct way.
Gist
Changed OpenAIEndpoint output to use manual flow_control.
Changed OpenAIEndpoint output to use manual flow_control. - openai_membrane_manual_mode.livemd

Did you find this page helpful?