Software Mansion•11mo ago

WebRTC Endpoint + Mixing Multiple Tracks into a single mp4

I have a working app that allows a user to "talk" to an LLM. I'm using Membrane to help coordinate the audio. For QA purposes, we record the tracks (one for each endpoint). I'm trying to setup a bin that mixes the two tracks using the Membrane.LiveAudioMixer so I can have a single file. There's no errors thrown, but the resulting file is only 40 bytes, so I suspect I have something misconfigured. Each time a pad is added, I try piping it into the LiveAudioMixer and then take that output, encode it and write it to the file.

def handle_setup(_context, state) do
    log_path = Application.fetch_env!(:smartvox, :log_path)
    File.mkdir(log_path)

    spec = [
      child(:mixer, %Membrane.LiveAudioMixer{
        stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_rate: 16_000,
          sample_format: :s16le
        }
      })
      |> child(:encoder, %Membrane.Opus.Encoder{
          application: :voip,
          input_stream_format: %Membrane.RawAudio{
            channels: 1,
            sample_rate: 16_000,
            sample_format: :s16le
          }
      })
      |> child(:parser, Membrane.Opus.Parser)
      |> child({:muxer, state.room_id}, Membrane.MP4.Muxer.ISOM)
      |> child({:sink, state.room_id}, %Membrane.File.Sink{
        location: "#{log_path}/#{state.room_id}.mp4"
      })
    ]

    {[spec: spec], state}
  end

  def handle_pad_added(Pad.ref(:input, track_id) = pad, _ctx, state) do
    track = state.tracks[track_id]

    spec = [
      bin_input(pad)
      |> child({:track_receiver, track_id}, Smartvox.Endpoints.Conversation.TrackRecevier)
      |> child({:depayloader, track_id}, Track.get_depayloader(track))
      |> child({:decoder, track_id}, %Membrane.Opus.Decoder{
        sample_rate: 16_000,
      })
      |> via_in(:input)
      |> get_child(:mixer)
    ]

    {[spec: spec], state}
  end

def handle_setup(_context, state) do
    log_path = Application.fetch_env!(:smartvox, :log_path)
    File.mkdir(log_path)

    spec = [
      child(:mixer, %Membrane.LiveAudioMixer{
        stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_rate: 16_000,
          sample_format: :s16le
        }
      })
      |> child(:encoder, %Membrane.Opus.Encoder{
          application: :voip,
          input_stream_format: %Membrane.RawAudio{
            channels: 1,
            sample_rate: 16_000,
            sample_format: :s16le
          }
      })
      |> child(:parser, Membrane.Opus.Parser)
      |> child({:muxer, state.room_id}, Membrane.MP4.Muxer.ISOM)
      |> child({:sink, state.room_id}, %Membrane.File.Sink{
        location: "#{log_path}/#{state.room_id}.mp4"
      })
    ]

    {[spec: spec], state}
  end

  def handle_pad_added(Pad.ref(:input, track_id) = pad, _ctx, state) do
    track = state.tracks[track_id]

    spec = [
      bin_input(pad)
      |> child({:track_receiver, track_id}, Smartvox.Endpoints.Conversation.TrackRecevier)
      |> child({:depayloader, track_id}, Track.get_depayloader(track))
      |> child({:decoder, track_id}, %Membrane.Opus.Decoder{
        sample_rate: 16_000,
      })
      |> via_in(:input)
      |> get_child(:mixer)
    ]

    {[spec: spec], state}
  end

5 Replies

TonyLikeSocksOP•11mo ago

For what it's worth, I'm not using the latest versions of the membrane stack. I made sure to read the relevant docs for those versions, and I thought the above would work. From mix:

Direct dependencies:
{:membrane_rtc_engine, "~> 0.18.0"},
{:membrane_mp4_plugin, "~> 0.30.1"},
{:membrane_audio_mix_plugin, "~> 0.15.0"},
{:membrane_opus_plugin, "~> 0.18.1"},
{:membrane_raw_audio_format, "~> 0.11.0"},

Child dependencies:
membrane_core ~> 0.12.3

Direct dependencies:
{:membrane_rtc_engine, "~> 0.18.0"},
{:membrane_mp4_plugin, "~> 0.30.1"},
{:membrane_audio_mix_plugin, "~> 0.15.0"},
{:membrane_opus_plugin, "~> 0.18.1"},
{:membrane_raw_audio_format, "~> 0.11.0"},

Child dependencies:
membrane_core ~> 0.12.3

mat_hek•11mo ago

Hi @TonyLikeSocks, would you mind checking on the latest versions of the plugins? My suspicion is that we're not passing the timestamps correctly, and the mixer relies on them. We've improved timestamps handling a lot recently, so it may already be fixed.

TonyLikeSocksOP•11mo ago

Ahh, I was afraid that'd be the path forward. I tried a few months ago and hit some snags. From memory, it had to do with how we're using Ratio -- I'll try again and spin up a new thread with where I get stuck. Thanks @mat_hek

mat_hek•11mo ago

FWIW we're locked on ratio 3.0, however 4.0 should work without problems, so you can override. We're going to allow 4.0 starting with core 1.1

TonyLikeSocksOP•11mo ago

Looks like I've got Ratio 2.0 in my deps. I'll have to read up on the changes It's a child dependency pulled in. I don't list it as an explicit dependency.

Gaming

Programming

WebRTC Endpoint + Mixing Multiple Tracks into a single mp4

Did you find this page helpful?