Software Mansion•2y ago

Membrane.Source example for RTC Engine

Hi! I'm trying to implement a solution in which I can stream audio from an API to the client via the Membrane RTC Engine. I'm basing my Membrane.Source implementation on Membrane.Hackney.Source, but now I'm a bit stuck on getting the audio to send from that to the WebRTC Endpoint. I've created an endpoint that fowards the output of my pad to Membrane.RTC.Engine.Endpoint.WebRTC.TrackSender, but I'm wondering: 1. Is this the right approach? 2. If so, where do I get the correct track with which to configure the TrackSender? If there are any examples out there implementing any sort of Source-based Membrane RTC Endpoint, I'd love to study it. Thanks!

7 Replies

Michał Śledź•2y ago

Hi @braintrain3000 , so to sum up, you want to download some audio via Hackney.Source and then publish it to the Engine, correct?

Michał Śledź•2y ago

You might want to take a look at our FileEndpoint, which we use in tests https://github.com/jellyfish-dev/membrane_rtc_engine/blob/v0.11.0/test/support/file_endpoint.ex Besides this, take a look at https://hexdocs.pm/membrane_rtc_engine/track_lifecycle.html and https://hexdocs.pm/membrane_rtc_engine/custom_endpoints.html Custom endpoint refers to the endpoint that consumes tracks but maybe it will be somehow useful to see the broader context.

GitHub

membrane_rtc_engine/file_endpoint.ex at v0.11.0 · jellyfish-dev/mem...

Customizable Real-time Communication Engine/SFU library focused on WebRTC. - membrane_rtc_engine/file_endpoint.ex at v0.11.0 · jellyfish-dev/membrane_rtc_engine

DJMAKEITBRAINOP•2y ago

@mickel8. Yep, that's mostly correct. I've made my own Source, similar to Hackney, but actually forwarding a stream of audio chunks from successive API calls. I've added that endpoint to the engine, and that track is being listened to by 2 endpoints: 1. The WebRTC endpoint from the user 2. A recording endpoint that writes the audio to disk. The recording endpoint records the audio just fine, it sounds exactly as it should. However, the WebRTC endpoint (or at least the audio element in the browser) receives the audio all messed up, missing large sections of it and generally sped up. Do you know why that might happen? I don't think the issue is with my Source/source endpoint, given that the recorder picks it up fine. The issue must either lie somewhere inside the WebRTC endpoint, or in the browser, I'd imagine. Any help would be much appreciated!

Michał Śledź•2y ago

Check chrome://webrtc-internals. You will find there stats for you audio track. In particular, there will be information about packet loss, dropped frames etc. There might be a number of reason. One that comes to my mind is that there is something wrong with timestamps and browser drops packets as they are not in order Another thing might be packet loss from the server to the browser but I assume you are running on localhost

Michał Śledź•2y ago

Check also if membrane_videoroom works fine for you when you connect with two tabs https://videoroom.membrane.work/

Membrane Videoroom WebRTC demo

DJMAKEITBRAINOP•2y ago

Interesting, thanks! We're seeing no packets lost, but the majority of packets are discarded. That might speak to a timestamp issue...do you know how that might come about, and what we could do about it? Membrane Videoroom sounded just fine with two tabs! My best guess is that the following is happening: 1. The API we're calling for the audio responds with a stream of chunks. 2. Those chunks are received very quickly (faster than the duration of the audio they contain), and we process them as they arrive. 3. We package these chunks into an RTP stream via Membrane.RTP.PayloaderBin. 4. Membrane forwards these packets to the client as soon as they arrive. 5. Because they're received by the client faster than it can play them back, it starts dropping packets. (see section 3.2 of https://www.rfc-editor.org/rfc/rfc7002) So...do we need to control the rate of playback on the Membrane side in order to make this work? I guess I assumed it would measure things out on its own. I dunno if this helps, but here's our handle_pad_added:

@impl true
  def handle_pad_added(Pad.ref(:output, {_track_id, _rid}) = pad, _ctx, state) do
    structure = [
      child(:synthesizer_source, %SynthesizerElement{
        room_id: state.room_id
      })
      |> child(:decoder, Membrane.MP3.MAD.Decoder)
      |> child(:converter, %Membrane.FFmpeg.SWResample.Converter{
        input_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_format: :s24le,
          sample_rate: 44_100
        },
        output_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_format: :s16le,
          sample_rate: 48_000
        }
      })
      |> child(:encoder, %Membrane.Opus.Encoder{
        input_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_rate: 48_000,
          sample_format: :s16le
        }
      })
      |> child(:parser, %Membrane.Opus.Parser{})
      |> child(:payloader, %Membrane.RTP.PayloaderBin{
        payloader: Membrane.RTP.PayloadFormat.get(state.track.encoding).payloader,
        ssrc: state.ssrc,
        payload_type: state.payload_type,
        clock_rate: state.track.clock_rate
      })
      |> via_in(Pad.ref(:input, {state.track.id, :high}))
      |> child(:track_sender, %TrackSender{track: state.track, variant_bitrates: %{high: 48_000}},
        get_if_exists: true
      )
      |> via_out(pad)
      |> bin_output(pad)
    ]

    {[spec: structure], state}
  end

@impl true
  def handle_pad_added(Pad.ref(:output, {_track_id, _rid}) = pad, _ctx, state) do
    structure = [
      child(:synthesizer_source, %SynthesizerElement{
        room_id: state.room_id
      })
      |> child(:decoder, Membrane.MP3.MAD.Decoder)
      |> child(:converter, %Membrane.FFmpeg.SWResample.Converter{
        input_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_format: :s24le,
          sample_rate: 44_100
        },
        output_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_format: :s16le,
          sample_rate: 48_000
        }
      })
      |> child(:encoder, %Membrane.Opus.Encoder{
        input_stream_format: %Membrane.RawAudio{
          channels: 1,
          sample_rate: 48_000,
          sample_format: :s16le
        }
      })
      |> child(:parser, %Membrane.Opus.Parser{})
      |> child(:payloader, %Membrane.RTP.PayloaderBin{
        payloader: Membrane.RTP.PayloadFormat.get(state.track.encoding).payloader,
        ssrc: state.ssrc,
        payload_type: state.payload_type,
        clock_rate: state.track.clock_rate
      })
      |> via_in(Pad.ref(:input, {state.track.id, :high}))
      |> child(:track_sender, %TrackSender{track: state.track, variant_bitrates: %{high: 48_000}},
        get_if_exists: true
      )
      |> via_out(pad)
      |> bin_output(pad)
    ]

    {[spec: structure], state}
  end

Michał Śledź•2y ago

A couple of things here: * clock_rate in payloader should be 48_000 as you convert from 44_100 to 48_000. I am not sure what is under state.track.clock_rate. * variant_bitrates is max bitrate of your track and it's not the same as sample_rate. I wouldn't expect it to be greater than 50kbps so 48_000 should probably be okay * if you don't limit the rate with which you are sending, you might overwhelm sender socket or congest the network. Try to plug in e.g. membrane_realtimer_plugin and see whether it helps I would recommend trying this options one by one and observe whether something changes You can also send here a dump from webrtc internals. I believe you can create it in the upper left corner in chrome://webrtc-internals It will also be helpful if you paste the content of state.track i.e. inspect it before publishing

Gaming

Programming

Membrane.Source example for RTC Engine

Did you find this page helpful?