C#•16mo ago

Memory leak problem with System.Speech.Recognition && Microsoft.Speech.Recognition

Description: When running the speech recognition using Microsoft.Speech.Recognition, there appears to be a memory leak issue. The recognition engine constantly grows in memory during execution, even though it is disposed of correctly. The problem is specifically observed in the following code block:

while (!stoppingToken.IsCancellationRequested && !disposed)
{
    Thread.Sleep(100);
    try
    {
        int read = originStream.Read(buffer, 0, 48000);
        bufferedByteStream.Write(buffer, 0, read);
    }
    catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
    {
        Console.WriteLine("Connection was forcibly closed by the remote host.");
        bufferedByteStream.Close();
        disposed = true;
    }
}

while (!stoppingToken.IsCancellationRequested && !disposed)
{
    Thread.Sleep(100);
    try
    {
        int read = originStream.Read(buffer, 0, 48000);
        bufferedByteStream.Write(buffer, 0, read);
    }
    catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
    {
        Console.WriteLine("Connection was forcibly closed by the remote host.");
        bufferedByteStream.Close();
        disposed = true;
    }
}

Context: - The recognition engine grows in memory while it runs in a separate thread. - Disposing of the engine at the end of the execution does not resolve the issue. - The memory growth is evident even with correct disposal practices. Additional Information: - The issue happens withMicrosoft.Speech.Recognition and is also observed in System.Speech.Recognition. - The code is part of a background service and runs continuously. Code Snippet for Reference:

// ... (previous code)

using (SpeechRecognitionEngine engine = SREBuilder.Create(new[] { "1FriendlyDoge", "notanoob600m", "RoyalCrests" }))
{
  engine.SpeechRecognized += (sender, e) => HandleSpeechRecognized(sender, e, userId, guildId);
  engine.SetInputToAudioStream(bufferedByteStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 48000, 16, 2, 192000, 4, null));
  engine.RecognizeAsync(RecognizeMode.Multiple);
    
    while (!stoppingToken.IsCancellationRequested && !disposed)
    {
        Thread.Sleep(100);
        try
        {
            int read = originStream.Read(buffer, 0, 48000);
            bufferedByteStream.Write(buffer, 0, read);
        }
        catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
        {
            Console.WriteLine("Connection was forcibly closed by the remote host.");
            bufferedByteStream.Close();
            disposed = true;
        }
    }
}

// ... (previous code)

using (SpeechRecognitionEngine engine = SREBuilder.Create(new[] { "1FriendlyDoge", "notanoob600m", "RoyalCrests" }))
{
  engine.SpeechRecognized += (sender, e) => HandleSpeechRecognized(sender, e, userId, guildId);
  engine.SetInputToAudioStream(bufferedByteStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 48000, 16, 2, 192000, 4, null));
  engine.RecognizeAsync(RecognizeMode.Multiple);
    
    while (!stoppingToken.IsCancellationRequested && !disposed)
    {
        Thread.Sleep(100);
        try
        {
            int read = originStream.Read(buffer, 0, 48000);
            bufferedByteStream.Write(buffer, 0, read);
        }
        catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
        {
            Console.WriteLine("Connection was forcibly closed by the remote host.");
            bufferedByteStream.Close();
            disposed = true;
        }
    }
}

12 Replies

Noah;OP•16mo ago

(I have isolated this and profiled it & have determined the issue does not appear to be my code) For some additional information, the buffered steam is just a list of bytes with a predefined size (48,000 bytes in this case, which is 0.25s of audio data) and its size never changes. When trying to profile the issue using dotMemory the issue seems to be a ton of Byte[] object with a stack trace of literally just [AllThreadsRoot]. The exception gets called when a socket connection closes (each socket connection has its own instance of SpeechRecognitionEngine) and only happens once per instance. After that everything is being disposed.

Ꜳåąɐȁặⱥᴀᴬ•16mo ago

probably it's not related to the problem, but why don't you async? also i would check if read == 0 for exiting the while

1FriendlyDoge•16mo ago

I removed all async stuff for debugging But it didnt help The client might not constantly send something but it still needs to stay connected

Ꜳåąɐȁặⱥᴀᴬ•16mo ago

i don't know, i personally use gcs speech recognition and azure speech recognition and don't have these issues, i know it's not exactly the same thing 0 means exit if you receive a 0 you have to restart the connection

1FriendlyDoge•16mo ago

Why lol Since when

Ꜳåąɐȁặⱥᴀᴬ•16mo ago

that's the standard behavior of a stream interface you can look on msdn

1FriendlyDoge•16mo ago

Just because a stream is empty doesnt mean I have to close it

Ꜳåąɐȁặⱥᴀᴬ•16mo ago

an empty stream would not return 0 bytes read again, this probably is not the problem, just saying that if originStream inherits from Stream then it would be better if you checked for return 0/end stream if not, it will throw an error next time you try to read anyway, but it wouldn't be a gracefully wait to exit

1FriendlyDoge•16mo ago

How would that error I am pretty sure that I used delays between identification and actual data transfer on my socket client for testing and never got any exception because of that

Noah;OP•16mo ago

Found some people doing the same thing as me online. https://stackoverflow.com/questions/68106636/using-a-memorystream-with-nets-system-speech-speechrecognitionengine-class https://stackoverflow.com/questions/1682902/streaming-input-to-system-speech-recognition-speechrecognitionengine?rq=4 They don't seem to be having any issues... but I can't find anything that they're doing that I am not..

Stack Overflow

Using a MemoryStream with .NET's System.Speech SpeechRecognitionEng...

I am trying to use .NET's System.Speech SpeechRecognitionEngine object to recognize words spoken by a discord user in a voice channel. The raw pcm audio received by the bot is written to a MemorySt...

Stack Overflow

Streaming input to System.Speech.Recognition.SpeechRecognitionEngine

I am trying to do "streaming" speech recognition in C# from a TCP socket. The problem I am having is that SpeechRecognitionEngine.SetInputToAudioStream() seems to require a Stream of a defined length

Noah;OP•16mo ago

For what it's worth, the full code:

message.cs

Noah;OP•16mo ago

Still very stumped on this. All help is appreciated :) just hoping this isnt impossible to fix solved

Gaming

Programming

Memory leak problem with System.Speech.Recognition && Microsoft.Speech.Recognition

Did you find this page helpful?