M
Modular2mo ago
fyfe93

Is MAX real-time safe? (Or can it be configured to be?)

Hey all! Working in real-time audio applications! Just wanted to ask out-right, is MAX real-time safe? (Or can it be configured to be?) I currently work with other frameworks like onnxruntime and libtorch in C++ to deploy models but I'm forced to do some non-ideal background thread inference and use lock-free queuing between the background thread and audio thread due to those runtimes not being real-time safe. It would be a game changer if the MAX engine was real-time safe. I would probably use the C API as there's no C++ API yet right?
3 Replies
Darkmatter
Darkmatter2mo ago
This requires someone from Modular to answer it, but I think they'll need some extra information to do that. The bit that I can answer is that there is no public C++ API right now. 1. How are you defining real time? Real time is a spectrum ranging from "a human will notice if this is slow" to "failure to finish before time T is the result of a hardware failure". 2. CPU or GPU inference? I'm not actually sure if most GPUs can give real time at all, and some operations like spawning a process (for a CPU threadpool) are difficult to do under some definitions of real time. For CPU inference, if you don't need multiple threads you may be able to prod the compiler into making it a normal function call, in which case it should run in a fixed number of CPU cycles ignoring cache effects. In the worst case, you may be able to use MAX once, ask the compiler for the equivalent assembly in for the architectures you care about, and then inline that into C++.
fyfe93
fyfe932mo ago
@Owen Hilyard Thanks for your input! Just to clarify, in real-time audio, real-time is under the hard constraint of “failure to finish before time T is the result of a hardware failure”. It results in discontinuities and undefined behaviour in the audio stream. To give further information, what I want to know is whether MAX does any dynamic memory allocations or system calls during inference as this would violate real-time safety. I’m also referring to performing inference on a single high priority thread on CPU which is usually dedicated as the audio thread. It would be great if someone from the MAX team can comment on this.
Want results from more Discord servers?
Add your server