Any plans to add other inference engine?

Hi I'm using vllm worker now but when it comes to quantized models vllm works poorly. Too many vram usage, slow inference, poor output quality, etc.. So, is there any plans to add other engines like tgi, exl2?

1 Reply

Alpay Ariyak•15mo ago

Potentially in the future, currently not a priority in the short term

RunPod Join

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

17KMembers

View on Discord

Gaming

Programming

Any plans to add other inference engine?

Did you find this page helpful?