Mike
Help deploying LLaVA Flask API
I'm trying to create a LLaVa endpoint I can use in my project so I can assess 5 million photos with a Node script, similar to how I'm doing right locally currently with Ollama. I'm looking to deploy the 7b model on an RTX 4000, GPU Cloud not serverless to keep costs down. My preference is speed as well as cost so I'd ideally like to process multiple images at once, any advice welcome.
After speaking to the author of the LLaVA RunPod template, he's recommended I use the below Flask method, but I'm not sure how I'd go around getting this deployed as I'm new to backend. Anybody able to help with some initial steps?
https://github.com/ashleykleynhans/LLaVA/tree/main?tab=readme-ov-file#flask-api-inference
#LLaVA: Large Language and Vision Assistant #⛅|gpu-cloud
2 replies