Best Mixtral/LLaMA2 LLM for code-writing, inference, 24 to 48 GB?

Good evening all you experts! I'm past the pain and suffering stage and into the finesse and finishing stage - what is the best class of models for doing basic inference and in particular formulating simple commands based on a set of simple rules, and which will fit into a 24 GB (or 48 GB if much better) runpod?
4 Replies
Unknown User
Unknown User7mo ago
Message Not Public
Sign In & Join Server To View
Stone Johnson
Stone Johnson7mo ago
Thanks a ton! Will try it out
Alpay Ariyak
Alpay Ariyak7mo ago
OpenChat-3.5-1210 :)
Stone Johnson
Stone Johnson7mo ago
Gonna try it! Do I understand that it is 7B size, so probably runs in 16 GB?
Want results from more Discord servers?
Add your server
More Posts
Is runpod UI accurate when saying all workers are throttled?To be honest, I cannot tell if the image I see is correct? I have two endpoints both with max 3 workserverless: any way to figure out what gpu type a job ran on?trying to get data on speeds across gpu types for our jobs, and i'm wondering if the api exposes thiIs it possible to build an API for an automatic1111 extension to be used through Runpod serverless?I want to use the faceswaplab extension for automatic1111 as a serverless endpoint on Runpod. I manhosting mistral model in productionhi, I wish to host mistral model in runpod for production. what will happen to the app during scheduJobs suddenly queuing up: only 1 worker active, 9 jobs queued** Endpoint: vieo12phdoc8kh** Hi, are there any known issues at the moment with 4090s? Our processiIssues with building the new `worker-vllm` Docker ImageI've been using the previous version of `worker-vllm` with the `awq` model in production, and it recImportError: version conflict: '/opt/micromamba/envs/comfyui/lib/python3.10/site-packages/psutil/_psI'm spinning up a new pod and copying from backblaze B2, it works just fine before the download but Jupyter runpod proxy extremely slowHello, since a few days im having massive issues with Jupyter running on runpod proxys. Its abysmallRunpod Running Slower Than Local MachineI conducted a benchmark test on stable diffusion image-to-image. My pipeline involves using ControlNHow to transfer outputs when GPU is not available?I ran into an issue with trying to transfer my outputs from a pod with 0 GPU's. I wasn't able to use