How to deploy Llama3 on Aphrodite Engine (RunPod)
I have setup the following settings for a pod with 48 GB RAM.
1) I'm not sure how to enable Q4 cache otherwise the 5.0bpw won't fit. Any advice please? (See attached)
2) I get an error config.json can't be found, It seems like the REVISION variable has not been taken into account. Based on the docs it says:
REVISION: The HuggingFace branch name, it defaults to the main branch.
I think that's a bug.
Solution:Jump to solution
Sure, I just made A PR. Please have a look:
https://github.com/PygmalionAI/aphrodite-engine/pull/455
Do you think you could cherry pick this fix for RunPod?...
18 Replies
or if its a bug i suggest just make an issue to the github of the template
Ah yes, there is already an issue about this and patch has been proposed:
https://github.com/PygmalionAI/aphrodite-engine/issues/318#issuecomment-2088895545
Do you see how simple the fix is? 😄 They just forgot to add this parameter
revision=self.model_config.revision
The patch can be applied to the latest release v0.5.2
. Do you think you could do that? That would be amazing.Do a quick pr if you're sure for that, sorry I don't want to do that
Create a fork, of the github repo
Edit your own fork ( using web editor )
Then create a pr
Solution
Sure, I just made A PR. Please have a look:
https://github.com/PygmalionAI/aphrodite-engine/pull/455
Do you think you could cherry pick this fix for RunPod?
@nerdylive
Thanks
Ah hold on, it seems someone replied to rebase it to dev.
Try to do that and they will merge your pr
Ah is that you? LOL
No no
That's not me
sure, what a timing
Yeah I just happened to open that just now
cool. I'll do it now. Thanks
Yep np
Look at the cool green button at my screen, that doesn't look like I just replied hahah
yes 🙂 I was joking
same
all gud now on the engine?
Ok, he just approved and merged it to Dev branch. Now we need to wait until he makes a release. 😦
It's ok. Hopefully he can do it in the next 2 weeks, before we go live.
nice