RunPod•12mo ago

How to deploy Llama3 on Aphrodite Engine (RunPod)

I have setup the following settings for a pod with 48 GB RAM. 1) I'm not sure how to enable Q4 cache otherwise the 5.0bpw won't fit. Any advice please? (See attached) 2) I get an error config.json can't be found, It seems like the REVISION variable has not been taken into account. Based on the docs it says: REVISION: The HuggingFace branch name, it defaults to the main branch. I think that's a bug.

2024-05-06T19:18:00.996748225Z Starting Aphrodite Engine API server...
2024-05-06T19:18:00.996849854Z + exec python3 -m aphrodite.endpoints.openai.api_server --host 0.0.0.0 --port 7860 --download-dir /tmp/hub --model turboderp/Llama-3-70B-Instruct-exl2 --revision 5.0bpw --kv-cache-dtype fp8_e5m2 --gpu-memory-utilization 1.0 --enforce-eager --max-log-len 0
2024-05-06T19:18:03.870671479Z Traceback (most recent call last):
2024-05-06T19:18:03.870689139Z   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
2024-05-06T19:18:03.870694559Z     response.raise_for_status()
2024-05-06T19:18:03.870698449Z   File "/usr/local/lib/python3.10/dist-Packages/requests/models.py", line 1021, in raise_for_status
2024-05-06T19:18:03.870701649Z     raise HTTPError(http_error_msg, response=self)
2024-05-06T19:18:03.870705949Z requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/turboderp/Llama-3-70B-Instruct-exl2/resolve/main/config.json
2024-05-06T19:18:03.870710549Z

2024-05-06T19:18:00.996748225Z Starting Aphrodite Engine API server...
2024-05-06T19:18:00.996849854Z + exec python3 -m aphrodite.endpoints.openai.api_server --host 0.0.0.0 --port 7860 --download-dir /tmp/hub --model turboderp/Llama-3-70B-Instruct-exl2 --revision 5.0bpw --kv-cache-dtype fp8_e5m2 --gpu-memory-utilization 1.0 --enforce-eager --max-log-len 0
2024-05-06T19:18:03.870671479Z Traceback (most recent call last):
2024-05-06T19:18:03.870689139Z   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
2024-05-06T19:18:03.870694559Z     response.raise_for_status()
2024-05-06T19:18:03.870698449Z   File "/usr/local/lib/python3.10/dist-Packages/requests/models.py", line 1021, in raise_for_status
2024-05-06T19:18:03.870701649Z     raise HTTPError(http_error_msg, response=self)
2024-05-06T19:18:03.870705949Z requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/turboderp/Llama-3-70B-Instruct-exl2/resolve/main/config.json
2024-05-06T19:18:03.870710549Z

Solution:

Sure, I just made A PR. Please have a look: https://github.com/PygmalionAI/aphrodite-engine/pull/455 Do you think you could cherry pick this fix for RunPod?...

Jump to solution

18 Replies

Jason•12mo ago

or if its a bug i suggest just make an issue to the github of the template

houmieOP•12mo ago

Ah yes, there is already an issue about this and patch has been proposed: https://github.com/PygmalionAI/aphrodite-engine/issues/318#issuecomment-2088895545 Do you see how simple the fix is? 😄 They just forgot to add this parameter revision=self.model_config.revision The patch can be applied to the latest release v0.5.2. Do you think you could do that? That would be amazing.

Jason•12mo ago

Do a quick pr if you're sure for that, sorry I don't want to do that Create a fork, of the github repo Edit your own fork ( using web editor ) Then create a pr

Solution

houmie•12mo ago

Sure, I just made A PR. Please have a look: https://github.com/PygmalionAI/aphrodite-engine/pull/455 Do you think you could cherry pick this fix for RunPod?

houmieOP•12mo ago

@nerdylive Thanks

Jason•12mo ago

houmieOP•12mo ago

Ah hold on, it seems someone replied to rebase it to dev.

Jason•12mo ago

Try to do that and they will merge your pr

houmieOP•12mo ago

Ah is that you? LOL

Jason•12mo ago

No no That's not me

houmieOP•12mo ago

sure, what a timing

Jason•12mo ago

Yeah I just happened to open that just now

houmieOP•12mo ago

cool. I'll do it now. Thanks

Jason•12mo ago

Yep np Look at the cool green button at my screen, that doesn't look like I just replied hahah

houmieOP•12mo ago

yes 🙂 I was joking

Jason•12mo ago

same all gud now on the engine?

houmieOP•12mo ago

Ok, he just approved and merged it to Dev branch. Now we need to wait until he makes a release. 😦 It's ok. Hopefully he can do it in the next 2 weeks, before we go live.

Jason•12mo ago

nice

Gaming

Programming

How to deploy Llama3 on Aphrodite Engine (RunPod)

Did you find this page helpful?