Severless 404
Hi there,
I'm getting a 404 error when sending requests on a develpment session (runpodctl project dev). Everything worked great locally using the --rp_serve_api, the only difference is that I changed the url from local host to https://api.runpod.ai/v2/my_pod_id/runsync and added the authentication key to accommodate for the deployment. I'm using postman to send the request
Has anyone faced this problem? Can't figure what I'm doing wrong
17 Replies
You can't use
my_pod_id
it must be a serverless endpoint not a pod.you're right, thx @ashleyk
I misunderstood it
@ashleyk not related but do you happen to know if it's mandatory to use rp_cuda? My worker is getting stuck and I don't see GPU usage ramping up
What is rp_cuda?
https://github.com/runpod-workers/worker-faster_whisper/blob/main/src/predict.py
a runpod replacement for torch.cuda
GitHub
worker-faster_whisper/src/predict.py at main · runpod-workers/worke...
🎧 | RunPod worker of the faster-whisper model for Serverless Endpoint. - runpod-workers/worker-faster_whisper
found it in this repo
I'm also doing STT
rn I can't even runsync from here
(don't mind what's inside audio_base64, I put that has a placeholder only)
had to cancel all requests manually
@Marut
@JHenriP Are you still facing the issue with the worker ?
Haven't tried since then. Will try again later on today
Still facing the same issue
@Marut
Can you share error? Setup? It should be helpful.
There's no error actually, simply nothing happens and looking at the worker utilization nothing is ramping up.
Base image: runpod/base:0.6.1-cuda12.2.0"
Requirements:
torch
hf_transfer
accelerate
flash-attn
transformers
runpod
Everything worked fine on local deployment
Let me check & try to reproduce!
@Marut any updates?
Hey, It works fine. I tested.
What is this? Is this something different than the faster whisper link you shared above?
@Merrell @Marut @ashleyk ended up being a problem with flash-attn 🙂
With base image runpod/base:0.6.1-cuda12.2.0 and using an A4000 apparently you can't have flash-attn added to the requirements.txt