deanQ
deanQ
RRunPod
Created by Keffisor21 on 10/3/2024 in #⚡|serverless
Job timeout constantly (bug?)
Hi. Please file a support ticket and mention this thread so that you can share more info that would help us determine what's going on and how to fix it. Feel free to mention me on your tickets. Thank you.
21 replies
RRunPod
Created by rougsig on 9/27/2024 in #⚡|serverless
Stuck IN_PROGRESS but job completed and worker exited
I was referring to this "payload_size_bytes": 0 <-- seems sus? It's going to always be zero for GET requests. Payload only exists for post or put requests.
17 replies
RRunPod
Created by rougsig on 9/27/2024 in #⚡|serverless
Stuck IN_PROGRESS but job completed and worker exited
not the best of logs. That field actually refers to the size of the body payload on post or put requests. Get requests have none.
17 replies
RRunPod
Created by rougsig on 9/27/2024 in #⚡|serverless
Stuck IN_PROGRESS but job completed and worker exited
I have looked at the logs of your endpoint 74jm2u3liu0pcy. It still says it's using 1.6.2 all week. Could it be a different endpoint ID?
17 replies
RRunPod
Created by rougsig on 9/27/2024 in #⚡|serverless
Stuck IN_PROGRESS but job completed and worker exited
We recently fixed a bug and released it on 1.7.2. The bug caused our platform to disregard workers that are currently working a job. So if a job took longer than an endpoint's idle time (for example) it would put that worker to sleep. By the time the job is finished, it would have no worker to report back to.
17 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
This is exactly where flash-boot should help. I’ll investigate what I can about this.
56 replies
RRunPod
Created by Keffisor21 on 10/3/2024 in #⚡|serverless
Job timeout constantly (bug?)
1.7.2 is officially the latest release as of today
21 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
With a setup like this, you will face cold start issues. For example, if you have burst consecutive jobs coming in, workers will stay alive and take those jobs. The moment a second or two have a gap without a job then your workers will go to sleep. Any job that comes in after that will have to wait in queue until a worker is ready. And by ready I mean, flash-booted or fully booted as a new worker. Extra few seconds will not cost you more, and will guarantee quick job takes between the gaps. Incurring cold start and boot times will end up costing you more time in total.
56 replies
RRunPod
Created by Keffisor21 on 10/3/2024 in #⚡|serverless
Job timeout constantly (bug?)
FYI: v1.7.2 is on pre-release while I do some final tests https://github.com/runpod/runpod-python/releases/tag/1.7.2
21 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
FYI: v1.7.2 is on pre-release while I do some final tests https://github.com/runpod/runpod-python/releases/tag/1.7.2
56 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
Today. I’m just running some final testing.
56 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
Yes. You can do that from the main if you’d like to test it out. Override the Container Start Command with something like
/bin/bash -c "apt-get update && \
apt-get install -y git && \
pip install git+https://github.com/runpod/runpod-python && \
<insert Dockerfile CMD here>"
/bin/bash -c "apt-get update && \
apt-get install -y git && \
pip install git+https://github.com/runpod/runpod-python && \
<insert Dockerfile CMD here>"
56 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
Alright. That should be fixed with the 1.7.2 release. I’ll let you know when it’s out.
56 replies
RRunPod
Created by 1AndOnlyPika on 10/5/2024 in #⚡|serverless
Flashboot not working
Has it always been at a 1-second idle timeout? There’s a bug in 1.7.1 that affects tasks running longer than the idle timeout. That’s getting fixed in 1.7.2 that is releasing soon. See PR https://github.com/runpod/runpod-python/pull/362
56 replies