deepblhe
RRunPod
•Created by deepblhe on 11/7/2024 in #⚡|serverless
(Flux) Serverless inference crashes without logs.
Hi All!
I've built a FLUX inference container on Runpods serverless.
It works (sometimes) but I get a lot of random failures and Runpods does not return me the error logs.
E.g. this is the response:
'''
{
"delayTime": 151019,
"error": "job timed out after 1 retries",
"executionTime": 102002,
"id": "64de56ee-4af2-4c64-ab84-02d4a7e81593-u1",
"retries": 1,
"status": "FAILED",
"workerId": "1qjtmj861f1278"
}
'''
But no error log is reported, either in console or in the response, about what made the jobs re-try the first time.
Also the timeout should be one hour but I get this message after a few minutes.
I have also added a Telegram bot to log, but no exception is captured there as well. Did the machine just crash?
Have you experienced the same?
Have you experienced the same?
10 replies
RRunPod
•Created by deepblhe on 10/16/2024 in #⚡|serverless
RunPods Serverless - Testing Endpoint in Local with Docker and GPU
I’m creating a custom container to run FLUX and Lora on Runpods, using this Stable Diffusion example as a starting point. I successfully deployed my first pod on Runpods, and everything worked fine.
However, my issue arises when I make code changes and want to test my endpoints locally before redeploying. Constantly deploying to Runpods for every small test is quite time-consuming.
I found a guide for local testing in the Runpods documentation here. Unfortunately, it only provides a simple example that suggests running the handler function directly, like this:
python your_handler.py --test_input '{"input": {"prompt": "The quick brown fox jumps"}}'
This does not work for me as it ignores the Docker setup entirely and runs the function in my local Python environment. I want to go beyond this and test the Docker image end-to-end locally—on my GPU—with the exact dependencies and setup used when deploying on Runpods.
Is there specific documentation for testing Docker images locally for Runpods, or a recommended workflow for this kind of setup?
I tried following the guidelines for local testing here: https://docs.runpod.io/serverless/workers/development/local-testing
11 replies
RRunPod
•Created by deepblhe on 10/14/2024 in #⚡|serverless
Testing Endpoint in Local with Docker and GPU
I’m working on creating a custom container to run FLUX and Lora on Runpods, using this Stable Diffusion example as a starting point. I successfully deployed my first pod on Runpods, and everything worked fine.
However, my issue arises when I make code changes and want to test my endpoints locally before redeploying. Constantly deploying to Runpods for every small test is quite time-consuming.
I found a guide for local testing in the Runpods documentation here (https://docs.runpod.io/serverless/workers/development/local-testing). Unfortunately, it only provides a simple example that suggests running the handler function directly, like this:
This does not work for me as it ignores the Docker setup entirely and just runs the function in my local Python environment. I want to go beyond this and test the Docker image end-to-end locally—on my GPU—with the exact dependencies and setup that will be used when deploying on Runpods.
Does anyone know if there’s specific documentation for testing Docker images locally for Runpods, or if there’s a recommended workflow for this kind of setup? Any guidance on how to achieve this local environment would be greatly appreciated.
8 replies