andypotato
andypotato
RRunPod
Created by andypotato on 3/3/2025 in #⚡|serverless
Async workers not running
When using the /run endpoint I will receive the usual response:
{
"id": "d0e6d88c-8274-4554-bb6a-0a469361ae20-e1",
"status": "IN_QUEUE"
}
{
"id": "d0e6d88c-8274-4554-bb6a-0a469361ae20-e1",
"status": "IN_QUEUE"
}
However the job is never getting processed, despite there being available workers. Some observations: - A worker will spin up and go into "running" status, but the rp_handler.py is never executed - When I check the status of the job via /status/<jobId> the job will immediately start running - I can reproduce the exact same behavior using the local test version, so this is not limited to cloud usage - Running the exact same worker with /runsync will work without problems Using the runpod sdk 1.7.7 How can I solve this issue?
10 replies
RRunPod
Created by andypotato on 2/22/2024 in #⚡|serverless
Returning error, but request has status "Completed"
Hello, I'm using validate() from rp_validator to validate my input data against a schema. The relevant line of code to trigger the error is:
validated_input = validate(input, INPUT_SCHEMA)
if "errors" in validated_input:
return {"error": "validated_input["errors"]}
validated_input = validate(input, INPUT_SCHEMA)
if "errors" in validated_input:
return {"error": "validated_input["errors"]}
The validation works as expected when using a local test_input.json file. It will pass when all parameters are available, it will result in the job being marked as "Failed" when parameters are missing. However when running on a serverless instance, the request will be marked as completed, despite the input validation returning an error. The log shows:
2024-02-22T08:11:23.867285503Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Started.", "level": "INFO"}
2024-02-22T08:11:23.935203369Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Failed to return job results. | 400, message='Bad Request', url=URL('https://api.runpod.ai/v2/1hl4lm8ob7fkf6/job-done/phuu0up50t2bqb/sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1?gpu=NVIDIA+RTX+A4500&isStream=false')", "level": "ERROR"}
2024-02-22T08:11:23.935242370Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Finished.", "level": "INFO"
2024-02-22T08:11:23.867285503Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Started.", "level": "INFO"}
2024-02-22T08:11:23.935203369Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Failed to return job results. | 400, message='Bad Request', url=URL('https://api.runpod.ai/v2/1hl4lm8ob7fkf6/job-done/phuu0up50t2bqb/sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1?gpu=NVIDIA+RTX+A4500&isStream=false')", "level": "ERROR"}
2024-02-22T08:11:23.935242370Z {"requestId": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1", "message": "Finished.", "level": "INFO"
This is the response:
{
"delayTime": 1026,
"executionTime": 169,
"id": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1",
"status": "COMPLETED"
}
{
"delayTime": 1026,
"executionTime": 169,
"id": "sync-c03c0f59-b8d1-43e2-a0c9-c33466d24cd1-e1",
"status": "COMPLETED"
}
Why is this COMPLETED and not FAILED? Thank you for any hints or pointers.
5 replies