Clone endpoint failing in UI

{
"errors": [
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
},
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
}
]
}
{
"errors": [
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
},
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
}
]
}
User input,sensitive information removed:
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"AMPERE_48,ADA_48_PRO,-NVIDIA A40,-NVIDIA L40","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"myendpoint-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":25,"containerRegistryAuthId":"","dockerArgs":"","env":[{},{],"imageName":"my-imaage","name":"endpoint-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"AMPERE_48,ADA_48_PRO,-NVIDIA A40,-NVIDIA L40","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"myendpoint-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":25,"containerRegistryAuthId":"","dockerArgs":"","env":[{},{],"imageName":"my-imaage","name":"endpoint-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
17 Replies
zkreutzjanz
zkreutzjanz2mo ago
seems to be related to "allowedCudaVersions" changing in some way
digigoblin
digigoblin2mo ago
I think its better to remove allowedCudaVersions from your payload than to leave it empty
zkreutzjanz
zkreutzjanz2mo ago
Can’t control that because it is how the UI is built
digigoblin
digigoblin2mo ago
oh damn, yeah sorry oversight from my part, I think @nerdylive or @justin [Not Staff] can help by pinging the UI dev(s).
justin
justin2mo ago
@zkreutzjanz @flash-singh Bug? I think if you want to provide more info @zkreutzjanz can maybe put under #🧐|feedback as a [Bug]
flash-singh
flash-singh2mo ago
@zkreutzjanz can you share endpoint id? we can try to replicate using that
zkreutzjanz
zkreutzjanz2mo ago
6fiz1j5a45xg0u Looks like only reason why is because it now adds
"allowedCudaVersions"
:""
"allowedCudaVersions"
:""
which is invalid graphql maybe?
flash-singh
flash-singh2mo ago
oh empty one is not allowed, @Zeke
zkreutzjanz
zkreutzjanz2mo ago
But also fails when you add fyi see: "allowedCudaVersions":"12.1,12.2"
kaj
kaj2mo ago
hm, I can't seem to replicate this on my own, can you detail what steps in the UI you take to reproduce this issue?
kaj
kaj2mo ago
empty string works just fine for me
No description
No description
kaj
kaj2mo ago
doesn't seem to be allowedCudaVersions causing the issue might be that the GPU IDs you're sending aren't quite right?
flash-singh
flash-singh2mo ago
is it possible you reached max workers?
digigoblin
digigoblin2mo ago
You never really know what the actual problem is with GraphQL, it never gives a useful error. It is a nightmare trying to debug GraphQL payloads.
zkreutzjanz
zkreutzjanz2mo ago
No, have plenty of workers left Just hit three dots, clone, deploy And here is the exact request:
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"AMPERE_48,ADA_48_PRO,-NVIDIA A40,-NVIDIA L40","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"facerecognition-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":5,"containerRegistryAuthId":"clhpo24ed0008le085dfb6npb","dockerArgs":"","env":[{"__typename":"EnvironmentVariable","key":"RUNPOD_BUCKET","value":"deep-test-bucket"},{"__typename":"EnvironmentVariable","key":"CREDS_PATH","value":"/app/creds.json"}],"imageName":"zachdeepshot/deep-facerecognition-service-dev:c44a24d9806136cb2df0a7a36e4d8bcfed5a8961-serverless","name":"facerecognition-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"AMPERE_48,ADA_48_PRO,-NVIDIA A40,-NVIDIA L40","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"facerecognition-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":5,"containerRegistryAuthId":"clhpo24ed0008le085dfb6npb","dockerArgs":"","env":[{"__typename":"EnvironmentVariable","key":"RUNPOD_BUCKET","value":"deep-test-bucket"},{"__typename":"EnvironmentVariable","key":"CREDS_PATH","value":"/app/creds.json"}],"imageName":"zachdeepshot/deep-facerecognition-service-dev:c44a24d9806136cb2df0a7a36e4d8bcfed5a8961-serverless","name":"facerecognition-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
and response
{
"errors": [
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
},
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
}
]
}
{
"errors": [
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
},
{
"message": "Something went wrong. Please try again later or contact support.",
"locations": [
{
"line": 1,
"column": 23
}
],
"extensions": {
"code": "BAD_USER_INPUT"
}
}
]
}
same response here so not a gpu is issue:
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"ADA_24","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"facerecognition-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":5,"containerRegistryAuthId":"clhpo24ed0008le085dfb6npb","dockerArgs":"","env":[{"__typename":"EnvironmentVariable","key":"RUNPOD_BUCKET","value":"deep-test-bucket"},{"__typename":"EnvironmentVariable","key":"CREDS_PATH","value":"/app/creds.json"}],"imageName":"zachdeepshot/deep-facerecognition-service-dev:c44a24d9806136cb2df0a7a36e4d8bcfed5a8961-serverless","name":"facerecognition-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
{"operationName":"saveEndpoint","variables":{"input":{"gpuIds":"ADA_24","gpuCount":1,"allowedCudaVersions":"","idleTimeout":5,"locations":null,"name":"facerecognition-dev (cloned)","networkVolumeId":null,"scalerType":"QUEUE_DELAY","scalerValue":4,"workersMax":1,"workersMin":0,"executionTimeoutMs":600000,"template":{"containerDiskInGb":5,"containerRegistryAuthId":"clhpo24ed0008le085dfb6npb","dockerArgs":"","env":[{"__typename":"EnvironmentVariable","key":"RUNPOD_BUCKET","value":"deep-test-bucket"},{"__typename":"EnvironmentVariable","key":"CREDS_PATH","value":"/app/creds.json"}],"imageName":"zachdeepshot/deep-facerecognition-service-dev:c44a24d9806136cb2df0a7a36e4d8bcfed5a8961-serverless","name":"facerecognition-dev (cloned)__template"}}},"query":"mutation saveEndpoint($input: EndpointInput!) {\n saveEndpoint(input: $input) {\n gpuIds\n id\n idleTimeout\n locations\n name\n networkVolumeId\n scalerType\n scalerValue\n templateId\n userId\n workersMax\n workersMin\n gpuCount\n __typename\n }\n}"}
flash-singh
flash-singh2mo ago
@kaj did we get to the bottom of this?
kaj
kaj2mo ago
not yet, looking into a possible solution