Soulmind Comments - Answer Overflow

Soulmind

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

yeah I will do, cuz I've been monitoring the values for a while, and seems like the totalCount, and rentedCount for same GPU but different dc shows the same value:

Datacenter: CA-MTL-1
GPU Types: NVIDIA A40
{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.7,
          "rentalPercentage": 0.8423,
          "rentedCount": 844,
          "totalCount": 1002,
          "stockStatus": "High"
        },
        "oneMonthPrice": 0.35,
        "threeMonthPrice": 0.35,
        "sixMonthPrice": null
      }
    ]
  }
}
Datacenter: EU-SE-1
GPU Types: NVIDIA A40
{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.7,
          "rentalPercentage": 0.8423,
          "rentedCount": 844,
          "totalCount": 1002,
          "stockStatus": "Medium"
        },
        "oneMonthPrice": 0.35,
        "threeMonthPrice": 0.35,
        "sixMonthPrice": null
      }
    ]
  }
}

Datacenter: CA-MTL-1
GPU Types: NVIDIA A40
{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.7,
          "rentalPercentage": 0.8423,
          "rentedCount": 844,
          "totalCount": 1002,
          "stockStatus": "High"
        },
        "oneMonthPrice": 0.35,
        "threeMonthPrice": 0.35,
        "sixMonthPrice": null
      }
    ]
  }
}
Datacenter: EU-SE-1
GPU Types: NVIDIA A40
{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.7,
          "rentalPercentage": 0.8423,
          "rentedCount": 844,
          "totalCount": 1002,
          "stockStatus": "Medium"
        },
        "oneMonthPrice": 0.35,
        "threeMonthPrice": 0.35,
        "sixMonthPrice": null
      }
    ]
  }
}

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

👍 the only thing is, it seems like the GraphQL API is responding with the combined # of GPUs, not the # of GPUs in the specific dc...

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

There is way to do that if you use GraphQL. The doc states that there is totalCount and rentedCount. If you run the query:

query gpuAvailability($gpuTypesInput: GpuTypeFilter, $lowestPriceInput: GpuLowestPriceInput) {
  gpuTypes(input: $gpuTypesInput) {
    lowestPrice(input: $lowestPriceInput) {
      uninterruptablePrice
      rentalPercentage
      rentedCount
      totalCount
    }
  }
}

query gpuAvailability($gpuTypesInput: GpuTypeFilter, $lowestPriceInput: GpuLowestPriceInput) {
  gpuTypes(input: $gpuTypesInput) {
    lowestPrice(input: $lowestPriceInput) {
      uninterruptablePrice
      rentalPercentage
      rentedCount
      totalCount
    }
  }
}

with variable:

variables: {
  gpuTypesInput: {
    id: 'NVIDIA A40',
  },
  lowestPriceInput: {
    gpuCount: 1,
    secureCloud: true,
    dataCenterId: 'CA-MTL-1',
  },
}

variables: {
  gpuTypesInput: {
    id: 'NVIDIA A40',
  },
  lowestPriceInput: {
    gpuCount: 1,
    secureCloud: true,
    dataCenterId: 'CA-MTL-1',
  },
}

you will be able to see the rented count and total count:

{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.35,
          "rentalPercentage": 0.8745,
          "rentedCount": 885,
          "totalCount": 1012
        }
      }
    ]
  }
}

{
  "data": {
    "gpuTypes": [
      {
        "lowestPrice": {
          "uninterruptablePrice": 0.35,
          "rentalPercentage": 0.8745,
          "rentedCount": 885,
          "totalCount": 1012
        }
      }
    ]
  }
}

but seems like the rented count and total count is not strictly from that specific datacenter, but aggregated tho..

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

We're pooling from CA-MTL-1 and EU-SE-1, as they are the only datacenters with network volume support with A40s.

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

@yhlong00000 any plans on adding more A40s to the pool?

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

hope it's easy to debug! seems like now there are ~121 GPUs available. btw which backend are you using for your batch job? I heard SGLang is pretty good for batch jobs.

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

and 🤞 for your batch job 😉

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

lol yeah for sure it won't be an issue of course, not your fault no need to say sorry! it's just our thing that we only have added A40 to the autoscaling pool for now, cuz seemed like there were plenty of A40s couple days/weeks back. I think we anyways need to add more GPU types to the pool to adapt to any case.

84 replies

RRunPod

•Created by sluzorz on 8/23/2024 in #⛅｜pods

Maximum number of A40s that can run at one time

I got paged by the alert policy we setup internally for A40 availability as our product currently relies on that. Of course it is your right to spin up as many as you want but can you kindly let me know if this is going to be a one-off thingie or something you will be running in long-term? We've been happily enjoying the high availability of A40 but there are now only ~27 GPUs left lol

84 replies

Gaming

Programming