AC_pill
AC_pill
RRunPod
Created by AC_pill on 5/15/2024 in #⚡|serverless
Model loadtime affected if PODs are running on the same server
I was trying to debug the latency on my test PODs and now I figured that PODs running on the same physical machine are lagging too much on IO access. After profilling, I've got these results. Example: Initial test on POD - running on a single POD model load time for 6Gb model is 2 sec - when I pulled 2 GPUs from the same server model load increased to 40 sec Even inference is affected, RAM leaking? On Serverless: - Same GPU 4090, gets different inference and load time as well - 30s for loading, 4 sec depending on the machine - inference is non uniform as well: 20s on some and 10s on some All running the same docker, and same scripts with the same libraries. Do we have any work in place to ensure we have uniformity on HW? Are we enforcing servers to have separate SSD / NVME for each GPU and including different pipe for IO access? Need to have some idea if this is persisting issue, I'm pretty sure the Mbps on the descriptors are not reflecting the reality at all. EDIT: I'm using US region now, Global the problem is worse.
16 replies
RRunPod
Created by AC_pill on 5/3/2024 in #⚡|serverless
Flashboot mode: Need help or documentation
Hi, I'm trying to decrease load time to our serverless endpoint on idle. I'm building for Stable diffusion usage. I tried Flashboot before but the all output was repeated. Is there any documentation or guide on how to create the docker to use it? Does it work as hibernation saving the memory state? If so, does it save the latest state after the task finishes?
6 replies
RRunPod
Created by AC_pill on 4/30/2024 in #⚡|serverless
Idle timeout not working
Hi team. I'm setting my serverless endpoint with a Idle timeout for 180 seconds, but it's idleing or sleeping back after the task is done. It was working before, and this is hard to debug, does anyone has any experience on that? How can I debug this, as it's outside the handler. notes: - I didn't add Maximum execution time out. - I added Per request, spawn
9 replies
RRunPod
Created by AC_pill on 3/22/2024 in #⚡|serverless
Moving to production on Runpod: Need to check information on serverless costs
Hi team. I'm working with my company to move our product to release, with a soft launch in April. We ran tests already on serverless, but we need to confirm some information. 1. On the endpoints costs, the 40% discount for active servers, is this considering all models correct? 2. Is there a price update time frame? 3. Do we have more than 8 A100 GPUs available, are they located in US? 4. Is there a way to set a number of GPU models for each active as a group? Say 8 A100 and 6 L4? Thanks.
2 replies
RRunPod
Created by AC_pill on 2/23/2024 in #⚡|serverless
Idle time: High Idle time on server but not getting tasks from queue
I'm testing servers with high Idle time to keep alive and get new tasks, but the worker is showing idle and finished but not getting new tasks from the Queue. Is there any event or state I need to add to the handler?
13 replies
RRunPod
Created by AC_pill on 2/22/2024 in #⚡|serverless
Is there a programatic way to activate servers on high demand / peak hours load?
We are testing the serverless for production deployment for next month. I want to assure we will have server times during peak hours. We'll have some active servers but we need to guarantee load for certain peak hours, is there a way to programatically activate the servers?
42 replies
RRunPod
Created by AC_pill on 2/20/2024 in #⚡|serverless
Serverless on Active State behaviour
Some APIs I was using on serverless were working on active and idle state before, now it seems to break the server when I switch to active, the response is always the same as the one before, or only finished. I want to debug what is happening, can someone explain how the state work internally on the handler after it's awake? What will stay in memory? Will it run the entrypoint.sh only once correct? Will it send only start signal once or for every task?: runpod.serverless.start({ "handler": handler })
11 replies
RRunPod
Created by AC_pill on 2/12/2024 in #⚡|serverless
[FEATURE REQUEST] Granular selection for Serverless Pod GPUs
Hi team, not sure if this is the correct place to post. - Feature request: I'd like to select specific GPU pods for my server, example the grouped tier with L4, A5000, 3090, could be expanded in singular. Why? TensorRT models now take advantage of RTX GPUs, so been able to select 3090 and 4090, or different groups will be highly beneficial to keep Dockers that work with specific architecture. Same is true for old architecture and new ones like L4 L40, etc. Thanks.
3 replies
RRunPod
Created by AC_pill on 2/8/2024 in #⛅|pods
RTX 4090 POD Cuda issue
Hi, I'm trying to load a community POD and I run into this issue with the Cuda drives ERROR: The NVIDIA Driver is present, but CUDA failed to initialize. GPU functionality will not be available. 2024-02-08T16:56:04.457515868Z [[ Initialization error (error 3) ]] POD is US, RTX 4090 It's a NVidia container, details below: ============= 2024-02-08T16:56:04.419245167Z == PyTorch == 2024-02-08T16:56:04.419246369Z ============= 2024-02-08T16:56:04.419247661Z 2024-02-08T16:56:04.419248773Z NVIDIA Release 23.10 (build 71422337) 2024-02-08T16:56:04.419250697Z PyTorch Version 2.1.0a0+32f93b1 Is there any specific Cuda version for 4090s? The same Docker runs without issues on A5000, 3090.
8 replies
RRunPod
Created by AC_pill on 1/30/2024 in #⚡|serverless
How do I select a custom template without creating a new Endpoint?
Hi, right now I need to create a new template to do a new release. The issue is, the platform as is, doesn't let you pick a new Template only modify what is attached to it, that is problematic if I have different Endpoints that share the same template. Do I need to create a new endpoint every time I need to just select a template?
9 replies
RRunPod
Created by AC_pill on 1/26/2024 in #⛅|pods
Custom Templates are not loading on Secure Cloud
I have some templates that are running without issue on Community Cloud but None of them work on Secure Cloud. And probably Serverless is suffering with the same issue as my custom endpoint do not load. Issue affects A5000 GPU, on Secure Cloud. Can someone please take a look: - Issue is, on start.sh I have a Github repo update on loading container, it says that /workspace/repository does not exists - the same template and start.sh works perfectly on Community Pods Output from Pod Log: Updating ComfyUI repository... fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). Failed to fetch from origin - no issues with the repository, or access or related - possible POD connection issues? Local files system issues? We are testing a product.
11 replies