TumbleWeed Comments - Answer Overflow

TumbleWeed

Posts Comments

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

I would try the 2nd option and let you know the result

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

So, its possible to preload the model into the worker ?

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

I have move my LLM model to Dockerhub, so I don't get haunted by gcp egress cost lol I have another question my cold start (load the LLM Model ) is around 15s-30s, anyway to optimize it? @ashleyk @justin

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

Alright, I will try with dockerhub then, Thank you @ashleyk @justin

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

But they limit the pull requests right?

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

Hi @ashleyk @Alpay Ariyak I have tried deploying my LLM Model on runpod serverless, the image is over 40GB, it doesn't cost wise to use Google Artifact Registry as they charging for egress outside of GCP Network, any recommendations for the container registry? Thank you

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

I have sucessfully run my model, but need some adjustment, because inside the container, I still run the FastAPI for endpoint.

50 replies

RRunPod

•Created by TumbleWeed on 2/16/2024 in #⚡｜serverless

Run LLM Model on Runpod Serverless

Wow, okay

50 replies

RRunPod