Ammar Ahmed
Ammar Ahmed
RRunPod
Created by Ammar Ahmed on 1/2/2025 in #⚡|serverless
Failed to load docker package.
I don't think v2 is the main issue here because other images and workers are working fine
13 replies
RRunPod
Created by Ammar Ahmed on 1/2/2025 in #⚡|serverless
Failed to load docker package.
first it gave me denied with new registry key. then i made it public but it's stuck here
13 replies
RRunPod
Created by Ammar Ahmed on 1/2/2025 in #⚡|serverless
Failed to load docker package.
5 minutes
13 replies
RRunPod
Created by Ammar Ahmed on 1/2/2025 in #⚡|serverless
Failed to load docker package.
2025-01-02T09:41:15Z image pull: ghcr.io/ammarft-ai/img-remover:1.6: pending
13 replies
RRunPod
Created by Ammar Ahmed on 1/2/2025 in #⚡|serverless
Failed to load docker package.
ahan yes working now. thanks
13 replies
RRunPod
Created by Ammar Ahmed on 1/1/2025 in #⚡|serverless
Cannot send request to one endpoint
it simple runsync serverless template. using it for a while. it was the very first time this type of issue occured
8 replies
RRunPod
Created by Ammar Ahmed on 1/1/2025 in #⚡|serverless
Cannot send request to one endpoint
it's back to normal, i guess it was some bug. Thanks tho
8 replies
RRunPod
Created by Ammar Ahmed on 1/1/2025 in #⚡|serverless
Cannot send request to one endpoint
it was working fine 5 minutes ago 🙂
8 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
class ModelPool:
def __init__(self, max_models=3):
self.lock = Lock()
self.model_queue = queue.Queue()
self.max_models = max_models

# Initialize the pool with a set number of models
for _ in range(max_models):
self.model_queue.put(self._create_model())

def _create_model(self):
""" Load and return a new instance of the model. """
model_id = "SG161222/Realistic_Vision_V2.0"
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
pipeline = DiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16, cache_dir="model_cache")
pipeline = pipeline.to("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
return pipeline

def get_model(self):
""" Get a model from the pool. """
with self.lock:
return self.model_queue.get()

def return_model(self, model):
""" Return a model to the pool. """
with self.lock:
self.model_queue.put(model)

model_pool = ModelPool(max_models=10)
class ModelPool:
def __init__(self, max_models=3):
self.lock = Lock()
self.model_queue = queue.Queue()
self.max_models = max_models

# Initialize the pool with a set number of models
for _ in range(max_models):
self.model_queue.put(self._create_model())

def _create_model(self):
""" Load and return a new instance of the model. """
model_id = "SG161222/Realistic_Vision_V2.0"
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
pipeline = DiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16, cache_dir="model_cache")
pipeline = pipeline.to("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
return pipeline

def get_model(self):
""" Get a model from the pool. """
with self.lock:
return self.model_queue.get()

def return_model(self, model):
""" Return a model to the pool. """
with self.lock:
self.model_queue.put(model)

model_pool = ModelPool(max_models=10)
This is the pool, which loaded into memory. Everytime a request is on the server it gets model from this pool.
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
Fixed it, I created a model pool which will keep number of models loaded according to the max concurrency. It reduced the time to below 10 seconds 😀
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
also processing on a single request is fast, but when multiple requests are being processed concurrently, processing is very slow
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
yes i have flashboot enabled
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
It seems like concurrent request are taking too long to get processed together.
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
It's taking time to load the model into memory
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
okay
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
No description
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
ohh okay. Thanks
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
yes
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
yes i found it on docs and was figuring out how to implement it in python. Will it go with input in handler.py?
38 replies
RRunPod
Created by Ammar Ahmed on 10/4/2024 in #⚡|serverless
How can I make a single worker handle multiple requests concurrently before starting the next worker
Essentially, I want to modify how the queue behaves so that premium requests jump ahead of regular ones in the processing order. Can I modify the queue behavior or set some priority rules for incoming requests in RunPod?
38 replies