ribbit Posts - Answer Overflow

ribbit

•Created by ribbit on 4/30/2024 in #⚡｜serverless

How do I handle both streaming and non-streaming request in a serverless pod?

How can I handle both effectively? Is it okay to have a handler witht both yield and return? i.e.

def handler(endpoint):
    if endpoint == "stream_response":
        yield stream_response()
    elif endpoint == "get_response":
        return get_response()

runpod.serverless.start(
    {
        "handler": handler,
        "return_aggregate_stream": True
    }
)

def handler(endpoint):
    if endpoint == "stream_response":
        yield stream_response()
    elif endpoint == "get_response":
        return get_response()

runpod.serverless.start(
    {
        "handler": handler,
        "return_aggregate_stream": True
    }
)

Will this work?

71 replies

RRunPod

•Created by ribbit on 4/24/2024 in #⚡｜serverless

Can local development use Runpod Secrets?

I discover that runpod serverless has this specify Secret feature. I want to use this to store values like environment variables.

{{ RUNPOD_SECRET_hello_world }}

{{ RUNPOD_SECRET_hello_world }}

How can I define secrets for my local development server as well? So that I can emulate accessing secrets in the actual serverless pod? Thanks

4 replies

RRunPod

•Created by ribbit on 4/17/2024 in #⚡｜serverless

Connection reset by peer

34 replies

RRunPod

•Created by ribbit on 3/14/2024 in #⚡｜serverless

Knowing Which Machine The Endpoint Used

10 replies

RRunPod

•Created by ribbit on 2/29/2024 in #⚡｜serverless

cudaGetDeviceCount() Error

When importing exllamav2 library I got this error which made the serverless worker stuck and keeps on spitting an error stack trace. The error is:

RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

What's about this error? Is this about the library or is there something wrong with the worker hardware that I've chosen? and why doesn't the error stop the worker? It keeps on running for 5mins without I even realizing.

9 replies

RRunPod

•Created by ribbit on 2/27/2024 in #⚡｜serverless

Is it possible to run fully on sync?

All the async functions and webhooks are so much pain, can we just fully run on sync?

13 replies

RRunPod

•Created by ribbit on 2/22/2024 in #⚡｜serverless

Can I emulate hitting serverless endpoints locally?

So far I've been testing my runpod serverless locally by executing the python handler

python -u handler.py

python -u handler.py

but is there any way to emulate hitting the serverless endpoint locally?

16 replies

RRunPod

•Created by ribbit on 2/20/2024 in #⚡｜serverless

LLM inference on serverless solution

Hi, need some suggestion on serving LLM model on serverless. I have several questions: 1. Is there any guide or example project I can follow so that can infer effectively on runpod serverless? 2. Is it recommended to use frameworks like TGI or vLLM with runpod? If so why? I'd like maximum control on the inference code so I have not tried any of those frameworks Thanks!

9 replies

Gaming

Programming