ngagefreak05
ngagefreak05
RRunPod
Created by ngagefreak05 on 6/24/2024 in #⚡|serverless
cannot stream openai compatible response out
I have the below code for streaming the response, the generator is working but cannnot stream the response: llm = Llama(model_path="Phi-3-mini-4k-instruct-q4.gguf", n_gpu_layers=-1, n_ctx=4096, ) class JobInput: def init(self, job): self.openai_route = job.get("openai_route") self.openai_input = job.get("openai_input", {}) self.is_completion = "v1completions" in self.openai_route self.is_embedding = "embeddings" in self.openai_route self.embedding_format = self.openai_input.get('encoding_format', 'unknown') self.is_chatcompletion = "chat" in self.openai_route def infer(job_params): if 'n' in job_params.openai_input: del job_params.openai_input['n'] if job_params.openai_route and job_params.is_embedding: yield [ErrorResponse( message="The embedding endpoint is not supported on this URL.", type="unsupported_endpoint", code=501 # Not Implemented ).model_dump()] else: if job_params.openai_route and job_params.is_chatcompletion: llm_engine = llm.create_chat_completion else: llm_engine = llm.create_completion if not job_params.openai_input.get("stream", False): yield llm_engine(job_params.openai_input) elif job_params.openai_input.get("stream", False): llm_op = llm_engine(job_params.openai_input) yield llm_op async def handler(event): inp = event["input"] job_input = JobInput(inp) for line in infer(job_input): if isinstance(line, Generator): for l in line: yield l else: yield line if name == "main": runpod.serverless.start({"handler": handler, "return_aggregate_stream": True,}) Need help to fix!
7 replies
RRunPod
Created by ngagefreak05 on 5/3/2024 in #⚡|serverless
raise error
in serverless docker how can i raise an error at an api endpoint for runpod to work properly
4 replies
RRunPod
Created by ngagefreak05 on 4/30/2024 in #⚡|serverless
openai compatible endpoint for custom serverless docker image
how can I get openai compatible endpoint for my custom docker image in runpod serverless. I am trying to create llama cpp docker image
5 replies