Rayboy Comments - Answer Overflow

Rayboy

•Created by Rayboy on 8/12/2024 in #⚡｜serverless

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

Would it be possible in a future update to get the job ID sent back in the OpenAI endpoints? I also would like to be able to cancel the job from our server but I currently cannot.

10 replies

RRunPod

•Created by Rayboy on 8/12/2024 in #⚡｜serverless

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

Its good that one 1 worker can process multiple, though that makes it hard to calculate costs this way since it would actually be cheaper if multiple ran requests on the same active worker. I may need to figure out another way to calculate it then.

10 replies

RRunPod

•Created by Rayboy on 8/12/2024 in #⚡｜serverless

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

That is a bummer, I really need to be able to calculate our costs per user when making requests and the total_tokens can't help me with that unfortunately.

10 replies

RRunPod

•Created by Rayboy on 8/12/2024 in #⚡｜serverless

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

I would use the Standard endpoints as they do return a job ID but I need to use the guided_json field and it seems only the OpenAI endpoints support that

10 replies

RRunPod

•Created by Rayboy on 8/12/2024 in #⚡｜serverless

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

Yes as I mentioned I tried this, but the OpenAI API endpoints do not return a job ID for me to use. It only returns "chat-652d581fee6c4bffb771c43b371b444e" which does not seem to be a job ID

10 replies

Gaming

Programming