R
RunPod5mo ago
Rayboy

Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?

The standard endpoint provides executionTime as well as an ID that points to an execution that I can use /status on:
{
"delayTime": 598,
"executionTime": 1276,
"id": "84407cd5-63c4-45d6-aa56-b1f136c44d14-u1",
"output": ...
}
{
"delayTime": 598,
"executionTime": 1276,
"id": "84407cd5-63c4-45d6-aa56-b1f136c44d14-u1",
"output": ...
}
The OpenAI API endpoints unfortunately do not provide this, only token usage and a "chat-" ID that maybe I can do something with, but I can not find any documentation on:
{
"choices": ...
"created": 1723501967,
"id": "chat-652d581fee6c4bffb771c43b371b444e",
"model": ...,
"object": "chat.completion",
"usage": {
"completion_tokens": 100,
"prompt_tokens": 17,
"total_tokens": 117
}
}
{
"choices": ...
"created": 1723501967,
"id": "chat-652d581fee6c4bffb771c43b371b444e",
"model": ...,
"object": "chat.completion",
"usage": {
"completion_tokens": 100,
"prompt_tokens": 17,
"total_tokens": 117
}
}
Any help would be appreciated!
5 Replies
yhlong00000
yhlong000005mo ago
you can call https://api.runpod.ai/v2/endpoint_id/status/job_id to get execution time
Rayboy
RayboyOP5mo ago
Yes as I mentioned I tried this, but the OpenAI API endpoints do not return a job ID for me to use. It only returns "chat-652d581fee6c4bffb771c43b371b444e" which does not seem to be a job ID I would use the Standard endpoints as they do return a job ID but I need to use the guided_json field and it seems only the OpenAI endpoints support that
nerdylive
nerdylive5mo ago
ah ya it doesn't.. also 1 worker can process multiple request at the same time in vllm-worker
Rayboy
RayboyOP5mo ago
That is a bummer, I really need to be able to calculate our costs per user when making requests and the total_tokens can't help me with that unfortunately. Its good that one 1 worker can process multiple, though that makes it hard to calculate costs this way since it would actually be cheaper if multiple ran requests on the same active worker. I may need to figure out another way to calculate it then. Would it be possible in a future update to get the job ID sent back in the OpenAI endpoints? I also would like to be able to cancel the job from our server but I currently cannot.
yhlong00000
yhlong000005mo ago
Thanks for the feedback, I noted internally~
Want results from more Discord servers?
Add your server