Rayboy
Rayboy
RRunPod
Created by Rayboy on 8/12/2024 in #⚡|serverless
Using the vLLM RunPod worker image and the OpenAI endpoints, how can I get the executionTime?
The standard endpoint provides executionTime as well as an ID that points to an execution that I can use /status on:
{
"delayTime": 598,
"executionTime": 1276,
"id": "84407cd5-63c4-45d6-aa56-b1f136c44d14-u1",
"output": ...
}
{
"delayTime": 598,
"executionTime": 1276,
"id": "84407cd5-63c4-45d6-aa56-b1f136c44d14-u1",
"output": ...
}
The OpenAI API endpoints unfortunately do not provide this, only token usage and a "chat-" ID that maybe I can do something with, but I can not find any documentation on:
{
"choices": ...
"created": 1723501967,
"id": "chat-652d581fee6c4bffb771c43b371b444e",
"model": ...,
"object": "chat.completion",
"usage": {
"completion_tokens": 100,
"prompt_tokens": 17,
"total_tokens": 117
}
}
{
"choices": ...
"created": 1723501967,
"id": "chat-652d581fee6c4bffb771c43b371b444e",
"model": ...,
"object": "chat.completion",
"usage": {
"completion_tokens": 100,
"prompt_tokens": 17,
"total_tokens": 117
}
}
Any help would be appreciated!
10 replies