Runpod VLLM Context Window
Hi I've been using this template in my serverless endpoint
https://github.com/runpod-workers/worker-vllm
I'm wondering what my context window is/how its handling chat history?
{
"conversation": {
"id": "some_conversation_id", // This should be the ID of the conversation
"messages": [
{
"source": "USER",
"content": "Previous messages in the conversation..."
}
// ... other previous messages
]
},
"message": {
"content": "Tell me why RunPod is the best GPU provider",
"source": "USER"
}
}
I follow the above as my input to the endpoint.
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by VLLM. - GitHub - runpod-workers/worker-vllm: The RunPod worker template for serving our large language model en...
0 Replies