How to configure a one-to-one mapping of client connection to worker/GPU instance
I am building an application where a client connects to a worker and the worker streams some content to the client over websocket. I want to configure this setup to force a one-to-one mapping of client to worker. In other words, I would like precise control over how individual client requests are allocated to workers. I tried setting the request count to 1 to force the endpoint to spin up one worker per client connection, but that didn't work because while the endpoint does spin up one worker per endpoint, it still routes multiple client connections through the same worker at least some of the time because it is handling load-balancing with some logic that doesn't seem to be accessible as far as I've found.
3 Replies
yes you can do this
just create an endpoint with your template, use ports in serverless
but jobs arent for connecting, its only for starting workers then sending the ip/port details to your application backend
so you connect with that WS ip/port details from the job
when you're done then you can send a data/signal from the WS. connection, the worker handles it then return a valid json to exit worker (turn off) and stop charging
you can connect howeever you like, just send as many requests for how many workers you wanna have
feel free to ask more if you;re confused, this may be a little complicated
If websocket works that can be an option
What do you mean?