Draining During Deployments

How long does Railway wait to drain old instances before tearing them down during deployments? We have requests that can take up to 60s to serve and want to make sure we're not dropping users during deployments. Is there any way to configure this?
14 Replies
Percy
Percy13mo ago
Project ID: 33c47f57-f4aa-4640-9b44-cd0a3f034b71
nootrality
nootrality13mo ago
33c47f57-f4aa-4640-9b44-cd0a3f034b71
Brody
Brody13mo ago
3 seconds from remove to killing the container, 3 seconds
nootrality
nootrality13mo ago
Oof, any way to increase that?
Brody
Brody13mo ago
no, however...
Brody
Brody13mo ago
nootrality
nootrality13mo ago
oh great, so RAILWAY_DEPLOYMENT_OVERLAY_SECONDS var set to e.g. 60 is sufficient to delay the teardown? e.g. the container stops getting requests but has 60s additional?
Brody
Brody13mo ago
I think it would stop getting requests, you'd have to do some testing around that since I've never toyed around with that setting myself
nootrality
nootrality13mo ago
ok let me try it and i'll report back here for posterity thanks
Brody
Brody13mo ago
no problem
Adam
Adam13mo ago
It would still get requests iirc. That env variable sets the time between your new deployment goes live and your old deployment goes down All requests will go to your old deployment until that time is up, after which the deployment is killed. There’s no way around your issue without some custom middleware Your downtime will be minimal. After your new deployment is active there will be no downtime and you shouldn’t be pushing to prod too often anyway. If you have a large amount of users who need constant access you should release full version updates, not patches
nootrality
nootrality13mo ago
yeah i think the issue here is we frequently have many outstanding requests at 3 seconds and disconnecting sockets is hard to recover i won't debate the merits of deploying frequently but regardless can't eat 2-3 self-inflicted events a day ok testing compete, @Adam seems to be right. the env variable does keep the old deploy around, but the requests are still being routed to it meaning we still end up dropping those connections is there no way to make RAILWAY_DEPLOYMENT_OVERLAP_SECONDS start routing incoming traffic to the newer deploy as soon as it succeeds?
Adam
Adam13mo ago
No, that’s not a feature on Railway but you can add it as a feature request in #🤗|feedback
nootrality
nootrality13mo ago
ok final update here just in case anyone runs into this thread. DNS flips currently take between 5 and 15 seconds, so if your max request duration is say X, you need to set RAILWAY_DEPLOYMENT_OVERLAP_SECONDS to 15+X. that actually does work, today, but if you're kicking off new requests within 15 seconds a successful deployment you may be directing them still to the old deployment. thanks @Adam @Brody