R
Railway•4w ago
hmh

Health check failing after minor code change

I made a very minor text change, and now my deployment isn't working where as it was working fine yesterday. The deploy logs look normal, but my build logs show that my health check isn't working (it works locally).
Solution:
gunicorn -b [::]:$PORT project.wsgi
Jump to solution
59 Replies
Percy
Percy•4w ago
Project ID: N/A
hmh
hmh•4w ago
N/A My health check is /health which serves:
def health_check(request):
return HttpResponse(status=200)
def health_check(request):
return HttpResponse(status=200)
Brody
Brody•4w ago
what is the health check failing with?
hmh
hmh•4w ago
Attempt #6 failed with service unavailable. Continuing to retry for 6m31s
Brody
Brody•4w ago
are you on the legacy or v2 runtime? check your service settings
hmh
hmh•4w ago
legacy Oh wait It's V2?
Brody
Brody•4w ago
on the v2 runtime your app needs to listen on ::
hmh
hmh•4w ago
Did it get auto-switched or something?
Brody
Brody•4w ago
might have
hmh
hmh•4w ago
I don't love that. Would've been nice to know about a breaking change like this. Is that safe for me to switch back to legacy? And where would I find docs for the difference between legacy and V2?
angelo
angelo•4w ago
You can indeed switch to legacy.
angelo
angelo•4w ago
We expect the legacy runtime to stay in place for as long as we get the expected behavior that our users need.
Brody
Brody•4w ago
https://docs.railway.app/reference/runtime if the healthcheck issue is the only issue you face, I cannot recommend switching back to the legacy runtime fwiw the health check issue has been reported to the team
hmh
hmh•4w ago
That's fair, but I'm not always looking at the changelog unless I'm interested to see what new features are available to me. I do get emails about the changelog with some basic bullet points, but it would've been nice to have in this email, or a separate email a message long the lines of: "Starting 6/x/2024, all services will be switched from Legacy to V2, and here's what you need to do to before then:"
No description
hmh
hmh•4w ago
Ohhh. So it wasn't anticipated. That's fair. Ok. Thanks for the help! Will fix up my /health response 😄
Brody
Brody•4w ago
its a fair assumption that the changelogs would only include new features, and they do, but that also mention migration timelines and such for new features and new features always have the possibility to cause issues
hmh
hmh•4w ago
True, but imo known breaking changes should be communicated more directly. I don't always have the time to read changelogs for all of the services I use. My project is a hobby project, so no bigs, but for the enterprise customers, that could put a snag in their work. Luckily the support here is really on top of things!
Brody
Brody•4w ago
i dont think this was known tbh, but i have no way to know for sure waiting to hear back from char on this issue
hmh
hmh•4w ago
Yeah, in that case, it's a hiccup. And good on the Railway team for testing with Hobby accounts first so they can find these issues before they reach enterprise customers. I'm still having issues getting this to work. I've added [::1]:$PORT to my gunicorn command. I've confirmed this working locally, but still having trouble with the health check So it was gunicorn project.wsgi and now it's gunicorn -b 127.0.0.1:$PORT -b [::1]:$PORT project.wsgi
Brody
Brody•4w ago
it needs to be :: not ::1
hmh
hmh•4w ago
Ah, see, I tried that, but I get [ERROR] Connection in use: ('::', 65090)
angelo
angelo•4w ago
Dumb ask and unsure if you did this in the past, does switching to Legacy confirmed will fix the issue? Wanna make sure our network engineer can do a proper repro.
hmh
hmh•4w ago
I figured that there's already something running there.
Brody
Brody•4w ago
yes, check #🦸|conductor-chat
hmh
hmh•4w ago
I'll give it a try here and confirm. I don't have access to that channel.
angelo
angelo•4w ago
He is flagging me to another case 🙂
hmh
hmh•4w ago
ahhh
angelo
angelo•4w ago
We just wanna have more languages to test runtime with hence why I ask. The more cases the better.
hmh
hmh•4w ago
Sounds good! Yeah, I'll test and report back.
angelo
angelo•4w ago
And sorry to use you as a test pig, I can comp you the month since you are doing QA work.
Solution
Brody
Brody•4w ago
gunicorn -b [::]:$PORT project.wsgi
hmh
hmh•4w ago
Yup! Tried that, and got the "Connection in use" error. Much appreciated!
angelo
angelo•4w ago
new role added comped, test away, let us know when you have recovered the healthcheck
hmh
hmh•4w ago
Ah, looks like that only gets the newest logs. Here's what it shows:
[2024-06-11 19:11:19 +0000] [1] [INFO] Starting gunicorn 21.2.0

[2024-06-11 19:11:19 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:19 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:20 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:20 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:21 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:21 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:22 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:22 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:23 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:23 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:24 +0000] [1] [ERROR] Can't connect to ('::', 65090)

container event container died
[2024-06-11 19:11:19 +0000] [1] [INFO] Starting gunicorn 21.2.0

[2024-06-11 19:11:19 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:19 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:20 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:20 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:21 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:21 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:22 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:22 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:23 +0000] [1] [ERROR] Connection in use: ('::', 65090)

[2024-06-11 19:11:23 +0000] [1] [ERROR] Retrying in 1 second.

[2024-06-11 19:11:24 +0000] [1] [ERROR] Can't connect to ('::', 65090)

container event container died
Brody
Brody•4w ago
ill try to reproduce what version of gunicorn?
hmh
hmh•4w ago
I changed my gunicorn command back to what it was, and flipped the runtime to Legacy and it deployed successfully. And that's including the minor code change mentioned in the original post. 21.2.0
angelo
angelo•4w ago
Gotcha- that seems to be enough, going to add this case on the Runtime V2 blockers in the root thread.
hmh
hmh•4w ago
Great. Thanks!
Brody
Brody•4w ago
my start command is gunicorn -b [::]:$PORT main:app on the v2 runtime with the same gunicorn version you are using, so this new error doesnt look like a v2 vs legacy issue
No description
No description
hmh
hmh•4w ago
But what would already be running on that port? 🤔 In my case.
Brody
Brody•4w ago
does your container run gunicorn and only gunicorn?
hmh
hmh•4w ago
It runs a couple django commands before gunicorn. migrate and collectstatic
Brody
Brody•4w ago
can you provide the full command
hmh
hmh•4w ago
python manage.py migrate && python manage.py collectstatic --noinput && gunicorn project.wsgi
Brody
Brody•4w ago
and what was the command when you got this error?
hmh
hmh•4w ago
python manage.py migrate && python manage.py collectstatic --noinput && gunicorn -b [::]:$PORT project.wsgi I may have had an extra -b 127.0.0.1:$PORT in there for IPv4. Testing just [::] atm
Brody
Brody•4w ago
that would do it, gunicorn supports dual stack binding anyway so that wouldnt be needed, 127.0.0.1 would also be the incorrect address
hmh
hmh•4w ago
Their documentation seems to suggest that you need to state both: https://docs.gunicorn.org/en/stable/settings.html#bind and I'm assuming the correct address is 0.0.0.0?
angelo
angelo•4w ago
Yes, binding on 127.0.0.01 won't bind properly. But wondering why legacy did it.
hmh
hmh•4w ago
I didn't have that for legacy. I was just adding it in because I assumed I needed it if I also needed to have IPv6. My bad. Alright, well. It worked with python manage.py migrate && python manage.py collectstatic --noinput && gunicorn -b [::]:$PORT project.wsgi on V2
Brody
Brody•4w ago
by default gunicorn binds to 0.0.0.0:$PORT so that would have worked for legacy as i suggested 🙂
hmh
hmh•4w ago
Yup! For some reason I thought I tested that. Sorry about that.
Brody
Brody•4w ago
no worries
hmh
hmh•4w ago
Guess this isn't a new bug then, Angelo! I apologize. New to messing with IPv6.
Brody
Brody•4w ago
it is a new bug you should not need to listen on ipv6 just for the health check to work
angelo
angelo•4w ago
Yea, if any behavior is different vs. old, its a bug. You did us a favor.
Brody
Brody•3w ago
technically solved Update, health checks can now pass if your app only listens on 0.0.0.0 but if you have already changed it to :: there's no point in changing anything back as listening on :: has no known drawbacks.