R
Railwayā€¢10mo ago
Floris

Can't seem to get the healthcheck to work, works fine locally over FASTapi and Flask

service ID: 2a262f8f-be17-475a-8463-21e12fafebbf I really hate opening this ticket guys, i'm sorry in advance but i can't seem to figure it out, i'm sure it's something small i must of missed im running a pretty big python repository with 7-10 seconds worth of healthchecks being done before returning status 200, however when i deploy on railway it just keeps timing out (the API itself worked fine with the current config, it's just the healthcheck endpoint which is acting up) for context i am running main.py from my procfile and my API is in another python file, both are being initialized though) Also the API has to run on port 4242 as its interacting with the stripe API via webhooks
No description
No description
No description
171 Replies
Percy
Percyā€¢10mo ago
Project ID: 2a262f8f-be17-475a-8463-21e12fafebbf
Floris
Florisā€¢10mo ago
if anyone would have some sparetime and would maybe be willing to try and help me out i'd greatly appreciate it @brody192 what are my options to run multiple processed concurrently if procfiles are off the board, i dont really wanna subprocess into different py files w popen
Brody
Brodyā€¢10mo ago
you wanna go over that before we get the health check working?
Floris
Florisā€¢10mo ago
that is the root problem of my health as i have a main file and i had a seperate api file to have my endpoints and well i was trying to init both of them seperate via the procfile hence the endpoint not working deploying via 2 services is not really an option as that would defeat the point of the healthcheck
Brody
Brodyā€¢10mo ago
interesting setup you have
Floris
Florisā€¢10mo ago
yes
Brody
Brodyā€¢10mo ago
what does the main.py file do on its own?
Floris
Florisā€¢10mo ago
its the main handler as its a mono repo the repo is like 8 or 9 k lines
Brody
Brodyā€¢10mo ago
how many services in this mono repo
Floris
Florisā€¢10mo ago
i wire all thru 1 or well i did till the healthcheck had other ideas
Brody
Brodyā€¢10mo ago
how many different ports are in use
Floris
Florisā€¢10mo ago
1 pre-set for stripe and 1 that railway assigns itself i believe?
Brody
Brodyā€¢10mo ago
I see, but unfortunately, per service you can only expose one port publicly, and ideally your app listens on $PORT
Floris
Florisā€¢10mo ago
i see, i suppose i can route all over 1 port no? aslong as my endpoints are different its only internal traffic that goes over that api so its not a big deal
Brody
Brodyā€¢10mo ago
yep the running solution for services that listen on multiple ports is to use endpoints
Floris
Florisā€¢10mo ago
its ok i only run stripe over 4242 and the other port would be health
Brody
Brodyā€¢10mo ago
ideally in the future you would be able to map internal ports to different domains on port 443 externally
Floris
Florisā€¢10mo ago
i have 3 assigned domains
Floris
Florisā€¢10mo ago
No description
Brody
Brodyā€¢10mo ago
but the healthcheck does need to listen to $PORT since that check is made internally
Floris
Florisā€¢10mo ago
No description
No description
Brody
Brodyā€¢10mo ago
yeah there's no native way to map those external domains to internal ports on your service, without running a proxy that does host matching
Floris
Florisā€¢10mo ago
ahh okay okay thats a shame endpoints it is
Brody
Brodyā€¢10mo ago
if you want an example of that, I have one prepared just so you don't need to modify any of your code
Floris
Florisā€¢10mo ago
yes sure if you want
Brody
Brodyā€¢10mo ago
okay one sec, let me find, since I whipped up an example for someone else that wanted to map internal ports to different subdomains on the same service https://discord.com/channels/713503345364697088/1154106744306421830/1154267922714345523 if you need any clarifications on anything I said in that thread just ask
Floris
Florisā€¢10mo ago
quite impressive that you came up w that bro jeesus
Brody
Brodyā€¢10mo ago
everything I've learnt about railway is from being with the community
Floris
Florisā€¢10mo ago
thats amazing, i never been in any coding communities but basicly the crux of what you say i cant have 4242 if i want the health path to work without a proxy server
Brody
Brodyā€¢10mo ago
yeah since your app listens on different ports
Floris
Florisā€¢10mo ago
well guys that was 4 hours down the drain šŸ’€ back 2 github actions i go thanks for the help i appreciate it
Brody
Brodyā€¢10mo ago
haha I was a sleep 4 hours ago, wish I could have gotten to help you sooner
Floris
Florisā€¢10mo ago
imma try 1 sketchy thing
Brody
Brodyā€¢10mo ago
ouuu what ya got in mind
Floris
Florisā€¢10mo ago
imma subprocess the bitch im sick of it via daphne
Brody
Brodyā€¢10mo ago
does what your going to subprocess need to access the same filesystem as the rest of the monorepo services?
Floris
Florisā€¢10mo ago
im subprocessing the endpoint for the healthcheckm
Brody
Brodyā€¢10mo ago
because you could just run the 3 things separately in 3 different services
Floris
Florisā€¢10mo ago
it would defeat the point
Brody
Brodyā€¢10mo ago
unless your healthcheck actually does more then just return 200
Floris
Florisā€¢10mo ago
the whole idea of having the health check halter is to stop any faulty commits coming thru so if i have the same repo on multiple services one would always need to be out of sync w commits of the other thats not really practical i believe it does like 39 healthchecks internally or 37 im not sure
Brody
Brodyā€¢10mo ago
impressive
Floris
Florisā€¢10mo ago
and that just returns 1 int if all pass yeah i mean we cant afford our main branch being down for some stupid reason
Brody
Brodyā€¢10mo ago
that's a whole lot more thorough than return 200 in a /health route
Floris
Florisā€¢10mo ago
yes isnt that the point of it though or am i wrong hahaha woops
Floris
Florisā€¢10mo ago
No description
Brody
Brodyā€¢10mo ago
no you are definitely using a health check properly, though simply returning a fixed status code of 200 is still useful too not for your case, but you know
Floris
Florisā€¢10mo ago
yeah but then i could just use the webhooks from railway no? for the deployment status change if thats all i want to query does a healthcheck have to be completed in between the retries?
Brody
Brodyā€¢10mo ago
for simple code bases it's often plenty to just return 200 so that railway knows when your new deployment is ready to start accepting requests, giving you less of a switch over period
Floris
Florisā€¢10mo ago
ahhh i see i see this is our main backend repo so 80% of our code is here
Brody
Brodyā€¢10mo ago
railway will just retry your healthcheck endpoint for up to 5 minutes
Floris
Florisā€¢10mo ago
hence why it big yes i know but it does like 20 retries or something is there a time limit how long the check can last in between the retries thats what i wonder
Brody
Brodyā€¢10mo ago
and if it never gets a 200, it will never switch in your new deployment
Floris
Florisā€¢10mo ago
yes i like that
Brody
Brodyā€¢10mo ago
oh I see what you mean
Floris
Florisā€¢10mo ago
as github actions is slow
Brody
Brodyā€¢10mo ago
what is the timeout of the individual check
Floris
Florisā€¢10mo ago
hcheck takes longer than 10 seconds i dont know thats what i wonder
Brody
Brodyā€¢10mo ago
that's a good question
Floris
Florisā€¢10mo ago
i have tried courotoutinng everythin alrdy but i cant get below 7 seconds
Brody
Brodyā€¢10mo ago
I'm not 100% sure but I can't imagine it wouldn't wait for 10 seconds
Floris
Florisā€¢10mo ago
yeah i mean u never know could be something small
Brody
Brodyā€¢10mo ago
what are your deployment logs looking like during the healthcheck attempts
Floris
Florisā€¢10mo ago
attempt # 9392932 failed every few seconds
Brody
Brodyā€¢10mo ago
what does that correlate to in code? failed database connection or something?
Floris
Florisā€¢10mo ago
i dont know its generated by railway it doesnt get past building if it doesnt pass health
Brody
Brodyā€¢10mo ago
yes it does actually
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
ah ok for me it doesnt
Brody
Brodyā€¢10mo ago
your deployment is still ran, there should be deployment logs since your deployment needs to be running for any type of healthcheck to work
Floris
Florisā€¢10mo ago
oh i see wtf i didnt see that up until now
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
build
Brody
Brodyā€¢10mo ago
click deploy logs
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
its clearly returning 200
Brody
Brodyā€¢10mo ago
well isn't that odd
Floris
Florisā€¢10mo ago
but railway doesnt think so
Floris
Florisā€¢10mo ago
thats my fastapi get
No description
Floris
Florisā€¢10mo ago
serving a non-200 before its done with the couroutine then serves 200 as u see here
Brody
Brodyā€¢10mo ago
you run the healthcheck on every request to that endpoint?
Floris
Florisā€¢10mo ago
technically yeah but its awaiting
Brody
Brodyā€¢10mo ago
okay now throw in an early return 200 just for fun, skip the actual health check, as I understand your situation you are not working on a live site right now so it doesn't matter if stuff crashes?
Floris
Florisā€¢10mo ago
hahaha no it is live just not on railway so its ok
Brody
Brodyā€¢10mo ago
perfect, just as I thought
Floris
Florisā€¢10mo ago
the whole idea of railway was to use it like a node with 2 more but it would need to fit in 1 service then
Brody
Brodyā€¢10mo ago
gotcha
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
it takes ages to build though give it 4 min
Brody
Brodyā€¢10mo ago
yep, let me know how that goes
Floris
Florisā€¢10mo ago
also i really appreciate you helping me
Brody
Brodyā€¢10mo ago
is a 4 minute build normal?
Floris
Florisā€¢10mo ago
i have been frustrated with this the entire day yeah bro
Brody
Brodyā€¢10mo ago
are you deploying with a dockerfile?
Floris
Florisā€¢10mo ago
yes its dockerizing it automatically
Brody
Brodyā€¢10mo ago
haha well yeah, I was more so asking if you where bringing your own Dockerfile to the party
Floris
Florisā€¢10mo ago
ah no its just a repo with raw .py
Brody
Brodyā€¢10mo ago
gotcha
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
it went thru the health now so that means there IS an inidivual dtime limit
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
while as u saw my long ass healthcheck DID return 200 eventually so thats not cool
Brody
Brodyā€¢10mo ago
okay so there is a solution to this how do I explain
Floris
Florisā€¢10mo ago
celery all of m?
Brody
Brodyā€¢10mo ago
haha no, much simpler
Floris
Florisā€¢10mo ago
oh WORD
Brody
Brodyā€¢10mo ago
instead of running your healthcheck on every request of that endpoint, only let health.healthcheck ever run once, so that you will see a few failed healthchecks in the deployment logs while your single healthcheck check runs then once your healthcheck finally finished updating the status code that route returns to reflect a failed or successful healthcheck
Floris
Florisā€¢10mo ago
so something like this or
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
oh woops wait inversed
Brody
Brodyā€¢10mo ago
haha yeah
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
i havent slept yet so my bad that looks abour right no?
Brody
Brodyā€¢10mo ago
let me think this over
Floris
Florisā€¢10mo ago
ok bro
Brody
Brodyā€¢10mo ago
nah I can see this running a health check every request, since the healthcheck value will be zero until the healthcheck returns
Floris
Florisā€¢10mo ago
global cache it?
Brody
Brodyā€¢10mo ago
yeah just make sure you sync access to the global value
Floris
Florisā€¢10mo ago
but it can only run once anyways no?
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
asyncio sleep it could be feasible too
Brody
Brodyā€¢10mo ago
you could also have a Boolean flag named healthcheck_in_progress, and on the first request to your endpoint set that true, then run the healthcheck in a thread and update the flag to false and return the correct status that way it's always an instant return of a non successful status code until the moment your app finished the background healthcheck, then the thread updates the status code the route returns and railway switches in your deployment
Floris
Florisā€¢10mo ago
i havent worked with threads that much this is awaited / couroutined does that mtter? matter
Brody
Brodyā€¢10mo ago
nah you could make it work is health.healthcheck non blocking without the await?
Floris
Florisā€¢10mo ago
u cant run it without it calls over a 100 functions and theres stripe payments being processed and well thats self explanatory
Brody
Brodyā€¢10mo ago
fair, then you'd need to await it in a separate thread to turn it in a healthcheck that runs in the background
Floris
Florisā€¢10mo ago
coding is so wild sometimes what a great first project to do
Brody
Brodyā€¢10mo ago
haha so what is this project anyway and if you don't mind me asking (you don't have to answer) where do you currently have it deployed?
Floris
Florisā€¢10mo ago
haaaa man you do not want to know so technically this is just the wrapper yes
Floris
Florisā€¢10mo ago
No description
Brody
Brodyā€¢10mo ago
GPUs
Floris
Florisā€¢10mo ago
yezzor AI
Brody
Brodyā€¢10mo ago
it needs GPUs?
Floris
Florisā€¢10mo ago
yes lots
Brody
Brodyā€¢10mo ago
are you aware railway doesn't offer GPU compute?
Floris
Florisā€¢10mo ago
i have the GPUs myself this is the wrapper on railway
Brody
Brodyā€¢10mo ago
ohhhhh
Floris
Florisā€¢10mo ago
just as a node remember
Brody
Brodyā€¢10mo ago
I see now very cool
Floris
Florisā€¢10mo ago
yes we made 2 trading algoritms last year and they did really well over the year so now this is a step further or well we, back then i had developers and i traded and just instructed but now i learned how to dev myself
Brody
Brodyā€¢10mo ago
that's awesome!
Floris
Florisā€¢10mo ago
brody my friend i owe you something what a brilliant idea
Floris
Florisā€¢10mo ago
No description
Floris
Florisā€¢10mo ago
global_healthcheck = 0

@app.get("/health")
async def health_repo():
global global_healthcheck

status_code = 300

if global_healthcheck == 0:
_ = await health.healthcheck()
global_healthcheck += 1

return {"status_code": status_code}
global_healthcheck = 0

@app.get("/health")
async def health_repo():
global global_healthcheck

status_code = 300

if global_healthcheck == 0:
_ = await health.healthcheck()
global_healthcheck += 1

return {"status_code": status_code}
this was the solution thank u so much for ur time man it really means alot man can i buy u a coffee or something hahahahah fucking hell
Brody
Brodyā€¢10mo ago
looks good!
Floris
Florisā€¢10mo ago
No description
Brody
Brodyā€¢10mo ago
if you want to you can, but you absolutely don't have to
Floris
Florisā€¢10mo ago
the commits of a dying man send me ur paypal ill send u a coffee bro
Brody
Brodyā€¢10mo ago
I hate uvicorn too, you should use hypercorn šŸ¤£
Floris
Florisā€¢10mo ago
bro anything with -corn im not touching it anymore holy fuck
Brody
Brodyā€¢10mo ago
that's why I do golang, no silly things needed to get a web server running in production I actually have a buymeacoffee, it's in my bio, but seriously you don't need to
Floris
Florisā€¢10mo ago
i just saw that yea how coincidental real question though bro why do you NOT work for railway
Brody
Brodyā€¢10mo ago
not qualified
Floris
Florisā€¢10mo ago
in what sense? it seems like you're doing a pretty good job
Floris
Florisā€¢10mo ago
thanks again fuck me bro i refactored that entire file 6 times just for railway to time out in between retries šŸ˜­
Brody
Brodyā€¢10mo ago
haha I'll ask the team about increasing that time limit in the sense that I have no work experience
Floris
Florisā€¢10mo ago
or atleast let them write it down somewhere because the docs werent really helpful im sure i wont be the only one with an almost 1k line healthchecker i see i see, how old are you?
Brody
Brodyā€¢10mo ago
good point, I will put it in the docs once I get more information I think 22
Floris
Florisā€¢10mo ago
u think im 21
Brody
Brodyā€¢10mo ago
I stopped counting
Floris
Florisā€¢10mo ago
whahwhahaw real me with my bug fix commits today
Brody
Brodyā€¢10mo ago
šŸ¤£
Floris
Florisā€¢10mo ago
good lord
Brody
Brodyā€¢10mo ago
thanks for the train btw, means a lot
Floris
Florisā€¢10mo ago
i mean bro you saved me some time thats for sure
Brody
Brodyā€¢10mo ago
haha maybe
Floris
Florisā€¢10mo ago
alright bro i gotta go, imma be more active in this discord cus i learn alot cya
Brody
Brodyā€¢10mo ago
awesome, welcome to the server!!