R
Railway13mo ago
Railay

Memory Leak with Python Web App

I've got a web app that has a lot of I/O from external API's and a postgres DB. I've been troubleshooting this for a long time and can't find any issues in the code. However, if I disable the functions which save the data to Postgres, the memory leak stops.
While I realize the problem is most likely in the code somewhere (my code or a package I'm using), I'm wondering if something else is going on at the infra/container level. Are there any methods or tools to inspect the container and see if something else is consuming memory, or if the memory isn't getting released after garbage collection?
31 Replies
Percy
Percy13mo ago
Project ID: 34108dc5-7795-469f-b853-2863bbedfa6f
Railay
Railay13mo ago
34108dc5-7795-469f-b853-2863bbedfa6f
Brody
Brody13mo ago
have you searched for known memory issues with the packages you use to save data into postgres
Railay
Railay13mo ago
Yes, extensively. I've been troubleshooting this for over 50 hours.
For DB stuff, I'm using the most popular packages in the python ecosystem: SQLAlchemy (DB & ORM toolkit) and Psycopg (DB driver). I've also tried AsyncPG instead of Psycopg. All the code is async and using context managers, which should automatically close and release resources back to the system. The API data gets stored correctly in postgres, there's no exceptions thrown. As far as I can tell, the code works perfectly, but memory consumption grows steadily with time. I have to restart the project twice a day. I've also tried manually running garbage collection and even del on the data objects. And maybe this is important for context, but all of the code making external requests and storing the data to postgres is running inside an ASGI API framework.
Brody
Brody13mo ago
sounds like you are doing everything correctly and know what you're talking about I'm leaning more towards a problem with a package, and not with your own code does this problem not happen when running locally?
Railay
Railay13mo ago
Funny you ask, I had just made an MCVE for this which helps reproduce the issue quickly by making a lot of requests for a 5 MB JSON object. And yeah, memory consumption grows on my local.
Brody
Brody13mo ago
are you running this in docker locally?
Railay
Railay13mo ago
Nope, just in my terminal
Brody
Brody13mo ago
well that definitely makes it easier to debug than if your app would only leak memory in a docker image
Railay
Railay13mo ago
Yeah, at least i've got something that's reproducible!
I've been trying to find memory profilers for python, but all the 3rd party packages are not supported anymore or don't work inside an ASGI framework.
Python's got the tracemalloc package in the standard lib... I'm probably not using it correctly, but so far, I haven't been able to pinpoint any objects getting accumulated in memory, or tracebacks to memory leaks.
Brody
Brody13mo ago
and really no one else has had reported memory leaks with the packages that you are using to save the data?
Railay
Railay13mo ago
Nope, nothing from the last year and I looked at a bunch of old SO threads to see if there were any ideas I could pull from other peoples issues.
Brody
Brody13mo ago
have you tried simple stuff like using different versions of python, or different versions of the packages that save the data?
Railay
Railay13mo ago
That's why I commented on the Node JS ticket... I'm desperate 😂
Brody
Brody13mo ago
well do the simple thing of switching python and package versions
Railay
Railay13mo ago
Yeah, I did try a couple versions of SQLAlchemy... 1.4.x which has been out for years, and the new 2.0 version which came out earlier this year. I haven't tried other versions of Python, but that's a good idea, easy to try
Brody
Brody13mo ago
I'm sure there's other packages involved that deal the the data too
Railay
Railay13mo ago
Yeah, I'll look at the dependancies, but as far as storing the data, it should primarily be SQLAlchemy and the DB driver, Psycopg or Asyncpg
Brody
Brody13mo ago
what if it's just something simple like the data is buffered in the request body and never released, that wouldn't even have anything to do with the database stuff
Railay
Railay13mo ago
So if I comment out the function to save the data to SQL, theres no memory leak
Brody
Brody13mo ago
ah right my bad how hard would it be to try a different SQL library?
Railay
Railay13mo ago
There is another ORM i could try using. It's much less popular... SQLAlchemy in Python is like Prisma in JS... it's huge. But you're right, it's worth checking off the list
Brody
Brody13mo ago
there's prisma for python it might be worth building out a minimal reproducible app that has the same memory leak, that way changes like swapping your orm would be much simpler with everything I say, keep in mind that I am not a python developer, nor have I ever had the displeasure of having to debug a memory leak
Railay
Railay13mo ago
Hahaha fair. Thanks for chatting with me about this, it really is helpful
Brody
Brody13mo ago
just here to brainstorm 🙂 so like i don't even know if prisma for python is any good or not lol
Railay
Railay13mo ago
I'll try to plugin Prisma for python and see if it helps Yeah, at least its part of a huge project in the JS world. I'll check it out and see if it's viable. The other ORM I know about has a plugin with my ASGI framework, so I'll see what's easy to try and see if it makes a difference
Brody
Brody13mo ago
sounds good, let me know how that goes!
Railay
Railay13mo ago
@Brody turns out it was Picologging, a logging package from Microsoft. A friend helped troubleshoot this and figured it out. I switched back to the logging module in the python standard lib and it's all good now
Brody
Brody13mo ago
😭 that's not what you are supposed to do but I'm happy you found the cause of the memory leak
Railay
Railay13mo ago
Hahaha, I'm just glad to have a solution Anyway, thanks again for all the good ideas, truly appreciate the time you spent
Brody
Brody13mo ago
no problem! 🙂