Memory Leak with Python Web App
I've got a web app that has a lot of I/O from external API's and a postgres DB. I've been troubleshooting this for a long time and can't find any issues in the code. However, if I disable the functions which save the data to Postgres, the memory leak stops.
While I realize the problem is most likely in the code somewhere (my code or a package I'm using), I'm wondering if something else is going on at the infra/container level. Are there any methods or tools to inspect the container and see if something else is consuming memory, or if the memory isn't getting released after garbage collection?
While I realize the problem is most likely in the code somewhere (my code or a package I'm using), I'm wondering if something else is going on at the infra/container level. Are there any methods or tools to inspect the container and see if something else is consuming memory, or if the memory isn't getting released after garbage collection?
31 Replies
Project ID:
34108dc5-7795-469f-b853-2863bbedfa6f
34108dc5-7795-469f-b853-2863bbedfa6f
have you searched for known memory issues with the packages you use to save data into postgres
Yes, extensively. I've been troubleshooting this for over 50 hours.
For DB stuff, I'm using the most popular packages in the python ecosystem: SQLAlchemy (DB & ORM toolkit) and Psycopg (DB driver). I've also tried AsyncPG instead of Psycopg. All the code is async and using context managers, which should automatically close and release resources back to the system. The API data gets stored correctly in postgres, there's no exceptions thrown. As far as I can tell, the code works perfectly, but memory consumption grows steadily with time. I have to restart the project twice a day. I've also tried manually running garbage collection and even
For DB stuff, I'm using the most popular packages in the python ecosystem: SQLAlchemy (DB & ORM toolkit) and Psycopg (DB driver). I've also tried AsyncPG instead of Psycopg. All the code is async and using context managers, which should automatically close and release resources back to the system. The API data gets stored correctly in postgres, there's no exceptions thrown. As far as I can tell, the code works perfectly, but memory consumption grows steadily with time. I have to restart the project twice a day. I've also tried manually running garbage collection and even
del
on the data objects.
And maybe this is important for context, but all of the code making external requests and storing the data to postgres is running inside an ASGI API framework.sounds like you are doing everything correctly and know what you're talking about
I'm leaning more towards a problem with a package, and not with your own code
does this problem not happen when running locally?
Funny you ask, I had just made an MCVE for this which helps reproduce the issue quickly by making a lot of requests for a 5 MB JSON object. And yeah, memory consumption grows on my local.
are you running this in docker locally?
Nope, just in my terminal
well that definitely makes it easier to debug than if your app would only leak memory in a docker image
Yeah, at least i've got something that's reproducible!
I've been trying to find memory profilers for python, but all the 3rd party packages are not supported anymore or don't work inside an ASGI framework.
Python's got the
I've been trying to find memory profilers for python, but all the 3rd party packages are not supported anymore or don't work inside an ASGI framework.
Python's got the
tracemalloc
package in the standard lib... I'm probably not using it correctly, but so far, I haven't been able to pinpoint any objects getting accumulated in memory, or tracebacks to memory leaks.and really no one else has had reported memory leaks with the packages that you are using to save the data?
Nope, nothing from the last year and I looked at a bunch of old SO threads to see if there were any ideas I could pull from other peoples issues.
have you tried simple stuff like using different versions of python, or different versions of the packages that save the data?
That's why I commented on the Node JS ticket... I'm desperate π
well do the simple thing of switching python and package versions
Yeah, I did try a couple versions of SQLAlchemy... 1.4.x which has been out for years, and the new 2.0 version which came out earlier this year.
I haven't tried other versions of Python, but that's a good idea, easy to try
I'm sure there's other packages involved that deal the the data too
Yeah, I'll look at the dependancies, but as far as storing the data, it should primarily be SQLAlchemy and the DB driver, Psycopg or Asyncpg
what if it's just something simple like the data is buffered in the request body and never released, that wouldn't even have anything to do with the database stuff
So if I comment out the function to save the data to SQL, theres no memory leak
ah right my bad
how hard would it be to try a different SQL library?
There is another ORM i could try using. It's much less popular... SQLAlchemy in Python is like Prisma in JS... it's huge. But you're right, it's worth checking off the list
there's prisma for python
it might be worth building out a minimal reproducible app that has the same memory leak, that way changes like swapping your orm would be much simpler
with everything I say, keep in mind that I am not a python developer, nor have I ever had the displeasure of having to debug a memory leak
Hahaha fair. Thanks for chatting with me about this, it really is helpful
just here to brainstorm π
so like i don't even know if prisma for python is any good or not lol
I'll try to plugin Prisma for python and see if it helps
Yeah, at least its part of a huge project in the JS world. I'll check it out and see if it's viable. The other ORM I know about has a plugin with my ASGI framework, so I'll see what's easy to try and see if it makes a difference
sounds good, let me know how that goes!
@Brody turns out it was
Picologging
, a logging package from Microsoft. A friend helped troubleshoot this and figured it out. I switched back to the logging module in the python standard lib and it's all good nowπ
that's not what you are supposed to do
but I'm happy you found the cause of the memory leak
Hahaha, I'm just glad to have a solution
Anyway, thanks again for all the good ideas, truly appreciate the time you spent
no problem! π