Queries hanging indefinitely

I am experiencing a weird issue. If I enable PrismaInstrumentation a small subset of queries in a large application begin to hang. The promises remain permanently unsettled. Enabling debug logging does not reveal much about what’s happening, just that the query never runs. It shows that the last query is DEALLOCATE, which usually runs right before a query does. I have confirmed that no query is ever sent to the database despite a connection being established when this happens. Disabling PrismaInstrumentation by commenting out “new PrismaInstrumentation()” fixes the issue. Disabling any other code does not fix the issue. Simplifying the SQL queries to a very basic query also seems to fix the issue. I’ve been investigating this issue for about a month now, and have been working on a reproduction, but it’s an extremely tricky problem to nail down. Any advice? NodeJS 22 Prisma 6.1.0 through 6.4.1 all exhibit this behavior OS Alpine 3.21 Driver target is Linux musl OpenSSL 3.x Postgres 15
7 Replies
Prisma AI Help
You decided to hold for human wisdom. We'll chime in soon! Meanwhile, #ask-ai is there if you need a quick second opinion.
Nurul
Nurul5d ago
Hey 👋 I think having a reproduction would indeed help me as well as ORM team in figuring our what's going wrong. If I understand correctly, this started happening from Prisma version 6.1.0? It was working as expected before that?
Vinnymac
VinnymacOP4d ago
Yea, everything in production is running on 6.0.1, because I can’t upgrade past it without a handful of specific queries hanging indefinitely when instrumentation is enabled. All of the queries that hang are using $queryRaw, and if I simplify the queries to basically not do anything or if I disable instrumentation by commenting out new PrismaInstrumentation() the issue disappears. Reproducing it outside of my project isn’t the easiest, but I’ll keep working on it when I find time and see if I can get to the bottom of it. Also I am running on Supabase with connection pooling for the database. And I can reproduce the issue outside production in canary test environments, but haven’t had luck making it happen outside my project.
Nurul
Nurul4d ago
Thanks for your efforts in reproducing this. It would be of immense help for the ORM team to fix it once we can reproduce it consistently.
Vinnymac
VinnymacOP2d ago
I was able to reproduce the issue with very little code / dependencies involved on my local. The only thing I haven't been able to do yet is reproduce the issue with a local database. Unfortunately the issue doesn't appear to occur if no data exists in the database tables which is peculiar because its not even running a db query, so why in the world would data existing or not have any impact. I populated about half the tables, and still it didn't break, but I didn't populate the main FROM table yet. So I am going to try to take a backup of the tables where I am seeing this issue, and then seed my local db with the data to see if that reproduces the issue fully locally. If the data is required to simulate the issue, unfortunately I won't be able to give you the data because this is proprietary information, so I am not sure what we would do next, or if Prisma would be open to getting on a call or something at that point. I was able to backup 100% of the data, took a while as I had to wait for the full dump, but then I seeded my local database, and ran the test. It ran successfully (and failed to reproduce the bug). So either something about the database is different or my environment is different. If I use one of the live databases, the issue occurs, even without changing a single line of code. The entire reproduction is containerized and uses the same image as all of the other tests I’ve done. If it’s not the environment it must have something to do with the db. There are plenty of things that are different in a local database vs the live ones, even the hardware configuration. Perhaps postgres extensions, or connection pooling, or one of the billion other things. I suppose next I can stop using a simple postgres database, and try an actual supabase local instance with all the extensions installed and see if that leads me anywhere. Nonetheless this is a really strange issue. Okay, I tried something new just now, and this is a big lead. I found that using a direct connection instead of the database pooler pgbouncer=true fixes the issue as well. This wouldn’t be the first time deallocate all calls have caused that type of hanging issue, and a quick search online shows both queries hanging with deallocate calls in past Postgres versions as well as Prisma related issues with PGBouncer. I think we may have found the main culprit here, an incompatibility between telemetry changing the underlying query and the connection pooler. We now know three things 1) disabling telemetry fixes the issue 2) simplifying the query fixes the issue, maybe it doesn’t treat it as a prepared statement? 3) using a direct connection fixes the issue, implying Supabase supervisor is problematic in some way. What’s weird is that disabling the telemetry successfully makes the deallocate all call and it works with the pgbouncer pooler connection. So I think we’ve found that something with connection pooling and telemetry is broken, but it’s not clear what that is exactly just yet.
Vinnymac
VinnymacOP2d ago
Possibly related to the following issue, although it says it is for MySQL, I imagine if instrumentation effects MySQL prepared statements it most assuredly also effect Postgres prepared statements. https://github.com/prisma/prisma/issues/18975
GitHub
mysql: Leaking of prepared statements when using opentelemetry · Is...
Bug description We started to experience this error in production: ServerError { code: 1461, message: "Can't create more than max_prepared_stmt_count statements (current value: 16382)"...
Vinnymac
VinnymacOP2d ago
I ran a suite of tests to see what happens in every configuration to make it easier to understand. Basically everything I see here is expected except when pgbouncer is enabled like it should be in transaction mode with supabase. Using transaction mode with supabase without pgbouncer=true is known to cause issues so that isn’t surprising at all. Each test was ran 5x each.
No description

Did you find this page helpful?