Worker reaches max CPU time but only when running integration test suite
I have a worker with the following setup. I am using node-postgres with node_compat = true in the wrangler.toml. I am using Supabase as my Postgres database and Drizzle as my ORM.
I was using Turso before to just try it out, but would like to move to Supabase because the project includes file hosting and time series data which can all be handled by Supabase so it would simplify things. With Turso I was able to run the integration test suite with no issues and all tests were passing. But after moving to node-postgres, the first half of the tests pass, but about half way through workers starts erroring that the max CPU time was reached and the rest of the suite fails. Then if I re-run the tests without restarting the worker all the tests fail going forward.
Another thing to note is I am doing some encryption and decryption stuff for some routes in this worker. But it is not for every route and wasn't causing issues before.
node-postgres does establish a direct TCP connection, while Turso is over HTTP, so I can understand why that may consume more CPU time, although I don't know the details.
I tried profiling this using chrome devtools but it seems as soon as the max CPU limit is reached it disconnects or something and I can't capture the trace.
Switching between node-postgres and postgres.js does not change anything either, so it is not specific to the driver either.
I'm confident that the causing for going over the CPU time is switching to TCP based connection. Maybe that is pushing my worker over the edge for CPU time and I've just been getting by until now.
The main question I'd like to ask out curiosity is why does it take a large number of simultaneous requests to reveal this? If I make requests via Postman for example the error never happens. Furthermore, once it happens once, why does it break every following request? If there are other ways I can debug this that can be suggested I'd love to hear those as well. Thanks!
1 Reply
I also increased the cpu_ms to 30 seconds in the wrangler.toml and that does not seem to affect anything. Still eventually fails.
Don't know why I didn't do this before, but I deployed the application and ran the integration test suite against that and they all passed. Seems like it's a wrangler local issue.