Intermittent errors with wrangler deploy
I have a monorepo with a number of workers and pages apps and am running into issues with deployments via
wrangler deploy
...
For now, I'm using pnpm run -r deploy
to run the deploy tasks in each workspace and am getting workers.api.error.unknown [code: 10013]
, although it's not always on the same app. When I run the deployments individually from each app, everything works fine.
Currently there are only 3 workers and 2 pages apps, but I'm planning on having several more of each (and using GHA).
Is it likely that I'm hitting a rate limit or is it more likely to be something else?16 Replies
Rate limit is the same as the default CF one so 1200/5 mins by default
But you'd not be getting an internal error for that, sounds like something is going wrong
You're deploying these concurrently?
yeah, they're all going within a second or so
(very roughly)
should i expect a more explicit error message for rate limits?
if it is something else, i'm a bit baffled, as running each deployment inidivdually works fine
Yeah you'd see a 429 and a rate limit message
Yep, so not concurrently works
Do you have the workers.dev enabled?
good to know, thanks
ah, no... is that likely to help?
Having it disabled is what would help... if you already do then that's not it
i've not set it at all... might it be good to explicitly set it to false?
Ah yes it defaults to on
So workers_dev = false should hopefully help you out
aha, will give it a whirl. thanks 🙂
@Walshy | Deploying you're a star, that's worked a charm! thanks for your help 🙌
Perfect!
Known issue with concurrent deploys with workers.dev right now - we have it planned to fix but not yet got the time to properly dedicate
Apologies for the issue!
haha, no worries!
setting
workers_dev = false
seemed to work, but i'm actually getting intermittent failures... have successfully deployed a few times this morning, then started getting errors. Also, I misdescribed something yesterday: I was actually using pnpm run --parallel deploy
when I was getting workers.api.error.unknown [code: 10013]
. Today I went back to trying pnpm run -r deploy
and am now getting a different error, which might be helpful: ✘ [ERROR] In a non-interactive environment, it's necessary to set a CLOUDFLARE_API_TOKEN environment variable for wrangler to work.
This seems like a strange error, given that the app has been successfully deployed in the past
also, if i disable the deployment for the worker that's erroring, i get the same error on the next available worker... the error seems to always be on the first worker in the deployment (as pnpm kills all deployment tasks before anything deploys)
just successfully deployed via pnpm run --parallel deploy:preview
, then immediately tried pnpm run -r deploy:preview
, which failed, so tried pnpm run --parallel deploy:preview
again, which then failed
Another update... just successfully ran pnpm run -r deploy:preview
, so tried pnpm run --parallel deploy:preview
again, which failed 😦
For now, I can just run individual deployments manually as needed, but the plan is to get things into CI via GitHub Actions, where that won't be an option. Am having a bit of a wobble as this is pretty core to our design -- any advice would be very gladly received!
In case it helps, here's the log output from an earlier unsuccessful run of pnpm run -r deploy
...I'm off today so I can't dig too much but I think you're still hitting the same concurrency issue just less frequently due to the subdomain off
Let me escalate it and we can get a case made
Thanks, really appreciate your help. Enjoy your day off!
Hi team, this is Kiki from Cloudflare. Sorry to read about the issues you are having.
I will PM you @scruffy now for more info to create a case.
Awesome, thanks!
Hey @scruffy , sorry for the delay in getting back to you. We're pretty sure this is an issue that is already known to us about making deployments concurrently. That's something which we plan to attempt fixing soon, but in the meantime, all we can suggest is (a) disabling
workers.dev
routes (which you've already done in this case) and (b) adding a short sleep between deployments when making them concurrently. Likely just a second or two would be sufficient.