Fetch getting infinite 308 redirects in worker but fine locally
Hey, we are trying to make an external call from the worker. When testing the same code in wrangler dev mode, everything works fine, the same when running via curl in the command line, however as soon as it runs in production, the code fails with "Too many redirects."
Upon inspection of this by checking the headers, the Location header returned is the exact URL that we are trying to send the request too, this continues in a loop until fetch errors out because of an infinite redirect loop. Again, this behaviour only shows up within workers, this doesn't happen from anywhere else. These responses also appear to be coming from cloudflare servers, when the request is targetted at an origin that is not proxied through cloudflare. Not sure how to debug this further
24 Replies
So.... we figured out the problem, I don't know if this needs more documentation or some kind of handling, but this was pretty crazy to debug
It's pretty much undocumented that workers fetch requests follow the same SSL/TLS rules from the domain settings "off/flexible/full/full(strict)"
I only found this from someones reply in a forum post
https://community.cloudflare.com/t/does-cloudflare-worker-allow-secure-https-connection-to-fetch-even-on-flexible-ssl/68051/5
We follow best practices for TLS, so our origin auto-redirects http to https. However, because the default setting is flexible in cloudflare, our worker was making a request with https protocol, and then that was getting downgraded by the proxy to http before it made the call to the origin
This results in an infinite redirect loop that is near impossible to actually understand what is happening. Fetch makes the https request, proxy downgrades to http, origin sends a 308 with a redirect to (what appears to be in fetch) the exact same URL, fetch redirects, repeat ininitum
Honestly, to me it seems crazy that cloudflare would silently downgrade https requests to http when you explicitely call for https in fetch. A much more sensible behaviour to me would be to fail the fetch request outright if you try and do this and your domain is set to flexible for that origin.
Worker Subrequests go through the cdn with your zone settings, just like if you loaded
https://yoursite.com
yourself. The default isn't Flexible, it tries to auto sense your Origin's security level but that's only at setup if you've created dns records for your root then and not after (and from my experience seems a bit iffy), but yeaThe behaviour is just extremely misleading with little to no documentation
yea idk where you'd put the docs for it that people would read but you can create an issue in the docs repo or put it somewhere and PR it
Never should an explicit fetch request with https:// written in the protocol end up getting transmitted over http, especially when the origin it's connecting to isn't proxied
I don't think Flexible should even be an option but yea people get tripped up over it and endless redirects semi-commonly sadly
My preferred solution would be that it doesn't try proxy requests to the origin at all
I don't really see why it applies the SSL zone setting to requests coming out of the worker
because it's going through the cdn pipeline
just intrinsic to how subrequests work in workers
I understand that's why it happens now
But it's very misleading
If the server at the other end of the fetch request was also cloudflare proxied, and was set to use flexible SSL, then I would understand it
But it's not
It's downgrading requests to a server that normally isn't even proxied through the cloudflare CDN
Really this only makes sense with a deep understanding of how workers operate, which as I said, I now understand
But to someone just writing some code and using the fetch API as they normally do, this is extremely dangerous imo
You could be transmitting sensitive information in the clear without even realising it
yea people trip over the flexible option/endless redirects even outside of workers. I saw they were working on removing port restrictions with subrequests/modifying that, maybe will extend further to being less reliant on zone stuff in the future
should always be Full (Strict) though
Yea, my main concern is simply the amount of people this probably catches out without them even knowing
yea, if you can find a good place to document it people would read you could make an issue/pr about it. There is docs for fetch but I doubt many people check that
Yea, documentation is a good start, but realistically that's only going to help people that have a problem
If you just use the fetch API as you normally would, most people won't ever notice a problem, unless they don't allow HTTP at their origin, or they redirect HTTP to HTTPS
yea I know what you mean, could make an issue in the workerd (runtime repo) about that, I think would be the best place, about not respecting zone ssl/tls if not fetching something within the zone
I wish flexible itself just went away/everyone was forced to Full (Strict) and has to pick to downgrade, we've tried suggesting that to Cloudflare for a long while but not much movement
Yea, I am not really sure why Full (Strict) is not the default in 2024
10 or even 5 years ago, I get it, certificates were less freely available
But lets encrypt and other options exist that make this stuff trivial nowadays
There is no good reason to be transmitting stuff in the clear in 2024
CF itself even offers up to 15 year origin ssl certs trusted by proxy
Yea, it's a bit crazy
But yea, the whole issue here is that realistically I see no reason for worker subrequests to obey the SSL setting anyway. If an origin you're trying to connect to has a cloudflare DNS record and has proxying enabled, by all means, follow those rules
But otherwise, just ship it as-is over the internet
yea idk if it ever had a good reason other then just being an implementation detail due to reusing the cdn pipeline
Yea. The problem is now, how many workers would it break if you just switched that setting
Workers has compatibility dates and such for that reason
I wonder if it'd be possible with the infrastructure for that to traverse down the CDN pipeline though
Obviously I have no idea 😛
breaking features get new compat flags to enable/disable, and are by default behind a new compat date (like 2024-08-10), so any new workers after then would have it by default
It's unlikely the worker itself is making this decision
yea it's def not part of workerd/the runtime but I'm sure if they wanted to implement it they could find a way to pass it down