R2 Files loading Blank Page
I've had a bunch of foldered files that make up a training course package stored in R2 for a while (over a year). At the root of the folder is an html file (index.html or story.html) that when accessed would load the training material using the other files within the folder structure. This has worked for over a year until this week it all stopped working.
I've tried:
-deleting the files and re-uploading
-messing with CORS
-creating a fresh bucket
-tried multiple methods of uploading (GUI, API, S3-Browser, CyberDuck)
Nothing seems to work and the only errors i get are 404 for select files that the index.html is trying to call within the same folder.
I suspect this is related to multipart upload issues but I cannot get the files to upload correctly, which i note was not an issue in the past. This also doesn't explain why the previous batch of uploaded folders/files that were working all the sudden stopped.
Looking for guidence on how i should be uploading these files correctly so that it completes with all files in there correct place.
For references the total folder size is 400+MB with about 200+ individual files ranging from bytes/kbs all the way to 1, 5, 10, or even 50MB for some videos and pictures. Hence the need for proper Multipart Uploading.
Outside of that, looking for any insight as to what might have changed on CF's and how they are handling html files stored and served from R2?
24 Replies
Did you get this working? I saw you asked some questions in #r2
rclone is indeed a really helpful and useful tool that should just work, shouldn't even need to mess with its settings for simple transfers
For references the total folder size is 400+MB with about 200+ individual files ranging from bytes/kbs all the way to 1, 5, 10, or even 50MB for some videos and pictures. Hence the need for proper Multipart Uploading.You don't even need to use multipart up to 5 GiB, although some tools will ahead of that limit for performance and stopping entire reuploads.
Outside of that, looking for any insight as to what might have changed on CF's and how they are handling html files stored and served from R2?Nothing recently
Hey, so far I haven't been able to get things working. Or at least working in a way like they were before with R2. I got all the rsync commands going but like you said there probably wasn't a need for that. I've resorted to using Linodes S3 obj storage for now as that seems to work with the same folders and same rsync upload cmd (plus needing to set files to public read). But for now I can't get those files to render and get 404 or 400's for select files in the folder structure when loading the index.html file.
It also seems like the folder or url/i path is not translating correctly from the root R2 domain to where these files are actually sitting with the R2 folder structure. Not sure if others have had something similar?
generally most of the issues I've seen with files/folders in R2 (or object storage in general) is just confusion about them since Object storage doesn't actually have folders, it's all virtual.
If you go to where you think the assets are, do they load fine? If you click on a file name in the R2 bucket, it'll give you the R2 Custom Domain it's reachable at, worth sanity checking "is this asset actually reachable, and if so what is the html page trying to load/is it any different"
ya for sure, using "folder" as a general term. And yes I have tried that, as far as a i can tell the individual files load but it is hard to tell if they are working correct since they are things like .css .js and .wolf files for elements that get referenced in an online training package output that I don't control. Its standard practice to store and deliver these packages via S3 storage and for over a year I had done that via S3 and R2 for the last 6 months or so.
The odd thing is the package (folder) I uploaded 6 months ago that had been working just stopped, which is what caused me to delete and try re-uploading and thats where im at lol. Again Linodes S3 has continued to work the whole time including new uploads so it was my guess that something about the R2 config had changed at least in someway.
could you give the link of a broken page on r2, or look at the console (ctrl+shift+i -> console or right click -> inspect -> console) and look for errors about assets?
Sure, heres the link of the broken folder: https://lmsfiles.aesi-inc.com/lms-test-rise-3%2Fcontent%2Findex.html
Here for reference is the same folders/files on Linodes S3: https://acumen-training.us-east-1.linodeobjects.com/how-to-protect-your-data/content/index.html
Thanks for following up and having a look!
See how the slashes are different between r2 and s3 in your links? That's not R2 doing it (although I think the dashboard might show them wrong), and it looks like when you do that the browser takes them as escapes and thinks the relative directory is
/
, so it tries to load the resources with the wrong relative directory, looking for /lib/icomoon.css
, but if you do the slashes right: https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/index.html
it works fine, and your relative assets work /lms-test-rise-3/content/lib/icomoon.css
hmm, interesting, I still feel like this is a change or bug on R2 side cause like i said i had files uploaded (same type of folder setup) which were working that allt he sudden stopped working. Then all future uploads like this have this same issue. I've tried uploading with multiple S3 clients or cli's and on Mac and windows so i don't think all of those are messing with the slash direction?
only thing i can think of outside of a bug with our R2 tenant is a setting in CloudFlare for the custom domain thats used that is changing those paths? Maybe the URI formatting, could that do it?
as far as I can see, this is just a browser thing/whatever is generating your links to them. You can mess up linode in exactly the same way
https://acumen-training.us-east-1.linodeobjects.com/how-to-protect-your-data%2Fcontent%2Findex.html
guess the real question is how you do generate those links/what is generating those links url encoded?
I copied the link i gave you directly from the R2 dashboard:
ahh yea the dash is messed with that and has been forever
but that wouldn't explain why it only broke now
maybe the better question is: If you fix up all your links to be
/
instead of %2F
, is there any that are still broken/don't work?ya, i have had a few weird things happen with links from there but nothing like this and nothing that stays after multiple refreshes, relogs, etc
I only have 2 folders right now for testing and i just tried both and they work with the swap
in prod or even dev I will be grabbing these links via the R2 API or Cf workers so i don't think it will be an issue but this weirdness was stopping me from even starting that lol
idk how your other things are setup, but they wouldn't break like this unless they relied on relative directory links like that specific page does
normally you see relative from base
/index.css
or absolute https://r2.example.com/index.css
which would both be fineYa i have another bucket for PDF docs so single files and they work fine.
yea single files would work fine too, no relative linking at all in that case
well eitherway sounds like the issue was just the r2 dash showing slashes in the preview urls as
%2F
(url escaped) and then the browser breaking on relative asset links in that exact setup, maybe could try to push for it to be fixed now as before it was just a harmless display thing but it breaking webpages in some cases is super annoyingI wonder if this setting is playing into it?
nah this is r2 display + browser behavior
normalization would change what cf gets/r2 gets, which maybe could mess with the html being served or something, but not the underlying issue
R2 quite simply just shouldn't show slashes as
%2F
/url encodedya i just wonder if because its being filtered through the custom domain with these settings? Cause I have a personal CF R3 account and that one also shows the % in urls but i've tried folders there as well and it pastes with the % but when the browser loads it converts it correctly.
no, it just depends on how you are loading the other resources (if you are loading any at all)
is it the exact same content/setup? Do you have a link to it?
no its not the exact same but maybe i will try an upload there to test. I just know that for all files in either account the dashboard always has shown the % slashes and then it always converted fine.
right, the issue isn't purely conversion
maybe I did a poor job explaining
there's three general ways html can link to other resources
1. absolute urls (
https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/lib/icomoon.css
)
2. relative url off hostname/root path (/lms-test-rise-3/content/lib/icomoon.css
)
3. relative url off current path (lib/icomoon.css
) <--- this is what you do
When you do %2F
, it looks like the browser (at least Firefox) takes it as you are trying to escape the url, and that it isn't a proper path separator (I suppose it isn't). As such:
https://lmsfiles.aesi-inc.com/lms-test-rise-3%2Fcontent%2Findex.html
-> base is /
, at root
https://lmsfiles.aesi-inc.com/lms-test-rise-3/content/index.html#/
-> base is /lms-test-rise-3/content/
So /lib/icomoon.css
with the right base becomes /lms-test-rise-3/content/lib/icomoon.css
which works, however if you visit the broken page, you can see the icomoon.css path it tries is /lib/icomoon.css
which is broken.
So this bug:
Only occurs with html linking to other resources
Only occurs with links to other resources with relative urls off the current path (pretty rare as far as I know, most of the time you just use relative off root)
so this has to do with the way the browser is resolving those relative urls, and the way R2 displays the url encoded/with %2F, nothing else. Nothing you do in CF would change those relative urls to fully qualified ones, or make /lib/icomoon.css
know it's supposed to be under /lms-test-rite-3/content
ah ok ya that makes sense
I did also find this forum post after searching for the % slash term with R2. very recent and looks like they are aware of this issue and working on it. This also points to this being a more recent issue which lines up with the timeline.
https://community.cloudflare.com/t/r2-keys-with-multiple-slashes-in-key-require-url-encoding-or-it-404s/641375/4
been an issue ever since those preview urls have been around afaik
it's purely just a display bug (well, that then causes other issues)
anyway tldr would be %2F are the enemy, replace with
/
, I'll see if I can't escalate the issue with the knowledge that it can in edge cases break webpagesThat would be awesome, thanks for taking the time on this!