How to validate a file upload in Workers before saving it to R2?
I've spent ages trying to resolve this. I know how to upload a file and save it to R2. What I can't figure out is best practice to validate the file first.
Lots of resources out there say use something like Formiddable, but I can't believe I need a third-party library just to enforce things like "must be a PDF" or "must be under 100kb".
I know the Workers R2 API allows the stipulation of conditions on PUT operations, but these conditions don't seem to relate to the sort of conditions I mention above.
So what's best practice here?
10 Replies
Validating the size isn't too difficult. You can use a
FixedLengthStream
to throw an error if the file goes over the max size(maybe?).Cloudflare Docs
TransformStream · Cloudflare Workers docs
A transform stream consists of a pair of streams: a writable stream, known as its writable side, and a readable stream, known as its readable side. …
As for validating file types, unless you trust the file type given by the user blindly, then it can be very difficult, since you actually need to inspect the file to ensure it follows the standard for the file type
Thank you. This is in a trusted environment so not a huge deal. Presumably I can get filesize by checking the length of the base64 representation?
Probably, yeah
It might also make sense to do some of those validation checks after upload.
You mean after upload to my app (which is what I'm already doing), or upload to R2? If the latter, could you possible elaborate? Thank you.
After you upload the file to R2 and get the upload event. I wouldn't necessarily do this for upload size (the other option seems good). But checking other things can be done after it's uploaded.
https://developers.cloudflare.com/r2/buckets/event-notifications/
Cloudflare Docs
Event notifications · Cloudflare R2 docs
Event notifications send messages to your queue when data in your R2 bucket changes. You can consume these messages with a consumer Worker or pull …
Thanks for this. So just so I'm 100% with this, the pattern here is to upload to R2 without any prior validation (except perhaps filesize, as per the method I mentioned) and THEN validate the file once it's uploaded to R2, via the information R2 provides? This is a new way of thinking for me, coming from PHP land!
Bonus question: is it a poor pattern in Node to upload the file to my back-end and then have my back-end transfer it to R2, rather than have my front-end upload direct to R2 via a signed link? The issue I have with the latter approach is it involves different parts of my form going to different places via different requests (the "normal" data to my back-end, and the file upload to R2).
Really appreciate the help.
So just to confirm, if the client sends a content-length header, do you trust that it is actually that size?
No, that's not how I'm determining length - I mentioned I was doing this by counting the string length of the base64 representation of the file, which I think is sound? The part I'm missing (coming from PHP) is this idea of first save to R2, then ask R2 whether the file is valid, and delete if necessary.