I recently used slurper tool to move

I recently used slurper tool to move data from S3 to R2. AWS shows me 2.73 TB but cloudflare shows me 7 TB for the bucket. I am using a script to get all the files and the size and it is showing 2.73 TB which is matching what I expected. The dashboard seems to be way off and I am worried this will impact billing. The initial move had some errors (10% files could not be moved) and so I ran it again 2-3 times with skip files turned on and it only attempted the error files in future migrations. I also set it to remove failed multi-part uploads within 1 day. Anyone else facing this?
4 Replies
cmaddox
cmaddox5d ago
Hey there, the 7TB looks to be the correct amount of data for that account. A few questions: 1. How many objects roughly are expected for the bucket? 2. Do you see any objects in the bucket that have sizes that are much larger than expected? 3. Is your script pulling every object in your bucket or a particular prefix? I did a few spot checks and object sizes appear to be correct based on what was received from the original source bucket.
Shankar_sq
Shankar_sqOP5d ago
1. There are around 17200 objects in the bucket and it matches the source bucket. 2. No, it looks like the size is adding up to 2.7 TB and all objects are similar to the source 3. I am using the listObjectsV2 to list all the objects and then add up their sizes. So the source bucket is only 2.7 TB and I have not uploaded any other files and have only used the migration tool via the cloudflare dashboard. Are there any versions due to the use of migration tool multiple times? I did use the 'skip' option and can confirm no duplicates on the dashboard or on cyberduck
cmaddox
cmaddox4d ago
Dug a little more into this. TLDR; the data stored for the account is the 2.7TB and ~17000 object as mentioned. The extra data that is showing is from the failed uploads, which should be automatically cleaned up when the multipart abort object lifecycle rule runs for them (likely today or tomorrow based on when you ran the migrations). I'll keep an eye on your account to ensure that occurs.
Shankar_sq
Shankar_sqOP4d ago
This is what I thought but - I have set the multipart abort to 1 day and it is already over 6 days. - And second is that I only ran it on a source bucket that is 2.7 TB and for it to show 2.5x as space occupied would need an extremely high failure percentage. But only ~5-7% are showing as errors on the slurrpr migration tool and I have used no client side tools or manual uploads. - Overall usage on the dashboard is shown as 11 TB for the account. Which also does not make sense as there is no other data except this. One other aspect is that I had set it to infrequenct access when creating the bucket. I have not removed any objects except the rule to delete the multipart upload failures. My guess is that the infrequent setting along with something to do with the 'skip' setting in slurrpr might be leading to higher reported numbers.

Did you find this page helpful?