How to Speed Up S3 Upload or Make it Async in RunPod Serverless Deployments

I am currently exploring using RunPod as our primary in-house model deployment platform instead of Replicate (our current preferred platform). Our in-house models mostly are txt2img/img2img custom models. One of the issues I'm facing while testing RunPod is long S3 upload times. For example, for one of our processes, the prediction time is ~1 second, but the S3 upload is taking up to 4-5 seconds (depending on image size), significantly increasing the overall prediction time. This causes two main problems: - Long prediction times despite the GPU being free after just 1 second of actual processing - Increased queue times as workers remain occupied during these long uploads Is there a way to speed up S3 uploads? Is there a way to make the S3 upload async so that the server can handle multiple concurrent requests? For comparison, Replicate provides temporary File URLs (persistent for ~30 mins) that avoid the S3 upload overhead, resulting in much faster overall request times. Note: I am already using RunPod's specific S3 upload function that uses chunking.
2 Replies
ToonyGen
ToonyGen2d ago
I just do it everything into base64 but if runpod sdk can support background s3 upload and download, it will be great. also it would be nicer to support other s3 compatible buckets on its serverless sdk.
garg-aayush
garg-aayushOP20h ago
I actually would like to avoid the base64 images, given that in my cases the images can at times be of size 10 MB or more. Such large base64 images can create other issues. It will be great if runpod can support temporary File URLs (persistent for ~30 mins) something similar to Replicate. This will avoid the S3 upload overhead and would be of great help.

Did you find this page helpful?