R
RunPod6mo ago
tzk

How to get worker to save multiple images to S3?

Hey all - my comfyui workflow is saving multiple images from throughout the workflow......however in the S3 upload, the worker is only saving one image - do you know how I can have it to save the multiple images into the same directory in S3?
No description
36 Replies
nerdylive
nerdylive6mo ago
I think you need to change the code for uploading files to s3 in the handler it only uploads 1 result if im not wrong
tzk
tzkOP6mo ago
oh I see I'll dig into it - thanks!
nerdylive
nerdylive6mo ago
Sure, your welcome!
Madiator2011 (Work)
zip it then upload
tzk
tzkOP6mo ago
Oh that’s a great suggestion - I wasn’t sure if comfyui had a zip node?
nerdylive
nerdylive6mo ago
Oh wait what doesn't runpod supports multi files upload? Btw how does your code call the upload?
tzk
tzkOP6mo ago
I'm using the runpod-worker-comfy repo by blib-la - this is the lines of code that handles the upload:
image = rp_upload.upload_image(job_id, local_image_path)
print(
"runpod-worker-comfy - the image was generated and uploaded to AWS S3"
)
image = rp_upload.upload_image(job_id, local_image_path)
print(
"runpod-worker-comfy - the image was generated and uploaded to AWS S3"
)
nerdylive
nerdylive6mo ago
Ah le me check
tzk
tzkOP6mo ago
FWIW, I mentioned the following in the blib-la support discord too:
I suspect the issue is in line 240 in rp_handler.py where the dict output_images is overwritten in the for statement? I unfortunately have no way to test as I can't set up the docker locally on my M1 Mac (needs a Nvidia I'm guessing?)
I suspect the issue is in line 240 in rp_handler.py where the dict output_images is overwritten in the for statement? I unfortunately have no way to test as I can't set up the docker locally on my M1 Mac (needs a Nvidia I'm guessing?)
No description
nerdylive
nerdylive6mo ago
doesnt have any for loops here? Whats local_image_path
tzk
tzkOP6mo ago
the path where the output images are stored in the worker's instance of ComfyUI I believe
nerdylive
nerdylive6mo ago
so a directory, and it only uploads one file?
tzk
tzkOP6mo ago
correct
nerdylive
nerdylive6mo ago
ill try to check on this maybe, runpod only uploads one of the files yeah i think that upload function only uploads one file
Madiator2011 (Work)
nah you can upload multi
nerdylive
nerdylive6mo ago
Did you mean the upload functions automatically retrieves multiple images or files if the input path is a directory?
Encyrption
Encyrption6mo ago
It's not magic. It depends on how you code your serverless handler to run. If you need to upload multiple files then call rp_upload.upload_image (included in the runpod Python module) multiple times. If you don't like how rp_upload.upload_image works you can write your own upload routine. Here is an example of uploading 2 files uses rp_upload.upload_image: from runpod.serverless.utils import rp_upload import runpod def handler(job): input = job['input'] JOB_ID = job['id']
imageOne_url = rp_upload.upload_image(JOB_ID, "./imageOne.png") imageTwo_url = rp_upload.upload_image(JOB_ID, "./imageTwo.png")
return {'imageOne_url': imageOne_url, 'imageTwo_url': imageTwo_url} You will have to modify this for your specific needs.
nerdylive
nerdylive6mo ago
Yeah thats why im confused why madiator said this
nerdylive
nerdylive6mo ago
No description
nerdylive
nerdylive6mo ago
Yeah it receives a file path in args orr... you can use another utility function in the same python file as that upload_image() to upload multiple files
Encyrption
Encyrption6mo ago
I do not believe that rp_upload.upload_image handles multiple files with its default code. Here is code for a Multi file upload routine you could add and use to upload multiple files, with wilcard:
import glob

def uploadMulti(job_id, path_with_wildcard):
# Get a list of all files matching the wildcard path
file_list = glob.glob(path_with_wildcard)

# Initialize an empty list to store the URLs of uploaded files
uploaded_urls = []

# Iterate through each file in the list
for file_path in file_list:
# Upload the file and get the URL
file_url = rp_upload.upload_image(job_id, file_path)
# Append the URL to the list
uploaded_urls.append(file_url)

# Return the list of URLs
return uploaded_urls
import glob

def uploadMulti(job_id, path_with_wildcard):
# Get a list of all files matching the wildcard path
file_list = glob.glob(path_with_wildcard)

# Initialize an empty list to store the URLs of uploaded files
uploaded_urls = []

# Iterate through each file in the list
for file_path in file_list:
# Upload the file and get the URL
file_url = rp_upload.upload_image(job_id, file_path)
# Append the URL to the list
uploaded_urls.append(file_url)

# Return the list of URLs
return uploaded_urls
Here is how you call such a function:
# Example usage
def handler(job):
input = job['input']
JOB_ID = job['id']

path_with_wildcard = "./images/*.png"
uploaded_urls = uploadMulti(JOB_ID, path_with_wildcard)

return {'uploaded_urls': uploaded_urls}
# Example usage
def handler(job):
input = job['input']
JOB_ID = job['id']

path_with_wildcard = "./images/*.png"
uploaded_urls = uploadMulti(JOB_ID, path_with_wildcard)

return {'uploaded_urls': uploaded_urls}
nerdylive
nerdylive6mo ago
ikr yeah
tzk
tzkOP5mo ago
Thank you @Encyrption - this is super helpful. I’m actually looking to download - rather than upload - multiple images from the output folder. Basically my workflow is producing multiple images and I want to present them all to the user to choose the one they want?
nerdylive
nerdylive5mo ago
how do you want to download them? Where / how maybe if you want to download inside serverless, there is boto3, or you can use the rp_download from the runpod's pip package
Encyrption
Encyrption5mo ago
I think what you are describing is uploading. Your workflow is likely producing image files PNG, JPG, or similar. They are stored on the worker's disk as files. For the user to get access to them you will have to upload those files somewhere that will host them on the Internet. My code above uploads them to an S3 bucket. With proper configuration of the S3 bucket it's URL will be available for anyone on the Internet to access it. Your handler needs to return those URLs to user in JSON. If you are presenting a web interface to the user then you will need to include the images in <img> tags so they can see them or use JavaScript to make it so the user downloads it.
nerdylive
nerdylive5mo ago
yep correct you're, encryption upload -> serverless worker to s3 download is s3 to serverless worker
Encyrption
Encyrption5mo ago
If you do actually need to dowload something into your worker you can add this function:
"""Downloads a file from a URL to a local path."""
def download_file(url, local_filename):
try:
print(f'[Enhancer]: Downloading {url}')
if os.path.exists(local_filename):
return local_filename, None
with requests.get(url, stream=True) as r:
r.raise_for_status()

with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)

return local_filename, None

except Exception as e:
return None, e
"""Downloads a file from a URL to a local path."""
def download_file(url, local_filename):
try:
print(f'[Enhancer]: Downloading {url}')
if os.path.exists(local_filename):
return local_filename, None
with requests.get(url, stream=True) as r:
r.raise_for_status()

with open(local_filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)

return local_filename, None

except Exception as e:
return None, e
You can call it like this:
result, error = download_file(url, local_path)
result, error = download_file(url, local_path)
tzk
tzkOP5mo ago
Ahh thank you both! Btw from a performance point of view is it better to go via S3 or bring directly to local device? Ie export as a list of base64 strings
nerdylive
nerdylive5mo ago
Try both, but if you do many images at one time I guess s3 is better S3 more cleaner, depending on your provider too it can provide speed if your provider allows you to connect cdn to s3 buckets
Encyrption
Encyrption5mo ago
When you run a serverless worker the only thing that will be returned is JSON (text). Below is an example of a run of one of my ToonCrafter worker:
{
"delayTime": 491,
"executionTime": 135484,
"id": "your-unique-id-will-be-here",
"output": {
"output_video_url": "https://mybucket.nyc3.digitaloceanspaces.com/ToonCrafter/2024_06_16_16.20.48.mp4"
},
"status": "COMPLETED"
}
{
"delayTime": 491,
"executionTime": 135484,
"id": "your-unique-id-will-be-here",
"output": {
"output_video_url": "https://mybucket.nyc3.digitaloceanspaces.com/ToonCrafter/2024_06_16_16.20.48.mp4"
},
"status": "COMPLETED"
}
You can see how I returned a s3 URL to a video file. None of the serverless disk will persist so any files not uploaded somewhere are lost. One other option would be to convert your image to BASE64. Since BAS64 is just text you can return that in the actual JSON response. Although, there is a size limit on how much data you can include in a response. You could likely return an image encoded as BASE64 but I wouldn't suggest that route for multiple images. But again, nothing persists after an API call to a serverless worker. You have to move the results somewhere that will persist and give a link to it in the JSON that is returned.
tzk
tzkOP5mo ago
Thank you thank you both!!
nerdylive
nerdylive5mo ago
Yep might be slower too if you return base64.. Depending on your network connection
Encyrption
Encyrption5mo ago
I've just finished building a web socket proxy... that creates a WebSocket connection between the serverless worker and the user's browser. With that you could send the results directly to the users browser. I'm not giving that code out yet though... I am considering if I want to run it as a service.
nerdylive
nerdylive5mo ago
Wew how's the speed
Encyrption
Encyrption5mo ago
Hard to say.... it depends on a lot of factors. Upload/Download speed at given runpod region. Upload/Download speed of the users browser. I have done tests with streaming logs from the worker to the browser. I am currently working on code that will stream webcam video to the worker from the browser and video in the reverse. Transferring over media (images, videos) should be no problem in most scenarios.... but if a user was on a slow link it could.
briefPeach
briefPeach5mo ago
@tzk i noticed this bug too with blib-la's repo. This is what im doing to grab all comfyui output images:
for node_id, node_output in outputs.items():
for output_type, output in node_output.items():
for image in output:
if not isinstance(image, dict) or "filename" not in image:
continue
try:
subfolder = image.get("subfolder", "")
type = image.get("type", "output")
image_path = os.path.join(COMFYUI_PATH, type, subfolder, image.get("filename"))
if image_path not in output_images and type == "output": # only process output images, no temp images
output_images.append(image_path)
except Exception as e:
print(f"Error processing output in: node [{node_id}] {image} - {e}")
print(traceback.format_exc())
for node_id, node_output in outputs.items():
for output_type, output in node_output.items():
for image in output:
if not isinstance(image, dict) or "filename" not in image:
continue
try:
subfolder = image.get("subfolder", "")
type = image.get("type", "output")
image_path = os.path.join(COMFYUI_PATH, type, subfolder, image.get("filename"))
if image_path not in output_images and type == "output": # only process output images, no temp images
output_images.append(image_path)
except Exception as e:
print(f"Error processing output in: node [{node_id}] {image} - {e}")
print(traceback.format_exc())
Want results from more Discord servers?
Add your server