RunPod•10mo ago

How to get worker to save multiple images to S3?

Hey all - my comfyui workflow is saving multiple images from throughout the workflow......however in the S3 upload, the worker is only saving one image - do you know how I can have it to save the multiple images into the same directory in S3?

36 Replies

Jason•10mo ago

I think you need to change the code for uploading files to s3 in the handler it only uploads 1 result if im not wrong

tzkOP•10mo ago

oh I see I'll dig into it - thanks!

Jason•10mo ago

Sure, your welcome!

Madiator2011 (Work)•10mo ago

zip it then upload

tzkOP•10mo ago

Oh that’s a great suggestion - I wasn’t sure if comfyui had a zip node?

Jason•10mo ago

Oh wait what doesn't runpod supports multi files upload? Btw how does your code call the upload?

tzkOP•10mo ago

I'm using the runpod-worker-comfy repo by blib-la - this is the lines of code that handles the upload:

image = rp_upload.upload_image(job_id, local_image_path)
            print(
                "runpod-worker-comfy - the image was generated and uploaded to AWS S3"
            )

image = rp_upload.upload_image(job_id, local_image_path)
            print(
                "runpod-worker-comfy - the image was generated and uploaded to AWS S3"
            )

Jason•10mo ago

Ah le me check

tzkOP•10mo ago

FWIW, I mentioned the following in the blib-la support discord too:

I suspect the issue is in line 240 in rp_handler.py where the dict output_images is overwritten in the for statement? I unfortunately have no way to test as I can't set up the docker locally on my M1 Mac (needs a Nvidia I'm guessing?)

I suspect the issue is in line 240 in rp_handler.py where the dict output_images is overwritten in the for statement? I unfortunately have no way to test as I can't set up the docker locally on my M1 Mac (needs a Nvidia I'm guessing?)

Jason•10mo ago

doesnt have any for loops here? Whats local_image_path

tzkOP•10mo ago

the path where the output images are stored in the worker's instance of ComfyUI I believe

Jason•10mo ago

so a directory, and it only uploads one file?

tzkOP•10mo ago

correct

Jason•10mo ago

ill try to check on this maybe, runpod only uploads one of the files yeah i think that upload function only uploads one file

Madiator2011 (Work)•9mo ago

nah you can upload multi

Jason•9mo ago

Did you mean the upload functions automatically retrieves multiple images or files if the input path is a directory?

Encyrption•9mo ago

It's not magic. It depends on how you code your serverless handler to run. If you need to upload multiple files then call rp_upload.upload_image (included in the runpod Python module) multiple times. If you don't like how rp_upload.upload_image works you can write your own upload routine. Here is an example of uploading 2 files uses rp_upload.upload_image: from runpod.serverless.utils import rp_upload import runpod def handler(job): input = job['input'] JOB_ID = job['id']
imageOne_url = rp_upload.upload_image(JOB_ID, "./imageOne.png") imageTwo_url = rp_upload.upload_image(JOB_ID, "./imageTwo.png")
return {'imageOne_url': imageOne_url, 'imageTwo_url': imageTwo_url} You will have to modify this for your specific needs.

Jason•9mo ago

Yeah thats why im confused why madiator said this

Jason•9mo ago

Yeah it receives a file path in args orr... you can use another utility function in the same python file as that upload_image() to upload multiple files

Encyrption•9mo ago

I do not believe that rp_upload.upload_image handles multiple files with its default code. Here is code for a Multi file upload routine you could add and use to upload multiple files, with wilcard:

import glob

def uploadMulti(job_id, path_with_wildcard):
    # Get a list of all files matching the wildcard path
    file_list = glob.glob(path_with_wildcard)
    
    # Initialize an empty list to store the URLs of uploaded files
    uploaded_urls = []
    
    # Iterate through each file in the list
    for file_path in file_list:
        # Upload the file and get the URL
        file_url = rp_upload.upload_image(job_id, file_path)
        # Append the URL to the list
        uploaded_urls.append(file_url)
    
    # Return the list of URLs
    return uploaded_urls

import glob

def uploadMulti(job_id, path_with_wildcard):
    # Get a list of all files matching the wildcard path
    file_list = glob.glob(path_with_wildcard)
    
    # Initialize an empty list to store the URLs of uploaded files
    uploaded_urls = []
    
    # Iterate through each file in the list
    for file_path in file_list:
        # Upload the file and get the URL
        file_url = rp_upload.upload_image(job_id, file_path)
        # Append the URL to the list
        uploaded_urls.append(file_url)
    
    # Return the list of URLs
    return uploaded_urls

Here is how you call such a function:

# Example usage
def handler(job):
    input = job['input']
    JOB_ID = job['id']

    path_with_wildcard = "./images/*.png"
    uploaded_urls = uploadMulti(JOB_ID, path_with_wildcard)

    return {'uploaded_urls': uploaded_urls}

# Example usage
def handler(job):
    input = job['input']
    JOB_ID = job['id']

    path_with_wildcard = "./images/*.png"
    uploaded_urls = uploadMulti(JOB_ID, path_with_wildcard)

    return {'uploaded_urls': uploaded_urls}

Jason•9mo ago

ikr yeah

tzkOP•9mo ago

Thank you @Encyrption - this is super helpful. I’m actually looking to download - rather than upload - multiple images from the output folder. Basically my workflow is producing multiple images and I want to present them all to the user to choose the one they want?

Jason•9mo ago

how do you want to download them? Where / how maybe if you want to download inside serverless, there is boto3, or you can use the rp_download from the runpod's pip package

Encyrption•9mo ago

I think what you are describing is uploading. Your workflow is likely producing image files PNG, JPG, or similar. They are stored on the worker's disk as files. For the user to get access to them you will have to upload those files somewhere that will host them on the Internet. My code above uploads them to an S3 bucket. With proper configuration of the S3 bucket it's URL will be available for anyone on the Internet to access it. Your handler needs to return those URLs to user in JSON. If you are presenting a web interface to the user then you will need to include the images in <img> tags so they can see them or use JavaScript to make it so the user downloads it.

Jason•9mo ago

yep correct you're, encryption upload -> serverless worker to s3 download is s3 to serverless worker

Encyrption•9mo ago

If you do actually need to dowload something into your worker you can add this function:

"""Downloads a file from a URL to a local path."""
def download_file(url, local_filename):
    try:
        print(f'[Enhancer]: Downloading {url}')
        if os.path.exists(local_filename):
            return local_filename, None
        with requests.get(url, stream=True) as r:
            r.raise_for_status()

            with open(local_filename, 'wb') as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)

        return local_filename, None

    except Exception as e:
        return None, e

"""Downloads a file from a URL to a local path."""
def download_file(url, local_filename):
    try:
        print(f'[Enhancer]: Downloading {url}')
        if os.path.exists(local_filename):
            return local_filename, None
        with requests.get(url, stream=True) as r:
            r.raise_for_status()

            with open(local_filename, 'wb') as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)

        return local_filename, None

    except Exception as e:
        return None, e

You can call it like this:

result, error = download_file(url, local_path)

result, error = download_file(url, local_path)

tzkOP•9mo ago

Ahh thank you both! Btw from a performance point of view is it better to go via S3 or bring directly to local device? Ie export as a list of base64 strings

Jason•9mo ago

Try both, but if you do many images at one time I guess s3 is better S3 more cleaner, depending on your provider too it can provide speed if your provider allows you to connect cdn to s3 buckets

Encyrption•9mo ago

When you run a serverless worker the only thing that will be returned is JSON (text). Below is an example of a run of one of my ToonCrafter worker:

{
  "delayTime": 491,
  "executionTime": 135484,
  "id": "your-unique-id-will-be-here",
  "output": {
    "output_video_url": "https://mybucket.nyc3.digitaloceanspaces.com/ToonCrafter/2024_06_16_16.20.48.mp4"
  },
  "status": "COMPLETED"
}

{
  "delayTime": 491,
  "executionTime": 135484,
  "id": "your-unique-id-will-be-here",
  "output": {
    "output_video_url": "https://mybucket.nyc3.digitaloceanspaces.com/ToonCrafter/2024_06_16_16.20.48.mp4"
  },
  "status": "COMPLETED"
}

You can see how I returned a s3 URL to a video file. None of the serverless disk will persist so any files not uploaded somewhere are lost. One other option would be to convert your image to BASE64. Since BAS64 is just text you can return that in the actual JSON response. Although, there is a size limit on how much data you can include in a response. You could likely return an image encoded as BASE64 but I wouldn't suggest that route for multiple images. But again, nothing persists after an API call to a serverless worker. You have to move the results somewhere that will persist and give a link to it in the JSON that is returned.

tzkOP•9mo ago

Thank you thank you both!!

Jason•9mo ago

Yep might be slower too if you return base64.. Depending on your network connection

Encyrption•9mo ago

I've just finished building a web socket proxy... that creates a WebSocket connection between the serverless worker and the user's browser. With that you could send the results directly to the users browser. I'm not giving that code out yet though... I am considering if I want to run it as a service.

Jason•9mo ago

Wew how's the speed

Encyrption•9mo ago

Hard to say.... it depends on a lot of factors. Upload/Download speed at given runpod region. Upload/Download speed of the users browser. I have done tests with streaming logs from the worker to the browser. I am currently working on code that will stream webcam video to the worker from the browser and video in the reverse. Transferring over media (images, videos) should be no problem in most scenarios.... but if a user was on a slow link it could.

briefPeach•9mo ago

@tzk i noticed this bug too with blib-la's repo. This is what im doing to grab all comfyui output images:

    for node_id, node_output in outputs.items():
        for output_type, output in node_output.items():
            for image in output:
                if not isinstance(image, dict) or "filename" not in image:
                    continue
                try:
                    subfolder = image.get("subfolder", "")
                    type = image.get("type", "output")
                    image_path = os.path.join(COMFYUI_PATH, type, subfolder, image.get("filename"))
                    if image_path not in output_images and type == "output": # only process output images, no temp images
                        output_images.append(image_path)
                except Exception as e:
                    print(f"Error processing output in: node [{node_id}] {image} - {e}")
                    print(traceback.format_exc())

    for node_id, node_output in outputs.items():
        for output_type, output in node_output.items():
            for image in output:
                if not isinstance(image, dict) or "filename" not in image:
                    continue
                try:
                    subfolder = image.get("subfolder", "")
                    type = image.get("type", "output")
                    image_path = os.path.join(COMFYUI_PATH, type, subfolder, image.get("filename"))
                    if image_path not in output_images and type == "output": # only process output images, no temp images
                        output_images.append(image_path)
                except Exception as e:
                    print(f"Error processing output in: node [{node_id}] {image} - {e}")
                    print(traceback.format_exc())

Gaming

Programming

How to get worker to save multiple images to S3?

Did you find this page helpful?