How to parallelize batch processing of photos?

I’m trying to batch run ~60,000 images from my photo library though Topaz for filtering, but it only processes 1 image at a time. With an M2 Ultra CPU there are plenty of resources available to process multiple photos in parallel… Currently it’s taking forever. I don’t see an option for parallel threads in Topaz, is there any way to enable this?

EDIT:

after rm -rf a .ds folder i got this command to work as intended, which means theres a hundred ways to do this, but I have no immediate solution on how to manage memory. I got 4 processes that each consume 10 gig of memory and it usually crash after the first photo is done. {TEXTI will check if making sure that earch thread does not get piped the same images might help, since I think I am using 4 threads to run 4 topaz instances that each need as much memory as a single thread.}

Parallell has built in option to not start process unless certain amount of ram is free, but after that it maxes out again.

$ find . -maxdepth 1 -type f -print0 | parallel -0 -j 2 --memfree 4G topaz --cli {} --output ./test

Where -j2 --memfree 4G starts 2 threads and checks for at least 4 gig available mem before starting. It works but it takes a minute before my 2 topaz processes consumes 20 gig memory each :neutral_face:

This command got it technically working after ‘rm -rf .DS*’:

  • I have linked the cli binary to /usr/local/bin so that running “$ topaz” works anywhere. This command is ran while in the same folder as the images i wanted to process, and they are to ble placed in a new “test/” directory.
$ find . -maxdepth 1 -type f -print0 | parallel -0 -j 4 topaz --cli {} --output ./test --verbose

I got the original .tiff at 3024 × 4032 px and 37 MB
I also got a new .tiff under “test/” at 9072 × 12096 px and 658 MB, and is exactly as intended.


I got one of 4 .tiff as intended before it crashed. I think this would be very helpfull if topaz just made themselves as an argument option. I also thought about using a cloud service like runpod but I cant get it to work on linux even though it should be perfect for it.

from chatgpt:

Here’s a basic way to parallelize your binary using shell scripting:

1. Using xargs:

The xargs command can be used to run processes in parallel. For example, if you have a list of images in a file called images.txt and your binary is named mybinary, you can use the following command to process 4 images concurrently:

bashCopy code

cat images.txt | xargs -I {} -P 4 ./mybinary {}

Here’s what each argument means:

  • -I {}: This replaces {} in ./mybinary {} with each line from images.txt.
  • -P 4: This tells xargs to run up to 4 processes in parallel.

2. Using GNU Parallel:

GNU Parallel is a powerful tool for executing jobs in parallel. If you have it installed, you can use it to achieve parallelism as well:

bashCopy code

cat images.txt | parallel -j 4 ./mybinary {}

Here, -j 4 tells GNU Parallel to run 4 jobs in parallel.

3. Using a Scripting Language:

You can also use scripting languages like Python to run the binary in parallel. Here’s an example using Python’s concurrent.futures:

pythonCopy code

import subprocess
from concurrent.futures import ProcessPoolExecutor

def run_binary(image_path):
    subprocess.run(['./mybinary', image_path])

images = [line.strip() for line in open('images.txt')]

with ProcessPoolExecutor(max_workers=4) as executor:
    executor.map(run_binary, images)

This Python script reads the image paths from images.txt, and processes up to 4 images concurrently.

Remember to adjust the number of concurrent processes based on the capabilities of your Apple M2 and the nature of your task. Running too many processes might not necessarily speed things up, and in some cases, might even slow things down due to contention for resources. Always test different configurations to find the optimal number of processes for your specific use case.

1 Like

Ok, that makes sense, I do this sort of thing all the time but wasn’t aware this could be launched via cli – Just did the drag/drop 40,000 photos into it and hit go.

1 Like

Thanks so much for sharing!