Running Parallel Processes - FPS not dropping much when more processes added in parallel?

robert.moulton · November 16, 2022, 9:36pm

Hi, new user here.

I’ve been processing some files one at a time and they usually process at about .7 fps.
I then tried running three files in parallel. I was expecting the fps processing speed to drop by a factor of about three but instead the fps only dropped to about .6 fps.

I was surprised by that. I must not understand how resources are used when processing a file. What is the usual bottleneck in processing?

If relevant I am running with Stabilization off, the Chronos Fast Model, deinterlacing, and the Dione: TV 2X FPS . My output FPS matches the input of 29.97. My input file is crappy VHS tapes at 720*480 (as is the output).

(I need to process some large VHS files and am now planning splitting them apart, processing, in parallel, and then rejoining. This will save me days of time)

TPX · November 16, 2022, 11:05pm

That’s the smartest thing you can do.
We’ll see if anything changes in the future.
In principle, I personally have nothing against running multiple processes in parallel.

robert.moulton · November 17, 2022, 12:33am

I now notice that when I ran only one process at a time my GPU would go to 100% and my CPU was unaffected. Now with three processes running in parallel I see the GPU continue at 100% and now the CPU is running at about 50%.

I am encoding with VP9 Best as MP4, auto bitrate, and Audio Settings as Copy.

Any tips, or points to a link, for minimizing loss on re-encodings (when re-assembling a split file)? I need to use something I can be reasonably sure other people have access to on their PCs (will be running results on PCs). So should I encode as MP4 on highest quality settings (if expressed as a bitrate I’m thinking 4 MBPS is sufficient for VHS sourced stuff). Happy to be told otherwise, I’m new to all this.

ForSerious · November 17, 2022, 6:11pm

Personally, I like outputting to image files. TVAI even numbers them correctly by default, so you can just drag and drop all the images into one folder then encode them back into a video and add the sound back in.
If you don’t do it that way, I don’t know how to not add sound gaps in the final movie.

TPX · November 17, 2022, 9:22pm

Its also a good way to edit the images with other software, when you export the video as single frames.

michaelpfost · November 17, 2022, 9:49pm

Originally on V2.6.4 I got .10 spf on Mac.

Now I am running 3 processes at .07 spf.
Mac Mini M1 16GB. I haven’t tried 4 yet.

That’s a 30% increase in speed if my math is right?

Screenshot 2022-11-17 at 1.46.53 PM

TPX · November 17, 2022, 10:06pm

100%/0.10*0.07=142% (almost 143%)

42% increase in Processing Performance.*

*if i’m not wrong.

michaelpfost · November 17, 2022, 10:30pm

Actually I got lazy and it was .20 spf with three processes running = .067 spf.

There does seem to be some variation with the total remaining time left so I’d need hả independent timer.

I’m thinking the delay every couple of seconds may be adding additional time that is not being reported in the spf. This delay seems to be slowly increasing now and I’m not sure now if I am actually gaining or not. Perhaps I should just test 2 processes.

I am using SSD drives, but maybe the ProRes LT is starting to bog it down.

TPX · November 17, 2022, 10:42pm

Using three at 0.20 resulting in 0.67 together is 335%.

If im not wrong its then 235% faster over single one (over 2 times faster). (when its stable)

The 35% after the 200% is still a whole GPU or M1/2 Generation faster. (we sometimes gained only 34% - 40% in performance in the last years)

Parallelisation is King.

michaelpfost · November 18, 2022, 3:22am

Well I spoke to soon.

Screenshot 2022-11-17 at 7.05.46 PM

Each clip is one hour in length. I am just going from 480p to 720p. Same settings as my previous test (with three threads).

This new software is unpredictable. I don’t get what the time has increase so much much and I’m running two processes instead of three.

Any ideas?

michaelpfost · November 18, 2022, 7:51am

Sorry my calculations on seconds per frame rate are not accurate because the estimated time clock goes up and then down and you start to gain more time than what it is estimating.

For a 60 minute clip going from 480p to 720p using Artemis (med) it estimated to be about 4.5 hours throughout the whole render. The first clip was 60 minutes and the second clip was 64 minutes.

Screenshot 2022-11-17 at 7.05.46 PM

But in reality it took over 5.5 hrs based on the file creation date and screen shot date. You can see the clock jumps all over the place over just a 40 second time interval.

So if it were indeed .15 spf then the completion time would be 4.5 hours. But the completion time was actually over 5.5 hours making it .20 spf.

I don’t know what point I am trying to make, but the average frame rate is wrong.

ezgif-1-b7b0dd450a

TPX · November 18, 2022, 2:40pm

Did you work at the same time on the machine with other software.

I know from Nvidia that they slow down programs that work in the background to give the user a smooth experience.

This makes the process slower.

michaelpfost · November 18, 2022, 4:17pm

No this was a dedicated machine

Mac Mini M1 16GB - nothing else on it. Except the required internet.

reiner · November 18, 2022, 8:53pm

as mentioned above - output to single images, leavy out audio and encode afterwards to your desired end-format (is VP9 your goal?)… Then remux the audio stream back into it - thats the most crash-resilient, safest way to do it and also gives you the opportunity to tune the encoding afterwards.

07broom.fibulae · November 27, 2022, 2:00pm

Hi

I use LosslessCut to split a video into 4 segments (taking just a few seconds) for parallel processing in TVAI and then to join the up-scaled segments afterwards. I have discovered that I can up-scale 480p / 25fps to 1080 a bit faster than real-time on a base spec Mac Studio! Each of the 4 processes run at around 6.5 fps when in parallel. If I run just one process, the best I can get is around 10 fps. I’m SO much looking forward to TVAI speed / efficiency improvements so I don’t have to go through these extra steps to take advantage of parallel processing.

Thanks.

Andy

ArtndFun · November 27, 2022, 8:39pm

mr reiner, that sounds interresting. Will that image creation be as fast as building a .264?

What images in 2.6.4 are lossless, and can you recommend a simple muxer?

reiner · November 28, 2022, 9:47am

yes, image creation can even be the fastest output method.
png and TIF are losless, JPG is not… Look up “image compression” on wikipedia to get a grasp of the losless/lossy concept.
Of course you need a lot of discspace - but you will keep most of the quality. 16Bit TIF will keep the most possible, PNG and TIF will stay at the same level of quality (PNG being a compressed format, while TIF in this case is purely uncompressed - its just there to give you a choice).

Putting single files back into a video can be done by a lot of ways - one very simple and free method would be virtualdub2. Of course you can try whatever videosoftware you have at hand, most are able to import an image seqence and spit it out in whatever format you want.

muxers: a very robust example for the mkv container would be mkvtoolnix, for MP4 container, there are a lot of MP44tools guis for every operating system.