Video AI 5.3.X - User Benchmarking Results

z1nonly · September 6, 2024, 12:47pm

If we are both running the same benchmark resolution, we should either both get “N/A” or both get results. I cannot think of a reason why a 13900K / 4080S rig would get results but a 13900K / 4090 would get N/A. (Other than system setup)

I think the TVB frequency clipping is a safety thing that lowers frequency (and thus voltage) at higher temperatures to avoid degradation. It’s called something different on my Asus board, but the result is lower boost clocks and lower voltage as temperatures increase.

The higher memory speed does kick the IMC voltage up to 1.4V but Intel seems to think the upper voltage limit for the cores needs to be 1.55V or less so I’m okay with 1.4V on the IMC. I know AMD mentioned their X3D cache layer is more sensitive than the CCD silicon, but the 32mb of on-die cache didn’t require any special voltage considerations.

I don’t know that 1.4V is safe long term, but Intel didn’t mention IMC voltage in their investigation of CPU failures and didn’t change it in the 0x129 microcode that’supposed to save these processors from self distruction, so I’m going with it.

Imo · September 6, 2024, 12:59pm

When I just copy the benchmark results directly from Video AI there are those results “missing” also because certain models can’t do that. It only looks a bit chaotic this way. As always.

Topaz Video AI Beta  v5.3.2.0.b
System Information
OS: Windows v11.23
CPU: 13th Gen Intel(R) Core(TM) i9-13900K  31.775 GB
GPU: NVIDIA GeForce RTX 4090  22.082 GB
GPU: Intel(R) UHD Graphics 770  0.125 GB
Processing Settings
device: 0 vram: 1 instances: 1
Input Resolution: 1920x1080
Benchmark Results
Artemis		1X: 	42.97 fps 	2X: 	19.67 fps 	4X: 	05.25 fps 	
Iris		1X: 	36.46 fps 	2X: 	23.68 fps 	4X: 	05.96 fps 	
Proteus		1X: 	41.49 fps 	2X: 	22.52 fps 	4X: 	05.93 fps 	
Gaia		1X: 	14.73 fps 	2X: 	10.14 fps 	4X: 	05.23 fps 	
Nyx		1X: 	17.65 fps 	2X: 	13.87 fps 	
Nyx Fast		1X: 	32.38 fps 	
Rhea		4X: 	04.52 fps 	
4X Slowmo		Apollo: 	37.56 fps 	APFast: 	78.54 fps 	Chronos: 	32.21 fps 	CHFast: 	34.18 fps 	
16X Slowmo		Aion: 	32.33 fps

kyle.topazlabs · September 6, 2024, 10:08pm

Was running multiple tests and added the wrong one to this thread after posting the v5.3.x results above it.

kyle.topazlabs · September 6, 2024, 10:09pm

Yeah, yeah trying to do too many things at once. It happens.

Alpheratz · September 7, 2024, 12:28am

I have no choice, my expensive license flows very quickly

i just tried for the first time Rhea on a bullshit video 720x400 to 4K, I am completely stuned bu the result, on some tests pictures. Going to encode full now.

I say on picture because, several videos are VERY resilient with a disappointment impossible to patch before coding : some with clear spot punishing the stream at random location, ruining the stream
very hard to reproduice, will post it in buggs section

X670E Aorus Extreme has arrived ! Week-end of tests, it’s so heavy, omg, a panzer
Also received my Nvme Gen4 x4 to PCIE Gen 4 x16, will see how much bandwidth it disaggregates. If few I order tons of it 10$, very worth it to get PCIE x16

There’s also a fu**** confirmed bug on X670E Crosshair and Ryzen 99xx, thermal managment workaround and CPU current delivery. The CPU bumps instantly to a skyrocket 100% current load and FANs scream to death trying to cool this … 500$ , incredible
Impossible to reproduce, it is random, when Asus chips decides NCT6796D

Imo · September 7, 2024, 4:26pm

With my current setup, slightly adjusted from CPU Lite Load 6 to 5 and via CPU Adaptive Offset +0.015 V and limit the CPU temperature to 95 °C I manage to stay at around 5 Ghz on the PCores during processing from HD to full HD via Iris MQ. The average VCore during processing is 1.216 V. The 4090 is undervolted to 0.875 V @ 2.4 Ghz. Temperature at home is 28.0 °C today.

z1nonly · September 8, 2024, 5:30pm

I discovered the “tweaked” RAM timings were not fully stable, and when I tried using generic XMP I or XMP II the performance loss was bad enough that I decided to try and find a way to get some of it back. After some experimentation, it seems that the timings were more important than the raw MT/s.
By selecting the “tweaked” XMP profile and then manually reducing the speed from 7200 down to 7000, I was able to get performance on-par with the 7200 speed. My IMC VDD is down to 1.350 too.

Topaz Video AI  v5.3.1
System Information
OS: Windows v10.22
CPU: 13th Gen Intel(R) Core(TM) i9-13900K  47.781 GB
GPU: NVIDIA GeForce RTX 4080 SUPER  15.671 GB
Processing Settings
device: 0 vram: 1 instances: 0
Input Resolution: 1920x1080
Benchmark Results
Artemis		1X: 	36.69 fps 	2X: 	24.59 fps 	4X: 	05.99 fps 	
Iris		1X: 	33.45 fps 	2X: 	20.52 fps 	4X: 	05.96 fps 	
Proteus		1X: 	34.78 fps 	2X: 	25.39 fps 	4X: 	06.51 fps 	
Gaia		1X: 	11.53 fps 	2X: 	08.17 fps 	4X: 	05.21 fps 	
Nyx		1X: 	13.98 fps 	2X: 	11.79 fps 	
Nyx Fast		1X: 	26.50 fps 	
Rhea		4X: 	04.27 fps 	
4X Slowmo		Apollo: 	40.45 fps 	APFast: 	87.42 fps 	Chronos: 	24.78 fps 	CHFast: 	34.59 fps 	
16X Slowmo		Aion: 	53.26 fps

Alpheratz · September 8, 2024, 5:33pm

I discovered greta instability over all releases since 5.0.4. Latest release just did something completely newe : when adding a 4th GPU it becomes completely nuts

One of GPU crash and become invisible, and before that instead of using “All GPU” it uses only 2 and worst of them …
I tried susing same, unsynced gen, exact same (same board w/ same firmare), multiple platform, always the same problem : everything turned wrong with multi GPU after 5.0.4 and I think we are not mcuh to see that.

vincentabel42 · September 8, 2024, 5:47pm

Yup, it usually is. Great job, btw!

vincentabel42 · September 8, 2024, 5:54pm

Intel will not call already sold CPUs for RMA.

Wow! I’m sorry to know that, Imo.

phew I got dehydrated just thinking about those thermals. j/k

vincentabel42 · September 8, 2024, 6:17pm

Wow! Interesting. That does seem weird.

Alpheratz · September 8, 2024, 6:30pm

Indeed, I returned to 5.0.4, because of enable to adress the issue

The load is absolutely not what advertised is but I will give a try with Nyx and exact Topaz conditions.
Here I am on Proteus(x4) + Artemis.

There’s a bottleneck, but where … I would be very very happy that’s on my side, but did so much , os so many platform, GPUs and mobos tests, always the same issue.

So we are very far from allmost perfect scalability

gene-8240 · September 8, 2024, 9:04pm

5.04 was less prone to crashing on my system with multi-GPU, but my multi-GPU processing speed was actually slower than it was with a single GPU.

Alpheratz · September 8, 2024, 9:25pm

I’ve got the same problem here 1 RTX 4090 >> 1 RTX 4090 + 2 RTX 3070 …

gene-8240 · September 8, 2024, 9:49pm

Multi-GPU usually doesn’t work well unless the processors are equivalent in speed. Mismatched processors usually produce a combined speed somewhere between the fastest and slowest processor. Which pretty much negates the whole point of multi-GPU.

Alpheratz · September 8, 2024, 10:53pm

With a well handled queue this should’nt occur, moreover using tiles or GOP,

Above is 4x MSI RTX 3070 GAMING Z TRIO LHR
One of it runs well, 99% load
Two of it, problems start. One is allmost fully loaded the second jerky
Three to four, they run at 30-60% very jerky, unable to get a full load.

It was said elsewhere the CPU is unable to feed them. Well … if so the CPU should be @ 100% load and agonizing, it is not the case, whatever the CPU I use 9950X / 2990WX which are not on the paper pure bullshit.

Found it back

8K input … ok … building an 8K sample test file with massive noise

Got it ! It FULLY depends on the model
Gaia full 100% everywhere (the useless model )

Proteus : no 100%, Artemis no 100%, so going to test it all and publish true results

I edited my previous post with a long term estimation, not the 5 first minutes

EDIT : Chances that advert is TRUE, got a 100% mega full load over all GPU, CPU is sleeping or almost, so Nyx IS multi optimized, others … benchs soon

So nothing to hunt on my hardware, runs fine, it is mosly very model dependant. and scale you use it up.

z1nonly · September 9, 2024, 1:08am

I missed this detail when I first replied, but CEP doesn’t like it when AC and DC LLC are not 1 to 1. Before 0x129, my Asus board was defaulting to 0.4 and 1.1. I had to manually match them. 0x129 defalts to 0.7 something for both. (On my board at least)

topaz257 · September 9, 2024, 3:28pm

Main bottleneck when doing any CPU related task (e.g. upscaling) is usually the RAM speed. Run for example Proteus with manual settings and no scaling and u will get much better GPU utilization compared when using auto-values and upscaling.

Alpheratz · September 9, 2024, 5:17pm

I never use automatic values.
But you are right about one thing, scaling SHOULD NOT be CPU-specific, and that seems to be the case.

I don’t know if bilinear or bicubic is used, but GPUs should handle it like at least 100x faster for the same quality.

My RAM is top on the market, did not find better for Zen5 arch. 30-36-36-72 @ 6000MT/s

84K read / 81K write / 78K copy, isn’t enough ?!? Or the algo. is crap …

topaz257 · September 9, 2024, 6:56pm

There is just no CPU+RAM combination which can upscale the image fast enough for a 4090 it seems. The 4090 can process a single frame just faster than any CPU RAM combo out there can upscale that frame. At least not with the current code TVAI is using for upscaling. Maybe better utilization of many CPU cores by upscaling multiple frames in parallel or sth similar could help.