Performance issues

Hey,

I haven’t used TVAI in a while as I had no need for it and just came back to today, as a new upscale project is at hand.

The newest TVAI version being used.

Source material is 480p animation/cartoon and Artemis Mid Quality to 1080p upscaling is being used, so nothing out of ordinary/fancy.

My system consists of a 7950X3D, 7900 XTX and 32GB DDR5 6000.

I am quite certain that in past projects I got well above 20FPS/s with that hardware.

Now however I am stuck with 11-12fps on a single encode.

Weirdly enough running multiple encodes increases the total FPS count. Running 3 encodes will run each at around 5.5 FPS each, totaling in >17fps total, while 4 encodes are sitting at around 4.5 fps each, again outdoing a single encode.

Running 3 Encodes puts my CPU at around 65~75% usage and GPU to around 55~65% usage, according to AMD’s overlay, (in)accurate as that may be. So neither my CPU nor GPU is being maxed out. In fact, they are chilling, relatively speaking.

I am at a complete loss as to what’s going. The problem exists with a NVME drive as well as an external USB HDD, so disks aren’t the bottleneck. The XTX is set as AI accelerator.

Are there any known issues or some kind of direction I could be pointed towards?

EDIT: Benchmark results

Topaz Video AI  v5.3.6
System Information
OS: Windows v11.23
CPU: AMD Ryzen 9 7950X3D 16-Core Processor            31.632 GB
GPU: AMD Radeon RX 7900 XTX  23.938 GB
Processing Settings
device: 0 vram: 0.9 instances: 1
Input Resolution: 640x480
Benchmark Results
Artemis		1X: 	126.98 fps 	2X: 	59.24 fps 	4X: 	16.28 fps 	
Iris		1X: 	118.37 fps 	2X: 	73.63 fps 	4X: 	24.74 fps 	
Proteus		1X: 	133.20 fps 	2X: 	75.95 fps 	4X: 	25.11 fps 	
Gaia		1X: 	57.09 fps 	2X: 	39.09 fps 	4X: 	21.19 fps 	
Nyx		1X: 	42.05 fps 	2X: 	30.34 fps 	
Nyx Fast		1X: 	71.23 fps 	
Rhea		4X: 	16.85 fps 	
4X Slowmo		Apollo: 	142.26 fps 	APFast: 	273.74 fps 	Chronos: 	62.32 fps 	CHFast: 	98.34 fps 	
16X Slowmo		Aion: 	162.05 fps 	

I agree that it should be running faster… But there are no benchmarks of that card on the current version, and I do know that Nvidia cards have a speed advantage from the Tensor versions of the AI models. So maybe that speed is what you are limited to?

I also thought that it might’ve been performance regressions - so to speak - as models progressed and advanced.

However, it still seems nonsensical that neither CPU nor GPU are pushed to 100% utilization with 3 simultaneous encodes.

Assuming there are no bugs at play here there must be some form of bottle neck which I don’t see.

It is also weird that multiple encodes produce a higher amount of fps/s than a single one. I’d expect both to reach the same total cap but 11.5 to 16.5 is an increase of ~40% perfomance cap.

In both instances total performance rangeds from atrocious to quite bad, though.

As of the latest release, 5.5.0, I am back to around 16 to 17fps with 3 concurrent encodes, totaling at nearly 50FPS total without having made any changes to the system.

So I think it is fair to assume that is was something TVAI related.

That’s how it was on Nvidia RTX cards before they made the tensor versions of the models. I could get double or triple the speed by doing more at a time. Now I don’t seem to get more than a 10% speed increase if I do two at a time. That’s not really worth it to me right now.

I also stand corrected, it’s still slow.

The difference between quite slow and weirdly fast seems to have resided with the source material’s resolution.

Artemis Medium Quality 480p → 1080p = slow. (1080/480 = 2.25)

Artemis Medium Quality 540p → 1080p = lightning fast (in comparison) (1080/540 = 2.00)

The fractional scaling seems to be culprit for reasons beyond my personal understanding.

Might just be that much more transformative math necessary to get it right. No idea.