TVAI friendlier with RDNA 3?

tcole-3526 · July 6, 2023, 2:59pm

Apologies if this has been addressed elsewhere in the forums and I just haven’t found it, but could someone help me understand what’s “going on behind the scenes” in the software that seems to have (previously) resulted in the RDNA 3 architecture (meaningfully) outperforming Ada Lovelace?

Specifically referencing this analysis by Puget Systems from earlier this year: https://www.pugetsystems.com/labs/articles/topaz-ai-suite-nvidia-geforce-rtx-40-series-performance/

I’m really just curious to understand, broadly, why TVAI performed better with RDNA 3 despite there having been various TensorRT optimizations noted in the release changelogs.

Yes, I’m aware of the benchmarking section, but given the multitude of variables that impact performance from one system to another, I find them less insightful, and generally would just like to understand if TVAI is fundamentally more friendly to the RDNA 3 architecture despite having functionality that is exclusive to NVIDIA architecture.

lhkjacky · July 6, 2023, 3:05pm

The benchmark shown in Puget systems for TVAI is outdated. They were using the TVAI V3.0.11 for their benchmark.

TVAI use a new processing pipeline since V3.1 and speed have improved significantly in newer version.

Video AI v3.3.X - User Benchmarking Results

Topaz Video AI  v3.3.1
System Information
OS: Windows v11.22
CPU: AMD Ryzen 9 5950X 16-Core Processor              63.923 GB
GPU: AMD Radeon RX 7900 XTX  23.945 GB
Processing Settings
device: 0 vram: 1 instances: 1
Input Resolution: 1920x1080
Benchmark Results
Artemis		1X: 	20.78 fps 	2X: 	09.80 fps 	4X: 	02.36 fps 	
Proteus		1X: 	19.19 fps 	2X: 	08.69 fps 	4X: 	02.28 fps 	
Gaia		1X: 	10.73 fps 	2X: 	06.95 fps 	4X: 	02.55 fps 	
4X Slowmo		Apollo: 	17.19 fps 	APFast: 	38.65 fps 	Chronos: 	13.89 fps 	CHFast: 	21.66 fps

Video AI v3.3.X - User Benchmarking Results

opaz Video AI  v3.3.1
System Information
OS: Windows v11.22
CPU: 13th Gen Intel(R) Core(TM) i9-13900K  63.576 GB
GPU: NVIDIA GeForce RTX 4080  15.688 GB
Processing Settings
device: 0 vram: 1 instances: 1
Input Resolution: 1920x1080
Benchmark Results
Artemis		1X: 	27.54 fps 	2X: 	15.20 fps 	4X: 	04.01 fps 	
Proteus		1X: 	24.89 fps 	2X: 	13.91 fps 	4X: 	04.02 fps 	
Gaia		1X: 	10.34 fps 	2X: 	07.06 fps 	4X: 	04.32 fps 	
4X Slowmo		Apollo: 	34.29 fps 	APFast: 	68.35 fps 	Chronos: 	22.91 fps 	CHFast: 	29.24 fps