TVAI 3.1.x performance on AMD GPUs

Users with AMD GPUs please let us know the average performance improvement you are noticing between 3.0.x and 3.1.x of TVAI. Please add the following information to your reply.
GPU utilization using GPU-Z or similar app, CPU, RAM, GPU, VRAM, input resolution, model used, output resolution, fps or slow motion amount.

4 Likes

I have to look at the logs again, I meant that there are 36 fps at the beginning after the start of the processing, for a second or two and then it collapses.

But I may have been mistaken, that’s why the verification.

The performance on my AMD GPU is so slow that I rarely test it, but I might give it a try again. Here are my specs: AMD Ryzen 7 5800U 8 core 16 threads, 16 GB 4,1 Ghz RAM, AMD RX Vega 8 (5000) 2 Ghz, 2 GB VRAM,

Model: V2.6.4 RAM VRAM CPU-Load GPU-Load
Sec/frame MB MB
Proteus 0.12 15300 2629 10-20% 2-94%
Artemis LQ 0.09 15300 2690 8-30% 5-90%
Chronos v3 f 0.03 14500 1718 19-38% 6-8%
Stabilisation
Gaia HQ 0.12 15500 3847 5-20% 20-90%

Model: V3.0.12 RAM VRAM CPU-Load GPU-Load
Sec/frame MB MB
Proteus 0.15 16300 3911 9-15% 5-45%
Artemis LQ 0.12 15900 3802 13-20% 8-70%
Chronos v3 f 0.12 16300 4054 10-18% 6-60%
Stabilisation 0.41 16500 5421 5-15% 9-60%
Gaia HQ 0.07 16000 4817 12-20% 6-100%

Model: V3.1.1 RAM VRAM CPU-Load GPU-Load
Sec/frame MB MB
Proteus 0.12 15400 4726 6-13% 6-80%
Artemis LQ 0.08 15300 4581 3-10% 30-70%
Chronos v3 f 0.07 15600 5643 8-15% 30-70%
Stabilisation 0.11 16300 6727 5-15% 30-80%
Gaia HQ 0.09 15300 5956 5-10% 50-98%


Closed no active APP Closed no active APP
RAM VRAM
13424 1215
Input File:
Res: 720x576
Format: .mov

output: File:
Res: 1080p
Format: .mp4
188% scale H.264 (AMD)

Sys specs:
CPU AMD Threadripper 3960X 128 GB ECC on SMT off 24 cores only
GPU AMD Radeon PRO W6800 32GB ECC on
OS Microsoft Windows 11 Pro for Workstations
Version 10.0.22621 Build 22621
MainBoard PRIME TRX40-PRO
Radeon PRO GPU Driver V22.11.2 Date 30.11.22


GPU Memory Visualisation while Proteus is processing.



Visualisation of Proteus while export

4 Likes

This is outstanding. a very information rich post. Thanks for sharing. i am myself on a 6900xt LC. Will surely try to provide results in similar format. Please keep it coming

1 Like

Thanks,

if you enter the data in a spreadsheet and copy it here, the spreadsheets will be created automatically in the post.

1 Like

Thank you @TPX
Still waiting on AMD to get back to me with some details. Hopefully will have the first alphas out in a few weeks.

4 Likes

ok,
whats the Problem?

DirectML or something else?

Sometimes it feels like its the driver, because there is no way to change it.

1 Like

Most likely, I’m hoping it is something we can work around. The gist of it is that we need to be able to copy data from RAM To VRAM, Process, copy output from VRAM To RAM in a staggered manner, with ORT+DirectML implementation it is not happening, leading to situations of GPU waiting. This is overcome by the multiple process approach, but that’s not really the solution.

There also seems to be some kind of wastage within ORT+DirectML. Lots of possibilities, the hope is to get the same level of performance in a single process that we get on multiple processes currently. We will be looking at RML to see if it provides more boost.

6 Likes

Thanks for the Info!

I did turn on Prefetch of L1 again but it seems to not matter for TVAI.
But its like to put more power on the CPU.


You only notice when working that the screen builds up faster here and there with the prefetch of L1 enabled.


AMD Threadripper 3960X|128 GB ECC on SMT off 24 cores only
AMD Radeon PRO W6800 32GB ECC on
Microsoft Windows 11 Pro for Workstations
Version 10.0.22621 Build 22621
Main Board PRIME TRX40-PRO
Radeon PRO GPU Driver V22.11.2 Date 30.11.22

Model: V3.1.2.0.b RAM VRAM CPU-Load GPU-Load 10 sec preview AI-Processor
Time:
Mem = 50%
Sec/frame MB MB
Proteus 0.09 15700 2657 20-35% 2-94% 56 sec Auto
Artemis LQ 0.08 15700 2625 8-30% 5-90% 49 sec Auto
Chronos v3 f 0.08 15800 4945 19-38% 2-94% 47 sec Auto
Stabilisation 0.22 16123 3861 12-30% 60-80% 1m45sec Auto
Gaia HQ 0.08 15700 6204 5-20% 60-90% 43 sec Auto


Just for Comparison
Nvidia Quadro RTX 5000 / 16GB VRAM (like 2080S), Driver: 528.24
Intel 7820X, 8C/16T, 64 GB Ram
Win 11, PRO


Model: V3.1.2.0.b RAM VRAM CPU-Load GPU-Load 10 sec preview AI-Processor
Time:
Mem = 50%
Sec/frame MB MB
Proteus 0.05 11715 2584 12-20% 40-50% 30 sec Auto
Artemis LQ 0.04 11234 2461 12-20% 60-66% 22 sec Auto
Chronos v3 f 0.04 11584 3075 12-24% 55-62% 37 sec Auto
Stabilisation 0.12 12200 4816 16-30% 30-45% 1m40sec Auto
Gaia HQ 0.08 10899 3763 9-16% 95-97% 45 sec Auto



Model: V3.1.1 AMD sys RAM VRAM CPU-Load GPU-Load 10 sec preview AI-Processor
Mem = 50%
Sec/frame MB MB
Proteus 0.11 16100 5183 6-13% 6-80% 58 sec Auto
Artemis LQ 0.08 15900 5071 13-30% 30-70% 46 sec Auto
Chronos v3 f 0.07 16200 6100 20-30% 30-70% 58 sec Auto
Stabilisation 0.11 17200 7200 20-30% 30-80% 1m45sec Auto
Gaia HQ 0.08 15400 6420 10-20% 50-98% 48 sec Auto


output: File:
Res: 1080p
Format: .mp4
188% scale H.264 (AMD)


2 Likes

Can Topaz distribute test videos and benchmarks to beta testers to investigate what the bottlenecks are by conducting a common test?

Currently, each tester is using different videos and different methods to investigate, so the test results are inconsistent.

For example, if we prepare 480i, 480p, 1080i, and 1080p test videos, perform common tests such as 2x scale and 4x scale, and have them compare the specs of each PC (CPU, GPU, MEM, SSD, HDD…) If we let them compare the results, it would be easier to grasp the cause of the problem.

The benchmark results obtained would also be useful for users to consider what they can do to improve the speed of their PCs.

5 Likes

I agree. Suggested something similar as well some time ago.

2 Likes

I have always wanted it too :slight_smile:
Will just add a benchmark menu item that will just run a few models and report numbers, along with system info so they are easy to post.
Will do this once all the performance improvements are finished.

8 Likes

TVAI will become a new benchmark tool for GPUs and CPUs. :joy:

At the same time when you will bring back the model manager menu, right? :stuck_out_tongue:

Here is a version for anyone with the issue to test:

1 Like

This topic was automatically closed after 5 minutes. New replies are no longer allowed.