How much GPU should video AI be using

so im just bought this and after running it im noticing only 2GB of vram is being used,
I have some experience doing ML myself and for something involving image generation this seems low to me,
is the the GPU being fully utilized? I have a 4090

4 Likes

I have a 12GB 6700XT and when the software actually works it just uses about 40% and a third of the RAM. CPU (16 threads) sits at like 10%.

I sit there going “Why??? Why??? Use ALL THE POWER DAMMIT!”

Why can’t they just add a ‘Turbo Mode’? Same goes for Gigapixel. I don’t get why such potentially intensive apps are so timid at using available horsepower. I don’t want to run three side by side slowly I just want to churn through one really fast.

5 Likes

It is a common part of the personal computer history that applications but also operating systems at times do not perform well and that the user has to buy a new pc every year to make things run acceptable again. this started in the 1990’s when people thought a personal computer would be better than a macintosh, an atari or amiga because they expected to buy a pc once and then to get it up to date by replacing parts forever. in reality they bought a new pc every year because the latest windows and applications became slower and slower where the software developers didn’t care about much often giving the same tip: buy a new pc … :neutral_face:

Oh indeed. Seen developers use advances in hardware to cover up their poor skills many a time.

Unfortunately, they have now been overtaken and they can’t harness the power properly.

“What? Customers want the code to recognise more than two CPU cores and more than 2GB of RAM in 2022???”

2 Likes

its just with a 4090 I thought it would go faster than what its doing right now, but it doesn’t seem to make a difference

I have a 4090 too, combined with a 5950 and it doesn’t really go much faster than the 3080 before.
But it can do more work in parallel, which is also the only way to get anywhere closer to full utilization. 3xArtemis 1080p denoising tasks in parallel are no problem. It’s also about 50% more power efficient than the 3080 before.

1 Like

My 4090 gets 100% utilisation or else 44% depending on the settings

What are the settings where u get 100% utilization and whats your CPU?

I am only managing close to 100% utilization of my 4090 when I for example run 3x1080p Artemis High denoise with NVENC H265 encoder (needs driver patch for 3 parallel H265 encode streams), using my 5950 locked @ 4.6Ghz @ 1.3v.

image
Note: all are 1080p tasks, but the 2 tasks running @ 3.8fps are using a 130 pixel crop at top and bottom (using ffmpeg pad to re-add black borders for essentially no cost)

The great thing about the 4090 (2700Mhz@0.925v) is that it needs less than 200W for these 3 parallel tasks
image

The pc I just built last week cpu is a 7950x, 128GB of ddr5, 4090. I will be more clear to not mislead you, there are still some combinations that lead to not full utilization of the gpu for me. Mainly it seems to be related to frame interpolation while deinterlacing? but not always.

There is a test file and recommended settings on this thread Comparison thread: How many seconds per frame for various processors/GPUs?
When I did that test I was getting 100% utilization of the 4090. Could you try that test and see if you are also getting 100%? 2x upscale gaia HQ computer generated. no fps change. I just reran that test and it completed in 3 minutes 44 seconds 0.096551 sec/frame

I was doing a back to back test between AV1 h.264 and h265 yesterday at varying bitrates. The input footage was XDCAM50 mxf (that is interlaced 25i 1920x1080 422 50mbps). Using just the deon dv deinterlace to 108025p I was seeing 100% utilisation
I’ve just repeated that xdcam50 test but with 4k50p prores output I was getting 25%-ish 4090 utilization. Deinterlacing to 1080p25 using dione dv it’s 100% gpu utilization.

This morning I’ve taken a 1920x108025p 11.5mbps h.264 noisy mov. I had a constant 80%-100% utilization of the 4090 while doing a conversion to 50p 4k using chronos and atermis low quality noisy setting. Then trying to match your settings with that same file I chose artemis high quality same resolution and framerate output (1920x108025p) and used the nvidia h265 main encoder and got the same 90%+ utilisation for the entire file 0.13sec/frame.

I haven’t got the ffmpeg commands to work yet so I’ve been using the gui

I tested the sample movie with Gaia x2 and got a processing time of 4:52 (5950 + 4090). GPU utilization was fluctuating only between 50%- and 60% though (according to MSI-Afterburner). Your time in the comparison thread for that task was 4:17. The 20% difference still wouldn’t bring GPU utilization close to 100%. But did u say that u managed 3:44 in the same test on another occasion? That would be 40% better performance and would actually bring the GPU up to 90%-100% utilization (50%-60% x 1.4).

I could have had other stuff running in the background of the other test, there was nothing else open when I got that lower time this morning

I noticed a performance regression since before.
It felt like “2022-11 Cumulative Update for Windows 11 Version 22H2 for x64-based Systems (KB5019980)” was the only change I had made as well as updating to 3.0.2
instead of the previous PB of 0.09 sec/frame now I’m getting in the region of 0.14-0.22 sec/frame and max 60% utilization

I tried rolling back that windows update as well as going back to 3.0.2 but I can’t get the old performance back.

I’ve also tried 3 different nvidia drivers since noticing the issue and none are effecting the result.

Not sure what to do

edit:
Nvidia control panel, let 3d application decide quality, and manage 3d settings → global settings → Power management mode = Prefer maximum performance

seems to have got it back to 100% usage. Would be good to see if some of the other 4090 uses can recreate this and maybe get themselves more performance

1 Like

Something has always been fishy here (even with th 2x series). I have an RTX 3080 Ti, and have set memory usage to the default 90% (you always want a little left for your youtube experience inbetween). To my astonishment, though, GPU-Z only reports ca. 4G of memory in use (and less when not upscaling to 4k). Whereas I have 12G available.

As long as that doesn’t equate to ‘Always use the GPU at 100% load’, then this sounds interesting. :slight_smile:

No I don’t think it’s like that but it’s more likely it probably stops down clocking the gpu for smaller workloads or something however I think it’s too early to call whether it fixes it. I ran the same test again and I was getting 0.11 sec/frame which is fine compared to my best ever 0.09 sec / frame for the same test. However this time it was 10-12% gpu utilisation instead of 100% when I scored 0.09 sec/frame…

But immediately before I posted my edit above I had gone back to 100% gpu utilization and 0.09 sec/frame. So there’s a definite issue somewhere and I’m not sure if its Topaz or windows or nvidia drivers.

I’ve seen other issues to go with the gpu scheduler (I’ve tried with and without this option turned on) so maybe I should just wait a few weeks and see if a windows update fixes it

Unfortunate that depending on the way the wind blows it can either be 0.09sec/frame or 0.4 sec/frame if there’s some slight change somewhere along the chain…

2 Likes

In the past the nvidia support did tell me if i want fully performance set the settings to maximum in the 3D Panel.

The question is, of course, what the point is of going full performance on the card, when VEAI does not supply enough data to the graphics card to process. Looks to me VEAI is simply underusing the GPU.

The best way to compare TVAI is with Premiere Pro or Resolve.

The graphics card is only an accelerator, the CPU prepares the data, i.e. the CPU is more important than the GPU.

Theoretically, a whole DVD (4 GB) would fit in the graphics card memory, but the data must be prepared for the GPU, as described.

[claim, Assumption]
I’ll just say that in the future we will see a higher GPU load but the speed will remain the same and the quality will be better.
[/claim, Assumption]

What we do here is not rendering but GPGPU processing.

(Translated from the German Wikipedia site about GPGPU processing)

"GPGPU emerged from the shaders of graphics processors. Its strength lies in the simultaneous execution of uniform tasks, such as coloring pixels or multiplying large matrices. Since the speed increase of modern processors can currently no longer be achieved (primarily) by increasing the clock speed, parallelization is an important factor in achieving higher computing performance of modern computers. The advantage of using the GPU over the CPU is the higher computing power and memory bandwidth. The speed is mainly achieved by the high degree of parallelism of the GPU’s computing operations.

The disadvantage compared to conventional CPUs is the massive parallelism with which the programs have to be executed to take advantage of them. GPUs are also limited in the range of functions. "


Just look at the benchmarks from Pugetsystems of the 4080.

Its just the same with Resolve and Premiere as with TVAI.

https://www.pugetsystems.com/labs/articles/nvidia-geforce-rtx-4080-16gb-content-creation-review/

1 Like