Select any combination of GPUs for processing

reiner · December 17, 2021, 10:18am

In case someone has multiple GPUs in a system - or a Card with 2 GPUs on it (Titan Z, some Teslas, GTX 690, Radeon Pro Duo, etc…), it would be very helpfull to make them selectable indivudally.

At the moment, we can either select “all”, “CPU” or “one”…

Example: In case one has a System with a build in iGPU and also two seperate dedicated GPUs, its only possible to select “all” - in this case, the much slower iGPUI drags down the speed of the tripplet, because the data is not divided according to each cards speed, but evenly - making the faster GPU Cores have to wait for the slower iGPU.

Example two: Someone has a Workstation with a small GPU for Display and some headless GPUs for rendering, Calculations, whatever… With Nvidia Cards, one can set them to wddm mode and use them in VEAI - but the slower display card will also slow down the whole process.

Instead of the dropdown list, a simple “tick each GPU” field would do the job…

TPX · December 23, 2021, 9:27am

As far as i know, headless gpus will not work, directx (directML) is needed.

When i set my second QRTX 5K to tcc veai does not calculate.

reiner · December 23, 2021, 2:37pm

Yes, it works, I do it all the time:

Set your GPU inside an admin command prompt to WDDM Mode via:

nvidia-smi -dm 0
(you might have to add your card number, depending on the system)

then resart - and your GPU now works in Windows Mode, offering Vulkan, OÜENCL, CUDA; DirecML, DirectX, etc… Depending on which GPU you have, even Rytracing…

You could even game on this …

Of course, IF the GPU lacks DX Suport on a silicon Level (like some data center cards), this won´t work, but most are able to do it, I only know of the Power-PC driven ones in Super computers or some Aldebaran variants of AMD which have different silicon than the datacenter/consumer/pro cards…

I did this on Teslas Kepler/Maxwell/Pascal/Turing as wellas Grid cards, all headless…

TPX · December 23, 2021, 6:42pm

0 = WDDM

1 = TCC

Exactly, actually I wanted to run the 2nd GPU in TCC but then VEAI or other TL software does not work.

That it is with the Teslas exactly the other way around I had not on the screen.

Someone on Youtube said that the MI100 would also get an Active Cooling model.

This will then replace the PRO VII. <----- which I find very cool.

reiner · December 23, 2021, 7:12pm

TCC Mode only offers CUDA (and in some cases OPENCL) and does not use the windows driver mode. For CUDA, it can have some speed benefits, since the windows-overhead is not present…
But if you want to do anything “windows related”, you have to set it to WDDM Mode

So it´s perfectly normal that it doesn`t show up in windows, if set to TCC mode…

Some Cards are set to default to TCC (Compute cards like Tesla), others are default to WDDM (like Quadros or Geforce).

As long as the chip on the card is the same as one present on a consumer or worksation (Quadro) Card, chances of being able to use them in windows in WDDM Mode are 99,9%

I´ve tested (and all worked in WDDM Mode - i`ll only list those which make sense in VEAI, those not supported anymore or simply too slow I´ll leave out - like older fermi ones which used to work, too…)

Tesla M40
Tesla M60
Grid K520
Grid K320
Tesla K40
Tesla K20(x/c, etc…)
Tesla K80 (here is one example where my proposition will be important - 2 Chips on the card…)
Tesla P100
Tesla P40
Grid K2
Grid K1 (ok, way too slow, but works…)
Tesla M6
Tesla M4

Chips like GA100 (Ampere Cluster Card for Compute Clusters have special variants with NO other circuits/firmware than CUDA/OPENCL - those can`t be set to WDDM (hardware does not support it).

Looking at the cards “normal people” will get theyr hands on - chances that a headless GPU will work just fine when set to WDDM are VERY high.

The diffiulty simply is, that many people don`t know how to set them up, have the wrong motherboard (lacking "abovce 4G decoding for bigger cards,), set UEFI wrong, or simply don´t know about how to set them to WDDM… The fact that most guides on the internet are either - old / wrong / too complicated doesn´t add to the impression, that this is “way too complicated” to get it running - the cooling solution also doesn´t “help”…

Aside from headless GPUs - my proposition also applies to people having more than two Chips in the system. Of course, everone only using one GPU doesn´t care about that - but I know enough people at offices having three quadros in they workstation for comutational stuff - one being a small card only to feed the displays, two being quite powerful Turing Quadros - which can´t be set to good use in VEAI because they can´t be selected (only together with the smaller card)…

TPX · December 23, 2021, 11:12pm

Nice that you are able to test so much gpus.

Someone did test two RTX Quadro 6000 for me, with a Bridge (old engine of denoise back in 2020), but the perfomance was not as good as with one card, the first one did write into the memory of the second one over the bridge and this is much slower than simply using them in paralell.

The memory pooling would only be a benefit in perfomance if veai would use huge amounts of memory.

reiner · December 26, 2021, 7:55pm

SLI and NVLINK rarley help in these scenarios, often slow it down.
In Gigapixel I also encountered a lot more slow downs if more than one card was used. The overhead of file handling / opening / etc… seems not be optimal for “two card scenarios” at the moment with the “foto” software of Topaz.
But with VEAI, especially the larger resolutions, quite a nice speed up is possible from multiple cards.

TPX · December 26, 2021, 9:22pm

I am quite satisfied with my two Q RTX 5000, the speed in Denoise is very good.

But yes, for single images or when the models change, it takes a long time to load the data for both GPUs.

I’m thinking about switching to a W6800.