Nvidia RTX 3090 vs RTX A6000 VEAI benchmark (UPDATED)

Hi guys. Finally, I was be able to get on hand an RTX A6000 to show you guys a quick benchmark on VEAI 2.2.0 here. This benchmark might be changed if the developers change something.

Setup:
5950X + RTX 3090/RTX A6000
64GB DDR4 3600Mhz CL18

*1080p => 2160p (200%) (TIF)

  • Single RTX 3090:
    Artemis models => 0.32
    Dione models => 0.31 - 0.36 up and down
    Gaia models => 0.55
    Theia models => 0.48

  • Dual RTX 3090:
    Artemis models => 0.32
    Dione models => 0.29 - 0.34 up and down
    Gaia models => 0.36 (reduce machine load)
    Theia models => 0.43 (reduce machine load)

  • Single RTX A6000: (after reinstall VEAI 2.2.0 and drivers)
    Artemis models => 0.34
    Dione models => 0.31 - 0.36 up and down
    Gaia models=> 0.55
    Theia models => 0.48

  • Single RTX A6000 (v2.3.0 with Nvidia drivers 471.41)
    Artemis models => 0.30
    Dione/Gaia/Theia/Proteus models => untested (estimate 10% performance gain).

As you can see, a single RTX 3090 is a winner, and RTX 3090 SLI is only good for Gaia. With 75% performance gain in Gaia models, is it worth your spending x2? Your judge. But based on the price tag, a single RTX 3090 > RTX A6000. I researched that Topaz products are using OpenGL instead of OpenCL that will give consumer Geforce GPUs more raw performance than Quadro GPUs. So, if you want to spend all days with Topaz products, get a Geforce GPU for a cheaper price.
But RTX A6000 is smaller and able to run 24/7 (you know what I mean if youā€™re working on DVD or 4K Blu-Ray projects). RTX A6000 seems to runs smoothier and faster without thermal throttling like RTX 3090 after 2 hours (10% performance drop). Power consumption reduced by 50-75W with higher performance. Huge deal for NVLink.
However, if your budget doesnā€™t allow you to get RTX A6000, then stop at RTX 3090 is enough. Unless you really need ECC and other stuff come with it (drivers, design, compatibilityā€¦). Then you should not get RTX A6000 at all.

p/s: RTX A6000 has unoptimized and optimized drivers for VEAI, VEAI crashed a lot due to VRAM issue if you use unoptimized one. Just like with Studio Driver on GeForce cards. I had to downgrade to its older stable drivers. It doesnā€™t crash anymore. This is Nvidiaā€™s driver issue because they only optimize the drivers for Studio Driver only.

9 Likes

RTX A6000
Advantages over RTX 3090:

  • Lower power consumption: 400W total system (A6000) vs 500W total system (3090) and 700W (3090 SLI). I have to mention that most PSU under 1200W wonā€™t work with RTX 3090 SLI, you need 1200W PSU and above (2021 versions). Same PSUs, but 2021 versions usually fixed to be able to handle RTX 3090 spikes that push the power consumtion to 450-500W. Contact manufacturer for a RMA if needed to replace 2021 models. If you run RTX 3090 SLI, prepare to buy a 1600W Platinum PSU (good luck with that), or your system will shut off any minute.
  • Lower temperature: 75-80Ā°C (3090) vs 70-75Ā°C (A6000)(consistency). I tested with 4 instances amd the temp goes up to 85Ā°C, but the clock speed is very consistent which is pretty good. RTX 3090 with 2 instances will go up 80Ā°C+ instantly with clock speed reduces drastically.
  • I can now run 2-3 instances at the same time with less speed reduced (Artemis). If time is important to you, then donā€™t run 2 instances. You will end up with the same amount of time for each video.
  • A6000 uses 100% GPU usage without shuttering. Performance is always stay the same for hours. No thermal throttling after 30 min like RTX 3090 (you need to ramp up the fan to avoid this which is loud and annoying). On RTX 3090, the more VRAM you use, the more speed reduced. Therefore, more instances donā€™t save your time due to thermal throttling. You need watercooling system to cool RTX 3090, RTX 3090 SLI/NVLink.
  • Tensor cores on RTX A6000 performs a bit better than RTX 3090.
  • A6000 runs SLI/NVLink better than RTX 3090 SLI/NVlink. Easier to run multi GPU because of dual slot configuration. RTX 3090 NVlink will need 4 slot spacing between PCIe slots. But good luck with thermal.
  • IMPORTANT: GPU usage is very high (lowest 65% to 100%) on A6000 because most of its components are designed for professional workloads. Some people say why their GPUs are not being used 100% is because Geforce GPUs are not workstation cards, therefore, a Studio Driver is a must to use as much as possible from the cardā€™s resources. Some donā€™t want to admit that Geforce cards are for gaming only.

Disadvantages over RTX 3090:

  • Expensive, but A6000 runs multiple insances better than RTX 3090.
  • Artemis/Dione/Theia models have more crashes than Gaia models due to VRAM issue in VEAI. Newest Quadro driver sucks, only older stable driver for Production branches works. Just donā€™t install New Feature driver (just like donā€™t install Game Ready driver for VEAI, use Studio Driver instead)
  • Turn on ā€œReduce Machine Loadā€ will instantly crash VEAI.
  • With a bit better hardware, itā€™s still slower in some applications (maybe, but new drivers help). But unless you run VEAI 24/7 like me, you wonā€™t need A6000. Go with RTX 3090 Studio Driver and you will be fine.
  • Cost twice RTX 3090, or even tripple if you can get as MSRP. RTX 3090 SLI still gives you more raw performance, but VEAI doesnā€™t scale properly in Artemis/Dione/Theia models. If you use Artemis the most, donā€™t waste money on SLI or NVLink or multi GPU setup. Only Gaia models scale GPUs properly due to 100% GPU usage.
  • Useless ECC for VEAI, I hope VEAI will have ECC where artifacts can be fixed. (idk if itā€™s possible, just my own thoughts lol).
  • Slower VRAM, but no performance impact compared to RTX 3090 at stock. But GDDR6 was reported to be a hotter than GDDR6X. Still, no performance impact at all.
  • No overclocking ability. But some people wonā€™t overclock their GPU. With multi GPU setup, you will not want to overclock because of thermal and PSU limit. Stock A6000 should be on pair with OC RTX 3090 sometimes.
2 Likes

UPDATED: v2.2.0 Clean Install did NOT clean install at all.

After reinstall the older drivers (to prevent Artemis VRAM crash) and VEAI v2.2.0, it just runs faster (v2.1.1 speed). Maybe I installed too many versions of VEAI that caused instability issue. But I always choose Clean install which should clean everything right? This time, I uninstalled/removed VEAI completely using Control Panel. Then reinstalled VEAI again. Maybe the speed I lost in VEAI v2.2.0 is because of Clean Install option in VEAI that didnā€™t really clean install itself? But I switched back and forth between v2.2.0 and v2.1.1, and the speed was significantly different (30% perf lost in v2.2.0). I saw some people had the same issue. But if thatā€™s the case, then RTX 3090 and RTX A6000 should have the same performance.

2 Likes

Why ā€œReduce machine loadā€ in SLI/NVlink mode? Hereā€™s the bug in VEAI:
In SLI mode, if you donā€™t use Reduce machine load, you lose 15-20% performance. Weird right? No idea tho.
But VEAI (Gaia and Theia models specifically) runs faster that way. Lower power consumption, lower temp, faster. Works for me!!!
But it scales okay for me. But not the performance that Iā€™m looking for. Saving time is what I need due to a large volume of DVDs and Blu-Rays. But still cheaper than RTX A6000 in many ways.

If money is not your concern, dual RTX A6000 is better than dual RTX 3090.

2 Likes

Very comprehensive and informative experiments, thanks @viktorz3008 !

1 Like

Youā€™re so much welcome.

Have you tried testing RTX 3090 x 2 without SLI, but rather using the VEAI All GPUs setting?

Yes. I tested SLI without NVLink bridge.

Thanks for compiling this list @viktorz3008. Regarding the RTX 3090, can a single card run two instances of GAIA upscale to 4K (from say 720p) without significant slowdown? I know this is something that my RTX3080 cannot do with Gaia but can with Artemis. Nice setup btw.

Yes, Gaia models consume around 5.5 to 6GB of VRAM. So you need at least 12GB of VRAM and above (it might crash if you use Youtube or something that needs more VRAM, in this case 3080 Ti). So a single RTX 3090 is able to handle 2 instances just well. But you will end up with double amount of time. Thatā€™s why itā€™s not worth it to run multiple instances with Gaia models. Artemis models are okay to do so as long as you donā€™t hit 100% CPU usage (480p => 4K is okay, but 1080p => 4K uses 50% CPU usage already). So I can say, multiple instances are not worth it and not worth upgrading if you use Gaia only. For Artemis, each instance needs around 4-8 cores of CPU.

In conclusion, Gaia is very heavy, and depends on GPU the most (20% CPU as well). If you plan to upgrade to RTX 3090 for multiple instances in Gaia, then itā€™s not worth it.

2 Likes

Thanks for the reply doesnā€™t sound like Gaia would benefit from switching to the 3090. As you say itā€™s not just a limitation in GPU memory, you need GPU processing cores. You saved me a lot of trouble so thank you. Enjoy the RTX A6000!

2 Likes

Sorry for asking for reclarification, i have double 3090 setup and was considering getting a bridge for possible improvements.
When you say tested SLI without bridge, did you just have the cards as 2 separate units in the system or did you forced on sli without the bridge?

I am running the cards as 2 separate devices and use all gpuā€™s setting in menu
I also see little performance improvement in scaling.

But Chronos has almost perfect scaling, 2 gpuā€™s will halve processing time compared to 1 gpu

Each GPU will handle its job separately. GPU 1 will render odd frames, the other will render even frames. But if you donā€™t have 1600W PSU, I recommend you not to use both GPU at once. If you do have 1600W PSU, then no problem at all. SLI bridge isnā€™t necessary to run SLI, it depends on the software. SLI wonā€™t help anything in this case.
Multi GPUs only speed up Gaia models. Artemis models should run as same as single GPU. But I mostly use other models than Gaia, investing dual GPU is a waste to me. So I went for A6000 that handle a much better job than 3090.

About Chronos, I donā€™t use it much. But it looks promissing to me. If you see a major improvement with Chronos, then it might be as same as Gaia.

Oke cool, thanks for the clarification.
I can really support youā€™re psu claim, now i have a 1850W PSU because also TRX cpu as well but originally had a 1300W platinum PSU but suffered from random power off moments with the 3090ā€™s, swapped psu after that and no problemo.

I did consider the a6000 as well, but in the end 24gb is more than enough for my workloads and fortunately in things like resolve scaling with the double 3090 is very good.
Also when not working, digital coin gathering. Thatā€™s a workload the 3090 blows past the a6000.
Only the A100 above it :space_invader:

Chronos is certainly very promising, i really hope they implement something like the proteus 6 parameter model for Manuel fine tuning.

Proteus is almost the only model i use, got way better results with some tuning for restoration projects, hi quality real life footage and anime. Have not really compared its performance against other models though

The only issue that made me want to leave 3090 behind is the power. It draws too mcuh power for the same performance of A6000. And A6000 has power limit that hits very accurate and reliable. Eventhough I capped 3090 at 300W, it always seems to draw more than that. The electricity bill is noticeably higher than normal, itā€™s like Iā€™m mining coins really. So I decided to go for A6000 and running 10 instances of VEAI at the same time with 30GB of VRAM in use. It takes longer time, but at least itā€™s doing at the same time. Iā€™m pretty sure that having massive VRAM and reliability (never crash) are what Iā€™m looking for from a GPU.

Yea, the power draw is really insane, also no matter how you tune it makes almost no different. I suspect its mostly due to the GDDR6X, also one of the reasons the a6000 uses normal gddr6.
Also the cooling on most models of 3090 is horribly insufficient especially on the backside.
After two weeks after purchase hw info update came out that mem T junction could be read, like 110 degrees full blown.
Took apart all 3090, replaced the super cheapo thermal pads on both side, that saved 20 degrees. Some extra fin stacks on the backplate and now they remain cool for a long lifespan.
Although it heats the house to the point that is not funny any more and extra cooling is required.

For my on the road mission critical systems i use at live events Iā€™m currently running Quadro RTX 4000 cards, but now a4000 is somewhat available i will soon upgrade to those.
Seem decently powerful 3070 like in a one slot card.

You can replace the thermal paste as well with a good one and it will never hit 80Ā° with OC. Thatā€™s what I always do except Quadro ones. The blower cards are dead I believe and Nvidia just donā€™t want people to run SLI anymore. Running 2x 3090 makes my room hotter than a sexy girl honestly. Also the crashes came more frequently due to the power limit. I wish SLI could do something, but it only works with Gaia. If they release blower style GPUs in the future, I might switch and use SLI again. But the insane power draw sucked and the lights in my house were flickering every day. I really donā€™t want to run a PC with 1500W power draw.

haha that gave me a good laugh.
In my main rig i have the gigabyte blower versions, i just love blowers and mostly use rackserver cases with all slots full with different pcie cards so its often the most elegant solution.
I did a repaste with one, the other with liquid metal on the core but there both keeping cool, its only really the memory and VRM on the backside that remains hot.

In my game vr rig i have the 3090 asus blower version but thats really really trash. very small backplate and overheats like crazy, ended up just sticking a 1u sp3 cooler on the backside to keep it cool.

If i may give some advice, never touch the power limit when going for efficiency/stabiltiy. Its really like a blunt limiting solution that just squeezes the card. I get far better results locking the card at a certain place on the Volt/Freq curve in afterburner and let the power limit stay at 100%.
You can get well under 300w with zero stability issues, locking on the Volt/Freq curve strangely also works on quadroā€™s.

I did orderd a NVlink bridge today because was busy with NDI video, new version can also do video encoding of xx amount of steams scaling multiple GPUā€™s for low latency video over IP distribution.
I have unlocked my NVenc session limit in the GPU kernel drivers so the 3090ā€™s can do quadro amounts of encode session so hopefully the bridge gives some benifits in that specific application while scaling
Because of the nature of low latency real time aspect having the data flow copyā€™s between gpuā€™s creates a lot of overhead in the system atm.
I will ofcourse do some extra topaz software testing after that but will probably come to the same conclusion as you did.
Altough if i find something interesting will let you know.:v::grin:

The reason why I told you that Nvlink bridge is a waste because itā€™s only useful when 2 GPUs ā€œtalkā€ to each other. But in this case, VEAI lets each GPU work separately. Each one completes its job, GPU 1 renders even frames, GPU 2 renders odd frame. Then they combine everything together (but frame error might occur sometimes). So the bridge wonā€™t help anything here. Just wanma save you $80 bucks because I have a brand new bridge on my shelf that I will never use. So you can go for it if you want, but donā€™t waste your money for that useless bridge. At least for VEAI.

Hmm, my use case is a bit more complex, i have 3x 4k Camera feeds comming in over the network, using facial recognition and tracking on the feeds to create 3 different shots per feed. Use the nvidia maxime SDK but unfortunatly this part of the proces is still only running on the main gpu. After the processing the 9 Full HD feeds that are created need to be encoded and send back out to the network, all in realtime, the whole process had about 2 frames latency. Because i use NDI HX for the outgoing feeds that uses HEVC in a interframe encoding style alternating frame rendering will not work. Since gpu 1 does the heavy 3d work gpu 2 does most of the stream encoding, like 7 from 9.