What does VEAI performance look like between AMD/NVIDIA?

shikuteshi · December 15, 2020, 9:22am

Compiled Information (Work In Progress): VEAI Community Performance Overview - Google Sheets

Contrary to the title. From user input below we should be looking at CPU impact on performance even if the program is being GPU accelerated.

vvv Original Post vvv

I see that VEAI now supports AMD GPU’s and NVIDIA GPU’s including Tensor Core support for RTX cards.

For anyone willing to help with answering my question, and for the sake of consistent results between hardware. I would ask you give the results from using this old clip of mine. Black Desert Online-Wheel.mp4 - Google Drive

Probably with Gaia-CG at 2x scale and any other results you’ll help submit. I’ve read the patch notes and it says that Gaia-CG has some quality issues but we’re just looking for numbers here.

I’ve already expired my trial so I’m unable to give the results from my GTX 1080. I’m also curious if RTX owners are able to toggle between CUDA and Tensor Cores or if it defaults/locks to one of those since seemingly optimized Tensor Cores could be really fast.

deeImgGuy · December 15, 2020, 5:10pm

There is definitely no toggle between CUDA and Tensor Cores, and the consensus of those with the tools to tell is that the Tensor Cores are NOT being used yet, despite release notes saying they are.

charoldpctech · December 18, 2020, 4:29pm

So I downloaded your video and did some tests with a 3060ti, full specs and results below

CPU: Ryzen 5 3600
MEM: 32GB
GPU: RTX 3060ti @ 1905mhz core and 8000mhz memory
Topaz VEIA V1.7.1

Gaia-CG v5 : 200% : output to MOV: 0.40s/frame : about 15mins to complete
Theia-Fidelity v4 : output to MOV: 0.36s/frame : about 13mins to complete
Theia-Fidelity v4 : output to JPEG: 0.36s/frame : about 13mins to complete
Theia-Detail v3: output to MOV : 0.36s/frame

I did some benchmarking with a 1660super before I sold it as well compared to the 3060ti, those are below

Theia-Fidelity v4 200%

Res 1660s 3060ti
480 0.34s 0.16s
720 0.94s 0.33s
1080 2.22s 0.72s

shikuteshi · December 19, 2020, 10:23am

I asked my friend with pretty much the exact same specs as me (except half the system RAM at 16GB which I doubt is any issue) to run the test for me.

So he’s equipped with
CPU: Intel 6700K
RAM: 16GB 2133Mhz
GPU: GTX 1080 @ and I’m assuming here stock speeds which is ~1750Mhz core and 10000Mhz Memory
VEAI 1.7.1

He only did Gaia-CG for me which I also assume is v5 since they’re updated along with VEAI versions @ 200% scale at around 2seconds/frame.

I could’ve confirmed exact details with him but he’s now asleep. Though I think this is close to accurate since I we built similar to exact PC’s at the same time.

Comparing against the 3060ti above.
A 3060Ti is almost double the CUDA count of a 1080 but it has over double the performance in VEAI and I can’t imagine it scaling more than linearly at best.
So either it’s a newer generation CUDA, benefits from the GPU architecture entirely, benefits from simply being an 8nm die, the additional 2GB of VRAM and GDDR6 bandwidth is more important than we realize, or contrary to the ‘consensus of those with the tools’ VEAI can actually make use of the RTX Tensor Cores.

But it could also be all of the above except for Tensor Cores that work in conjunction to get such performance. Though we’d need to be looking at a meter than can read such usage to be sure. I’m aware that Task Manager can accept and show various compute usages such as 3D rendering, Video Decoding/Encoding, CUDA usage, VR output, and some other vague metrics from what I’m seeing.

EDIT:
The 1660S vs 3060Ti confuses me though.
The 3060Ti is roughly 3x faster with a little over 3x the CUDA count. But compared to the difference with the 1080 I imagined it be bigger due to the 1660S having 2GB less VRAM and, while it’s GDDR6, a slower bandwidth interface.

So I’m thinking maybe it is just all CUDA and no Tensor Cores now. But VRAM generation plays a part. Not specifically speed with how it’s integrated but just due to the generation benefits. ie, GDDR5X vs GDDR6. I don’t know it’s late and I’m probably just spouting nonsense at this hour. (it’s 3a.m.)
More results would be greatly appreciated. Although I see version 1.8.0 released but it doesn’t say it updated Gaia-CG. So hopefully we get more of those if it’s still on v5 before it updates.

lhkjacky · December 20, 2020, 12:38pm

CPU: Ryzen 7 1700
MEM: 32GB
GPU: RX580
VRAM: 4GB
Topaz VEIA V1.8.0

Gaia-CG v5 : 200%
720p 2x → 1440p: 0.98 sec/frame

Theia-Fidelity v4: 200%
720p 2x → 1440p: 0.67 sec/frame

Operating System: Windows 10 Version 1903
Graphics Hardware: Radeon RX 580 Series
OpenGL Driver: 3.3.14736 Core Profile Forward-Compatible Context 20.9.1 27.20.12029.1000

shikuteshi · December 25, 2020, 10:02pm

So lucky me. Topaz has a 45% off holiday sale and I know own a copy of VEAI.

But my results are about the same as previously mentioned since I share the same rig as my friend. Although my memory is overclocked to 11000Mhz which only brings me down to ~1.90sec/frame on my GTX 1080 with Gaia-CG v5 @ 200% scaling and MOV output. I’ll try to make a compiled results chart and do more test with other models later.

ahilecostas · January 4, 2021, 8:36am

I have been using mostly, Artemis LQ and Gaia HQ for upsizing low quality videos to about 200 %. They turn up fine without artifacts. Average is 0.14 - 0.20 sec/frame in both cases.

Hardware is : I7 9700F, all 8 cores constantly at 4.5 GHZ, 64 GB DDR4 3000, Samsung SSD 850 EVO, AMD Radeon RX 480, Sapphire OC, VRAM 8 GB (GPU-1306 MHZ, Memory-2000 MHZ). Video driver is Radeon™ Pro Software for Enterprise, Revision Number 20.Q4

jasonkeopka · January 4, 2021, 11:50am

For science:

CPU: Ryzen 9 3900X (Stock settings)
RAM: 32 GB DDR4 @ 2133 MHz (Base settings, no XMP enabled)
Memory Fabric: 1200 MHz (Default)
GPU: Radeon 6800 XT (Stock Settings)

Gaia v5 .29-.30 per frame [11 mins] (highest mem usage setting, 200%)
Gaia v5 .37-.38 per frame [13 mins] (lowest mem usage setting, 200%)

CPU: Ryzen 9 3900X (Stock settings)
RAM: 32 GB DDR4 @ 3600 MHz (XMP enabled)
Memory Fabric: 1800 MHz
GPU: Radeon 6800 XT (Stock Settings)

Gaia v5 .25-.26 per frame [9 mins] (highest mem usage setting, 200%)
Gaia v5 .33-.34 per frame [12 mins] (lowest mem usage setting, 200%)

It made no discernable difference whether I chose MP4 or MOV output.
For AMD users, it looks like memory speed and memory fabric clock make a fairly significant impact on performance.

OS: Windows 10 Pro Build 19041
OpenGL API Version: 4.6
GPU Driver: 20.12.1

nfsking2 · January 4, 2021, 7:34pm

It looks like CPU performance matters much more than GPU performance, and Intel CPU platform seems to have better overall performance than AMD platform, even though GPU acceleration is enabled.

@charoldpctech got Ryzen 5 3600 with RTX 3060Ti, and the performance was around 0.36~0.4s/frame.

@ahilecostas got i7 9700F all cores at 4.5GHz with RX480 8G, and the performance was around 0.14~0.2s/frame.

@jasonkeopka got Ryzen 9 3900X at stock clock speed with 6800XT, and the performance depending on different Memory and Fabric speed, was around 0.25~0.38s/frame.

I also gave it a little bit try on my several PCs and laptop ( VEAI 1.8.1 trial version):

Software settings: Artemis LQ, 200%, highest vram usage, no reduce GPU load, model download enabled.

Intel 7980XE (18-cores) all cores at 4.6GHz, 4000MHz memory speed, RTX 3090 at 2100MHz GPU and 10000MHz VRAM: around 0.12~0.17s/frame.
Intel 6950X (10-cores) all cores at 4.3GHz, 3200MHz memory speed, RTX 3090 at 2000MHz GPU and 10000MHz VRAM: around 0.16~0.2s/frame.
AMD Ryzen 5 3600X (6-cores) all cores at 4.3GHz, 3266MHz memory speed (synced Fabric speed) , RX580 8GB at 1350MHz GPU and VRAM stock speed, around 0.5~0.56s/frame.
Intel 8809G (Intel Hades Cayon NUC, 4-cores, power limit unlocked) all cores at 4.5GHz, 3200MHz memory speed, RTX 2080Ti (connected to machine with a Razer Core X Thunderbolt 3 eGPU box) at 2100MHz GPU and 8500MHz VRAM, around 0.3~0.35s/frame.
Intel 8750H (ThinkPad X1 Extreme Gen1, 6-cores) all cores at 3.9GHz, 2666MHz memory speed, GTX 1050Ti Max-Q at stock speed, around 1.9~1.95s/frame.

The poor performance of the ThinkPad could be the overall power limit of the laptop, it made the CPU usage kept at around 20%, and ThrottleStop software indicated the CPU power would hit the power limited at only 15W when VEAI was running. I guess the result would be much better with an external GPU as the CPU power alone could be unlocked by Intel XTU and kept at 65~70W running CineBench.

Other 4 PCs, 7980XE at 4.6GHz got the best performance along with much higher CPU power consumption, 6950X at 4.3GHz was the 2nd.

Hades Cayon NUC was surprisingly the 3rd fastest one with an external GPU, even though it got less CPU cores than Ryzen 5 3600X, but with a little higher all cores clock speed.

Another unusual thing I’ve noticed, is that the CPU usage on Intel CPU usually kept at 45~60%, while on my Ryzen 5 3600X platform, it only kept at around 25%~30%, almost only a half compared to Intel platforms.

On the other hand, with Ryzen 5 and RX580 8G, the GPU usage kept at almost 100%, but on Intel platforms with RTX 3090, GPU usage only kept at around 15% more or less.

charoldpctech · January 4, 2021, 7:47pm

Interesting @nfsking2
I recently got a Ryzen 7 5800x which beats the best Intel 8c/16t cpu in pretty much everything, so once this current job is finished, I’ll test again.

I also noticed that in 1.8.x I could adjust GPU power, and I found my avg s/frame went down significantly. Also the new artemis algo is faster than Theia, while being very close (or better) in quality in 1.8.x as well.

nfsking2 · January 4, 2021, 8:04pm

I still didn’t figure out which was more important for this application, the clock speed or the number of cores, or both of these two.

Theoretical, Ryzen 7 5800X has one of the best single core performance among all the ‘normal’ CPUs at stock speed, but it cannot beat 10900K and 7980XE or even an overclocked 6950X in multi-threaded benchmark, like Cinebench, e.g.

The most wired thing was that an overclocked R5 3600X got slower speed in VEAI than the 8809G, which has only 4 cores. Same thing happened on others’ PCs, as I quoted, R5 3600 with RTX 3060 got 0.36~0.4s/frame, and Ryzen 9 3900X with 6800XT only got 0.25~0.38s/frame, much slower than it should be. I think there could be huge room to optimize on AMD CPUs.

ahilecostas · January 4, 2021, 9:36pm

I also want to mention constant instability and poor performance on Radeon GPU with Adrenalin gaming drivers, everything magically gone since i use Radeon™ Pro Software for Enterprise. Symptom was uneven clock and memory speeds. Besides all 8 cores of the CPU load at about 50 % (parking disabled) during the rendering process and now GPU is at 100 % all the time (Very stable RX 480 made by Sapphire-excellent cooling).
I can run everything 24 h with same results.

charoldpctech · January 5, 2021, 12:26am

The Ryzen 7 5800x is only slower at multi-threaded tasks than the Intel 7980XE, because it has more than double the cores obviously. You might want to check benchmarks again, but the 5800x is going to beat any 8c/16t Intel other than AVX512 and Quicksync optimized software and very specific tasks where Intel has an advantage.

I also just noticed you’re comparing Artemis times to Gaia-CG times, which makes a huge difference (about double).

Using the exact same settings on VEIA 1.8.1 with the following specs

Ryzen 7 5800x w/ PBO2 enabled - 32gb ddr4-3200 - RTX 3060ti @ stock with 108% power - 0.18-0.19s/frame

On Gaia-CG I got only 0.33s/f - which is still quite an improvement over the previous results of 0.4s/f, but that was on both a Ryzen 5 3600 and VEIA 1.7.1, so without being able to isolate one of those variables at the moment, it’s hard to say how much of a different the CPU vs the software is making.

shikuteshi · January 5, 2021, 3:52am

A compiled spreadsheet has been made.
I don’t remember much with how to use spreadsheet anymore so I don’t know if filtering will work as expected. You’ll have to clone it though.

There are some observations to be made. But I think some of that has already been said here.

Such as CPU seems to have quite a performance impact. And within that, Intel seems to dominate over AMD due to what appears to be lack of proper support for Ryzen’s multi chiplet design.

However, AMD GPU’s seem to have an edge over Nvidia. Even older generation AMD cards could beat the latest RTX 30 series. But if AI Interpolation may mean anything, CUDA code being converted to another variant, in this case it could be OpenGL or similar, then the visual output could be different. It could be faster but it could also be worse.

Though a mix of AMD and Nvidia users would have to agree to compare. Meaning we’d use a proper 1080p high bitrate clip to 4K and upload it. Unless of course someone owned both types.

charoldpctech · January 5, 2021, 4:45am

You can’t make much from the compiled list we have here. Several “tests” we don’t have the VEIA version number, I’m not entirely sure what the numbers @ahilecostas quoted means since “low quality videos” could mean several things, quite a few different settings, etc. If anything we should consider having folks re-run their tests in a more controlled manner, using the same video file, on the latest VEIA version, with the same set of test parameters for versions older than 1.8.1, trying to match hardware setups as close as possible.

I have a fairly decent selection of hardware including a Ryzen 7 2700, r5 3600, r7 5800x, Core i5-3550, AMD RX570, RX580, and the RTX 3060ti

I think the next test I will run is pairing the RX580 with my R7 5800x, and compare the results to see if it seems AMD GPUs are faster relative to their Nvidia counter-parts. I wish I had a faster Intel cpu to test as well, but other than the R7 5800x, the AMD CPUs offered too much value to pass up for my needs.

ahilecostas · January 5, 2021, 5:16am

I don’t want to compare apples with oranges, but from my part, i have the impression that i get the same smooth type of performance i receive from Magix Vegas 18 (former Sony Vegas) using Intel CPU and AMD GPU. I do think it’s OPEN CL based with a big push forward from the CPU. Vegas favors AMD GPU’s and i’m not talking about render times but the ease and smooth feeling in editing regardless how complicated the project is. For me After Effects runs horrible even at 64 GB Ram, except for the final export encoding part that uses the OPEN CL Mercury engine.

ahilecostas · January 5, 2021, 9:05am

Low quality video ? 320 x 240 upscaled to 1920X1080 (450%) using Theia Fidelity v4 i get constant 0.14 sec/frame
Same speed for 640x480, 720x 480, 960x720 to full HD…Doesen’t get better or worse

nfsking2 · January 5, 2021, 6:53pm

Don’t get me wrong. Arguing about which CPU has better overall performance is not necessary in this case.

With the video file provided by @shikuteshi, running 200% Gaia-CG model in VEIA 1.8.1 on my 8809G @ 4.5G - 2080Ti eGPU, I got 0.30~0.31s/frame, while running 200% Artemis LQ v9 model, I got 0.18~0.20s/frame, almost the same as your result.

Both the single core performance and multi-core performance of Ryzen 7 5800x with PBO2 enabled will be much much better than a 4.5G 8809G (which is almost got the same performance of a 7700K at stock speed) with no doubt.

And correct me if I’m wrong, you got only slightly faster processing speed with R7 5800x - 3060Ti than R5 3600 - 3060Ti. But with my experience, the processing speed of VEIA 1.8.1 on 6950X at stock speed and 4.3GHz was significantly different, so actually the CPU performance does matter.

Maybe a better GPU did some help on my case, but still, a 9700F @ 4.5GHz with RX480 could get 0.14~0.2s/frame on Artemis LQ model, almost the same speed as my 8809G - 2080Ti combination.

All these results seemed to proved my own opion again, that performance on AMD CPU might be abnormal when running VEAI. Or maybe it was meant to be like this due to some reasons, and I’m really curious about it.

shikuteshi · January 6, 2021, 12:37am

In case anyone missed it. suraj, a Topaz developer has posted a thread about VEAI performance. VEAI Performance

57110621 · January 28, 2021, 9:38am

I think everyone should record the configuration of the computer comprehensively, such as hard disk read and write speed, memory frequency, motherboard model, PCIE4.0 compatibility, power supply and heat dissipation. These may all affect the overall efficiency of video conversion. Many people just focus on the CPU and graphics card. Although these are also important, other factors should also be considered. THX!