Does / will VEAI use AVX 512 instructions? Zen 4 implications

Gigapixel AI does and Zen 4 CPUs are showing very impressive speed boosts using it.

I’m considering whether a new Zen 4 PC might be worthwhile and VEAI performance is an important deciding factor. Similar speed increases to those in Gigapixel would be awesome.

3 Likes

The thing is that GPUs are much faster and they are already being used.

AVX-512 won’t come that fast, it would have to be supported by Intel, then it would be something else.

AVX2 is currently used in Photo A.I. for the autopilot, at least my CPU crashes when I have the offset very high, , so I assume it will be used here.

The devs just said today that it isn’t being used (they mistakenly thought it was) so the next installer won’t look for AVX2 compatible CPUs.

1 Like

The biggest problem for me with Zen 4, is it’s a total re-haul of my system.
Basically the only parts I could reuse are storage drives and the case. (GPU, but I might as well get a new one with all the rest of the new parts.)

The next issue is the Zen 4 power usage. It draws more power than Zen 3 and run much more hot.
It would be more tempting to me if they came out with an energy focused CPU.

And its not that faster.

It seems like it does not scale well with more ghz like Zen2 did.

To be fair, the Zen 4 (5nm) is more power efficient than Zen3 (7nm), they just run hot by design for better benchmark score. Compare to Intel 12/13th Gen, Zen4 is still using less power.

You can under volt the Zen 4 processor in the bios, by setting “Curve Optimizer” to negative value in PBO2. It should lower the temperature of the CPU without performance hit.

Also you may set “Platform Thermal Throttle Temperature” manually in PBO2.

Lastly, you can set the Zen4 to ECO mode by adjusting the PPT (peak power consumption) Limit, it can lower the temperature of Zen4 to around 60 degree. The biggest surprise is, Zen4 still get pretty high benchmark score in ECO mode.

1 Like

Ah okay. I’ve only seen default settings as done by Gamers Nexus.
That’s actually really good to hear.

To be clear, one of the main, first, best case scenario uses of (some) AVX 512 instructions is for video upscaling.

‘Those instructions can provide a substantial boost in applications such as video upscaling.’

As someone who does a lot of video upscaling I am highly likely to buy whichever program and CPU utilise these.

My understanding is that VEAI 3.xx is in the feature development phase rather than efficiency/ optimisation. When the time for optimisation comes, I look forward to learning if AVX 512 instructions are something that will be considered.

It seem that Zen4 CPU work pretty well with VEAI. :partying_face:

Here is the result from Coreteks,

Thats CPU only, who does use CPUs for VEAI?

I did wonder, someone in the forum did say that the 12600K is the fastest CPU at the time for VEAI.

Found a good article about this:

“We asked Topaz Labs directly about this and received a confirmation that these programs do indeed use VNNI on Zen 4. And these instructions also, despite the fact that AMD implemented AVX-512 using 256-bit units, clearly have enough performance to make it worthwhile. So these scores are not some weird anomaly and do show a legitimate result – the speed boost is so anomalous because it is a case of accelerating specific operation and not general code performance.”

2 Likes

I do a lot of x265 encoding, and originally looked forward to having AVX 512 instructions… until I actually got them. And let me tell you, they’re crap. So much so even, that Intel has demanded motherboard manifacturers disable them even on existing i9 12900K processors (via microcode, via BIOS updates). And no, that’s not just a commercial decision.

Okay, but why are they crap, you ask? 2 words: power consumption. They consume so much power, that you need to run them with a negative speed offset in your BIOS to begin with. And even then they are too power costly. There are limits on how much power an Intel CPU can draw, at once, as a peak, and continuously (sustained). With AVX 512 on, your CPU will start to throttle, ere long. Not because the package is getting too hot, but simply because power limits are getting reached. Simple, bypass power limits then, right? (Allowed in the better BIOS-es) No, again, as then heat starts to kick in, and the throttling will recur, after all. I did extensive testing on my own rig, encoding x265, with –asm avx512 on and off; and, lo and behold, performance drops to ca. 80% when AVX 512 is on!

Tl;dr: AVX 512 is crap. Badly designed, too power hungry, running your CPU too hot, and overall dragging down your entire CPU performance. The fact alone they have to run with like a -8 multiplier, should have already told them they were on the wrong track from the get-go.

1 Like

I don’t think you are right overall.

Intel still support AVX 512 instructions outside of mainstream. E cores cannot fit AVX 512 instructions and AVX 512 was never validated on 12th gen and only 12th gen manufactured before 2022 don’t have AVX 512 fused off so the implementation there was never going to be good.

AVX512 also consumes less power and produces less heat than AVX2 if you adjust for the work done. it’s ~10% more power but you’re doing twice as many operations per cycle as compared to AVX2. Not that it matters in this case. Not all AVX512 experiences the issues you had on your non-validated CPU.

Zen 4 doesn’t have actual AVX 512, it’s double pumped 256 with 4 256 units compared to 2 512 units on intel cpus (of those which support AVX 512 instructions). There isn’t an AVX offset on Zen 4 either.

Oh, I think I am right. :slight_smile:

  1. Naturally, E-Cores don’t have AVX 512, so my tests have always been on P-Cores only (keeping all E-Cores parked; see more on that below),

  2. A paralleled process, such as x265, is always best done on P-Cores only, regardless of AVX 512. If you don’t, then what will happen is that all E-Cores will get saturated, as they’re much slower, while the P-Cores, as a result, are just idling:

What this means, is that a full x265 rendering will run slower when all Cores are enabled! Yes, due to how the parallelization works, such a process can really only be done on all P-Cores, lest you sabotage your own video job.

In a server environment, Intel has indeed allowed AVX 512 to persist. A regular consumer CPU, such as the i9 10900K (or 12900K) gets insanely hot as it is, though. Even with hefty watercooling, it’s difficult to keep the i9 12900K below 95C on full load. Perhaps a server park, with extreme professional cooling sulutions, can run AVX 512 at full speed, sustained. VEAI isn’t targeting such a market, though.

We can agree to disagree but I did specifically say that you aren’t right overall. At best your observations of the poor AVX512 implementation on 12th gen is irrelevant here. The thread is about the implications of AVX512 on Zen 4 for VEAI and you’re telling people to completely disregard AVX512 because of your experience of it on the 12900k. I’m just replying to your comment which would be read by people who don’t know anything on the topic and they’ll pick up the wrong info if it goes unchallenged. Someone could read what you wrote and spend the rest of their life parroting that AVX512 is crap all because you were using it on the 12900k which has since been fused off or disabled in BIOS because the implementation there wasn’t ideal.

It is just not correct to imply that AVX512 overall is crap; or has been disabled on consumer cpus for any reason other than basically market segmentation and the direct cause is the big little architecture which was caused by needing to claw back efficiency from sticking on ancient node sizes. You’ll still see AVX512 on workstation class cpus so it isn’t just some server farm accelerator for specific workloads.

Intel took a gamble on dropping AVX512 support on the consumer side nothing more nothing less and there’s just no need to spread incorrect information that AVX512 is crap

3 Likes

I had written a lengthy reply, to why I feel AVX 512 is still crap. Well, at least on Intel. Reading up on AMD’s implementation of it, though, looks like theirs is much more efficiently done. So, I’ll just resolve to wait and see whether VEAI can use them in a beneficial manner.

1 Like