RTX 5070 AI TOPS VS 4080

So the new RTX 5070 coming out next month has 988 AI TOPS and the RTX 4080 has 780. Will this translate to an advantage for the 5070 in TVAI or will the lack of cuda cores limit its performance. I am debating over the 5070 and a second card after the refresh cycle or just getting a 5080.

TVAI has never scaled according to what other companies say their products can do with AI. I have no idea what benchmark or program Nvidia ran to get their AI TOPS number, but TVAI is vastly different.

I want to make assumptions here, one that whatever metric NVidia will be the same across all of their cards and two since the difference in AI TOPS on my 3 cards which are the 4080, 4070TI and 4070S does seems to translate roughly to TVAI FPS maybe it will also in the 5000 series. However, those 3 cards also have very different tensor/cuda ratios vs the 5000 series. My understanding is TVAI does leverage tensor cores effectively.

I agree on that. TVAI is probably the most optimized for Nvidia tensor flow. (I think that what it’s called.)
It’s probably safe to assume the numbers will match up.
The numbers don’t seem to match up on Apples claims of AI speed or power.

Let’s hope because that would mean a 5070 will absolutely fly in TVAI and the 5080 would be insane.

The AI TOPS of RTX 5XXX are FP4 AI TOPS, the AI TOPS for RTX 4XXX are FP8 AI TOPS, i dont know if TL does generate images in FP8, but clearly not in FP4 at the moment.

AI TOPS besides FP4 are not that different for RTX 4XXX vs 5XXX.

RTX 4XXX cant do FP4.

Rumors are for RTX 5090 = 40% faster as 4090 (the other 5XXX cards are much slower).

Quote:

For Inference (what we do when using TL software)

  • INT8 is a reliable standard for most practical scenarios
  • 4-bit formats are emerging but require careful testing ← <- ← <- ← <- ← <-
  • FP16/BF16 still play a role in tasks with tight precision demands

https://www.reddit.com/r/StableDiffusion/comments/1b4x9y8/comparing_fp16_vs_fp8_on_a1111_180_using_sdxl/



@dakota.wixom ← i did add you to get attention on this (bit depth topic) - maybe its not clear.

Quote from the first link again
Some understanding are totaly wrong !!!

For Non-Technical Readers: It’s All Around Us!

Think about how we handle precision in everyday life:

  1. Photography:
  • Your phone’s “RAW” photos are huge (20-40MB)
  • Regular JPEGs are much smaller (2-4MB)
  • Can you spot the difference? Usually not!
    (TOTALY WRONG!!! RAWS are for editing - jpegs are for using)
    (You need the bit depth for not getting tone rip off during editing (posterisation)

  1. Streaming Video:
  • Blu-ray movies are massive (40GB+)
  • Netflix 4K streams use way less data
  • Yet both look fantastic on your TV
  1. Music:
  • Studio recordings: Super high-quality (192kHz/24-bit)
  • Your streaming service: Standard quality (44.1kHz/16-bit)
  • Both sound great because 44.1kHz already exceeds what human ears can detect
    (AGAIN TOTALY WRONG SAME AS FOR RAWS - its for EDITING)
  1. AI Models Follow the Same Logic:
  • Full precision: Like having a professional camera
  • Reduced precision (4-bit, 8-bit): Like having a really good phone camera
  • For most uses, the difference is hardly noticeable

It’s like buying a car - you don’t need a Formula 1 racer to get groceries. A regular car works perfectly fine and is much more practical. The same goes for AI models - we don’t always need the highest precision to get great results.

Well crap, that makes me wonder if the 5070 is going to work at all, without an update to TVAI.

The GPU will work.

Thanks