How does Topaz Video AI work? (particularly in relation to ffmpeg)

Hi all,

I know there are proprietary things that can’t be shared, but I have trouble explaining how this product works to folks even at a high level. I was wondering if someone could explain it, particularly in relation to ffmpeg.

My understanding:

  • ffmpeg is used to decode video
  • individual frames or a set of video frames are processed by Topaz Video AI secret sauce ( we don’t need to know this :wink: )
  • ffmpeg is used for resizing, upscaling, and encoding into the desired output formats

Does that sound about right?

Use of NVENC

Also, per this thread, it appears that NEVNC ffmpeg integration is not used due to:

  • no need for the speed as the frames aren’t rendered that fast by the secret sauce
  • lack of quality

It’s enabled as a complex filter.
As I understand it, it processes in RGB48le colorspace.
Currently NVENC is the only way to output to H.265 or 4.

(I see you read my post about Topaz Video Enhance AI 2.6.4. That’s different than Topaz Video AI.)

1 Like

Not really, the quality of NVENC is indistinguishable to the naked eye. Unless you go into a frame-by-frame comparison like a paranoid zooming (as many encoding radicals do) versus CPU encoding (which is QSV anyway and is Intel’s hardware encoding). In my experience, NVENC is better at decoding when watching the output video, the player doesn’t freeze when navigating the video through the timeline and its quality is indistinguishable at a glance.

1 Like

You must not have seen libx265 encoding yet. In general NVENC and QSV are blurry compared it, but they are passable if you’re not trying to enhance the quality of your video.
Even if you couldn’t visually tell the difference, libx265 can create smaller files at the same visual quality. (Depending on how long you are willing to let the compression take.) And it does not suffer from being overloaded with too many details to try and fit into a set bit rate.

1 Like

I don’t know why you say that… Have you used the right parameters and presets?

That statement comes from several tests. Again, even if it is possible to get them to match the visual quality, they come out much bigger than libx265. Here’s a very helpful chart of just H.264 vs H.265 from the libx implementations in ffmpeg. I usually use H.265 Slower at CRF 22. From that chart, you can see that that’s one of the bigger file size outputs. To keep all the grain in a Blu-ray source, I need to lower the CRF to about 18.
All that being said, it is hard to figure out how those parameters compare to hardware implementation parameters. Mainly because they don’t directly. The best I could do was pick a movie with a lot going on and using a scene with a lot of tiny detailed motions, compare the original against libx265 against NVENC. I changed the settings until I could no longer see any differences. I didn’t ever find a setting that passed on NVENC, but that was years ago with a GTX 1060. Maybe they’ve improved the implementation on newer cards.

1 Like

At this point in 2023, storage is not an issue for most users. For me personally, I don’t care if NVENC makes the files bigger, I have plenty of storage.

I defer you to what I could find about the subject on reddit and SuperUser.
The Reddit comments do not take into account the B-frame support of the RTX 4000 series GPUs, but the SuperUser one does.
Personally, I don’t care about how big the file is… but direct streaming that file at source quality on Plex, does care. Since that is my main use case, I opt for smallest file size at best visual quality.

You could have specified that your particular use was for the web and saved us the discussion lmao. I prefer VP9 for the web because of its compatibility, ease of decoding and its good quality to size ratio.

Wow awesome discussion and links about encoding efficiency / efficacy. Thanks :slight_smile:

Not web. It’s accessed through a Roku player.
I feel like all the points are still valid. I can find no tests nor proof that it is possible to get visually lossless results from hardware encoding.
People seem to not care because it’s for streaming, or assume it is visually lossless and tests are not needed. There seems to be a special case with the B-Frame support in the NVENC included in RTX 4000 series (Might be 3000 series too). People have good things to say about it, but it’s never clear if it’s in the context of ‘don’t care because it’s for streaming’ or not. Again. I have found no tests.

It would be a shame if people reading this topic, got the impression that hardware encoding was good enough for enhanced videos, when it’s not. Or worse: if it is good enough, but only on newer hardware that they don’t have—and we have not made that clear.

What about doing what I do? Export it as lossless FFV1 in TVAI and then Handbrake (or your other favorite tool) it with libx265…
Best of both worlds

1 Like

That is the only approach I am confident in recommending at this time: TVAI to a lossless codec, then to something more usable. I do PNG to libx265, but that’s because of how I like to do frame interpolation, and the before mentioned TVAI processing in RGB48le colorspace that matches the PNG color format.

If anyone really wants to get into hardware encoding. AV1 seems to be the best of all worlds—except the playback world. Might take a few more decades to get decoders in the hands of most people.

2 Likes

I personally use: ProRes 422 HQ* to H265 (libx265)** and really like the results :sunglasses:

*TVAI to ProRes 422 HQ on my pc with an rtx 4090
**ProRes 422 HQ to H265 (libx265) on my macbook pro (M1 Max) using Shutter Encoder

libx265 is fast as hell (300MB/s+ over 10GbE to my NAS) and efficient on the M1 Max - full speed even on battery and minimal heat since it’s done on dedicated hardware encoders.

I tried AV1 but I too have problems playing it and the only rig that can encode it in hardware is my 4090 rig… which I need for topaz TVAI :laughing: