Topaz Video AI v3.0.9

johnnystar · December 24, 2022, 10:15pm

In your examples, it does look like it is trying to reconstruct a face from just a blurry mess of pixels, but just not very well. It’s trying to draw eyes, eye brows, a nose and a mouth, which is odd, because Topaz has stated before that it doesn’t recognize faces at all.

meimeiriver · December 24, 2022, 10:29pm

And that is precisely what I, and others, have been suggesting: if TVAI can’t properly reconstruct a face from such a small area (yet), then… don’t. Just use some bicubic stuff there, or lanczos resize, or whichever is quick and dirty. But don’t create these monstrosities from hell.

youwanwa02 · December 24, 2022, 2:25pm

Hey guys, Is there a bug in this program?
I open topaz video AI today (3.0.9)
Then the video playback speed in topaz video AI is wrong, all of the video I import video’s playback speed become so slow
When I use window media to play these video, their playback speed is normal
What’s going on of Topaz Video AI? I never seen this bug in older version! This bug even make my export become slow too!

This is the example (GIF)
GIF

TicoRodriguez · December 25, 2022, 2:26am

TVAI’s encoding settings are recorded in “encoders.json”, which in the case of AV1 (NVIDIA) is as follows.

    "text": "AV1 (NVIDIA)",
    "encoder": "-c:v av1_nvenc -preset p4 -tile-columns 2 -tile-rows 2 -pix_fmt yuv420p",
    "ext": [
      "mkv",
      "mp4"

yuv420p = YUV 4:2:0 8bit

The pixel formats available for AV1 (NVDIA) are as follows

ffmpeg -h encoder=av1_nvenc

Encoder av1_nvenc [NVIDIA NVENC av1 encoder]:
    General capabilities: dr1 delay hardware
    Threading capabilities: none
    Supported hardware devices: cuda cuda d3d11va d3d11va
    Supported pixel formats: yuv420p nv12 p010le yuv444p p016le yuv444p16le bgr0 bgra rgb0 rgba x2rgb10le x2bgr10le gbrp gbrp16le cuda d3d11

p010le = NV12 10bit
NV12 = YUV 4:2:0

Maybe if we change the “pix_fmt yuv420p” part to “pix_fmt p010le”, it will be YUV420 10bit.

TicoRodriguez · December 25, 2022, 2:54am

The original resolution is too low.
TVAI only supports 2x or 4x, so 360p to 2160p is 6x, so it is only Lanczos from 4x.
Also, since 4x can have poorer picture quality than 2x, I think it is safe to use 2x and Lanczos and stay at 1080p.

meimeiriver · December 25, 2022, 4:20am

Of course it is. But a face this small can occur in 1080p too (given that it’s too far away in the background).

Which is really why I support the idea that TVAI should define a pixel threshold beyond which ‘not to bother.’

Then why is it trying so hard?

What is needed, with faces, is the introduction of some basic sanity checks. When that woman’s face (in the middle image) looks like she fell from 6 stories high, flat on her face, a human immediately gets that these distortions are highly abnormal. In practice, this means Models should be trained against baseline faces too. And even when the restored face may not fully resemble the original face, just make it a bland, generic one, and not something you’d find in a rarity cabinet. Or… simply do nothing at all, beyond trivial upscaling, is what I’m suggesting: if the ‘i’ isn’t there yet to accomplish a successful restoration, just don’t try.

meimeiriver · December 25, 2022, 4:26am

I think that should work.

In VapourSynth, I scale back my ProRes 422 HQ back to 10-bit 4:2:0 too:

vid = core.resize.Point(vid, format = vs.YUV420P10)

(My nVidia Shield Pro doesn’t do 444)

TicoRodriguez · December 25, 2022, 6:54am

Unlike facial restoration in still images, facial restoration in video requires continuity with the previous and following frames, so simply porting PhotoAI functions may cause problems in terms of picture continuity. Alternatively, the amount of computation may increase, resulting in a significant decrease in effective speed.

I think the reason why there are negative opinions about the face restoration function is that while there have been additional functions such as image stabilization, there have been no significant improvements in scaling and noise reduction, which have always been functions.

I fear that development resources will be wasted on features that will not be beneficial.

TicoRodriguez · December 25, 2022, 7:06am

TVAI(3.x.x) seems to be internally processed in rgb48le(R:G:B 16bit), so the processing flow is as follows.

yuv420p(YUV 4:2:0 8bit) → (FFMPEG) → TVAI rgb48le(R:G:B 16bit) → (FFMPEG) → yuv422p10le(YUV 4:2:2 10bit)

Also, VEAI(2.x.x) was internally processed in 8bit, so the processing flow is as follows.

yuv420p(YUV 4:2:0 8bit) → VEAI(R:G:B 8bit) → yuv422p10le(YUV 4:2:2 10bit)

I think 10bit output was meaningless because it was just adding bulk.

meimeiriver · December 25, 2022, 7:44am

Oh, I agree. So, why is TVAI trying to reconstruct those small faces then? I’d rather it didn’t.

karol.guzik · December 25, 2022, 9:05am

Thank you!
I did as you directed me and now it saves files in 10 bits.
I wonder if it is also possible to replace the proteus3 model with proteus2. I have to check it out.

TicoRodriguez · December 25, 2022, 9:40am

Cannot be done in GUI, but probably can be done in CUI.

Process - Show Export Command
FFMPEG encoding settings will appear, so edit them.

ffmpeg "-hide_banner" "-nostdin" "-y" "-nostats" "-i" "input.mkv" "-sws_flags" "spline+accurate_rnd+full_chroma_int" "-color_trc" "1" "-colorspace" "1" "-color_primaries" "1" "-filter_complex" "veai_up=model=prob-3:preblur=0:noise=0:details=0:halo=0:blur=0:compression=0:device=0:vram=0.9:instances=1,scale=out_color_matrix=bt709" "-c:v" "prores_ks" "-profile:v" "1" "-vendor" "apl0" "-bits_per_mb" "8000" "-pix_fmt" "yuv422p10le" "-map_metadata" "0" "-movflags" "frag_keyframe+empty_moov+delay_moov+use_metadata_tags+write_colr " "-map_metadata:s:v" "0:s:v" "-map_metadata:s:a" "0:s:a" "-c:a" "aac" "-b:a" "192k" "-ac" "2" "-metadata" "videoai=Enhanced using prob-3 with recover details at 0, dehalo at 0, reduce noise at 0, sharpen at 0, revert compression at 0, and anti-alias/deblur at 0" "output.mov"

veai_up=model=prob-3 → veai_up=model=prob-2

I haven’t tried it myself.

karol.guzik · December 25, 2022, 10:25am

It seems that the parameters in the files are different and also in the original proteus2.
I also tried replacing them.
Also without success.

TicoRodriguez · December 25, 2022, 10:55am

Proteus - Auto(estimate=1) seems to be using prap-3.
If Auto is turned off(estimate=0 or estimate none), it seems that prob-2 or prob-3 can be selected.
If the prob-2 setup worked correctly, you will find “prob-v2-?net-fp16-???x???-?x-ox.tz” downloaded in the models folder.

masashi.sahara · December 25, 2022, 11:16am

Audio issue may have some problem. I have no sound on AI processed video with Windows player.
Once I re-encode it with the other video converter, sound is recovered.

menditsa · December 25, 2022, 11:32am

Use mpv or vlc not WMP.
The audio is there just WMP cant play it.
rencoding is not recovering it.

karol.guzik · December 25, 2022, 12:16pm

Yes, the model is downloaded.
But the effects are like in version 3. ;/

roki · December 25, 2022, 4:21pm

It must be a model+GPU dependent situation. I had the poor utilization issue with 3.x, but as of a few of versions ago, my GPU is now running at 90% +/- 5%; CPU at 50%.

NVIDIA RTX A4000, Intel Xeon Silver 4216, Windows 11

karol.guzik · December 25, 2022, 4:42pm

The last version of 3.X.X that ran a similar speed to 2.X.X was alpha 3.0.40a and I’m still keeping it.

ewa.kretowicz · December 25, 2022, 8:36pm

How do you revert back? V3.0.9 doesn’t even work! Thanks in advance : )