Topaz Video AI v3.0.9

In your examples, it does look like it is trying to reconstruct a face from just a blurry mess of pixels, but just not very well. It’s trying to draw eyes, eye brows, a nose and a mouth, which is odd, because Topaz has stated before that it doesn’t recognize faces at all.

2 Likes

And that is precisely what I, and others, have been suggesting: if TVAI can’t properly reconstruct a face from such a small area (yet), then… don’t. Just use some bicubic stuff there, or lanczos resize, or whichever is quick and dirty. But don’t create these monstrosities from hell.

2 Likes

Hey guys, Is there a bug in this program?
I open topaz video AI today (3.0.9)
Then the video playback speed in topaz video AI is wrong, all of the video I import video’s playback speed become so slow
When I use window media to play these video, their playback speed is normal
What’s going on of Topaz Video AI? I never seen this bug in older version! This bug even make my export become slow too!

This is the example (GIF)
GIF

TVAI’s encoding settings are recorded in “encoders.json”, which in the case of AV1 (NVIDIA) is as follows.

    "text": "AV1 (NVIDIA)",
    "encoder": "-c:v av1_nvenc -preset p4 -tile-columns 2 -tile-rows 2 -pix_fmt yuv420p",
    "ext": [
      "mkv",
      "mp4"

yuv420p = YUV 4:2:0 8bit

The pixel formats available for AV1 (NVDIA) are as follows

ffmpeg -h encoder=av1_nvenc

Encoder av1_nvenc [NVIDIA NVENC av1 encoder]:
    General capabilities: dr1 delay hardware
    Threading capabilities: none
    Supported hardware devices: cuda cuda d3d11va d3d11va
    Supported pixel formats: yuv420p nv12 p010le yuv444p p016le yuv444p16le bgr0 bgra rgb0 rgba x2rgb10le x2bgr10le gbrp gbrp16le cuda d3d11

p010le = NV12 10bit
NV12 = YUV 4:2:0

Maybe if we change the “pix_fmt yuv420p” part to “pix_fmt p010le”, it will be YUV420 10bit.

5 Likes

The original resolution is too low.
TVAI only supports 2x or 4x, so 360p to 2160p is 6x, so it is only Lanczos from 4x.
Also, since 4x can have poorer picture quality than 2x, I think it is safe to use 2x and Lanczos and stay at 1080p.

1 Like

Of course it is. But a face this small can occur in 1080p too (given that it’s too far away in the background).

Which is really why I support the idea that TVAI should define a pixel threshold beyond which ‘not to bother.’

Then why is it trying so hard?

What is needed, with faces, is the introduction of some basic sanity checks. When that woman’s face (in the middle image) looks like she fell from 6 stories high, flat on her face, a human immediately gets that these distortions are highly abnormal. In practice, this means Models should be trained against baseline faces too. And even when the restored face may not fully resemble the original face, just make it a bland, generic one, and not something you’d find in a rarity cabinet. Or… simply do nothing at all, beyond trivial upscaling, is what I’m suggesting: if the ‘i’ isn’t there yet to accomplish a successful restoration, just don’t try.

1 Like

I think that should work. :+1:

In VapourSynth, I scale back my ProRes 422 HQ back to 10-bit 4:2:0 too:

vid = core.resize.Point(vid, format = vs.YUV420P10)

(My nVidia Shield Pro doesn’t do 444)

1 Like

Unlike facial restoration in still images, facial restoration in video requires continuity with the previous and following frames, so simply porting PhotoAI functions may cause problems in terms of picture continuity. Alternatively, the amount of computation may increase, resulting in a significant decrease in effective speed.

I think the reason why there are negative opinions about the face restoration function is that while there have been additional functions such as image stabilization, there have been no significant improvements in scaling and noise reduction, which have always been functions.

I fear that development resources will be wasted on features that will not be beneficial.

3 Likes

TVAI(3.x.x) seems to be internally processed in rgb48le(R:G:B 16bit), so the processing flow is as follows.

yuv420p(YUV 4:2:0 8bit) → (FFMPEG) → TVAI rgb48le(R:G:B 16bit) → (FFMPEG) → yuv422p10le(YUV 4:2:2 10bit)

Also, VEAI(2.x.x) was internally processed in 8bit, so the processing flow is as follows.

yuv420p(YUV 4:2:0 8bit) → VEAI(R:G:B 8bit) → yuv422p10le(YUV 4:2:2 10bit)

I think 10bit output was meaningless because it was just adding bulk.

2 Likes

Oh, I agree. :slight_smile: So, why is TVAI trying to reconstruct those small faces then? I’d rather it didn’t.

Thank you!
I did as you directed me and now it saves files in 10 bits.
I wonder if it is also possible to replace the proteus3 model with proteus2. I have to check it out.

1 Like

Cannot be done in GUI, but probably can be done in CUI.

Process - Show Export Command
FFMPEG encoding settings will appear, so edit them.

ffmpeg "-hide_banner" "-nostdin" "-y" "-nostats" "-i" "input.mkv" "-sws_flags" "spline+accurate_rnd+full_chroma_int" "-color_trc" "1" "-colorspace" "1" "-color_primaries" "1" "-filter_complex" "veai_up=model=prob-3:preblur=0:noise=0:details=0:halo=0:blur=0:compression=0:device=0:vram=0.9:instances=1,scale=out_color_matrix=bt709" "-c:v" "prores_ks" "-profile:v" "1" "-vendor" "apl0" "-bits_per_mb" "8000" "-pix_fmt" "yuv422p10le" "-map_metadata" "0" "-movflags" "frag_keyframe+empty_moov+delay_moov+use_metadata_tags+write_colr " "-map_metadata:s:v" "0:s:v" "-map_metadata:s:a" "0:s:a" "-c:a" "aac" "-b:a" "192k" "-ac" "2" "-metadata" "videoai=Enhanced using prob-3 with recover details at 0, dehalo at 0, reduce noise at 0, sharpen at 0, revert compression at 0, and anti-alias/deblur at 0" "output.mov"

veai_up=model=prob-3 → veai_up=model=prob-2

I haven’t tried it myself.

It seems that the parameters in the files are different and also in the original proteus2.
I also tried replacing them.
Also without success.

Proteus - Auto(estimate=1) seems to be using prap-3.
If Auto is turned off(estimate=0 or estimate none), it seems that prob-2 or prob-3 can be selected.
If the prob-2 setup worked correctly, you will find “prob-v2-?net-fp16-???x???-?x-ox.tz” downloaded in the models folder.

1 Like

Audio issue may have some problem. I have no sound on AI processed video with Windows player.
Once I re-encode it with the other video converter, sound is recovered.

Use mpv or vlc not WMP.
The audio is there just WMP cant play it.
rencoding is not recovering it. :wink:

1 Like

Yes, the model is downloaded.
But the effects are like in version 3. ;/

2 Likes

It must be a model+GPU dependent situation. I had the poor utilization issue with 3.x, but as of a few of versions ago, my GPU is now running at 90% +/- 5%; CPU at 50%.

NVIDIA RTX A4000, Intel Xeon Silver 4216, Windows 11

The last version of 3.X.X that ran a similar speed to 2.X.X was alpha 3.0.40a and I’m still keeping it. :stuck_out_tongue:

How do you revert back? V3.0.9 doesn’t even work! Thanks in advance : )