Manual seems Better with very poor videos

annemartensa · January 2, 2024, 4:54pm

I have tested with poor videos over and over again auto and relative to auto, but the best result I get is by giving preference to the worst type of part of the video mostly the beginning and then keep it on manual

Close ups always are fine, but for some reason investigating each 8 frames is not working so well at all and keeping it on manual seems to omit contextswitching and better analyze each frame

that is for old very bad VHS mp4 videos

apparently contextswitching has a price

gene-8240 · January 2, 2024, 5:12pm

My preferred scheme for poor videos is Proteus manual/estimate at the worst point in the source + Artemis default. This seems to work great as long as the source is not interlaced. If it is, I need to do double runs (deinterlace+ Proteus followed by Artemis HQ).

I wonder if asking Topaz for a third enhancement step would break something.

annemartensa · January 2, 2024, 9:08pm

A third enhancement would definitely not make it faster maybe when the 5000 Nvidea cards are released

ForSerious · January 2, 2024, 9:52pm

It would be faster in that it would be automated. There would be no extra time spent on waiting for the first two passes to finish to setup the third pass…

gene-8240 · January 2, 2024, 10:17pm

That’s the question. Would a three-stage enhancement be faster than a two-stage followed by a single? A two-stage doesn’t take twice as long as a single, so I think it probably would be. Assuming it doesn’t crash the app altogether.

jojje · January 3, 2024, 2:55pm

I’ve concluded that as well. I’ve found no documentation stating what the “auto” feature uses as source when trying to decide the “slider” values (running a set of frames through their hyper parameter estimation model), but it’s an unsolvable problem to find a global maximum by not searching the entire parameter space. So regardless of approach (beginning of clip, middle of clip or any other choice), the probability of any chosen segment of frames being representative of the overall clip distortion is slim to none.

What I would love to know is if they’re using an online update of the model parameters, such as once ever N frame sets or similar. That would be a working solution.

Anyone from @TopazLabs who can help demystify a bit how the auto-feature works, so we know whether we’re doing something really stupid by wasting time finding poor segments manually, or if the auto-heuristic is a zero shot inference used only once the video processing begins as hypothesized above?

ForSerious · January 3, 2024, 3:54pm

Auto is similar to clicking Estimate in the Manual mode every 8 frames or so.
I’ll try to look for where that was stated. Not sure if I’ll find it. I was back when TVAI 3 was launched.

gene-8240 · January 3, 2024, 4:01pm

That would explain why I’m never happy with the results from Auto. I could understand it happening at scene changes, but I wouldn’t want to see settings change in the middle of a scene.

ForSerious · January 3, 2024, 4:07pm

For me, it is useless because you cannot force it to not change Anti-Alias/Deblur. In Relative to Auto, the other sliders, I think you can disable them by setting them to -100. But Anti-Alias/Deblur, I need it to hover within 5 points from -80. There’s no way to do that. It always estimates to like 30.

jojje · January 3, 2024, 4:20pm

This is really interesting. If you do find the mentioning, please share it here.
It’s a bit disconcerting that it doesn’t evaluate all hyper parameters (like deblur). I wonder if there’s any other it silently ignores. It would explain why “relative to auto” is so hit and miss.

As for gene’s comment; why would you want per-scene settings rather than a rolling update?
Only reason I can think of is if there are some artistic choices in a scene, such as changing the focus or similar, which would indeed throw any model off (fighting against the artistic vision).

ForSerious · January 3, 2024, 4:28pm

No. My problem with it is that it does evaluate all parameters. I need it to ignore some.
Even worse, Nyx 2 has some parameters that are hidden by not having a slider, but they still get evaluated and applied when in Auto and Relative to Auto mode.

gene-8240 · January 3, 2024, 5:04pm

I don’t think I want any changes happening during the run of a video. I might be ok with scene-by-scene changes if I had the option to tweak them scene-by-scene. But I definitely do not want settings to change mid-scene.

Real world analogy: imagine trying to shoot a movie of a sunset where the camera is automatically adjusting exposure settings in response to the changing light level.

jojje · January 3, 2024, 5:39pm

Some really interesting findings by piecing together what you mentioned in this thread, along with some tidbit that topaz engineers have let slip in the past about some of the tvai filter parameters.

It seems the “estimate” button triggers an ffmpeg run using the “tvai_pe” filter.
“tvai_pe” in turn uses the prap-3 model to perform hyper-parameter optimization estimation, or in layman’s term, figure out the right slider values to set in the GUI when clicking estimate.
The tvai_pe filter can be run for the entire clip, but topaz people have mentioned in that past that 20 frames were used for estimation, so it’s clearly some subset of all frames being used. It’s also unclear what they meant with “20 frames”, if that was for the whole of the clip or for every N intervals or recurring sampling. (still to be clarified)
When producing a sample clip (1) and then running the tvai_pe filter on the first 20 frames, the metrics in (2) are produced.
Looking at the summary statistics of those values, we get (3).
Opening the clip in the TVAI UI, selecting Protheus v3 > manual and clicking estimate at the beginning of the clip produce the following slider values (4)
Referring back to the summary statistics of the tvai_pe output, we see that the 75th percentile almost exactly match the slider values. As such I was able to name the columns and we now understand what slider value each of the six “parameter” columns correspond to in the output arrays of tvai_pe.

Implication

Now, this opens up a lot of interesting possibilities.

It means we can run the tvai_pe model through the entire clip and get a holistic understanding on the predicted nature of our artifacts.
With our knowledge of where scenes start and end, or using our own custom scene detectors, we can configure different settings for different scenes such as taking a sample at that start of each scene and locking the values for that segment, like gene wanted to achieve.
We can also create our own functions where we can apply any function scaling to any of the values.
We can decide to ignore certain values by freezing them, just like you wanted FS.
To apply different settings under our control we’d likely need to chunk our clip, say one chunk per second, or per 10 seconds or the like. The precision is fully under our control.

The optimal way would be to write an ffmpeg filter specifically designed to take an instruction file. It would then be added to the filter graph just before the tvai filters, or between “tvai_pe” and the enhancement filters.

This is really cool. I’ll have to play with these possibilities a bit. Awesome that Topaz has created such a modular architecture, since that’s precisely what we need to create our own pipelines for our own respective needs. Just sad they’ve not documented it though.

References:

(1) Sample clip generation

ffmpeg -f lavfi -i testsrc=duration=12:size=320x180:rate=15 -pix_fmt yuv420p -c:v libx264 sample.mp4

(2) Defect correction estimation using the ffmpeg tvai_pe filter and prap-3 model

ffmpeg -v error -h filter=tvai_pe
  Filter tvai_pe
    Apply Topaz Video AI parameter estimation models.
      Inputs:
         #0: default (video)
      Outputs:
         #0: default (video)
  tvai_pe AVOptions:
     model             <string>     ..FV....... Model short name (default "prap-3")
     device            <int>        ..FV....... Device index (Auto: -2, CPU: -1, GPU0: 0, ...) (from -2 to 8) (default -2)
     download          <int>        ..FV....... Enable model downloading (from 0 to 1) (default 1)

  This filter has support for timeline through the 'enable' option.


ffmpeg -v error -i sample.mp4 -vf tvai_pe -vframes 20 -f null -
  Parameter values:[-0.163206 ,0.0181217 ,0.229532 ,0.427336 ,0.0986777 ,0.266223 , ]
  Parameter values:[-0.166424 ,0.0185162 ,0.231425 ,0.419165 ,0.0973274 ,0.278938 , ]
  Parameter values:[-0.166728 ,0.0185982 ,0.233558 ,0.409359 ,0.0961565 ,0.274892 , ]
  Parameter values:[-0.168796 ,0.019691 ,0.240075 ,0.406629 ,0.0975815 ,0.289282 , ]
  Parameter values:[-0.169663 ,0.0201042 ,0.238102 ,0.397581 ,0.0972078 ,0.288823 , ]
  Parameter values:[-0.17035 ,0.0201033 ,0.241191 ,0.395169 ,0.0981346 ,0.291275 , ]
  Parameter values:[-0.167102 ,0.0200989 ,0.239995 ,0.394212 ,0.099988 ,0.290737 , ]
  Parameter values:[-0.165109 ,0.0198792 ,0.236747 ,0.395309 ,0.0999681 ,0.287588 , ]
  Parameter values:[-0.164466 ,0.0195244 ,0.235611 ,0.393794 ,0.0993702 ,0.280433 , ]
  Parameter values:[-0.163594 ,0.0192311 ,0.236228 ,0.396524 ,0.0995546 ,0.273717 , ]
  Parameter values:[-0.168258 ,0.0195393 ,0.234674 ,0.391004 ,0.0963783 ,0.271823 , ]
  Parameter values:[-0.173479 ,0.0186817 ,0.236368 ,0.401487 ,0.0944276 ,0.276925 , ]
  Parameter values:[-0.169703 ,0.0187057 ,0.235371 ,0.396484 ,0.0951625 ,0.271464 , ]
  Parameter values:[-0.169483 ,0.0186276 ,0.237225 ,0.404117 ,0.0957704 ,0.267977 , ]
  Parameter values:[-0.154646 ,0.0191898 ,0.236846 ,0.390147 ,0.0961715 ,0.2302 , ]
  Parameter values:[-0.162518 ,0.0175994 ,0.235073 ,0.39477 ,0.094886 ,0.242646 , ]
  Parameter values:[-0.159807 ,0.017557 ,0.231505 ,0.38917 ,0.0952248 ,0.238809 , ]
  Parameter values:[-0.160943 ,0.0164608 ,0.231884 ,0.399115 ,0.0962811 ,0.242576 , ]
  Parameter values:[-0.157057 ,0.0161958 ,0.232442 ,0.397282 ,0.0978506 ,0.240852 , ]
  Parameter values:[-0.157441 ,0.0157485 ,0.234216 ,0.399434 ,0.0992506 ,0.238351 , ]

(3) Summary statistics of defect estimates

df.describe()

               0          1          2          3          4          5
count  20.000000  20.000000  20.000000  20.000000  20.000000  20.000000
mean   -0.164939   0.018609   0.235403   0.399904   0.097268   0.267177
std     0.005033   0.001316   0.003117   0.009580   0.001781   0.020475
min    -0.173479   0.015748   0.229532   0.389170   0.094428   0.230200
25%    -0.168968   0.017991   0.233279   0.394630   0.096060   0.242628
50%    -0.165766   0.018694   0.235491   0.396903   0.097268   0.272770
75%    -0.162124   0.019577   0.236941   0.402145   0.098821   0.282222
max    -0.154646   0.020104   0.241191   0.427336   0.099988   0.291275

(4) TVAI Estimate run in GUI

(5) tvai_pe dimension identification (column labeling)

df.columns=['deblur', 'reduce_noise', 'improve_detail', 'dehalo', 'sharpen', 'revert_compression']
df.describe()

          deblur  reduce_noise  improve_detail     dehalo    sharpen  revert_compression
count  20.000000     20.000000       20.000000  20.000000  20.000000           20.000000
mean   -0.164939      0.018609        0.235403   0.399904   0.097268            0.267177
std     0.005033      0.001316        0.003117   0.009580   0.001781            0.020475
min    -0.173479      0.015748        0.229532   0.389170   0.094428            0.230200
25%    -0.168968      0.017991        0.233279   0.394630   0.096060            0.242628
50%    -0.165766      0.018694        0.235491   0.396903   0.097268            0.272770
75%    -0.162124      0.019577        0.236941   0.402145   0.098821            0.282222
max    -0.154646      0.020104        0.241191   0.427336   0.099988            0.291275

Note: Compare mean or even the 75th percentile values with the GUI sliders’ in (4). Just multiply with 100. Values in the GUI are percentages while the model produces a fraction in the -1 to 1 range as common in ML.

jojje · January 3, 2024, 5:58pm

Yeah, that’s a variant of my “artistic focus adjustment” effect; stuff changing the global nature of the image either intentionally or because that’s the phenomenon captured. An ML model isn’t able to discern between intentional and non-intentional. It could be trained on natural events like sunsets though, but that would be a huge effort, and we’d not be willing to pay in cycle time (wait time) while such a model would run through the millions of “is this a sunset event? A space shuttle launch? … else, fix it” sort of decision making. Thanks for clarifying, I see your point.

annemartensa · January 3, 2024, 6:18pm

Yes auto used to be 20 frames but indeed Topaz changed improved this to 8 frames somewhere in the 3 series.

I learned the 20 frames from Ina in the days I thought uploading log files was helpfull.

So it is not documented and indeed it raises the question if a video is old and bad the whole digitized VHS mp4 will be significantly bad and autotuning is the least of the problems not to speak how smart is to cut of at just the right moment if the mp4 is 50 frames per second chances are that it will be fine in 50 divided by 8 in 85 percent of the time but wrong in 10 percent of the time which is still a high number compared to a very conservative manual setting that just wants to cleanup and getting vague figures alive without glossy iris eyes or artificial faces.

At the moment I am trying a combination of protheus 4 2 X and Nyx2 both first using the ESTIMATION button that does NOT exists, but using the manual I can click on it and hey it has 5 sliders then set it to relative auto .

Is there a drawback yes apart from that the result is promising if i copy the temp files it is soooooo slow even with a 4080 RTX and 5GhZ Intel

Maybe it is worth the waiting maybe not

gene-8240 · January 3, 2024, 6:42pm

I am finding that with Proteus V4 the Reduce Noise slider does a good enough job that I’m only using Nyx if noise reduction is the only enhancement I want.

I think I’m giving up on Auto and Relative to Auto, at least for Proteus.

ForSerious · January 3, 2024, 9:06pm

I’m not ready to figure out how to make an ffmpeg filter. I’m willing to do the work to make a script that will run that estimation filter and modify the values as I want. They just need to expose a way for me to use those values as an override in the tvai_up filter.

jojje · January 3, 2024, 9:34pm

Yes, since they’re building their own version of ffmpeg and shipping it with each update, it’d be a pain having to re-compile their ffmpeg glue each time they ship a new version. One option would be to get it accepted upstream, but that’s an order of magnitude more painful, since ffmpeg core devs aren’t keen on adding just any vendor specific code. Topaz is still a niche player and a filter like this would be super specialized for not only this niche product, but also the way topaz currently communicates between tvai_pe and tvai_up. The only real solution would be for topaz to either provide such an override filter for us, or come up with some way to allow dynamic overrides from an external source.

Like you, I’m not going to spend the time to create a filter that is basically a reverse engineered protocol implementation between two proprietary filters, filters for which there are currently zero guarantees regarding interface stability.

EDIT: That’s why I mentioned the likely need to segment the clip, since most pipelines already do that today to speed up the overall process in our transcoding pipelines. A known problem for which we have known solutions. “Split & Stitch”