Thoughts about TVAI and what path the developers should follow IMO

I think it’s time to say it as it is. The improvements with artificial intelligence have grown immensely throughout the last five years, and they are getting faster. The thing with Topaz is that, unless you have literal ideal footage, you are going to get artifacts. The biggest challenge for AI nowadays, in my opinion, is to deliver good results regardless of the input. Generative fill from Photoshop, and Stable Diffusion, are great examples for that. You either end up with an impressive result (because the algorithm was trained enough for that) or a total humanoide or geometric mess. Topaz does a fair job considering that dealing with video is a whole different and much more complicated field.
For the purpose of upscaling a video, first of all, I believe that the AI algorithm has to learn what we humans consider as a beautiful viewing experience, that’s what we are searching for in the first place. Artifacts are just the result that the algorithm is able to deliver from the scope of data that it has learned from. But we cannot train AI to learn every single feature of the visible world, that’s why I’m making this post to suggest something that I believe will be much more simple and effective for AI upscaling.

  • Let’s suppose that I download a 144p video from YouTube. Yes, it’s an extreme exemple, but keep reading. I think that Topaz should analyze the video characteristics and know what kind of improvement can be done to that specific source, considering the final resolution that I’m looking for. In this case, I beg the developers to develop a light upscaling model, similar to what artemis anti-alias does, but combined with the denoising power of proteus. The AI could identify the edges from the subject and the rest of the scene, then apply intelligent sharpening and a denoise filter that is aware of what is the actual noise, not texture. That alone would be great.
  • For high-res inputs, the AI should analyze if there’s enough detail on the scene to restore something that was present on its training batabase: body parts, objects, textures, geometric figures etc., and of course LETTERS. The fonts are already out there, we just need training.
  • I highly suggest that the developers get inspired from avisynth/vapoursynth list of filters to solve the variety of problems that may come within a video:
    External filters - Avisynth wiki
    Internal filters - Avisynth wiki

Thanx for compiling your thoughts :slight_smile:

From the early days of TVAI, the wish for some automatism that somehow manages to analyze the footage and automaically tunes all parameters in order to get “good results” has existed.

Sadly, this is about the hardest thing to achieve and to a perfect extend - impossible (because “good” is subjective)…

Of course, analzye passes can figure out a few things and tune them automatically, but at the current state of development and research, in most cases the resulst are far from perfect.

That´s not a limitation of Topaz Software, its rather the current state of development in this field of science. Topaz relies on research others do to have a starting point for theyr products - then refine it, train models, put everything together in a usable program for end-users…

so for years, advanced users have realized that TVAI is just another “tool” or “filter” that has capabilities and limits - just like every other knob we can turn in any other software. Every different apporach to sharpening a picture in any photoshop-like software has its pros and cons and can only do so much.

This lead to the realization, that using external filters and combining TVAI with other processing chains/tools/apps is the only way to get better results than simply throwing footage into TVAI.

Thats where all the xsynth filters often appear in workflows and many of us have been combining video-server systems with TVAI in the past. But just like not every avisynth deinterlace filter in existance is equal to the next one, TVAI also only has certain capabilities, shines in some situations and fails in others…

the wish / proposal to include more filter options or enable avisynth/vapourynth in TVAI is almost as old as TVAI itself… The years of discussion have lead to the most recent versions with ffmpeg as a basis, doing piping, serving, using TVAI as a ffmpeg plugin… It enables advanced users to tweak TVAI into almost anything one wants - compile your ownw TVAI ffmpeg, enable filters in the ffmpeg chain, pipe xsynth in/out…

some of the basic filters (like cropping) have not existed in early versions - topaz has given us some of the stuff we used to do externaly (add noise, crop, trim, etc…). Whether it is a good idea to include more or not is up for debate - and the result of this discourse will alter the way TVAI ios composed over time.

to get back to your initial point of “more analysis of the source”: I - in general - second this point… But I also realize that this is much more to ask than one would think.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.