Sampling better quality video/photo for detailed upscale

Hello everybody,

I am not sure if this topic has been discussed already, but from a scroll through the list I couldn’t find it.

I guess this is somewhat of a niche case, but lately I had a project where I wanted to join different video material that was recorded at the same time but with different devices and therefore with very different video qualities. My goal was to make the differences as unnoticeable as possible, which was a bit of a nightmare, since TVAI produced a whole plethora of different artifacts and looks for each source. Unfortunately in the end I had no choice but to trim details and quality down for the best source to match the worst one.

While I was working to match a similar look, I thought this could actually be a perfect job for the AI itself. I would hope a model like this would be able to not only match the scenes, but could also use the better sources as a reference for the bad ones. Besides video I would also love a opportunity to include pictures into the source mix, since most older photos are way way more detailed than the older video-counterparts.

What you are talking about is training a new model on your own source materials. Topaz ain’t doing that, at least anytime soon.

MMhhh… I got to admit the only AI product I got in contact with so far is TVAI, so I don’t really know what technically is possible. I would have thought this kind of process calls for a very different model, that incorporates features similar to something like Midjourney. But now that I think a bit more about it, I guess in that case, this also won’t be able to be run locally at minimum. Not only that, it sounds like it’s basically not possible, since one can’t train a model with just 2 samples, that are also only similar but not 1:1 comparable, right?

My AI model knowledge is also very limited, but I do know the way things are, models are normally trained on massive data sets. Since you want to improve a very specific image, the dataset could probably be a lot smaller, but I doubt it could only be 1 image.

Of course we are still very early in AI, and who knows in a few years it might be possible to feed in 2 images and tell the AI to improve 1 based on the other.

Thanks for the Midjourney mention btw, it got me googling it and it’s very interesting!

I guess I watched too many videos from Two Minute Papers on Youtube. He already presented so many models that I would have deemed technically impossible, that makes it feel everything is just a matter of time and money. :slight_smile:

Thanks for the Midjourney mention btw, it got me googling it and it’s very interesting!

You’re welcome! :slight_smile: BTW. I think Midjourney is just one of the more famous ones out there. I can’t give any recommendations, but there should be a couple more of these models.

Anyhow, if there is no technical way to achieve this, I might as well close this topic. Is there some expert in here, that could give a conclusive answer to that? I mean I would also pay Topaz for a online-version of this model, if they would offer one. But there seems to be no way around the training-problem, to fit two sources together.

1 Like

I have taken classes detailing the inner workings of AI machine learning logic. I do not claim to understand everything about AI though.

The general concept is, you take thousands of samples that are clearly marked what the outcome is supposed to be for each sample. For example, handwriting. Each handwritten letter needs to have an associated correct answer. Once the model is trained on enough letters, you can start passing letters with no correct associated answer and it will use all the logic it has built-up from the training data to calculate what the correct answer should be.

Of course models and methods can get much more complex than this.

There are two main ways I can think of to make AI models for Videos.
First way would be use video sources. Take a ton of DVD releases and match them with Blu-ray releases as the answers. (You’d probably need to modify them so that each frame matches)
Second way would be to have a video enhancing tool with all the best filters. Train it with source low quality videos and have the filters settings and so on, be the answers.

Of course you can make a model that’s a mix of those approaches.