Idea for all of the AI products: "Copy And Specialize" for the models

I was chatting with a friend of mien about our post production workflow. He had a fairly cool idea that I’ve not seen in commercial image processing products (it’s not that uncommon for products for Audio processing, though, and it works great there):

He has a studio he works in, that he’s had for years - he leaves it deliberately bland and same-ish because he wants the model he’s shooting to pout out and be the highlight of the set, not the random chair or office item that found itself on some of the pictures, distractingly. :slight_smile: So he was trying to find a product that would have great pre-trained AI models, but he could make a copy of the model and and in a first pass feed it the thousands of pictures he has with just the scenery, kinda just to absorb the vibe of the place. And in fact for “specializing” it would both be useful as a “try to unsee this sort of stuff cause it’s not relevant” fed with literally just wall pictures or the pre/post roll to warm up the camera and no actual content… SO kinda like the minute of “empty room and background noise” that audio techs tend to record before filming begins, to edit that out specifically and get rid of background noise in the scenes themselves.

The opposite of it would also be super useful: say, you want to touch up the pictures from a wedding: there’s very often literally thousands of pictures now, between a professional photographer taking pictures all day - and half of the assembled extended family doing what we all do now: take more pictures. There’s usually only 2 or 3 locales and they all share a “vibe” of sorts, even across the dozens of different cameras, there’s still a lot of similar and good quality data to refine a model with. I’ve not seen this at all with commercial products, but it is not uncommon with some audio post processing apps. Even just AVR products, it is useful to get a voice recognition module to hone in on specicif speakers - especially for people who may want to use a product that’s not in their first language, or, say an English person found themselves trying to use voice control software trained on Americans (hilarity ensues - but the retraining is very effective.)

It would be a really useful feature I think: it’s usually a core feature with open source products for this, and despite the original training set for the sample model not being very extensive or good, just doing a tune pass over the actual data set you want to enhance and then only after feeding all of it trying to apply to the set products interesting and sometimes impressive results - sometimes very uncanny Valley, like if the images are all very inconsistency composed - if the model before that was not really trained well, it can produce very weird ghosts and time-erased Martys. But with the really well trained models in, say, Gigapixel, I’d be pretty sure it would kind of just apply a connecting vibe to the set - and it should be good data to base upscales on. :slight_smile:

I don’t feel this would be a super important feature for me, I already love the products, but it would be pretty unique might be something many folks would be a fan of if they saw a demonstration.

1 Like