I was thinking, why not have the AI utilise fractal patterns to dynamically resize in conjunction to the resolution of the original footage?
As in very small with low res and a combo of low and high with higher resolution depending of course on just how much the original detail has been either lost or not with the original footage.
Another idea would be to use the 2D info used for the AI model to generate facial features and the like with simple geometry, such as with the tech that is able to extrapolate motion from 2D video footage (Both shot, or found footage) and transfer it to 3D models, same idea, but in reverse.
You could train the model to position 3D mesh-like geometry to match the head’s position, such as a simple sphere for the head, cubes for the body and cylinders for the limbs and the like, on which to place the 2D info for facial/body generation…
If I understand your point about combining low and high resolution scenes I have done this before where the subject focus was bad in HD footage. I downscaled the footage significantly, for example down to 25% or 33% height using ffmpeg and a standard scaler, then used one of the Topaz models to upscale again. then I recomposited the face onto the full-sized frame in After Effects, maybe with the background upscaled separately. Given a slightly out of focus face, sometimes this gives better results. The extra control of controlling the blending is also helpful sometimes. It would be nice if this could be automated to bring sharpness in the slightly out of focus case where existing models may struggle. Preserving identity is another matter, especially for very small faces, although arguably I am not throwing away much in terms of facial features if I do not downscale beyond what is necessary to make the facial focus result fade into the lack of detail from low resolution.
Thanks for the info of your methodology, as it helps put this in perspective and practical context, I’m a visual artist, so my description is somewhat abstract, so I’m glad it came across well enough.
If you ever stippled (A drawing comprised of varying amounts of a series of dots) an image, you’d get a pretty clear picture of my description as well as Iris’s details seem to be some sort of fractal type of repeating patterns, which looks like ass with low resolution footage.
So a varying pattern that dynamically resizes depending on just how much detail exist, or not, in the footage would most definitely break up the uniform patterns that already exist in Iris’s current iteration, as it looks like it’s super-imposed on the image instead of looking like naturally occurring detail.