Just thinking out loud here…
Something very far down the line could be similar to these AI selfie apps that are blowing up right now. Under the hood they may be using Stable Diffusion and Dreambooth (not 100% sure but let’s assume so). Dreambooth lets them fine-tune a SD model for a specific person’s face, then they use textual inversion to render images in Stable Diffusion and sell them back to the uploader. This is all open source code, anyone can try it but you need an expensive GPU with 10-12+GB of vram to do locally. You can also do this on Google Colab for free. However, this training takes a long time (10s of minutes to hours), so probably not feasible in the short term for a use case like video restoration by your average user who is very impatient.
The popular app right now (Lensa) is creating fake paintings because that is their goal and they use an artwork-focused SD model for whole-image generation, but the Dreambooth fine tuning could also be done on a realism-focused SD model. This model could then be used to insert the specific character’s face into the very low quality small face in a video. I believe you can already do this on still images in Stable Diffusion by inpainting, with the right settings (low amount of denoising, so the generated image is close to the original).
There is a very real ethical dilemma here though related to enabling easy creation of deepfakes. Imagine a Topaz product of the future lets a user provide a video with a low quality face, then the user can tell it that it’s someone else. That would open the door to all kinds of misuse, and is a potential liability for a commercial company like Topaz. This capability to some degree already exists, but it’s an underground thing based on open source codes so there is not a commercial stakeholder to be sued.
Edit: browsing more I found this thread discussing similar ideas: