The process of improving video quality with Project Starlight

We’re releasing our new Project Starlight video enhancement model on cloud only. Why?

The short answer is that Starlight is huge and won’t run locally unless you have server-grade hardware. The longer answer is: to make technological progress, we need to push quality first, then focus on speed and size later.

We would all love a new model that is fast, small enough for your laptop, and higher quality than anything we’ve ever seen before. But the reality is that we can only optimize for a single attribute at a time.

In 2023, we tried improving quality on our video models while holding speed and size constant, but this approach was ineffective. While later iterations (Rhea, RXL) do offer improvements over earlier versions (Proteus, Artemis), you’ve probably noticed that the changes have been getting more incremental.

So in 2024, we asked the question: “How much video quality could we achieve if we didn’t care about speed or size?” The result is Project Starlight. It requires huge VRAM and currently takes 20 minutes to process 10 seconds of footage, which we know is sort of ridiculous. The results, however, are truly mind-blowing - an accomplishment which was quite challenging just by itself.

We’ve actually seen this story before. When we released Gigapixel for image upscaling, it had great results but crashed for one out of five users and took hours to run on 2018 hardware. But because we first achieved the quality we wanted, we were able to focus on making it faster and more efficient. Nowadays Gigapixel models run in milliseconds and are deployable on mobile devices (like iOS).

In 2025, we’ll focus on optimizing Starlight in a similar way. We’ve only focused on quality for this release, and we wanted to get it usable as soon as possible. You can expect smaller and less expensive models in the near future, based on the same technology that we’ve developed with Starlight. In the meantime, we’re initially pricing this model at cost to help make it more accessible.

We’re really excited about Starlight; it’s truly a breakthrough, and we wanted to get it in front of people as soon as possible. Thanks for both your support and your patience!

18 Likes

Some of us have server grade hardware so the answer is therefore inadmissible and especially since the approach mainly suggests a brute force adoption rather than any optimization: We see exactly the same approach emerging on LLM AI

Many of us are also keen to have local processing, as the additional costs always pile up a bit more, we at least have control over them!

I bought a license 10 months ago, stayed a debugger for at least 5 months and got stuck with a largely unfinished 5.3.6.

You won’t reach many enthusiast customers this way anymore and professionals will certainly expect something more stable than what was so far…

The multi GPU processing that absolutely no one mentions because it doesn’t work proves that there is no limitation of memory or power but rather of coding

Topaz keeps being excited, and customers keeps being disappointed seeing more

27 Likes

Hey Eric. I get what you want to achieve. However, how about coming up with a good version that works with no bugs and a usable GUI for each topaz product. At least give us something that works and then move on with your new projects. Otherwise we have no incentive. Let me tell you Eric, the word of mouth is a very powerful marketing tool. I learned a long time ago that if you anger one customer you lose 10 potential customers.

10 Likes

We definitely thought about this when planning out the release of Starlight, and I will try to answer this point-by-point because we truly believe we are taking a balanced and considered approach to the cloud/local processing balance:

You may have seen in the beta thread, but the current Starlight model was developed on/designed for the NVIDIA H100. Here’s the section from Eric where we’re committing to optimizing and packaging these advancements for more GPUs.

Completely agreed here – even with the current set of models in the Video AI app, many of our larger business customers prefer to deploy Video AI server on-site.

If your license is on month 10, you have access to all updates including today’s 6.1 beta

One of the benefits of Starlight’s new architecture is that the model can operate across multiple GPUs with very little overhead.

I have ZERO interest in cloud based rendering. I have been a customer for a long time, and I feel as though the options, the UI, the product itself isn’t getting better – but being more engineered so that users have no option but a paid cloud rendering solution in addition to the subscription model you put in place.

Subscription models will eventually kill your company.

24 Likes

In the case of Starlight, we’re just offering free access to anyone with any type of license. Soon, we’ll allow at-cost rendering for up to 5 minutes for users that want more access right away. This is priced at cost, because we consider research advancement important enough to fully cover the cost of GPU compute for as many users as we can support.

As for subscriptions, users can currently purchase Video AI, receive one year of updates, and use any version released in that year indefinitely. I think that’s a fair arrangement for an application that provides ongoing value to its users, and benefits from active improvements to performance and compatibility.

Sounds good. Would I like a local model with this same quality? Obviously yes, but it sounds like the tech isn’t there yet. As long as this eventually comes to Topaz Video AI and I can run it locally, I will be satisfied. Hopefully once the model is optimized so that it uses less memory and is faster. Any estimates on how long this could take? 6 months…a year, longer?

It does feel kind of bad to be paying for something that is worse than what we know is possible, but focusing on quality, rather than tiny incremental improvements is the right decision as long as this will eventually come to Topaz Video AI. It always seemed like a diffusion based upscaler would be the way to go and now it has happened.

I have no interest in tokens or online cloud processing, so I’m willing to wait. Awesome work and I’m excited to see where this tech goes!

6 Likes

@Moebius

As long as this eventually comes to Topaz Video AI and I can run it locally, I will be satisfied

I seriously doubt that will ever happen. This “new direction” appears to be a fundamental change in Topaz’s business model and their supposed future customer base. (Just my opinion).

.

9 Likes

Thanks for the update on the vision and goals.

More parties are doing catching up, it therefore makes sense to do a next level. That said: like a few others already said, a very different approved which means end-of-line - as to model development - of the current standalone version?

I’m forced to agree. There’s been no meaningful performance boost for any model in any Topaz app for years at this point - what are the chances they’re going to be able to massively optimize this one anytime soon? This is likely always going to be a cloud-only service, at least if you’re working on anything HD and/or more than a few minutes in length. Heck, I’ve got a respectable system (not stellar, but solidly-mid-range) and it still takes me multiple minutes to run Gigapixel’s Redefine on a single high-res image.

2 Likes

I don’t agree with this. Certain diffusion models have been tough to run locally because of memory requirements, but devs have found optimizations that allow the models to run on more hardware. This is happening with text to video models right now. A few months ago, this stuff could only be run in the cloud, but now it generally can be run locally.

I doubt that they will keep this model cloud only forever. If they do, it will be far less popular than Topaz Video AI. A lot of people don’t want to use the cloud or don’t trust it.

2 Likes

I’m certainly not suggesting that optimisation is impossible (or even unlikely). I’m just sceptical that it can be optimised enough to be a viable resource for the average home user with ambitious goals. Could we reach a point where a 5-minute home movie could be given the Starlight treatment on a modest home computer? Probably. Could we do the same thing with a 26-episode series of a 90s TV show? As nice as that would be, I doubt it.

Look at Gigapixel’s Redefine. It’s been out for about a year now and it still takes an age to process a single high-res image on my respectable system (and absolutely hammers my GPU doing it). That’s one image - your average 45-minute show has about 70,000 of them and Starlight would have to be doing a lot more than just fixing each of those images in isolation.

As for the likely unpopularity of the cloud-only approach, I certainly agree with you there, and I don’t doubt that Topaz hope to get some version of the model into the destop app at some point. Having said that, they may well see cloud processing as the future in terms of maintaining their income stream. That would make sense, much as it would suck for their existing customer base.

Of course, this is all just speculation. I could be completely wrong about everything (and actually hope that I am!).

1 Like

When i first started out with AI video rendering i was using the real-ESRGAN project on github.

What does models do, is exacly how you’d use gigapixel. One picture gets enhanced at a time.

Had to do it in three steps. First, extract every frame into jpg in a folder. Then you ran a command that would enhance one frame at a time, and move them to a different folder.

Finally, you stiched the frames together again and used the source video files to extract the audio.

However, 100 000 jpg from a single move would use around 100-200gb of data pre-enhanced and upscaled. After getting upscaled and enchanced, they’d occupy up to 2tb before stiching them together into a video.

It also took forever. I don’t think i ever used the largest model, as that would have taken several days to finish a 1h long video, and that model could only do 4x upscale. Which meant that frames from original video had to be within the margins 1000x1000p, as hw encoder on my 3080 couldn’t process input larger than 4kx4k.

1 Like

I’m a bit disappointed. I’m not a fan of computer clouding. If I wanted to use it, it would have to improve videos longer than 10 seconds. That will probably be priceless with 1 hour DV cassettes.

9 Likes

Why not distill the model down to various sizes? Diffusion models will already run on 24GB cards that many of us have. The fact that you are saying it can run across multiple GPUs as well makes it even more frustrating that you are limiting it to the cloud. Clearly, it COULD be made to run on our cards, but that would be less profitable since we already paid for the app.

I have been hoping something like this would eventually come to Video AI, but seeing it is cloud-only is INCREDIBLY disappointing. Time to move on to setting something else up to do this work. It would take less resources to just do a pass to add fine details after a topaz render, anyway. You are really shooting yourself in the foot going this direction.

9 Likes

We are on the way to solutions like you are describing. This is an iterative process, and we would not be able to make progress on visual quality if we only released models that run quickly on consumer hardware.

3 Likes

This is absolutely a tech demo for now – we know users will need much more than 10 seconds at a time for actual projects.

2 Likes

I got added to your early access list on twitter when does access come in to effect do you notify us on twitter? What will be the cost do you just purchase credits? or is it a subscription?

I really hope this is a step in progress toward what we really want. It’s hard to see the future from here. That’s all.

The results are not convincing… Faces are distorted, structures are too smooth…
It’s as usual: in the examples shown it works fabulously, in reality it doesn’t or only badly