Same issue we had with 7.0.0.1.b, although this time the crash lasted longer, with the messages “Initializing Viewer” and “loading model” onscreen for over a minute, the GPU load very high, and then crash.
Hey, starlight mini works on an rtx 5070 now. I updated, and I changed a system setting:
-> NVidia control panel
-> Manage 3D Settings
-> Cuda - Sysmem Fallback Policy
Set to Prefer Sysmem Fallback
Processing an old 4 second 720p 29.75fps clip of my dog worked, and ended up using just a little shared memory. I’ve got another card with more vram on order but I’m happy that what I have now works!
Doing more testing, it seems like it is not good at grass like Dilson noticed. My old home videos playing fetch with my dog got grass that looks just like that train picture. But overall it seems pretty good at recovering 320x240 razr era videos. It’s much more realistic than the other models in my non-expert opinion.
I’m trying to understand how VRAM impacts both output quality and render speed.
Is it similar to LLMs, where more RAM lets you run larger models with more parameters?
If so, what are the actual quality differences between running a model on an 8GB VRAM GPU versus 12GB, 16GB, or 24GB? Does extra VRAM mainly affect finer detail, or is there more to it?
I’m considering a 16GB 5060 Ti, but I could stretch for a 3090 with 24GB for around the same price. What kind of differences in quality and speed should I expect between those two RAM differences?
My hunch is that 16GB covers most practical needs and 24GB might be overkill, but I want to be sure before deciding.
This is with interlaced source video, look at the marked area (causing artifacts).
Apart from this it looked quite well recovered. The Starlight mini looks promising. But yet to grow and if the performance on a local PC can be improved, then. It is a win-win situation.
You cracked the code! It is now working on an rtx 2080, slow but sure. That “Prefer System Fallback” switch works. Saving as jpgs so can confirm it’s results along the way. Looks like about 5 -6 frames at a time every few minutes.
Assuming you can use shared system memory as vram, the only difference is render speed. Having enough memory or not is a pass/fail test. The models have memory requirements to run; if you have enough memory and your chip is compatible, it’ll get there. It might be 0.4fps (like in my case with starlight mini) but if you can load it, it will chug through.
Fancier models, e.g. full starlight, move the pass/fail memory requirement to numbers that not many users can pass. That said, with enough patience and system memory I bet lots of people could run a much larger starlight-medium or something. Some of us have hundreds of gigabytes of system memory, and could leave a video processing for a long time to use full starlight with Prefer Sysmem Fallback.
12GB vram is not enough to run starlight mini on windows without touching system memory (see my earlier post). It looks to me like 16GB would do it. However, I’d love a big 40GB model that does even better, regardless of the fact that it would be slow on account of using system memory across the PCIe bus. More memory will always be better, at least to memory values that are likely to exist in the foreseeable future. I have a 5070ti in the mail, as that’s the right compromise for me, but I’m under no illusions that I will always want more vram than I can afford.
12GB is running starlight-mini fine all day so far with Prefer Sysmem Fallback and ample system memory. I don’t know if an older card with more memory is faster than a newer card with less memory. Either appears able to work though. Do with this whatever is best for you!
Problem with high VRAM usage in Beta 7.0.0.2b:
In Beta 7.0.0.1b, it used about of 14GB VRAM, and an upscale of 320x240 was 0.6fps.
In Beta 7.0.0.2b. it uses the whole VRAM and a part of System RAM, running slow at 0.2 to 0.3 fps.
(RTX 4090 24GB, Ryzen9 7950X3d, 64GB RAM)
(because of uncapped VRAM ?)
When i set the NVIDIA-Control panel to not use System-RAM, then i get an error.
Using Starlight Mini:
in Beta 7.0.0.1b it worked well (RTX 4090), about 0.6 fps, using about 12GB Vram.
In Beta 7.00.2b is uses the whole VRAM and some system RAM, it is slow (0.2-0.3 fps).
I can fully confirm that. Upscaling on my RTX 4090 is noticeably slower as well.
While I was getting up to 0.6 FPS in the previous beta, I’m now only reaching around 0.2 to 0.3 FPS in Beta 7.00.2b.
Also, despite limiting VRAM usage to 90% within the program, my 4090’s memory is almost completely maxed out. So it seems like the limit isn’t working properly.
That said, in my short tests with the new beta, I do feel like the upscaling quality has improved compared to Beta 7.0.0.1b.
From what I’ve found out so far, 1280 x 960 is the minimum resolution — it can’t be set any lower.
It also seems that the resolution is automatically adjusted if the original aspect ratio can’t be maintained.
Yesterday, during a quick test, I selected 1920 x 1080 as the target resolution, but the output video was actually rendered at 1920 x 1440.
It’s like it ignores the what you set it to on the first render… or it is limited by the aspect ratio.
I’m still able to get 16:9 aspect ratio DVDs to upscale to 1280x720.
I just tried with a 720p video and a 720x480 video, and errored out both times. Interestingly, I scrolled a little further up in the log file and saw this:
2025-05-01 20-08-20.430 Thread: 46100 Info FF Process Output: 5 [ERROR] exception during
run(): CUDA out of memory. Tried to allocate 3.52 GiB. GPU 0 has a total capacity of 24.00 GiB
of which 4.71 GiB is free. Of the allocated memory 14.66 GiB is allocated by PyTorch, and 1.61
GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try
setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.
See documentation for Memory Management
(https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
That is WITH Sysmem Fallback enabled, though I’m not sure if I need to reboot or restart TVAI for it to go into effect?