Topaz Video Enhance AI keeps crashing in bizarre ways

On a laptop.
CPU is a i7 10875H. GPU is an nVidia GeForce RTX 2070 Super Max-P (115W limit).

Temps on the GPU never exceed 72C, temp limit of 100C.
Temps on the CPU never exceed 95C (and rarely stay anywhere near it), temp limit of 100C.

Topaz VEAI crashes in bizarre ways depending on what version.
2.6.4 seems to completely crash the computer, it freezes on the last screen it was on and stops processing. One was particularly scary where it shut my computer down entirely…
2.3.0 the program just closes. In the middle of an upscale, just closes.

The model used doesn’t seem to matter either.

It’s not like Topaz VEAI uses much of my specs here. The GPU maintains ~60% usage at best even during an intense live-action upscale, and my CPU doesn’t go much past ~80%.

Some people say to downgrade, clearly tried it, doesn’t work still.
Some people say it’s a “power regulation issue”, doesn’t make sense with a laptop. If a program was hypothetically needing more power than my AC adapter provides (which at 240W has loads of headroom), it’d siphon from battery. It doesn’t do that though.

I’ve seen others have this issue on their top-of-the-line desktops rocking 3080 Tis, where Topaz can’t even break 10-15% GPU usage. It’s not a specs issue. But it also seems to be a long-standing problem for Topaz judging from Google searches.

1 Like

Disabling “ThrottleStop” entirely, which I use to undervolt my CPU for better performance and temperatures seems to have fixed it. I completed an upscale (which took about ~1h30m) whereas before I wasn’t able to do this anymore.

One of the Topaz crashes last night finally (instead of shutting down or just hanging) yielded a BSOD for me which had the WHEA_UNCORRECTABLE_ERROR message. This was actually very helpful because for those who are undervolting, this message means your undervolt is at fault. Specifically it is a message that means there’s a power delivery issue somewhere.

Which makes the earlier findings of “your power supply” ring a bit truer. I would not at all be surprised if some absolutely bleeding edge (or conversely, sloppier PSUs that deliver too much to begin with or aren’t as variable as they should be) PSUs are actually able to accommodate less-stable undervolting (or overvolting for enthusiasts) settings.

Undervolting is very great, especially for laptop CPUs, or modern desktop GPUs (I’ve seen some 3090s decrease power consumption by over 100w and the temps go down over 20C by undervolting!) but it can indeed be unstable even if it “passes tests”, due to the varying power requirements certain workloads will require.

1 Like

I have encountered this problem since first purchasing on all but basic Artemis and Proteus Fine Tune model for progressive videos and even then I’m limited to 200%
resolution enhancement. I have reduced voltage on CPU and throttled GPU. It makes no difference. In my case it appears to be a memory limitation. My GTX 970 has 4GB RAM. For some models that do work, I have to set "memory usage’ to about medium. Too low and it crashes. Too high and it crashes. For Gaia and Theia it crashes as soon as the model downloads and attempts to run. I’ve been a programmer for years. It seems like it should be possible to predict memory requirements BEFORE crashing my computer. The same defect occurs in Topaz Gigapixel and Topaz Sharpen AI. Rule 1: don’t crash my computer. Rule 2: provide useful error messages

I have monitored temperatures on CPU and GPU and the crashes occur well within normal range. I have removed and reapplied heat sink to GPU and CPU. I have replaced my i5 CPU with an i7 CPU. I have monitored power usage and have 40% headroom when processing with models that don’t crash my computer. This is a programming problem that many have reported and Topaz does not seem interested in fixing. I have not upgraded to the latest version because I cannot afford to pay hundreds of dollars on the off-chance that this problem has been addressed. Same goes for paying a thousand dollars to upgrade my whole computer and video card.

I’m having constant crashing issues when processing 8K content to 100% output (also 8K output). It runs for many hours on my 16-core + 3090 and then the machine reboots. I can only solve this problem by running on image sequences, but they I have to deal with terabytes of data to get through a video.

Will Topaz engage on these topics? Can we send in logs to help?

Sounds like a power supply issue to me, but might not be.

It’s possible, but it’s unlikely. The machine has a giant power supply that is not being taxed, and the CPU/GPU are also not running at high % during the enhance. Also, I can run it on the same video if I export the video first as an image sequence.

I’ve heard that with 3090s, even 1600 watt PSUs can cause restarts. I think it has to be rated higher than gold to not be susceptible. Anyway, I’m just going off of my terrible memory of not proven facts I read in random places of the internet. It really may be something else. For a computer to restart though, it has to be a hardware issue. (though programs can be made in such a way that they cause hardware to crash.)

For old versions of VEAI (that included command-line version) i’ve even made a script that detected VEAI crash and restarted where it left.
But after command-line removal i’ve had to go back to manually restarting VEAI.
Still, with newer versions i’ve switched from PNG output (that can be restarted at any point) to H.264 CRF 1 output and crashes became painful because it’s hard to recover partial file (impossible for MOV).
What turned out to help my 3080 is undervolting it but still running overkill voltage. E.g. i reduce my clock speed all the way to 1400 but leave voltage at 0.8 (which is reasonable for 1700 mghz). That almost completely removed crashes.
But yeah, newer versions don’t crash, they bluescreen your PC instead . I don’t know why :slight_smile:

Just to stop the snowball: white,copper,silver,gold,etc stickers are NOT the efficiency level in terms of “output power & stability”.
The stickers are supposed to show how efficient the psu is regarding consumption vs output, in other words << how much electricity doesn’t get wasted (lost) in the process between the input of the power supply (wall socket) and the output that it generates for your components. >> It gives a very vague idea of the efficiency on your future electricity bills.

Telling this so nobody rushes to buy new PSU’s with wrong info.

I’ve seen some weird behavior with intel processors. I don’t know the specifics but a fix that I’ve been able to constantly reproduce is to disable the xmp profile in the bios. This fixed 99% of all random errors. I feel like it might be an issue with pipelining code that creates memory overflow, but what do I know right?. Any who hope this might help you.

3.01 and 3.02 crashes on one of my movie files doing a preview.
MacMini M1
input file is HD H264
Tried the same file on my MacBook Air M1, it does not crash on this machine.
The MacMini and an Intel iMac are remote, I use them vial AppleRemoteDesktop. That was the machine where it crashed. Maybe ARD related ?
It also crashes when running, 3 times now. This is so buggy I cant use it

btw:all audio tracks get lost, only the first main audio track gets copied

I believe you can still download version 2. In my opinion it is more stable than version 3

Just got a new 7900XTX (not the reference model with cooler issues) and VEAI will often BSOD/Crash/Restart the computer after hours of use, and even sooner if I try to run multiple processes at the same time. I thought it may be my PSU so I went out and bought a new Corsair 1200W Platnum PSU, but still experiencing the crashes. Very surprising given that my GPU is never at more than 50% load and <70C temps.

I tried doing some testing and nothing else seems to crash the computer; I even tried running furmark for a day with the GPU pinned at 100%, but still no crashes.