Gigapixel 1.3.0 — FaceRecoveryV3 fails on dual-GPU systems (RTX 5090 + RTX PRO 6000 Blackwell)

Hi Topaz team,

I’m hitting a consistent, reproducible crash in Gigapixel 1.3.0 on a dual-GPU workstation. Posting full details so you can fix it.

System

  • OS: Windows 11
  • CPU: Intel Core i9-13900K
  • RAM: 128 GB
  • GPU 0 (per Gigapixel): NVIDIA RTX PRO 6000 Blackwell Workstation Edition (94 GB VRAM)
  • GPU 1 (per Gigapixel): NVIDIA GeForce RTX 5090 (31 GB VRAM)
  • App: Topaz Gigapixel 1.3.0
  • AI Engine: 2.9.23

What happens
Any enhancement that triggers Face Recovery V3 (Standard V2, High Fidelity V3, etc. with face recovery enabled) fails with:

AI Engine Runtime Exception: Could not run model

The neuroserver subprocess logs:

FileNotFoundError: Input image not found: C:/Users/<user>/AppData/Local/Temp/neuroserver_input.png
ERROR:local_model_service:Error running video restoration: Input image not found...

Root cause (from the log)
The neuroserver enumerates GPUs in a different order than the host app:

  • Host Gigapixel order: [0] RTX PRO 6000, [1] RTX 5090
  • Neuroserver order: [0] RTX PRO 6000, [1] RTX 5090 — but translateDeviceId keeps remapping 0 → 1 and 1 → 0 and restarting the runner each time.

For every face recovery call I see this sequence repeated:

  1. NeuroserverProcessor::setupAndLoad model=FaceRecoveryV3 device=0
  2. Runner starts, lists GPUs
  3. translateDeviceId: translated deviceId 0 -> 1, restarting runner
  4. Runner is killed and a new one started on the “correct” GPU
  5. Gigapixel writes the input PNG to %TEMP%\neuroserver_input.png for the first runner
  6. The new runner starts, but by the time it tries to read the input file, it’s gone → FileNotFoundError
  7. Task fails with MODEL_INFERENCE_ERROR

Classic race condition between the input-file write and the runner restart triggered by GPU-index translation.

Secondary issue (same log)
Throughout the run there are dozens of:

Error | Attempting to set stage that does not exist in stage weights map: "FaceRecoveryV3"
Warn | Progress stage missing, attempting to auto-fix (imageai bug?)

So the progress/stage tracker doesn’t know about FaceRecoveryV3 either — likely related, and at minimum makes the progress bar useless during face recovery.

Reproducibility
100% on my machine. Any image that triggers face recovery fails on the first attempt. Occasionally a retry succeeds when the runner happens to land on the right GPU before the temp file is wiped, but most of the time it loops and fails.

Workaround
Forcing a single GPU in Preferences avoids the translateDeviceId restart and the crash goes away — but obviously that defeats the point of having two GPUs.

What I’d ask you to fix

  1. Make GPU enumeration consistent between the host app and the neuroserver, so translateDeviceId never has to remap and restart the runner.
  2. If a runner restart is needed, write the input file after the final runner is up — not before.
  3. Register FaceRecoveryV3 in the stage weights map so the progress tracker stops spamming errors.

Thanks!