Topaz Video AI does not use GPU on Linux

michael.de_marliave · May 13, 2024, 9:32am

I want to use Video AI with command line on linux (Ubuntu 22.04 LTS).
Tried various NVIDIA GPUs (GeForce RTX 40xx/30xx) and TVAI versions (3.x, 4.x, 5.x, alpha/beta).

I always come up to the same problem : it uses the CPU, not the GPU. (GPU drivers are up-to-date)

The ffmpeg command is :

ffmpeg \
-v verbose \
-y \
-i "input.mp4" \
-profile:v high \
-preset medium \
-pix_fmt rgb48le \
-c:v h264_nvenc \
-filter_complex "tvai_up=model=prob-3:scale=2:preblur=-0.6:noise=0:details=1:halo=0.03:blur=1:compression=0:estimate=20:blend=0.8:device=0:vram=1:instances=1" \
-b:v 0 \
output.mp4

Logs look like this :

ffmpeg version git-2024-02-12-2e8b679fc Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.4.0-1ubuntu1~22.04)
  configuration: --enable-shared --disable-ffplay --disable-libxcb --disable-sdl2 --disable-xlib --enable-tvai --enable-libvpx --enable-libaom --enable-nvenc --enable-nvdec --extra-cflags='-I./conan/include -I./conan/include/videoai' --extra-ldflags='-Wl,-rpath,./conan/lib/ -Wl,-rpath,/home/vsts/work/1/b/Qt/6.6.1/gcc_64/lib -L./conan/lib' --prefix=./output-conan/
  libavutil      58. 36.101 / 58. 36.101
  libavcodec     60. 37.100 / 60. 37.100
  libavformat    60. 20.100 / 60. 20.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 17.100 /  9. 17.100
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 13.100 /  4. 13.100
[h264 @ 0x55a6e945edc0] Reinit context to 1920x1088, pix_fmt: yuv420p
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2024-05-01T10:29:01.000000Z
  Duration: 00:00:04.44, start: 0.000000, bitrate: 2128 kb/s
  Stream #0:0[0x1](eng): Video: h264 (Main), 1 reference frame (avc1 / 0x31637661), yuv420p(progressive, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 1784 kb/s, 29.97 fps, 29.97 tbr, 30k tbn (default)
      Metadata:
        creation_time   : 2024-05-01T10:29:01.000000Z
        handler_name    : ?Mainconcept Video Media Handler
        vendor_id       : [0][0][0][0]
        encoder         : AVC Coding
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 316 kb/s (default)
      Metadata:
        creation_time   : 2024-05-01T10:29:01.000000Z
        handler_name    : #Mainconcept MP4 Sound Media Handler
        vendor_id       : [0][0][0][0]
[out#0/mp4 @ 0x55a6e948b580] No explicit maps, mapping streams automatically...
[vost#0:0/h264_nvenc @ 0x55a6e946fd40] Created video stream from input stream 0:0
[Parsed_tvai_up_0 @ 0x55a6e9483700] Here init with params: prob-3 2 -2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
Using auto hwaccel type cuda with new default device.
[aost#0:1/aac @ 0x55a6eaf51f80] Created audio stream from input stream 0:1
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_nvenc))
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
[vost#0:0/h264_nvenc @ 0x55a6e946fd40] Starting thread...
[aost#0:1/aac @ 0x55a6eaf51f80] Starting thread...
[vf#0:0 @ 0x55a6e948c800] Starting thread...
[af#0:1 @ 0x55a6eafdd880] Starting thread...
[vist#0:0/h264 @ 0x55a6e9461300] Starting thread...
[aist#0:1/aac @ 0x55a6e94b0040] Starting thread...
[in#0/mov,mp4,m4a,3gp,3g2,mj2 @ 0x55a6e945c600] Starting thread...
Press [q] to stop, [?] for help
[h264 @ 0x55a6ea22d3c0] NVDEC capabilities:
[h264 @ 0x55a6ea22d3c0] format supported: yes, max_mb_count: 65536
[h264 @ 0x55a6ea22d3c0] min_width: 48, max_width: 4096
[h264 @ 0x55a6ea22d3c0] min_height: 16, max_height: 4096
[h264 @ 0x55a6ea22d3c0] Reinit context to 1920x1088, pix_fmt: cuda
[graph_1_in_0_1 @ 0x7fdae4003c40] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:stereo
[Parsed_tvai_up_0 @ 0x7fdaf0003a80] Here init with params: prob-3 2 -2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
[graph 0 input from stream 0:0 @ 0x7fdaf0004480] w:1920 h:1080 pixfmt:nv12 tb:1/30000 fr:30000/1001 sar:1/1 csp:unknown range:unknown
[auto_scale_0 @ 0x7fdaf0004280] w:iw h:ih flags:'' interl:0
[Parsed_tvai_up_0 @ 0x7fdaf0003a80] auto-inserting filter 'auto_scale_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_tvai_up_0'
[auto_scale_1 @ 0x7fdaf0015c00] w:iw h:ih flags:'' interl:0
[format @ 0x7fdaf00058c0] auto-inserting filter 'auto_scale_1' between the filter 'Parsed_tvai_up_0' and the filter 'format'
[auto_scale_0 @ 0x7fdaf0004280] w:1920 h:1080 fmt:nv12 csp:unknown range:unknown sar:1/1 -> w:1920 h:1080 fmt:rgb48le csp:gbr range:pc sar:1/1 flags:0x00000004
[Parsed_tvai_up_0 @ 0x7fdaf0003a80] Here init with perf options: model: a�
� scale: 1080 device: 1 vram: 0.000000 threads: 0 downloads: 0

2024-05-02 16:05:06 140578482241536  INFO:  Creating file fetcher for: https://veai-models.topazlabs.com REMOTE DIR: /
2024-05-02 16:05:06 140578482241536  INFO:  AIEngine mode: Normal 3.2.6
2024-05-02 16:05:06 140578482241536  INFO:  model dir path from env TVAI_MODEL_DIR and TVAI_MODEL_DATA_DIR /opt/TopazVideoAIBETA/models/models
2024-05-02 16:05:06 140578482241536  INFO:  Resetting to model directory to local directory 0/models0
2024-05-02 16:05:06 140578482241536  INFO:  ModelManager: setting modeldirPath: /opt/TopazVideoAIBETA/models
2024-05-02 16:05:06 140578482241536  INFO:  File Fetcher setting localDirPath to /opt/TopazVideoAIBETA/models
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/thm-1.json
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/video-encoders.json
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/audio-codecs.json
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/ref-1.json
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/aaa-10.json
2024-05-02 16:05:06 140578482241536  WARNING:  ModelManager: Ignoring model: /opt/TopazVideoAIBETA/models/benchmarks.json
2024-05-02 16:05:06 140578482241536  INFO: Loaded models info map
2024-05-02 16:05:06 140578482241536  INFO:  OS VER: 5.15
2024-05-02 16:05:06 140578482241536  INFO:  VNNI: 0 AVX: 1 CPUName: AMD Ryzen 9 5950X 16-Core Processor Thread Count: 16
2024-05-02 16:05:06 140578482241536  INFO:  Machine id: 536b6871c5484a56b85b0ad9ced3fc3e
2024-05-02 16:05:06 140578482241536  INFO:  Adding default GPU
2024-05-02 16:05:06 140578482241536  INFO: === System Information ===
2024-05-02 16:05:06 140578482241536  INFO: OS Linux Version 5.15
2024-05-02 16:05:06 140578482241536  INFO: CPU AMD Ryzen 9 5950X 16-Core Processor Threads 16 AVX 1 AVX2 1 VNNI 0
2024-05-02 16:05:06 140578482241536  INFO: Is Apple Processor 0
2024-05-02 16:05:06 140578482241536  INFO: RAM 31.2973 GB Total / 2.53529 Free/Used
2024-05-02 16:05:06 140578482241536  INFO: Machine ID: 536b6871c5484a56b85b0ad9ced3fc3e
2024-05-02 16:05:06 140578482241536  INFO: Device count: 1
2024-05-02 16:05:06 140578482241536  INFO: - Index 0 Name Default GPU Cores 0
2024-05-02 16:05:06 140578482241536  INFO:  VRAM 2 GB Total / 4.58659e-41 GB Used DataType 1223861280
2024-05-02 16:05:06 140578482241536  INFO:  Serial  ComputeLevel: 0 CUDA ID: -92360447 Visible: 1 Legacy: 1
2024-05-02 16:05:06 140578482241536  INFO: ==========================
2024-05-02 16:05:06 140578482241536  INFO:  Checking for authentication at /opt/TopazVideoAIBETA/models/auth.tpz
2024-05-02 16:05:06 140578482241536  INFO:  Successfully authenticated for user: michael@micorp.studio
2024-05-02 16:05:06 140578482241536  INFO:  Model found in list prob-3
2024-05-02 16:05:06 140578482241536  INFO: Input Size: 1920x1080 device auto changes index -1 extra thread count 8 max memory 1 original index -2 extra threads 0 max memory 1 max instances 0
2024-05-02 16:05:06 140578482241536  INFO:  Available Backends 52
2024-05-02 16:05:06 140578482241536  INFO:  Blocks total: 1 N 18 Blocks count 18
2024-05-02 16:05:06 140578482241536  INFO: Checking block sizes 1920x1080 penalty 0.05 overlap 48
2024-05-02 16:05:06 140578482241536  INFO: Make backend 6 input size 1920x1080 block 672x576 openvino device -1 memory 1
2024-05-02 16:05:06 140578482241536  INFO: Computing device instances
2024-05-02 16:05:06 140578482241536  INFO:  Apple 0 DEVICES 0
2024-05-02 16:05:06 140578482241536  INFO:  DEVICE INFO 1
2024-05-02 16:05:06 140578482241536  INFO:  Creating model name pru instance 0x7fdaf0b62b10
2024-05-02 16:05:06 140578482241536  INFO: NETS ["fgnet-fp16-[H]x[W]-[S]x-ox.tz"] OUTPUTS ["generator/output"]
2024-05-02 16:05:06 140578482241536  INFO:  COMPUTED FILE NAME fgnet-fp16-[H]x[W]-[S]x-ox.tz prob-v3-fgnet-fp16-576x672-2x-ox.tz
2024-05-02 16:05:06 140578482241536  INFO:  Locating local model: "/opt/TopazVideoAIBETA/models/prob-v3-fgnet-fp16-576x672-2x-ox.tz" status: 1
2024-05-02 16:05:06 140578482241536  INFO: CACHING MAY NOT WORK"/opt/TopazVideoAIBETA/models/prob-v3-fgnet-fp16-576x672-2x-ox.tz"
2024-05-02 16:05:07 140578482241536  INFO:  Loading filename prob-v3-fgnet-fp16-576x672-2x-ox.tz fgnet-fp16-[H]x[W]-[S]x-ox.tz ["generator/output"]
2024-05-02 16:05:07 140578482241536  INFO:  Total Devices: 1
2024-05-02 16:05:07 140578482241536  INFO:  DEVICE DETAILS: -1 INSTANCES 9
2024-05-02 16:05:07 140578482241536  INFO:  PARAM DEVICE: -1
2024-05-02 16:05:07 140578482241536  INFO: ModelBackend loading from file
2024-05-02 16:05:07 140578482241536  INFO: CACHING MAY NOT WORK"/opt/TopazVideoAIBETA/models/prob-v3-fgnet-fp16-576x672-2x-ox.tz"
2024-05-02 16:05:07 140578482241536  INFO:  Reading model file duration: 0.019 ms
2024-05-02 16:05:07 140578482241536  INFO:  Network Name: Model1 Size: 41607390
2024-05-02 16:05:07 140578482241536  INFO:  Transpose: 0
2024-05-02 16:05:07 140578482241536  INFO:  Inputs: 1
2024-05-02 16:05:07 140578482241536  INFO:  Input name: fgwnet/input:0
2024-05-02 16:05:07 140578482241536  INFO:  Outputs: 1
2024-05-02 16:05:07 140578482241536  INFO:  Output name: generator/output:0
2024-05-02 16:05:07 140578482241536  INFO:  Loading time for model file: /opt/TopazVideoAIBETA/models/prob-v3-fgnet-fp16-576x672-2x-ox.tz is 67.818 ms
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 2
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 3
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 4
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 5
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 6
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 7
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 8
2024-05-02 16:05:07 140578482241536  INFO:  Shared PTR: 9
2024-05-02 16:05:07 140578482241536  INFO:  TOTAL INSTANCES: 9
2024-05-02 16:05:07 140578482241536  INFO:  Init backend runner 9
2024-05-02 16:05:07 140578482241536  INFO: [Pipeline]Start layer: model
2024-05-02 16:05:07 140578482241536  INFO: [Pipeline]Connecting model to post
2024-05-02 16:05:07 140578482241536  INFO:  Auth Check Watermark: 0002022-10-182024-11-16
2024-05-02 16:05:07 140578482241536  INFO: [Pipeline]Output layer: post
2024-05-02 16:05:07 140578482241536  INFO:  Video Processor setup successfully for model prob-3
2024-05-02 16:05:07 140578482241536  INFO:  With input dimension (width x height) 1920 x 1080
2024-05-02 16:05:07 140578482241536  INFO:  With output dimension (width x height) 3840 x 2160[auto_scale_1 @ 0x7fdaf0015c00] w:3840 h:2160 fmt:rgb48le csp:gbr range:pc sar:1/1 -> w:3840 h:2160 fmt:yuv420p csp:unknown range:unknown sar:1/1 flags:0x00000004

2024-05-02 16:05:07 140578482241536  INFO: Preflight frame index 1 4
2024-05-02 16:05:07 140578482241536  INFO:  ---TBlockProc::TBlockProc W: 672 H: 576 C: 3 R: 2 X: 624 Y: 504
2024-05-02 16:05:07 140578482241536  INFO:  ---TBlockProc::TBlockProc W: 672 H: 576 C: 3 R: 2 X: 624 Y: 504
2024-05-02 16:05:07 140578482241536  INFO: Preflight frame index 2 4
2024-05-02 16:05:07 140578482241536  INFO: Preflight frame index 3 4
2024-05-02 16:05:07 140578482241536  INFO: Preflight frame index 2 4
2024-05-02 16:05:07 140574386835456  INFO:  OVM INITING
2024-05-02 16:05:07 140574386835456  INFO:  OV device selection: CPU 
2024-05-02 16:05:07 140578482241536  INFO: Preflight frame index 1 4AMD Ryzen 9 5950X 16-Core Processor            
2024-05-02 16:05:07 140574386835456  INFO:  OpenVino device string is CPU index -1
2024-05-02 16:05:07 140574386835456  INFO:  - Loaded network successfully for deviceid CPU index -1
2024-05-02 16:05:07 140574386835456  INFO:  ExeNetwork optimal requests: 1
2024-05-02 16:05:07 140574386835456  INFO:  OVM FINISHED INITING

Any help would be appreciated

ForSerious · May 13, 2024, 2:15pm

Did you try the Linux beta?

michael.de_marliave · May 13, 2024, 2:30pm

Yes both beta and alpha versions

gregory.maddra · May 13, 2024, 10:15pm

Is this in a docker container by any chance? If so, you could try setting ENV NVIDIA_DRIVER_CAPABILITIES=${NVIDIA_DRIVER_CAPABILITIES},video,graphics in your Dockerfile?

If not, what’s the version number of the driver you currently have installed?

eh117 · May 14, 2024, 12:02am

This is usually a CUDA issue. What is the output of nvidia-smi?

BTW, most programs have no problem accessing the GPU through wine.

michael.de_marliave · May 14, 2024, 3:27pm

I indeed tried in a docker container, with the ENV var you showed :

$ echo $NVIDIA_DRIVER_CAPABILITIES
> compute,utility,video,graphics

I also tried without any container, packages versions are libnvidia-*-550

michael.de_marliave · May 14, 2024, 3:29pm

Here is the nvidia-smi output (inside my docker container) :

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.10              Driver Version: 551.61         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    On  |   00000000:0C:00.0  On |                  N/A |
| 32%   53C    P2            139W /  285W |    8582MiB /  16376MiB |     70%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A        25      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+

eh117 · May 15, 2024, 12:10am

Several things…

Where did you specify the GPU in your command? There is no -hwaccel or -hwaccel_device. You should specify CUDA or device 0. Try running a command like

ffmpeg -hwaccel cuda -i "input.mp4" -c:v h264_nvenc -profile high -preset medium -global_quality 23 -pix_fmt yuv420p -vf "tvai_up=model=prob-3:scale=2" "output.mp4"

Have you used prime-select to choose the Nvidia GPU first before running the command?

Also… you have the wrong Nvidia driver. The current Linux driver for 12.4.1 is 550.54.15, and the driver released in 12.4.0 was 550.54.14. The recommended version for your GPU on a Linux system is 535.179.

gregory.maddra · May 16, 2024, 9:57pm

Do you get different results if you use the videoaiBETA-run wrapper script that the package installs?