Enabling TensorRT for an older NVDIA GPU (GeForce 940M)

I am trying to tweak the speed performance of the Proteus model. I am doing it from command line with ffmpeg and a DOS batch script that reads a sequence of images from an Input folder and outputs the enhanced files to an Output folder. I’m using the command line generated in the GUI in my batch job to get the syntax right. I’m using tiff and resolution is 1440 x 1080, so it’s a very heavy task (circa 28,000 frames). Right now the algorithm is enhancing well less than 1 frame per second.

From time to time the batch script crashes too due to out of memory errors, but it’s designed to pick up where it last left and hence cover all te images. (It would be impossible to make it work from the Topaz GUI btw).

But now I am ambitious and wanting to go one step further. I want to force the Topaz AI model to use TensorRT. Initially I imagined just rebuilding ffmpeg with --enable-tensorrt would be enough, but then I realized only Topaz’s own ffmpeg works, bc it enables the -TVAI parameter, which requires Topaz’s codes and libs.

Below are my CUDA versions. Can somebody help me a bit?

CUDA

export CUDA_HOME=/c/PROGRA~1/NVIDIA~2/CUDA/v11.7
export CUDA_INCLUDE_DIR=/c/PROGRA~1/NVIDIA~2/CUDA/v11.7/include
export CUDA_LIBRARY=/c/PROGRA~1/NVIDIA~2/CUDA/v11.7/lib/x64
export NV_CODEC_INCLUDE=/c/nv-codec-headers/include/ffnvcodec

TENSOR RT

export TENSORRT_HOME=“/c/Program Files/NVIDIA GPU Computing Toolkit/TensorRT-8.4.1.5”

export TENSORRT_INCLUDE_DIR=“/c/Program Files/NVIDIA GPU Computing Toolkit/TensorRT-8.4.1.5/include”

export TENSORRT_LIBRARY=“/c/Program Files/NVIDIA GPU Computing Toolkit/TensorRT-8.4.1.5/lib”

CUDNN

export CUDNN_HOME=/c/PROGRA~1/NVIDIA/CUDNN/v9.6
export CUDNN_BIN_DIR=/c/PROGRA~1/NVIDIA/CUDNN/v9.6/bin/11.8
export CUDNN_INCLUDE_DIR=/c/PROGRA~1/NVIDIA/CUDNN/v9.6/include/11.8
export CUDNN_LIBRARY=/c/PROGRA~1/NVIDIA/CUDNN/v9.6/lib/11.8/x64

Considering TensorRT requires specific hardware that does not exist on the GeForce 940M… your endeavor is erroneous.
Maybe I don’t know what I’m talking about. I’ve only really listened to the marketing that makes it sound like it’s something that uses the RT cores in RTX class cards.

1 Like

I think ForSerious is right, the tensor cores were invented with the RTX series of Nvidia cards?

Thank you @ForSerious

Any chance you could give me alternative tips on how to speed up the speed in that case. I don’t want to run a code for 72 hours.

Process with two instances, one processes the first half of the video, the other one the second half. I made a video how to do this. The version in the video is a bit dated though: https://youtu.be/kMw8Oo_pWFY

I can’t think of anything that would help the speed of the 940M. I do know that I was getting bottle-necked by my storage drive. I use tiff too. I upscale from DVD to FHD resolution, so using 4TB or more of space is something that can happen.
My solution was to get two drives that can write consistently at 400MBps. (I joined several drives using the Windows version of RAID0.) I have it read from one drive and write to the other. You might not need that though. I could tell I needed something because it would always start off fast, then slow down to a crawl after about 20 minutes.