I will not tire of saying it, this programme is not properly optimised. And it does not exploit the performance of the RTX 4090 at all.
I wish developers would finally focus on making performance normal for high-end graphics cards.
It can’t be the fault of anything external to the programme as other AI programmes I use run like the wind.
This program (called enhancr in case you’re interested) contains TRT (TensorRT, an NVIDIA implementation that shamelessly speeds up processing) models, which as you can see, a music video takes about 1 minute to process. Whereas TVAI with Apollo or any other model, it takes about 15 minutes or 30-45mins if combined with 4k upscaling at the same time. Even in the latest update the developer implemented DirectML versions which is similar to TensorRT but is developed by Microsoft, it also speeds up performance drastically, much more than TVAI.
As I understand it, and as I read some time ago. TVAI has TRT models, which I don’t see the logic in them being so exaggeratedly slow. So they are simply poorly optimised and perhaps by increasing their VRAM usage they can be accelerated. Are the models FP16 architecture? That would also drastically increase performance on graphics cards with Tensor cores, if the models are FP32 then no wonder they are so slow.
Your forum profile has been set to hidden , but I’m guessing you must be a new user of VEAI / VAI.
Otherwise you should remember how slow VEAI was in the old time and should be grateful for how “fast” VAI is at the moment.
Since V1.7.0 VEAI add support of Tensor core and they started changing model from fp32 to fp16 for supported GPU, many users has got speed improve 2 times faster.
My point is that performance on a 4090 has always been woeful. It’s better suited for other graphics cards, that’s something I’ve always mentioned. Which makes no sense, the 4090 is the most powerful graphics card currently for AI inference and should have no reason to have lousy performance on TVAI.
I understand the frustration of having the most powerful GPU but not being able to use it to its full potential.
However, this happened before when the RTX 3090 was released; many users bought it for VEAI and just found out that it had little improvement over their last card.
When I begun with Topaz Video Enhance AI we only Had Gaia HQ and Gaia CG. Processing a full movie took around 18 hours. Now with the latest version I can do the same work in between 4 and 8 hours depending on the mode I choose.
Thank you for correction, I have a typo in my last post, 24fps is for 1x Artemis, 12fps for 2x Artemis.
The real life result might be slower than the benchmark but it is big improvement compare to older version VEAI already.
I guess CPU speed contribute to benchmark calculated fps. Since your newer CPU + 3000 GPU has better benchmark score than my older CPU + 4000 GPU. Either that or Topaz cannot make use of the full capability of 4090.
I’m trying to figure out how significant is the CPU bottleneck or the slow fps I have is due to poor optimization for 4090. I can’t just drop 100s of dollars to replace my CPU, ram, motherboard to test. Aside from spending the money to get a new CPU (which may or may not give me meaningful fps gain), I have tried variety of ways to off load work from CPU. I have tried using GPU encoder, which reduced CPU used to roughly 50-60%, yet no noticeable improvement in speed (maybe by 0.1 or 0.2 of fps). I have tried setting affinity of ffmepg to smaller number of cores. Again, no noticeable reduction in fps until I limit ffmepg to very small number of cores (1-4 cores). These testing suggest to me that even if I switch to the newest and top of the line CPU, the gain may not be big (like going from 6 fps to 15 fps for 1080p x2, like the benchmarks are suggesting).
The difference in quality is barely noticeable in CGI/IRL content, sometimes CAIN produces artifacts in certain scenes while Apollo does not and vice versa, but both types of artifacts are exactly the same type, distorted and blurred movements, especially in scene changes.
With the Chronos model I’m quite satisfied, with a 1080 source I reach 50fps, not bad at all. Comparing with RIFE (as it is similar to Chronos as it is optical flow) the speed I get in the last 4.6 model in its TRT version and TTA/Ensemble enabled (this increases the quality but at the cost of speed) is 160fps, and going up. And the quality is exactly the same as Chronos in its latest version, it produces exactly the same small artefacts in the same places as RIFE 4.6 Ensemble, so I guess the devs implemented this technology to TVAI.
Do the comparison yourself and draw your own conclusions. My conclusions are that both models are similar, between them (talking about CAIN and Apollo), because they produce the same types of artefacts, although Apollo gives an extra smoothness that CAIN does not have, maybe because I have not configured the settings correctly in enhancr and I have not enabled the de-duplication of frames as I did with TVAI as it was enabled by default. I forgot to mention that both CAIN and Apollo support static elements such as film credits without any problem, it doesn’t distort them or make the letters shake. And well, RIFE and Chronos are exactly the same, at least on how the results look.
If possible you could do both tests yourself and see how CAIN and RIFE make much better use of the GPU than the TVAI models. Although in order to use CAIN and RIFE in their TRT versions you would have to pay at least 7€ to get the paid version of the program.
the 4090 has a tiny hardware switch where you can switch between “gaming” power and “silent” use. please check if you have it on “gaming” for the best performance.
That would depend on the model. Mine does not have that switch. I can control the mode in software. Gaming/silent mode does not make a difference in term of speed.
Okay, please check your power saving options in Windows next, especially the advanced settings. If the maximum cpu usage is set below 99 % you lose half of the fps!