AI Audio Upscaler - Low quality audio sounds high quality

Mayday · March 17, 2025, 8:00am

Which form of enhacement your thinking? There are extra audio programs, even free ones like Audacity. When it goes to editing DTS or Atmos, that would hardly be possible for licensing reasons.

tomas.fojtik · March 17, 2025, 8:42am

You know I’m just a guy who is utilizing idle power of my desktop. To enhance old movies I like and cannot find anymore.

While the picture looks good now, it still sounds like s****

And if the models for that are out there anyway, some kind of sound enhancement in one passwould be nice as well

AND-E · March 17, 2025, 9:33am

Hi.

Mayday gave the correct answer to your original post in respect of there are free sound editors as well as paid for professional ones that can edit DTS and Atmos

Have you considered editing your videos in the free version of DaVinci Resolve first which has a professional DAW sound editor called Fairlight and professional colour grading features

Then use Video AI for your final Enhancements

Hope this helps

tomas.fojtik · March 17, 2025, 10:26am

ok, noted with thanks

Mayday · March 23, 2025, 11:00am

I have an Idea, but I’m not sure if it works. Theoretically: When a audio tool only can enhance stereo, you could sperate each channel of a sourround track. For example when you have 5.1 create 6 different files upload each for example here and then join all channels together again

harald.thingelstad · May 19, 2025, 12:00am

I have video I haven’t been able to use due to serious wind distortion, and while RX may be able to do filtering I would like to have something to recreate sound from the remaining information rather than just filtering it. Like being able to recognise what is being said, perhaps even with the help of a text prompt if necessary, as well as well as the voice from portions of the audio that is relatvely noise free, and then put these things together again using a vocoder of some sort.
It would be a new product on the market, and as we video people often have more than enough to handle with the camera and all it could be of great help sometimes.

Mike.M · May 19, 2025, 12:04am

Post a portion of the clip. I’d be curious to check it out with the software I have.

harald.thingelstad · May 19, 2025, 12:49am

Well, this audio clip here should be pretty average. I’m sure there is worse to be found. And the sound of the stream will further complicate things I guess, because I would actually want some it for realism, but it could be mixed back in. Just being able to salvage the voice would be great.
And the forum doesn’t accept audio files only images so here’s a link:

ForSerious · May 19, 2025, 3:11pm

You can zip them to upload directly on this site.
UVR5 Main.zip (14.2 MB)
UVR5 HQ4.zip (15.7 MB)
Here’s what a couple of the models did with it in Ultimate Vocal Remover 5. I put them in flac, but can upload them in wav if you want. i is for ‘instruments’ and v is for vocals.

harald.thingelstad · May 19, 2025, 10:48pm

That’s a huge improvement, thank you! And this Ultimate Vocal Remover even claims to run on Linux!
I can mix in some of the noise and hope that will tone down the artifacts, the important thing is that the voice doesn’t drown. I have some work to do, it seems!

ForSerious · May 19, 2025, 11:15pm

Best of luck! I really love what I have been able to get UVR to do, and it’s free!

meimeiriver · May 31, 2025, 7:11pm

Conversion to PCM would more than suffice for me.

Thing is, though, ‘upscaling’ audio is a lot harder than doing so for video. The latter consists of thousands of pixels, all offered parallel per frame, for which you can ‘delightfully’ interpolate between them; whereas audio is essentially just a serialized stream of ‘1-pixel’ per frame. What you’d have to do to get something useful out of that is beyond the abilities of TVAI, I fear; but, more importantly, beyond the scope of TVAI.

ForSerious · June 1, 2025, 3:04am

What does upscaling audio mean to you? I’ve been using Stereo Tool to fill in the lost high frequencies in compressed audio and even in 80s recordings that didn’t have anything above 15kHz. I think it’s a form of interpolation, but maybe it’s called something else. Anyway, making an AI to do the same kind of thing would be ultra easy—so much so that it’s probably already been done.
Making one to go from oldie-moldy to clean and clear would be harder since you need a correct way to degrade clean recordings in the same way.

meimeiriver · June 1, 2025, 6:09am

If so, I will go look for it. Haven’t really seen it, though, yet; but I would definitely be interested in ‘upscaling’ old AC3 5.1 DVD tracks, for instance.

pleochroic · September 11, 2025, 6:16pm

Good sound can make a video seem as much as 2X better on a subjective basis with human subjects.

pleochroic · September 11, 2025, 6:22pm

Here is just a high level summary of the research from Human Factors and Perceptual Cognitive Science studies over several decades. It is my opinion irrefutable.

If you have any doubts a simple way to test this compare how you feel when you have a crystal clear, hear the pin drop, telephone conversation versus a scratchy, phone connection with snaps, pops, background noise, and possibly the sound of other people’s voices. You are calm, relaxed and feel like the person you are talking to you is in the same room in the first instance. In the second instance, you are already concentrating so hear just to understand the words that being able to engage in a meaningful, stress-free conversation is next to impossible.

Scientific evidence from perceptual cognitive psychology and human factors studies consistently shows that audio quality can influence the subjective perception of video quality, sometimes making a video seem higher resolution or more pleasant when the audio track is improved—even when the video resolution itself remains unchanged.

Key Findings from Experimental Studies

Cross-modal Influence: Studies find that viewers rate the video content as higher quality when the accompanying audio is of high fidelity, even if the video resolution remains the same. The perceived video quality can improve in parallel with audio quality due to the integration of sensory modalities in cognitive processing.
Attention and Task Dependence: Research demonstrates that when viewers focus on audio-related tasks or content, their ratings of video quality tend to follow perceptions of the audio—sometimes regardless of the actual video fidelity.
Subjective Quality Ratings: Experiments using simultaneous manipulations of audio and video codecs show changes in mean opinion scores (MOS) for both modalities, recording instances where improved audio raises the rating of perceived video quality, especially in immersive scenarios.
Content and Comprehension Effects: Changes in audio quality (such as reduction in noise or improved clarity) have also been shown to enhance comprehension and overall enjoyment of audiovisual media, which leads to higher subjective ratings for the combined experience, including visual clarity.

Representative Research

Långvik (2015), collaborating with Ericsson, found that “results show a perceived difference in video quality following the perception of audio quality, although the video quality was never altered”—demonstrating the cognitive bias toward rating overall AV quality higher when audio is improved.
Rimell et al. (2008) and others report that “the perceived quality of the other modality (video) seems to follow the quality of the audio, regardless of the video’s true quality,” highlighting a robust cross-modal effect in subjective judgment.

In summary: There is strong, peer-reviewed evidence that an improved audio track can make a video subjectively appear higher resolution or quality in human ratings and perceptual studies, due to cognitive integration of multimodal sensory input and rating biases.

ForSerious · September 11, 2025, 7:02pm

I never have the audio playing when I’m evaluating how well the AI enhancement model did.

I agree that better audio quality can be key, but who knows if Topaz will go into that.

pleochroic · September 11, 2025, 7:37pm

The fact that this idea has 29 votes, one of the higher numbers I have seen for a user proposal, and after about 1335 days nothing has been done, no clear justification for inaction, is not a good sign.

Voice of the Customer (VOC) matters.

Not listening to the VOC results in bad karma.

pleochroic · September 11, 2025, 8:32pm

Topaz Labs LLC should endeavor to improve the sound quality of the video that it visually enhances and upscales.

Companies that ignore audio improvements or test only isolated video metrics risk missing critical opportunities for perceptual optimization and competitive advantage.

pleochroic · September 11, 2025, 8:58pm

Thanks for the tip on ffmpeg, ForSerious.

My question is more mundane. If I burn a DVD with the MP4 should I think of recoding the sound track for a different codec so it will play on the now mostly forgotten DVD player that some relatives of mine still have. [I for one no longer have a DVD, except on for the PC and the Mac, and I am not convinced that it representative of consumer grade , Joe-average, DVDs connected to TV.]

Best,
Michael