AI Audio Upscaler - Low quality audio sounds high quality

Which form of enhacement your thinking? There are extra audio programs, even free ones like Audacity. When it goes to editing DTS or Atmos, that would hardly be possible for licensing reasons.

You know I’m just a guy who is utilizing idle power of my desktop. To enhance old movies I like and cannot find anymore.

While the picture looks good now, it still sounds like s****

And if the models for that are out there anyway, some kind of sound enhancement in one passwould be nice as well

Hi.

Mayday gave the correct answer to your original post in respect of there are free sound editors as well as paid for professional ones that can edit DTS and Atmos

Have you considered editing your videos in the free version of DaVinci Resolve first which has a professional DAW sound editor called Fairlight and professional colour grading features

Then use Video AI for your final Enhancements

Hope this helps

ok, noted with thanks

I have an Idea, but I’m not sure if it works. Theoretically: When a audio tool only can enhance stereo, you could sperate each channel of a sourround track. For example when you have 5.1 create 6 different files upload each for example here and then join all channels together again :wink:

I have video I haven’t been able to use due to serious wind distortion, and while RX may be able to do filtering I would like to have something to recreate sound from the remaining information rather than just filtering it. Like being able to recognise what is being said, perhaps even with the help of a text prompt if necessary, as well as well as the voice from portions of the audio that is relatvely noise free, and then put these things together again using a vocoder of some sort.
It would be a new product on the market, and as we video people often have more than enough to handle with the camera and all it could be of great help sometimes.

Post a portion of the clip. I’d be curious to check it out with the software I have.

Well, this audio clip here should be pretty average. I’m sure there is worse to be found. And the sound of the stream will further complicate things I guess, because I would actually want some it for realism, but it could be mixed back in. Just being able to salvage the voice would be great.
And the forum doesn’t accept audio files only images so here’s a link:

You can zip them to upload directly on this site.
UVR5 Main.zip (14.2 MB)
UVR5 HQ4.zip (15.7 MB)
Here’s what a couple of the models did with it in Ultimate Vocal Remover 5. I put them in flac, but can upload them in wav if you want. i is for ā€˜instruments’ and v is for vocals.

That’s a huge improvement, thank you! And this Ultimate Vocal Remover even claims to run on Linux!
I can mix in some of the noise and hope that will tone down the artifacts, the important thing is that the voice doesn’t drown. I have some work to do, it seems! :smiley:

Best of luck! I really love what I have been able to get UVR to do, and it’s free!

Conversion to PCM would more than suffice for me.

Thing is, though, ā€˜upscaling’ audio is a lot harder than doing so for video. The latter consists of thousands of pixels, all offered parallel per frame, for which you can ā€˜delightfully’ interpolate between them; whereas audio is essentially just a serialized stream of ā€˜1-pixel’ per frame. What you’d have to do to get something useful out of that is beyond the abilities of TVAI, I fear; but, more importantly, beyond the scope of TVAI.

What does upscaling audio mean to you? I’ve been using Stereo Tool to fill in the lost high frequencies in compressed audio and even in 80s recordings that didn’t have anything above 15kHz. I think it’s a form of interpolation, but maybe it’s called something else. Anyway, making an AI to do the same kind of thing would be ultra easy—so much so that it’s probably already been done.
Making one to go from oldie-moldy to clean and clear would be harder since you need a correct way to degrade clean recordings in the same way.

If so, I will go look for it. :slight_smile: Haven’t really seen it, though, yet; but I would definitely be interested in ā€˜upscaling’ old AC3 5.1 DVD tracks, for instance.

Good sound can make a video seem as much as 2X better on a subjective basis with human subjects.

Here is just a high level summary of the research from Human Factors and Perceptual Cognitive Science studies over several decades. It is my opinion irrefutable.

If you have any doubts a simple way to test this compare how you feel when you have a crystal clear, hear the pin drop, telephone conversation versus a scratchy, phone connection with snaps, pops, background noise, and possibly the sound of other people’s voices. You are calm, relaxed and feel like the person you are talking to you is in the same room in the first instance. In the second instance, you are already concentrating so hear just to understand the words that being able to engage in a meaningful, stress-free conversation is next to impossible.

Scientific evidence from perceptual cognitive psychology and human factors studies consistently shows that audio quality can influence the subjective perception of video quality, sometimes making a video seem higher resolution or more pleasant when the audio track is improved—even when the video resolution itself remains unchanged.

Key Findings from Experimental Studies

  • Cross-modal Influence: Studies find that viewers rate the video content as higher quality when the accompanying audio is of high fidelity, even if the video resolution remains the same. The perceived video quality can improve in parallel with audio quality due to the integration of sensory modalities in cognitive processing.

  • Attention and Task Dependence: Research demonstrates that when viewers focus on audio-related tasks or content, their ratings of video quality tend to follow perceptions of the audio—sometimes regardless of the actual video fidelity.

  • Subjective Quality Ratings: Experiments using simultaneous manipulations of audio and video codecs show changes in mean opinion scores (MOS) for both modalities, recording instances where improved audio raises the rating of perceived video quality, especially in immersive scenarios.

  • Content and Comprehension Effects: Changes in audio quality (such as reduction in noise or improved clarity) have also been shown to enhance comprehension and overall enjoyment of audiovisual media, which leads to higher subjective ratings for the combined experience, including visual clarity.

Representative Research

  • LĆ„ngvik (2015), collaborating with Ericsson, found that ā€œresults show a perceived difference in video quality following the perception of audio quality, although the video quality was never alteredā€ā€”demonstrating the cognitive bias toward rating overall AV quality higher when audio is improved.

  • Rimell et al. (2008) and others report that ā€œthe perceived quality of the other modality (video) seems to follow the quality of the audio, regardless of the video’s true quality,ā€ highlighting a robust cross-modal effect in subjective judgment.

In summary: There is strong, peer-reviewed evidence that an improved audio track can make a video subjectively appear higher resolution or quality in human ratings and perceptual studies, due to cognitive integration of multimodal sensory input and rating biases.

I never have the audio playing when I’m evaluating how well the AI enhancement model did.

I agree that better audio quality can be key, but who knows if Topaz will go into that.

The fact that this idea has 29 votes, one of the higher numbers I have seen for a user proposal, and after about 1335 days nothing has been done, no clear justification for inaction, is not a good sign.

Voice of the Customer (VOC) matters.

Not listening to the VOC results in bad karma.

Topaz Labs LLC should endeavor to improve the sound quality of the video that it visually enhances and upscales.

Companies that ignore audio improvements or test only isolated video metrics risk missing critical opportunities for perceptual optimization and competitive advantage.

Thanks for the tip on ffmpeg, ForSerious.

My question is more mundane. If I burn a DVD with the MP4 should I think of recoding the sound track for a different codec so it will play on the now mostly forgotten DVD player that some relatives of mine still have. [I for one no longer have a DVD, except on for the PC and the Mac, and I am not convinced that it representative of consumer grade , Joe-average, DVDs connected to TV.]

Best,
Michael