• Most new users don't bother reading our rules. Here's the one that is ignored almost immediately upon signup: DO NOT ASK FOR FANEDIT LINKS PUBLICLY. First, read the FAQ. Seriously. What you want is there. You can also send a message to the editor. If that doesn't work THEN post in the Trade & Request forum. Anywhere else and it will be deleted and an infraction will be issued.
  • If this is your first time here please read our FAQ and Rules pages. They have some useful information that will get us all off on the right foot, especially our Own the Source rule. If you do not understand any of these rules send a private message to one of our staff for further details.
  • Please read our Rules & Guidelines

    Read BEFORE posting Trades & Request

AI Services for audio separation? Experiences?

Peoples-Exec

Member
Messages
11
Reaction score
1
Trophy Points
3
Hi everyone,

I'm new to all of this and have something I have wanted to do for years. Most recently I learned about this forum, but have been a fan of the SW OG project editing for a very long time.

Anyways has anyone used any AI services for separation? I am not opposed to separating tracks etc but there may be some effects I want to keep and trace amounts of ost left over in the center will not work for what I am going to do.

The only service I am aware of at the moment is audionamix, and for all I know they are not even a real company. I am sure it is far from free, but hopefully not totally unreasonable for a single film.

For now that is my only question for audio editing. If all goes well perhaps then I'll have other questions for improving the mixes once the OST is gone.

Thanks!
 
I use Ultimate Vocal Remover, and have had some extremely good results from some very tricky scenes. You can get good results from a single pass using one of the various models, but best results are gotten from using the ensemble mode and experimenting, which can quite effectively leave you with clean – and separate – dialogue, music, and SFX tracks.

Best of all, it's free.
 
Thank you for the suggestion, that might be my ticket. I want to do whatever is absolute best. This project has been with me now for almost 6 years at heart. To me it'll be the most epic thing ever, to everyone else who knows.

BTW I found someone quoted $50 a month for the one I first named, so I think no matter what I do the technical experience will be a bigger challenge than the price.
 
Only one I use is melody.ml

I haven't used many others so I can't say how it compares. It works for my requirements.

bit of advice, separate your audio tracks. It may seem like a lot of work but you are going to get a far superior result that way, as you will be retaining as much audio data as possible.
 
I've been doing a lot of audio separation recently, using both the website lalal.ai and the downloadable program RipX. The intended purpose of both of these is separating stems/instruments in music (eg piano/vocals/drums etc), but I've been using them both for separating dialogue/SFX and score from movies, cleaning up background noise etc.

RipX is a program with a 30 day trial. I used it extensively for those 30 days, getting what I thought I would need, but then didn't purchase.

Lalal.ai gives about 10 free minutes of audio processing to start, but then you have to pay for more. I paid about £30 for 600 minutes of audio processing time, which I consider to be great value. I've still only used about 40 minutes and there's no shelf-life on using your minutes.

Both services gave pretty good results, though not perfect – it really depends on the source and what you're trying to do. Some audio is just more complex and harder to separate than others. When there was something I needed to separate, I would often do it using both services, and pick and choose the bits that gave best results from either. Lalal is web based so the processing is done on the servers and is very quick, clean, and convenient. RipX is offline and so I have to do the processing on my computer, but it gives you a lot of control over what parts of the audio you want. You have a side-scrolling timeline of all the elements in the audio, and you can turn on and off individual notes as you see fit (it will be a familiar interface I've you've ever used any music production/sequencing programs). It's hard to describe, but sometimes it's really useful to get a more fine-grained control.

I also found I could get better results by being careful with what audio I used to start with. For example, my project has 5.1 audio, with the music that I wanted to remove in all channels, but often it would be relatively quiet in the centre channel which had the dialogue/SFX that I wanted. I would use that mono channel as the source audio to run the separation on as the music was half the volume relative to the dialogue compared to the overall audio.
 
Just going to add MVSEP.com to my recommendations. UVR5 is really good, but the models haven't been updated in over a year and they generally struggle with separating sound effects from scores. MVSEP now has several models trained explicitly to separate SFX and they work excellently – while working on my Deep Impact edit I was able to split one of Morgan Freeman's speeches, the hushed whispering and footsteps around him, and the quiet score for the scene (which UVR5 couldn't filter out without taking the lower pitches of Freeman's voice with it) into three separate tracks.

I recommend registering. Using the site without registration restricts you to a maximum quality of 320kbs MP3 and subjects you to an hour+ wait in a queue for your file to be processed, while registering gives you WAV and FLAC output and brings the wait down to anywhere between 30 seconds and a couple of minutes. You can pay to skip the queue entirely, but I'd hardly think that's necessary.
 
Last edited:
Just going to add MVSEP.com to my recommendations. UVR5 is really good, but the models haven't been updated in over a year and they generally struggle with separating sound effects from scores. MVSEP now has several models trained explicitly to separate SFX and they work excellently – while working on my Deep Impact edit I was able to split one of Morgan Freeman's speeches, the hushed whispering and footsteps around him, and the quiet score for the scene (which UVR5 couldn't filter out without taking the lower pitches of Freeman's voice with it) into three separate tracks.

I recommend registering. Using the site without registration restricts you to a maximum quality of 320kbs MP3 and subjects you to an hour+ wait in a queue for your file to be processed, while registering gives you WAV and FLAC output and brings the wait down to anywhere between 30 seconds and a couple of minutes. You can pay to skip the queue entirely, but I'd hardly think that's necessary.
Thanks for this, just gave it a whirl, seems pretty good!
 
I use DeMIX Pro for music production and find it excellent at separating vocal and other instrument stems.
 
Just going to add MVSEP.com to my recommendations. UVR5 is really good, but the models haven't been updated in over a year and they generally struggle with separating sound effects from scores. MVSEP now has several models trained explicitly to separate SFX and they work excellently – while working on my Deep Impact edit I was able to split one of Morgan Freeman's speeches, the hushed whispering and footsteps around him, and the quiet score for the scene (which UVR5 couldn't filter out without taking the lower pitches of Freeman's voice with it) into three separate tracks.

I recommend registering. Using the site without registration restricts you to a maximum quality of 320kbs MP3 and subjects you to an hour+ wait in a queue for your file to be processed, while registering gives you WAV and FLAC output and brings the wait down to anywhere between 30 seconds and a couple of minutes. You can pay to skip the queue entirely, but I'd hardly think that's necessary.
Which model did you use? Bandit plus?
A lot of their models are the same that UVR uses, I'd rather do the processing local on my machine than upload somewhere. Any idea where the model itself comes from so it could be installed for local use?
 
Which model did you use? Bandit plus?
A lot of their models are the same that UVR uses, I'd rather do the processing local on my machine than upload somewhere. Any idea where the model itself comes from so it could be installed for local use?
I've found that MVSep Demucs4HT DNR (dialog, sfx, music) cleans up most of the stuff I throw at it, Bandit Plus to clean up any of the output files that don't catch everything.

Best I can tell the newer models – specifically the ones with the MVSep prefix – are proprietary forks. Looks like you can get an offline version here, though it seems to only run a handful of models.
 
Vocal Remover and Isolation (if you google it) works for my purposes, though to be fair, the soundtrack of my source is pretty sparse. I do like it because it separates things like footsteps and thumps into the drums track, though this only works if the music playing doesn't have drums in it.
 
Back
Top Bottom