• Most new users don't bother reading our rules. Here's the one that is ignored almost immediately upon signup: DO NOT ASK FOR FANEDIT LINKS PUBLICLY. First, read the FAQ. Seriously. What you want is there. You can also send a message to the editor. If that doesn't work THEN post in the Trade & Request forum. Anywhere else and it will be deleted and an infraction will be issued.
  • If this is your first time here please read our FAQ and Rules pages. They have some useful information that will get us all off on the right foot, especially our Own the Source rule. If you do not understand any of these rules send a private message to one of our staff for further details.
  • Please read our Rules & Guidelines

    Read BEFORE posting Trades & Request

5.1 surround sound dialog normalization

seciors

Well-known member
Faneditor
Messages
1,317
Reaction score
2
Trophy Points
41
I'm trying to finalize my surround sound mix, and am not sure what to put in for "dialog normalization." The default is -27. I'm not sure if this value affects the overall volume or not, but right now it seems to me that my overall volume is lower than normal. Is this what is causing that to happen or would it be something else?

Thanks for any help/advice!
 
EDIT: While the information below is not inaccurate, it is somewhat less relevant in the modern digital era and to fanediting vs. original editing. The following is good educational information but be sure to read the discussion that follows it, or if you just want the ultra simple answer for making a dolby digital audio file for a fanedit: Set dynamic range compression to film standard and dialogue normalization to -27dB.

-----

I have spent a great deal of time researching this subject and have attempted to put together a guide here to help others wade through this very confusing subject. I hope this guide helps my fellow faneditors to produce professional-grade sound and reduce some of the headaches that this extremely challenging topic invariably gives to most of us.

What is dialogue normalization (N)? First off, this question cannot be properly answered without also also talking about dynamic range compression (DRC) The goal Dolby had when implementing the Dolby Digital standard many, many years ago was "to make all AC-3 encoded audio files have the same listening level, regardless of the source file." Put simply, N "Specifies the average volume of the dialogue, using decibels." (http://documentation.apple.com/en/compressor/usermanual/index.html#chapter=10&section=1&tasks=true). This essentially establishes a center point for a sound range, which Dolby defined as being -31 dB (decibels) from the rated output of a decoder defined as 0 dB. Dolby's goal was to make it so that all video would have the same average output levels. It was a noble goal that humans found a way to thwart, as always, but we will discuss what I shall refer to as the Loud Wars later.

DRC was implemented by Dolby to address two other issues of normalizing the range of sound levels into a range that still sounds rich on a home theater but isn't too wide a range so that you have to wake your neighbors across the street with the sound of explosions in order to hear the quieter sounds of footsteps. This is a subjective issue that depends on preference, listening environment, and type of audio being listened to; thus why there are the different options for line mode and RF mode (RF mode is for broadcast - just set RF mode to whatever you set line mode to, which is what we care about here). It also provides protection against decoder overload that can result in audio clipping and distortion, which can occur with downmixing from 5.1 to stereo, and can occur even without downmixing these days unfortunately because of the Loud Wars.

Both N and DRC settings are encoded into the ac-3 file metadata. In other words, the raw audio stream is unaffected by these settings. The metadata simply suggests to a dolby digital decoder how to pre-process the audio stream. Thus if you take a wave file, enccode it to an ac3, and then transcode back to a wave file, the wave files should sound the same (minus any transcoding data loss) but the ac3 file may sound different depending on the metadata settings of the ac3 file.

Let's say that you have a w64 or PCM audio file you have finished editing and now you want to encode this to ac3. If your only goal is to make the ac3 sound identical to the w64, then here is what you should do. Set dialogue normalization to -31 dB and set the dynamic range compression to none. Voila.

Now let me try to convince you why this is wrong. Here is what Dolby has to say on the subject:

"None is strictly a professional mode that defeats both dynamic
range compression and Dialog Normalization. This mode (None) is used to hear the
full level and dynamics in the program material being encoded. It is never allowed in
a consumer product and may not represent either the level balance or dynamic range
heard in a consumer decoder. Encoding judgments based on this compression mode
should not be made." (http://www.dolby.com/uploadedFiles/...lish_PDFs/Professional/L.mn.0002.5.1guide.pdf).

First off, unless you went through your entire audio file in your editor and adjusted the master mix levels to prevent any clipping, -31 DB should never be used. In all likelihood you will find that some clipping of at least 3-6 dB is occurring in the raw wave form, and so at minimum dialogue normalization should be set to -27, which is the default setting. This is the essence of the Loud Wars. These Dolby standards designed to ensure quality sound have been somewhat defeated by studios in order to cater to the lowest common denominator who define good sound as the loudest possible sound at the lowest possible volume and want to listen on cheap sound systems that don't highlight differences in sound quality very well. (Is it so hard to turn the volume nob up?) The unfortunate result is the compression of the audio's dynamic range, not by Dolby's settings, but by the studio, and they hard-coded it into the audio file so we can never get that data back. The reality is that having a good healthy dynamic range will make an audio file sound far richer and more vibrant than increasing the bitrate ever will, but there is a trade-off that one can't escape. All decoders have a limit at which the decoder overloads and sound winds up clipping and distorting. This ceiling implies that in order to have a deeper, richer dynamic range, you must have an average sound level (which for film we define as the level of normal dialogue) that is quieter.

Ok, moving on. Here is what Dolby and I recommend (and this is how my Matrix edit was encoded, so you can listen to that and decide whether you like or don't like this setup). Set your N number to the level of normal dialogue in the movie. Unfortunately this is easier said than done. Professional tools exist that will automatically analyze a clip and provide a normalization number. You can also download tools that will analyze the average RMS amplitude in dB (decibels) and you can use this number (the doom9 link at the bottom discusses some of these options further). With a piece of music this might work (though I can't imagine why anyone would ever want to use ac3 for music), but with film both methods will only yield an accurate number if you have it analyze a clip that is completely void of any sound other than the dialogue! Personally, I think the best way to do it is to say, 'fuck the computer,' and use your ears. Here is how I do it using Sony Vegas Pro 10 and Sound Forge Studio 10:

1) I open Vegas and set my audio properties to stereo. 2) I render a short piece of audio with normal dialogue containing as little background noise as possible as a wave file. 3) I open with Sound Forge and go to the "process" menu and click on "Normalize". Then I preview the audio at different normalize settings and compare this to the same clip of audio in my Vegas timeline (both as stereo!!!). I start at -31 dB and work my way up until the two clips sound equal in volume to my ears. 4) Switch back to 5.1 surround in your Vegas timeline (if you need to of course!) and set your dolby digital pro ac-3 encoder N value to the value you determined. (For me with my Matrix edit it was -19 dB and -13 for the Animatrix.) -15 or lower is really starting to push it in terms of having a nice healthy range, and we can pretty much directly interpret this to mean that the studio fucked us on range for the sake of loudness.

Now go to the pre-processing tab and decide what to do with the DRC. I personally feel that the only option for what we do here is to use the 'film light' setting. This gives you a large null (unmodified) range around your center value and provides the lightest curve (greatest dynamic range) to the audio outside of the null range. What happens is the dolby digital decoder will begin to apply an algorithm past the null range from your center point that boosts quiet sounds and attenuates loud sounds. This might sound scary to you as it did to me originally when I first looked at those settings, but it really is a good thing, unless maybe if you are planning on playback in a true theater. (Then an argument could be made to encode at -31 dB N (after eliminating any clipping in your audio editor) and and set DRC to none.) This datasheet from Dolby gives the specific curve values for each DRC setting, as well as other general information. (http://www.dolby.com/uploadedFiles/...133_m.ch.0002.DP569Guide_Chart.QuickStart.pdf).

Now what you may find yourself thinking, despite not liking the idea of range compression, is that your dialogue sounds too quiet relative to the loud stuff going on for your taste and listening environment. There are two ways to fix this. The best way is to make sure that you have properly set your N and DRC values. Bad things happen if you choose to use DRC and don't have N set properly. Now your dolby digital decoder is trying (or not trying) to boost or attenuate portions of the audio that it should or shouldn't in order to produce properly balanced audio. This can wreak havoc on the audio levels on playback when DRC is trying to apply an algorithm to audio that isn't properly centered. It has no way of knowing on its own what the center is. It relies on you to tell it with N. So you really only have two choices. You can either properly set N and use DRC, or use whatever you want for N, -31 dB being the loudest option because no level shifting to normalize will be done, but then you must set DRC to none. Use those settings at your own risk and get ready to man the volume button on your remote control.

If you still think your dialogue sounds too quiet, the 2nd way to fix it is to choose the film standard DRC setting. It applies a much steeper curve to the range compression than film light does. The steepest of all curves is the 'speech' setting. Case in point: I have received a couple complaints from people who felt that the dialogue in my matrix edit was "too quiet." This might be because their systems are not doing proper dolby digital decoding, or it might be because Film Light DRC is still too much range for their personal tastes and sound system. It is your call as the editor which setting to use. In truth you will probably have more people happy with your sound design if you choose film standard than film light. Whether to please the masses or the snobs is the never-ending battle. What I may try to do as a compromise in the future (if I have the room for an extra audio track) is do a 5.1 track as film light DRC and then a separate stereo track as film standard.

The other settings on that preprocessing tab that I check for fanediting are 'DC-highpass filter', 'bandwidth low-pass filter', and '90 degree phase shift'. LFE lowpass filter is really only useful if you are doing original audio mastering (i.e. your audio tracks consist of sound effects and music instruments vs. speaker channel outputs). EDIT: None of these really matter in a fanedit.

For further explanation, the following is a great forum thread that goes into a bit more of a technical description with math, graphs, and all:

http://forum.doom9.org/showthread.php?t=56020
 
Now its my turn to ask a question! Does anyone know the scoop regarding the "set copyright bit" and "set as original copy" settings in the ac3 pro settings?
 
geminigod,
Thanks for your excellent explanation! I can understand now why I (in my own set up) I have the center and surround speaker outputs boosted (by the decoder) for when I watch movies.

I did end up using -27db for my Return of the Sith release, but for my next one I might try -24 during my testing to see if it makes a difference. I mostly stuck with -27db since I figured the material was already properly normalized for this setting and that would be the safest path.

BTW, I do edit using AIFF PCM files, with each channel group as its own file, so it's doubtful I would hit the 4GB size...though it's good to know about that!

Sorry I don't have an answer to your questions, though you might want to start a new thread since the topic of mine might prevent people from finding it and thus answering it!
 
see edited initial post in this thread by me.
 
geminigod,

Thanks for giving me/us such useful information! If there is a FAQ or sticky this info can be put into, I really think it should be done!

Question for you -- if the center channel only contains dialog, can you analyze the center channel by itself to determine the N value, or do you need to analyze dialog within the context of all the channels downmixed to stereo?

Also, just to make sure I understand, by "normal dialog level" do you mean a scene where people are talking at "conversational" levels? Such as a conversation inside a room? Do you then try to determine what the average db is during that scene?
 
Yes, "normal dialogue" was meant to imply conversational dialogue. Sorry. A scene where a married couple are arguing may not be where you want your center point to be. ;-)

If the center channel is only dialogue it should be fine if software is analyzing it for you, but if you are trying to do it using your ears as I outlined, then you have to be careful that you are comparing apples to apples. Mono, stereo, and 5.1 playback may all sound different.
 
@geminigod - you are a fount of knowledge. Thanks for seeking out the info, experimenting and reporting the results. The sound in my firstling edit would have been better if I'd known this stuff.
 
For Mac users, I found the following free software called AudioLeak, which will give you both the unweighted and A-weighted RMS value of any size file.
http://www.channld.com/audle.html

According to the Dolby AC-3 Metadata Guide, the value to use is the A-weighted one.
The Dialogue Level setting represents the long-term A-weighted average level of dialogue within a presentation, Leq(A).
 
Great links Seciors.

It is interesting looking at the algorithm parameters for the DRC music presets. There is hardly any difference between these and film except that music light has a steeper attenuation on the levels close to the null range. Maybe this might be useful for some dvd menu music or something. The more important thing, I think, is to understand the difference between film light and film standard. The difference is huge and 'standard' is hardly standard when it comes to DVD and blu-ray movies. For these mediums the 'film light' preset is most definitely the standard. What would be better is if there were a 'film broadcast' option for line mode in which people could normalize to -20dB with the tighter DRC range, but then this would defeat Dolby's objective of having a normalized level of sound for any ac3 output. Personally I think the normalization issue is more relevant to broadcast and wish I had more control over my options than Dolby gives, but c'est la vie.
 
geminigod said:
Both N and DRC settings are encoded into the ac-3 file metadata.

Is there a tool that will rewrite the metadata of an AC-3 file without re-encoding it?
 
Seems like that should be possible, though I haven't come across anything like that. That would be a nice project for some programmer out there.
 
It would be awesome if there were a great open source app for handling and encoding dolby digital. Then I could just export as w64 from Vegas. This would be especially useful for people using the Studio version of Vegas.

PS: In my previous posts, I really emphasized using Film Light DRC setting over Film Standard, but on further consideration after receiving feedback from people about my Matrix edit, there is a strong argument to be made for using Film Standard for these fan edits. The narrower range can be more pleasing for the average home listener who doesn't want it to be loud and doesn't want to man the volume control constantly.
 
geminigod said:
It would be awesome if there were a great open source app for handling and encoding dolby digital.

The closest that is available is probably WAV to AC3 Encoder, which is a GUI for Aften. I have used it quite often. I will not accept w64 files, but it might be possible to convert to another lossless format that it will accept.
 
I stumbled across this wikipedia article that gives more info related to the loudness wars and a good history of the issue. http://en.wikipedia.org/wiki/Loudness_war

The way in which I am now handling some of these issues on my edits has dramatically changed since I posted my initial guide at the beginning of this thread. Once I am done with my current fanedit, I will post some of my new thoughts and methods here.
 
Captain Khajiit said:
Is there a tool that will rewrite the metadata of an AC-3 file without re-encoding it?

I have done a little research on this issue. As near as I can tell there is no way to change the DRC without re-encoding the audio, however it is possible to change the DN value. Also, eac3to can remove the metadata entirely from the command line version.
 
geminigod said:
As near as I can tell there is no way to change the DRC without re-encoding the audio, however it is possible to change the DN value

That's the conclusion that I came to as well. There was an old app called VOBDNorm that could change the DN value, but I've heard no mention of changing DRC, and I don't think we'll now see anything developed that can it change it. Some knowledgeable people on Doom9 think dialnorm is obsolete these days anyway. After all, DTS doesn't use it.
 
Captain Khajiit said:
Some knowledgeable people on Doom9 think dialnorm is obsolete these days anyway. After all, DTS doesn't use it.

That is the conclusion I have come to as well. It had its uses I think at one point in time, but at this juncture the digital audio editing tools are such that I have decided it makes more sense for me to apply my own compression and normalization to the exact level I want rather than monkeying around with Dolby's highly limited algorithms. This is what I am doing with Two Towers Rebuilt. The Dolby Digital AC3 tracks that I encode will be set to no DRC and -31dB DN. EDIT: I am only using these settings over those posted at the beginning of the first post on this thread because I have manually hardcoded in my own balancing, range compression, and normalization prior to rendering my AC3 file.

Also, since we are working with audio tracks that have already been mastered by people essentially doing what I just described so that they can have more control and because they are working with formats other than dolby digital, I am finding that this issue is not such a big deal with many audio tracks. Understanding the concepts though is important because even mastered tracks are still variable. In truth, I could do nothing with the Two Towers True HD audio that I pulled off blu-ray and it would sound great. The audio engineers did a great job. The Matrix audio on the other hand had a huge dynamic range, levels weren't balanced properly, and it frankly needed even more work than what could be done by just changing dolby digital settings.

Lastly, the original dolby digital argument that too much dynamic range can be a bad thing (which I furthered in my original post to this thread), I think has been proven to be even more true than was originally designed for. One way or another, either by compressing the entire range further or by handling the center track separately, I have decided that sticking dialogue right in the middle of the range at -31dB is just too quiet.
 
Back
Top Bottom