EDIT: While the information below is not inaccurate, it is somewhat less relevant in the modern digital era and to fanediting vs. original editing. The following is good educational information but be sure to read the discussion that follows it, or if you just want the
ultra simple answer for making a dolby digital audio file for a fanedit: Set dynamic range compression to film standard and dialogue normalization to -27dB.
-----
I have spent a great deal of time researching this subject and have attempted to put together a guide here to help others wade through this very confusing subject. I hope this guide helps my fellow faneditors to produce professional-grade sound and reduce some of the headaches that this extremely challenging topic invariably gives to most of us.
What is dialogue normalization (N)? First off, this question cannot be properly answered without also also talking about dynamic range compression (DRC) The goal Dolby had when implementing the Dolby Digital standard many, many years ago was "to make all AC-3 encoded audio files have the same listening level, regardless of the source file." Put simply, N "Specifies the average volume of the dialogue, using decibels." (
http://documentation.apple.com/en/compressor/usermanual/index.html#chapter=10§ion=1&tasks=true). This essentially establishes a center point for a sound range, which Dolby defined as being -31 dB (decibels) from the rated output of a decoder defined as 0 dB. Dolby's goal was to make it so that all video would have the same average output levels. It was a noble goal that humans found a way to thwart, as always, but we will discuss what I shall refer to as the Loud Wars later.
DRC was implemented by Dolby to address two other issues of normalizing the
range of sound levels into a range that still sounds rich on a home theater but isn't too wide a range so that you have to wake your neighbors across the street with the sound of explosions in order to hear the quieter sounds of footsteps. This is a subjective issue that depends on preference, listening environment, and type of audio being listened to; thus why there are the different options for line mode and RF mode (RF mode is for broadcast - just set RF mode to whatever you set line mode to, which is what we care about here). It also provides protection against decoder overload that can result in audio clipping and distortion, which can occur with downmixing from 5.1 to stereo, and can occur even without downmixing these days unfortunately because of the Loud Wars.
Both N and DRC settings are encoded into the ac-3 file metadata. In other words,
the raw audio stream is unaffected by these settings. The metadata simply
suggests to a dolby digital decoder how to pre-process the audio stream. Thus if you take a wave file, enccode it to an ac3, and then transcode back to a wave file, the wave files should sound the same (minus any transcoding data loss) but the ac3 file may sound different depending on the metadata settings of the ac3 file.
Let's say that you have a w64 or PCM audio file you have finished editing and now you want to encode this to ac3. If your only goal is to make the ac3 sound identical to the w64, then here is what you should do. Set dialogue normalization to -31 dB and set the dynamic range compression to none. Voila.
Now let me try to convince you why this is wrong. Here is what Dolby has to say on the subject:
"None is strictly a professional mode that defeats both dynamic
range compression and Dialog Normalization. This mode (None) is used to hear the
full level and dynamics in the program material being encoded. It is never allowed in
a consumer product and may not represent either the level balance or dynamic range
heard in a consumer decoder. Encoding judgments based on this compression mode
should not be made." (
http://www.dolby.com/uploadedFiles/...lish_PDFs/Professional/L.mn.0002.5.1guide.pdf).
First off, unless you went through your entire audio file in your editor and adjusted the master mix levels to prevent any clipping, -31 DB should never be used. In all likelihood you will find that some clipping of at least 3-6 dB is occurring in the raw wave form, and so at minimum dialogue normalization should be set to -27, which is the default setting. This is the essence of the Loud Wars. These Dolby standards designed to ensure quality sound have been somewhat defeated by studios in order to cater to the lowest common denominator who define good sound as the loudest possible sound at the lowest possible volume and want to listen on cheap sound systems that don't highlight differences in sound quality very well. (Is it so hard to turn the volume nob up?) The unfortunate result is the compression of the audio's dynamic range, not by Dolby's settings, but by the studio, and they hard-coded it into the audio file so we can never get that data back. The reality is that having a good healthy dynamic range will make an audio file sound far richer and more vibrant than increasing the bitrate ever will, but there is a trade-off that one can't escape. All decoders have a limit at which the decoder overloads and sound winds up clipping and distorting. This ceiling implies that in order to have a deeper, richer dynamic range, you must have an average sound level (which for film we define as the level of normal dialogue) that is quieter.
Ok, moving on. Here is what Dolby and I recommend (and this is how my Matrix edit was encoded, so you can listen to that and decide whether you like or don't like this setup). Set your N number to the level of normal dialogue in the movie. Unfortunately this is easier said than done. Professional tools exist that will automatically analyze a clip and provide a normalization number. You can also download tools that will analyze the average RMS amplitude in dB (decibels) and you can use this number (the doom9 link at the bottom discusses some of these options further). With a piece of music this might work (though I can't imagine why anyone would ever want to use ac3 for music), but with film
both methods will only yield an accurate number if you have it analyze a clip that is completely void of any sound other than the dialogue! Personally, I think the best way to do it is to say, 'fuck the computer,' and use your ears. Here is how I do it using Sony Vegas Pro 10 and Sound Forge Studio 10:
1) I open Vegas and set my audio properties to stereo. 2) I render a short piece of audio with normal dialogue containing as little background noise as possible as a wave file. 3) I open with Sound Forge and go to the "process" menu and click on "Normalize". Then I preview the audio at different normalize settings and compare this to the same clip of audio in my Vegas timeline (both as stereo!!!). I start at -31 dB and work my way up until the two clips sound equal in volume to my ears. 4) Switch back to 5.1 surround in your Vegas timeline (if you need to of course!) and set your dolby digital pro ac-3 encoder N value to the value you determined. (For me with my Matrix edit it was -19 dB and -13 for the Animatrix.) -15 or lower is really starting to push it in terms of having a nice healthy range, and we can pretty much directly interpret this to mean that the studio fucked us on range for the sake of loudness.
Now go to the pre-processing tab and decide what to do with the DRC. I personally feel that the only option for what we do here is to use the 'film light' setting. This gives you a large null (unmodified) range around your center value and provides the lightest curve (greatest dynamic range) to the audio outside of the null range. What happens is the dolby digital decoder will begin to apply an algorithm past the null range from your center point that boosts quiet sounds and attenuates loud sounds. This might sound scary to you as it did to me originally when I first looked at those settings, but it really is a good thing, unless maybe if you are planning on playback in a true theater. (Then an argument could be made to encode at -31 dB N (after eliminating any clipping in your audio editor) and and set DRC to none.) This datasheet from Dolby gives the specific curve values for each DRC setting, as well as other general information. (
http://www.dolby.com/uploadedFiles/...133_m.ch.0002.DP569Guide_Chart.QuickStart.pdf).
Now what you may find yourself thinking, despite not liking the idea of range compression, is that your dialogue sounds too quiet relative to the loud stuff going on for your taste and listening environment. There are two ways to fix this. The best way is to make sure that you have properly set your N and DRC values. Bad things happen if you choose to use DRC and don't have N set properly. Now your dolby digital decoder is trying (or not trying) to boost or attenuate portions of the audio that it should or shouldn't in order to produce properly balanced audio. This can wreak havoc on the audio levels on playback when DRC is trying to apply an algorithm to audio that isn't properly centered. It has no way of knowing on its own what the center is. It relies on you to tell it with N. So you really only have two choices. You can either properly set N and use DRC, or use whatever you want for N, -31 dB being the loudest option because no level shifting to normalize will be done, but then you
must set DRC to none. Use those settings at your own risk and get ready to man the volume button on your remote control.
If you still think your dialogue sounds too quiet, the 2nd way to fix it is to choose the film standard DRC setting. It applies a much steeper curve to the range compression than film light does. The steepest of all curves is the 'speech' setting. Case in point: I have received a couple complaints from people who felt that the dialogue in my matrix edit was "too quiet." This might be because their systems are not doing proper dolby digital decoding, or it might be because Film Light DRC is still too much range for their personal tastes and sound system. It is your call as the editor which setting to use. In truth you will probably have more people happy with your sound design if you choose film standard than film light. Whether to please the masses or the snobs is the never-ending battle. What I may try to do as a compromise in the future (if I have the room for an extra audio track) is do a 5.1 track as film light DRC and then a separate stereo track as film standard.
The other settings on that preprocessing tab that I check for fanediting are 'DC-highpass filter', 'bandwidth low-pass filter', and '90 degree phase shift'. LFE lowpass filter is really only useful if you are doing original audio mastering (i.e. your audio tracks consist of sound effects and music instruments vs. speaker channel outputs). EDIT: None of these really matter in a fanedit.
For further explanation, the following is a great forum thread that goes into a bit more of a technical description with math, graphs, and all:
http://forum.doom9.org/showthread.php?t=56020