Blog/Audio normalization with FFmpeg

From Forza's ramblings

2020-08-03: Audio normalization with ffmpeg

FFmpeg Logo new.svg

ffmpeg is a great tool for audio/video encoding. Here we'll go through audio normalization using the loudnorm filter.

The loudnorm filter works best when you run it in two passes. The first pass checks the audio properties of the source file. Then you use those properties as input in the second pass.

FFmpeg option Description
-i Input file.
Options for the loudnorm filter:
-filter:a loudnorm Activates the loudnorm audio filter.
linear=true Normalize by linearly scaling the source audio.
i= Set integrated loudness target. Range is -70.0 - -5.0. Default value is -24.0. For EBU R128 normalization a target of -23dB should be used.
lra= Set loudness range target. Range is 1.0 - 20.0. Default value is 7.0.
tp= Set maximum true peak. Range is -9.0 - +0.0. Default value is -2.0.
offset= Set offset gain. Gain is applied before the true-peak limiter. Range is -99.0 - +99.0. Default is +0.0.
measured_I= Measured IL of input file. Range is -99.0 - +0.0.
measured_TP= Measured true peak of input file. Range is -99.0 - +99.0.
measured_LRA= Measured LRA of input file. Range is 0.0 - 99.0.
measured_thresh= Measured threshold of input file. Range is -99.0 - +0.0.
print_format=json Enables json output which is needed to get the parameters for the second pass.
Options for the aresample filter:
aresample= Enable sample rate conversion.
resampler= Slects the swr or soxr sample rate conversion engine.
out_sample_rate= Audio samplerate.
precision= soxr internal precision. 28-33 is considered very high quality and suitable for 24bit audio. 20 is default and is suitable for 16bit output, with noise shaping/dithering filters. Read the FAQ.
FFmpeg options
-ac Audio channels. Use 2 for stereo, 6 for 5.1 and 8 for 7.1 sources. Only needed with the aresample filter.
-ar Audio samplerate.
-c:a Audio codec. AAC, MP3, FLAC, WAV, etc.
-b:a Audio bitrate.

Pass 1:

# ffmpeg -i audio.dts -filter:a loudnorm=print_format=json -f null NULL
 [Parsed_loudnorm_0 @ 00000236cfdc12c0]
{
        "input_i" : "-23.01",
        "input_tp" : "-10.11",
        "input_lra" : "18.80",
        "input_thresh" : "-34.44",
        "output_i" : "-24.39",
        "output_tp" : "-5.99",
        "output_lra" : "10.70",
        "output_thresh" : "-35.00",
        "normalization_type" : "dynamic",
        "target_offset" : "0.39"
}

Pass 2:

The loudnorm filter resamples the audio to 192KHz. So in order to resample it back down to the same as the source we use -ar 48000 for 48KHz or -ar 44100 for 44.1KHz.

We also want to target the overall loudness to be -23 LUFS (Loudness Units, referenced to Full Scale).

# ffmpeg -i audio.dts -filter:a loudnorm=linear=true:i=-23.0:lra=7.0:tp=-2.0:offset=0.39:measured_I=-23.01:measured_tp=-10.11:measured_LRA=18.80:measured_thresh=-34.44 -ar 48000 -c:a aac -b:a 500k audionorm.mp4

While the FFmpeg sample rate conversion is generally very good, there is a better sample rate converter called SoX Resampler. If your FFmpeg has soxr support you can enable it with the aresample filter -filter:a aresample=resampler=soxr:out_sample_rate=48000:precision=28. Combining it with the loudnorm filter, the full command line would be:

# ffmpeg -i audio.dts -filter:a loudnorm=linear=true:i=-23.0:lra=7.0:tp=-2.0:offset=0.39:measured_I=-23.01:measured_tp=-10.11:measured_LRA=18.80:measured_thresh=-34.44,aresample=resampler=soxr:out_sample_rate=48000:precision=28 -ac 6 -c:a aac -b:a 500k audionorm.mp4

There is a tool that automates all of this. ffmpeg-normalize supports batch processing, copying of the video and subtitle streams, retaining the original audio and has all the options for exact control :). Head over to https://github.com/slhck/ffmpeg-normalize/ for more details.

# ffmpeg-normalize *.mkv -c:a aac -b:a 500k