[FFmpeg-devel] CNG (consistent noise generation) patch for AC-3 decoder

Sat Sep 3 19:27:07 EEST 2016

On 09/03/2016 04:09 AM, Carl Eugen Hoyos wrote:
> Hi!
> 
> 2016-09-03 12:50 GMT+02:00 Jonathan Campbell <jonathan at impactstudiopro.com>:
> 
>> Here you go (as attachments).
> 
> The changes to lfg and the version bump must be one patch.
> 
>> +    { "cons_noisegen", "enable consistent noise generation", OFFSET(consistent_noise_generation), AV_OPT_TYPE_BOOL, {.i64 = 0 }, 0, 1, PAR },
> 
> If this change makes sense why is it not the default?
> 
> Carl Eugen
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 

The option is intended for non-linear editing software that needs to be able to decode AC-3 from any arbitrary point in the stream. Making the dithering noise consistent means the AC-3 decoder will produce the same audio samples on output no matter where you begin decoding or how many frames you've decoded.

If the dithering noise is allowed to vary, then decoding from point A to B, freeing the decoder context, then later allocating a context and resuming decode from point B to C, will produce a slight discontinuity at point B where the decoded audio is put together because the dithering noise applied to some frequency bands came out differently between the two decodes. If the noise is made consistent, then the decoded audio at point B will come out the exact same as if you had decoded continuously from point A to C.

But, if you just want to play a media file and you don't care about that consistency, then the CPU work to ensure a consistent decode like that is a waste of effort. Playing, streaming and transcoding sequentially is probably 99 percent of FFMPEG's general use, right?

I'm well aware the AC-3 decoder has a window delay in the frequency domain. Decoding from point B to C actually means going to point B, stepping back 1 to 2 AC-3 frames, decoding up to B and discarding the audio, then taking decoded audio from point B on.

Do you understand now why this is useful for NLE software, but should not be enabled by default?

Jonathan Campbell