[FFmpeg-devel] [PATCH 1/6] ac3enc_fixed: convert to 32-bit sample format

Tue Jan 12 10:05:26 EET 2021

Jan 12, 2021, 08:59 by andreas.rheinhardt at gmail.com:

> Lynne:
>
>> The AC3 encoder used to be a separate library called "Aften", which
>> got merged into libavcodec (literally, SVN commits and all).
>> The merge preserved as much features from the library as possible.
>>
>> The code had two versions - a fixed point version and a floating
>> point version. FFmpeg had floating point DSP code used by other
>> codecs, the AC3 decoder including, so the floating-point DSP was
>> simply replaced with FFmpeg's own functions.
>> However, FFmpeg had no fixed-point audio code at that point. So
>> the encoder brought along its own fixed-point DSP functions,
>> including a fixed-point MDCT.
>>
>> The fixed-point MDCT itself is trivially just a float MDCT with a
>> different type and each multiply being a fixed-point multiply.
>> So over time, it got refactored, and the FFT used for all other codecs
>> was templated.
>>
>> Due to design decisions at the time, the fixed-point version of the
>> encoder operates at 16-bits of precision. Although convenient, this,
>> even at the time, was inadequate and inefficient. The encoder is noisy,
>> does not produce output comparable to the float encoder, and even
>> rings at higher frequencies due to the badly approximated winow function.
>>
>> Enter MIPS (owned by Imagination Technologies at the time). They wanted
>> quick fixed-point decoding on their FPUless cores. So they contributed
>> patches to template the AC3 decoder so it had both a fixed-point
>> and a floating-point version. They also did the same for the AAC decoder.
>> They however, used 32-bit samples. Not 16-bits. And we did not have
>> 32-bit fixed-point DSP functions, including an MDCT. But instead of
>> templating our MDCT to output 3 versions (float, 32-bit fixed and 16-bit fixed),
>> they simply copy-pasted their own MDCT into ours, and completely
>> ifdeffed our own MDCT code out if a 32-bit fixed point MDCT was selected.
>>
>> This is also the status quo nowadays - 2 separate MDCTs, one which
>> produces floating point and 16-bit fixed point versions, and one
>> sort-of integrated which produces 32-bit MDCT.
>>
>> MIPS weren't all that interested in encoding, so they left the encoder
>> as-is, and they didn't care much about the ifdeffery, mess or quality - it's
>> not their problem.
>>
>> So the MDCT/FFT code has always been a thorn in anyone looking to clean up
>> code's eye.
>>
>> Backstory over. Internally AC3 operates on 25-bit fixed-point coefficients.
>> So for the floating point version, the encoder simply runs the float MDCT,
>> and converts the resulting coefficients to 25-bit fixed-point, as AC3 is inherently
>> a fixed-point codec. For the fixed-point version, the input is 16-bit samples,
>> so to maximize precision the frame samples are analyzed and the highest set
>> bit is detected via ac3_max_msb_abs_int16(), and the coefficients are then
>> scaled up via ac3_lshift_int16(), so the input for the FFT is always at least 14 bits,
>> computed in normalize_samples(). After FFT, the coefficients are scaled up to 25 bits.
>>
>> This patch simply changes the encoder to accept 32-bit samples, reusing
>> the already well-optimized 32-bit MDCT code, allowing us to clean up and drop
>> a large part of a very messy code of ours, as well as prepare for the future lavu/tx
>> conversion. The coefficients are simply scaled down to 25 bits during windowing,
>> skipping 2 separate scalings, as the hacks to extend precision are simply no longer
>> necessary. There's no point in running the MDCT always at 32 bits when you're
>> going to drop 6 bits off anyway, the headroom is plenty, and the MDCT rounds
>> properly.
>>
>> This also makes the encoder even slightly more accurate over the float version,
>> as there's no coefficient conversion step necessary.
>>
>> SIZE SAVINGS:
>> ARM32:
>> HARDCODED TABLES:
>> BASE           - 10709590
>> DROP  DSP      - 10702872 - diff:   -6.56KiB
>> DROP  MDCT     - 10667932 - diff:  -34.12KiB - both:   -40.68KiB
>> DROP  FFT      - 10336652 - diff: -323.52KiB - all:   -364.20KiB
>> SOFTCODED TABLES:
>> BASE           -  9685096
>> DROP  DSP      -  9678378 - diff:   -6.56KiB
>> DROP  MDCT     -  9643466 - diff:  -34.09KiB - both:   -40.65KiB
>> DROP  FFT      -  9573918 - diff:  -67.92KiB - all:   -108.57KiB
>>
>> ARM64:
>> HARDCODED TABLES:
>> BASE           - 14641112
>> DROP  DSP      - 14633806 - diff:   -7.13KiB
>> DROP  MDCT     - 14604812 - diff:  -28.31KiB - both:   -35.45KiB
>> DROP  FFT      - 14286826 - diff: -310.53KiB - all:   -345.98KiB
>> SOFTCODED TABLES:
>> BASE           - 13636238
>> DROP  DSP      - 13628932 - diff:   -7.13KiB
>> DROP  MDCT     - 13599866 - diff:  -28.38KiB - both:   -35.52KiB
>> DROP  FFT      - 13542080 - diff:  -56.43KiB - all:    -91.95KiB
>>
>> x86:
>> HARDCODED TABLES:
>> BASE           - 12367336
>> DROP  DSP      - 12354698 - diff:  -12.34KiB
>> DROP  MDCT     - 12331024 - diff:  -23.12KiB - both:   -35.46KiB
>> DROP  FFT      - 12029788 - diff: -294.18KiB - all:   -329.64KiB
>> SOFTCODED TABLES:
>> BASE           - 11358094
>> DROP  DSP      - 11345456 - diff:  -12.34KiB
>> DROP  MDCT     - 11321742 - diff:  -23.16KiB - both:   -35.50KiB
>> DROP  FFT      - 11276946 - diff:  -43.75KiB - all:    -79.25KiB
>>
>> PERFORMANCE (10min random s32le):
>> ARM32 - before -  39.9x - 0m15.046s
>> ARM32 - after  -  28.2x - 0m21.525s
>>                        Speed:  -30%
>>
>> ARM64 - before -  36.1x - 0m16.637s
>> ARM64 - after  -  36.0x - 0m16.727s
>>                        Speed: -0.5%
>>
>> x86   - before - 184x -    0m3.277s
>> x86   - after  - 190x -    0m3.187s
>>                        Speed:   +3%
>>
>> Patch attached.
>>
>>
>> _______________________________________________
>> ffmpeg-devel mailing list
>> ffmpeg-devel at ffmpeg.org
>> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>>
>> To unsubscribe, visit link above, or email
>> ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".
>>
>>  static av_cold int ac3_fixed_mdct_init(AC3EncodeContext *s)
>>  {
>>  int ret = ff_mdct_init(&s->mdct, 9, 0, -1.0);
>> -    s->mdct_window = ff_ac3_window;
>> +    if (ret < 0)
>> +        return ret;
>> +
>> +    int32_t *iwin = av_malloc_array(AC3_WINDOW_SIZE, sizeof(*iwin));
>>
>
> This will lead to declaration-after-statement warnings.
>
>> +    if (!iwin)
>> +        return AVERROR(ENOMEM);
>> +
>> +    float fwin[AC3_WINDOW_SIZE/2];
>> +    ff_kbd_window_init(fwin, 5.0, AC3_WINDOW_SIZE/2);
>> +
>> +    for (int i = 0; i < AC3_WINDOW_SIZE/2; i++)
>> +        iwin[i] = lrintf(fwin[i] * (1 << 22));
>> +
>> +    for (int i = 0; i < AC3_WINDOW_SIZE/2; i++)
>> +        iwin[AC3_WINDOW_SIZE-1-i] = iwin[i];
>> +
>> +    s->mdct_window = iwin;
>> +
>> +    s->fdsp = avpriv_alloc_fixed_dsp(s->avctx->flags & AV_CODEC_FLAG_BITEXACT);
>> +    if (!s->fdsp)
>> +        return AVERROR(ENOMEM);
>> +
>>  return ret;
>>
>
> Maybe replace this by return 0; after all, you already checked for
> errors above (which the earlier code did not). Or you could move
> initializing the mdct to the very end of this function, via a tail call,
> which would then also automatically fix the declaration-after-statement
> warning above.
>
Done.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ac3enc_fixed-convert-to-32-bit-sample-format.patch
Type: text/x-patch
Size: 16887 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20210112/48e4fb7e/attachment.bin>