[Ffmpeg-devel] [PATCH][RFC] ac3 decoder

Sun Sep 24 09:06:06 CEST 2006

On Sat, 23 Sep 2006, Michael Niedermayer wrote:
>
> could you (or someone else) provide
> 1. a benchmark of it and liba52 (both with mmx/sse optims enabled and
>   disabled)
> 2. lines of code / bytes of both decoder source
> 3. compiled object sizes (and run strip over both)
> 4. max and mse difference between liba52 and soc_ac3 output or if
>   available against some reference decoder with reference bitstreams

versions tested:
a52: liba52 as present in ffmpeg svn
a52mp: liba52 as present in mplayer svn
soc: Kartikey's codec, including the attached patch to enable simd mdct

test content: the audio track of The Matrix. AC3 stereo 192kbps 8179sec.
cpu: Athlon64 2.2GHz

decoding time (mean and stddev of 10 runs):
27.448 \pm .012  sse_a52mp
28.076 \pm .008  c_a52
28.657 \pm .009  sse_soc
35.132 \pm .015  c_a52mp
36.748 \pm .006  c_soc

lines words bytes
2138  7854  69316  wc ac3_decoder.[ch]
4143 17078 120861  wc a52dec.c liba52/*.[ch]
7494 28796 216275  wc a52dec.c liba52mp/*.[ch]

bytes
22432  ac3_decoder.o
51064  a52dec.o liba52/*.o
94672  a52dec.o liba52mp/*.o

pairwise differences:
psnr:101.06 mse:    0.34 max:   91  c_a52.wav   c_a52mp.wav
psnr: 78.63 mse:   58.85 max: 6647  c_a52.wav   sse_a52mp.wav
psnr: 78.66 mse:   58.52 max: 6647  c_a52mp.wav sse_a52mp.wav
psnr: 53.16 mse:20758.59 max:26788  c_soc.wav   c_a52.wav
psnr: 53.16 mse:20750.29 max:26788  c_soc.wav   c_a52mp.wav
psnr: 53.16 mse:20745.67 max:26788  c_soc.wav   sse_a52mp.wav
psnr:   inf mse:    0.00 max:    0  c_soc.wav   sse_soc.wav

--Loren Merritt
-------------- next part --------------

--- libavcodec/ac3_decoder.c~	2006-09-23 18:08:55.000000000 -0700
+++ libavcodec/ac3_decoder.c	2006-09-23 21:42:16.000000000 -0700
@@ -1617,8 +1617,8 @@
         x2[k] = ctx->transform_coeffs[chindex][2 * k + 1];
     }
 
-    ff_imdct_calc(&ctx->imdct_256, ctx->tmp_output, x1, ctx->tmp_imdct);
-    ff_imdct_calc(&ctx->imdct_256, ctx->tmp_output + 256, x2, ctx->tmp_imdct);
+    ctx->imdct_256.fft.imdct_calc(&ctx->imdct_256, ctx->tmp_output, x1, ctx->tmp_imdct);
+    ctx->imdct_256.fft.imdct_calc(&ctx->imdct_256, ctx->tmp_output + 256, x2, ctx->tmp_imdct);
 
     o_ptr = ctx->output[chindex];
     d_ptr = ctx->delay[chindex];
@@ -1646,8 +1646,8 @@
 {
     float *ptr;
 
-    ff_imdct_calc(&ctx->imdct_512, ctx->tmp_output,
-            ctx->transform_coeffs[chindex], ctx->tmp_imdct);
+    ctx->imdct_512.fft.imdct_calc(&ctx->imdct_512, ctx->tmp_output,
+                                  ctx->transform_coeffs[chindex], ctx->tmp_imdct);
     ptr = ctx->output[chindex];
     ctx->dsp.vector_fmul_add_add(ptr, ctx->tmp_output, ctx->window, ctx->delay[chindex], 384, BLOCK_SIZE, 1);
     ptr = ctx->delay[chindex];