[FFmpeg-devel] [PATCH] Made wavpack decoder output 32 bits samples.

Kostya kostya.shishkov
Wed Apr 22 20:00:11 CEST 2009


On Tue, Apr 21, 2009 at 11:52:58PM +0200, Laurent Aimar wrote:
> Hi,
> 
> On Mon, Apr 20, 2009, Michael Niedermayer wrote:
> > On Mon, Apr 20, 2009 at 10:17:41PM +0200, Laurent Aimar wrote:
> > > Hi,
> > > 
> > > On Mon, Apr 20, 2009, Kostya wrote:
> > > > On Sat, Apr 18, 2009 at 09:47:08PM +0200, Laurent Aimar wrote:
> > > > > This patches is needed to extend wavpack to support > 16 bits files.
> > > > 
> > > > Well, in principle this looks fine along with 24-bit support patch but how does
> > > > it affect performance? Adding lossy compression support put it back on par with
> > > > reference decoder already and this may slow it down a bit more.
> > 
> > >  I have not checked for performance losses,
> > 
> > please do check and post the results
> As ffmpeg executable does not seems to correctly support S32 codec output I
> have used VLC to do all the benchmarks.
> 
>  For 16 bits audio output I have used:
> time ./vlc -I dummy /mnt/mem/long-16bit.wv --sout '#transcode{acodec=s16l}:std{access=file,mux=raw,dst=/dev/null}' --play-and-exit
>  And for 32 bits audio output I have used:
> time ./vlc -I dummy /mnt/mem/long-16bit.wv --sout '#transcode{acodec=s32l}:std{access=file,mux=raw,dst=/dev/null}' --play-and-exit
> 
>  where:
>  - /mnt/mem is tmpfs disk
>  - long-16bit.wv a wavpack file with 16 bits samples of duration 00:20:52.84
> creating using the wavpack executable from debian lenny.
>  - gcc --version gives "gcc (Debian 4.3.3-3) 4.3.3"
> 
>  The command line in itself tells VLC to convert to raw audio samples (using
> libavcodec).
> 
> 
> * Result without any patch (16 bits output): 
> 17.94s user 0.18s system 94% cpu 19.183 total
> 17.92s user 0.17s system 97% cpu 18.501 total
> 17.99s user 0.16s system 97% cpu 18.547 total
> 17.90s user 0.19s system 94% cpu 19.210 total
> 17.90s user 0.19s system 95% cpu 18.885 total
> 18.00s user 0.13s system 95% cpu 18.937 total
> 
> * Result for 32 bits output without 24 bit support:
> 18.12s user 0.16s system 91% cpu 19.888 total
> 18.08s user 0.20s system 94% cpu 19.371 total
> 18.10s user 0.18s system 94% cpu 19.353 total
> 18.17s user 0.16s system 94% cpu 19.356 total
> 18.10s user 0.22s system 90% cpu 20.186 total
> 18.15s user 0.17s system 94% cpu 19.440 total
> 18.15s user 0.19s system 97% cpu 18.780 total
> -> Really small speed loss (~ 1%)
> 
> * Result for 32 bits output with 24 bit support:
> 19.27s user 0.15s system 96% cpu 20.207 total
> 19.18s user 0.08s system 94% cpu 20.404 total
> 19.24s user 0.19s system 96% cpu 20.132 total
> 19.27s user 0.15s system 95% cpu 20.386 total
> 19.34s user 0.15s system 96% cpu 20.273 total
> 19.19s user 0.14s system 91% cpu 21.125 total
> 19.13s user 0.14s system 91% cpu 21.095 total
> -> Greater speed loss caused by the int64_t cast.
> 
> So I have created a small patch (attached to this mail) that add
> av_always_inline to wv_unpack_mono/stereo with a parameter use64
> telling to use or not the int64_t casts to help gcc optimizes:
> 
> * Result for 32 bits output with 24 bit support with av_always_inline and use64:
> 17.67s user 0.19s system 90% cpu 19.666 total
> 18.03s user 0.14s system 94% cpu 19.137 total
> 17.66s user 0.24s system 91% cpu 19.646 total
> 17.78s user 0.16s system 95% cpu 18.766 total
> -> Speed gain (even against 16 bits case).
> 
> To verify that av_always_inline wasn't the culprit, I have rechecked the 16
> bits case by adding it to wv_unpack_mono/stereo (not extra parameter needed here)
> 
> * Result for 16 bits without any patches except av_always_inline to wv_unpack_mono/stereo.
> 17.81s user 0.14s system 90% cpu 19.795 total
> 17.65s user 0.13s system 90% cpu 19.579 total
> 17.85s user 0.18s system 95% cpu 18.815 total
> 17.69s user 0.14s system 91% cpu 19.597 total
> 
>  So unless there is something wrong with my methodology, I think that applying
> all 3 patches I have posted and the extra one added to this email will not
> degrade the libavcodec wavpack decoder perfomances.

OK, I'll test it a bit myself and apply all four patches tomorrow
if I find no issues with them.
 
> Regards,
> 
> -- 
> fenrir

> diff --git a/libavcodec/wavpack.c b/libavcodec/wavpack.c
> index 28dd3d8..7d6e977 100644
> --- a/libavcodec/wavpack.c
> +++ b/libavcodec/wavpack.c
> @@ -337,7 +337,7 @@ static int wv_get_value(WavpackContext *ctx, GetBitContext *gb, int channel, int
>      return sign ? ~ret : ret;
>  }
>  
> -static int wv_unpack_stereo(WavpackContext *s, GetBitContext *gb, int32_t *dst)
> +static av_always_inline int wv_unpack_stereo(WavpackContext *s, GetBitContext *gb, int32_t *dst, int use64 )
>  {
>      int i, j, count = 0;
>      int last, t;
> @@ -371,22 +371,36 @@ static int wv_unpack_stereo(WavpackContext *s, GetBitContext *gb, int32_t *dst)
>                      B = s->decorr[i].samplesB[pos];
>                      j = (pos + t) & 7;
>                  }
> -                L2 = L + ((s->decorr[i].weightA * (int64_t)A + 512) >> 10);
> -                R2 = R + ((s->decorr[i].weightB * (int64_t)B + 512) >> 10);
> +                if(use64) {
> +                    L2 = L + ((s->decorr[i].weightA * (int64_t)A + 512) >> 10);
> +                    R2 = R + ((s->decorr[i].weightB * (int64_t)B + 512) >> 10);
> +                } else {
> +                    L2 = L + ((s->decorr[i].weightA * A + 512) >> 10);
> +                    R2 = R + ((s->decorr[i].weightB * B + 512) >> 10);
> +                }
>                  if(A && L) s->decorr[i].weightA -= ((((L ^ A) >> 30) & 2) - 1) * s->decorr[i].delta;
>                  if(B && R) s->decorr[i].weightB -= ((((R ^ B) >> 30) & 2) - 1) * s->decorr[i].delta;
>                  s->decorr[i].samplesA[j] = L = L2;
>                  s->decorr[i].samplesB[j] = R = R2;
>              }else if(t == -1){
> -                L2 = L + ((s->decorr[i].weightA * (int64_t)s->decorr[i].samplesA[0] + 512) >> 10);
> +                if(use64)
> +                    L2 = L + ((s->decorr[i].weightA * (int64_t)s->decorr[i].samplesA[0] + 512) >> 10);
> +                else
> +                    L2 = L + ((s->decorr[i].weightA * s->decorr[i].samplesA[0] + 512) >> 10);
>                  UPDATE_WEIGHT_CLIP(s->decorr[i].weightA, s->decorr[i].delta, s->decorr[i].samplesA[0], L);
>                  L = L2;
> -                R2 = R + ((s->decorr[i].weightB * (int64_t)L2 + 512) >> 10);
> +                if(use64)
> +                    R2 = R + ((s->decorr[i].weightB * (int64_t)L2 + 512) >> 10);
> +                else
> +                    R2 = R + ((s->decorr[i].weightB * L2 + 512) >> 10);
>                  UPDATE_WEIGHT_CLIP(s->decorr[i].weightB, s->decorr[i].delta, L2, R);
>                  R = R2;
>                  s->decorr[i].samplesA[0] = R;
>              }else{
> -                R2 = R + ((s->decorr[i].weightB * (int64_t)s->decorr[i].samplesB[0] + 512) >> 10);
> +                if(use64)
> +                    R2 = R + ((s->decorr[i].weightB * (int64_t)s->decorr[i].samplesB[0] + 512) >> 10);
> +                else
> +                    R2 = R + ((s->decorr[i].weightB * s->decorr[i].samplesB[0] + 512) >> 10);
>                  UPDATE_WEIGHT_CLIP(s->decorr[i].weightB, s->decorr[i].delta, s->decorr[i].samplesB[0], R);
>                  R = R2;
>  
> @@ -395,7 +409,10 @@ static int wv_unpack_stereo(WavpackContext *s, GetBitContext *gb, int32_t *dst)
>                      s->decorr[i].samplesA[0] = R;
>                  }
>  
> -                L2 = L + ((s->decorr[i].weightA * (int64_t)R2 + 512) >> 10);
> +                if(use64)
> +                    L2 = L + ((s->decorr[i].weightA * (int64_t)R2 + 512) >> 10);
> +                else
> +                    L2 = L + ((s->decorr[i].weightA * R2 + 512) >> 10);
>                  UPDATE_WEIGHT_CLIP(s->decorr[i].weightA, s->decorr[i].delta, R2, L);
>                  L = L2;
>                  s->decorr[i].samplesB[0] = L;
> @@ -419,7 +436,7 @@ static int wv_unpack_stereo(WavpackContext *s, GetBitContext *gb, int32_t *dst)
>      return count * 2;
>  }
>  
> -static int wv_unpack_mono(WavpackContext *s, GetBitContext *gb, int32_t *dst)
> +static int wv_unpack_mono(WavpackContext *s, GetBitContext *gb, int32_t *dst, int use64)
>  {
>      int i, j, count = 0;
>      int last, t;
> @@ -445,7 +462,10 @@ static int wv_unpack_mono(WavpackContext *s, GetBitContext *gb, int32_t *dst)
>                  A = s->decorr[i].samplesA[pos];
>                  j = (pos + t) & 7;
>              }
> -            S = T + ((s->decorr[i].weightA * (int64_t)A + 512) >> 10);
> +            if(use64)
> +                S = T + ((s->decorr[i].weightA * (int64_t)A + 512) >> 10);
> +            else
> +                S = T + ((s->decorr[i].weightA * A + 512) >> 10);
>              if(A && T) s->decorr[i].weightA -= ((((T ^ A) >> 30) & 2) - 1) * s->decorr[i].delta;
>              s->decorr[i].samplesA[j] = T = S;
>          }
> @@ -702,9 +722,9 @@ static int wavpack_decode_frame(AVCodecContext *avctx,
>      }
>  
>      if(s->stereo_in)
> -        samplecount = wv_unpack_stereo(s, &s->gb, samples);
> +        samplecount = wv_unpack_stereo(s, &s->gb, samples, s->post_shift < 16 );
>      else{
> -        samplecount = wv_unpack_mono(s, &s->gb, samples);
> +        samplecount = wv_unpack_mono(s, &s->gb, samples, s->post_shift < 16 );
>          if(s->stereo){
>              int32_t *dst = samples + samplecount * 2;
>              int32_t *src = samples + samplecount;

> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel



More information about the ffmpeg-devel mailing list