[FFmpeg-devel] [PATCH] use AV_RB16 in cabac refill
Måns Rullgård
mans
Thu Mar 25 15:11:32 CET 2010
Alexander Strange <astrange at ithinksw.com> writes:
> Measured 1 cycle faster decode_cabac_residual on x86-64. Didn't try
> anywhere else, but I'd be a little interested in what arm does.
>
>
> From 539b4c39981a32f4de2c0cbccc54bf540bda398f Mon Sep 17 00:00:00 2001
> From: Alexander Strange <astrange at ithinksw.com>
> Date: Wed, 17 Mar 2010 06:06:15 -0400
> Subject: [PATCH 3/4] cabac: Use AV_RB16 instead of two byte loads in refill
>
> Less than 1 cycle faster.
> ---
> libavcodec/cabac.h | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/cabac.h b/libavcodec/cabac.h
> index 2794626..3aed9fb 100644
> --- a/libavcodec/cabac.h
> +++ b/libavcodec/cabac.h
> @@ -262,7 +262,7 @@ static void put_cabac_ueg(CABACContext *c, uint8_t * state, int v, int max, int
>
> static void refill(CABACContext *c){
> #if CABAC_BITS == 16
> - c->low+= (c->bytestream[0]<<9) + (c->bytestream[1]<<1);
> + c->low+= AV_RB16(c->bytestream)<<1;
> #else
> c->low+= c->bytestream[0]<<1;
> #endif
> @@ -280,7 +280,7 @@ static void refill2(CABACContext *c){
> x= -CABAC_MASK;
>
> #if CABAC_BITS == 16
> - x+= (c->bytestream[0]<<9) + (c->bytestream[1]<<1);
> + x+= AV_RB16(c->bytestream)<<1;
> #else
> x+= c->bytestream[0]<<1;
> #endif
This is probably faster on machines with unaligned load support. The
others I'm not so sure about. If the compiler is clever enough, it
shouldn't make a difference, but you know, gcc...
On the other hand, many of those systems are probably not particularly
relevant here.
--
M?ns Rullg?rd
mans at mansr.com
More information about the ffmpeg-devel
mailing list