[FFmpeg-devel] [PATCH] use AV_RB16 in cabac refill

Alexander Strange astrange
Fri Mar 26 02:17:40 CET 2010


On Mar 25, 2010, at 4:08 AM, David Conrad wrote:

> On Mar 25, 2010, at 3:30 AM, Alexander Strange wrote:
> 
>> Measured 1 cycle faster decode_cabac_residual on x86-64. Didn't try anywhere else, but I'd be a little interested in what arm does.
> 
> It ought to be 2 instruction less and faster. However, both llvm and gcc decide to zero extend from 16 bits twice, and (llvm-)gcc-4.2 decides to load bytestream twice.

Hmm, zero-extending in bswap_16 isn't really surprising, since asm operands are always extended to int.
The only solution there is to write AV_RB16 in asm too.

--disable-asm is remarkably bad, I think it should be using (p[0] << 8 | p[1]) instead of __attribute__((packed)) and bswap_16 when FAST_UNALIGNED isn't defined.

This isn't a really important change, so it can wait.



More information about the ffmpeg-devel mailing list