[FFmpeg-devel] LIBMPEG2_BITSTREAM_READER vs. golomb.h

Siarhei Siamashka siarhei.siamashka
Mon Jul 14 02:14:30 CEST 2008


On Monday 14 July 2008, M?ns Rullg?rd wrote:
[...]
> >> This is all annoying because LIBMPEG2_BITSTREAM_READER is slightly
> >> faster on ARM.
> >
> > What about just using ALT_BITSTREAM_READER for ARMv6 and newer (cores
> > that support unaligned memory accesses)?
>
> I tried enabling HAVE_FAST_UNALIGNED, and it didn't make any
> significant difference.
>
> > It could be the fastest bitstream reader when implementing unaligned
> > 32-bit bigendian load as:
> >
> > setend be
> > ldr ...
> > setend le
>
> ldr; rev is only two instructions.

But it's 6 cycles on ARM11. Because unaligned read has 4 cycles latency, and
rev instruction has its argument as 'early reg' (+1 more cycle penalty).
Sequence "ldr"+"rev" is a dependency chain and you can't do much about it,
it's a bad choice.

On the other hand, "setend be"/"ldr"/"setend le" sequence is 3 cycles, with
some latency for load result availability. In the worst case it is 5 cycles,
which is already better than what you suggest. And you still have some freedom
reordering instructions for getting better results.

-- 
Best regards,
Siarhei Siamashka




More information about the ffmpeg-devel mailing list