[FFmpeg-devel] [PATCH] VBLE Decoder
Derek Buitenhuis
derek.buitenhuis at gmail.com
Wed Nov 9 17:39:43 CET 2011
On 09/11/2011 9:14 AM, Michael Niedermayer wrote:
> please remove trailing whitespace, they arent allowed in ffmpeg nor
> the fork.
Woops. Fixed.
>
> the following should be quite a bit faster and prettier:
>
> static av_always_inline int vble_bitscan(GetBitContext *gb)
> {
> static const uint8_t len[]={ ... };
> int i= len[ show_bits(gb, 8) ];
> skip_bits(gb, i+1);
> return i;
> }
You're right. I discussed this with Kostya as well,
and am rewriting it.
> the sum of lengths should be checked against get_bits_left()
Yup. Thanks.
> reading 4 at a time with get_bits_long() is likely slower than reading
> 2 at a time of just 1, later also would simplify the code
I'll look into this.
>
> these should be outside the innermost loop, which should be easy
> to do when 1 value is read at a time but can be done with multiple
> values too, though then messier
I'll check into this. Thanks.
> if(lengths[p]) pix=get_bits(gb, lengths[p])
> (for the 1 sample at a time case)
Given context above.
> in case gcc compiles this to a conditional branch then the
> following should be faster:
> (pix>> 1) ^ -(pix&1);
>
> Note: you can check speed with START/STOP_TIMER
I'll check this, but this seems to be correct.
> this could use add_hfyu_median_prediction()
> or is there some reason why its use is not possible ?
I had trouble figuring out exactly how to call
add_hfyu_median_prediction() properly. I'll double check it,
as it would likely be much, much faster (I believe it even has ASM).
> this has to be set before get_buffer() as it may affect the kind
> of buffer you get
Thanks. I wasn't aware of this. Fixed.
> you cannot change the linesize of buffers allocated by get_buffer()
> it also would break alignment for outside code like some video filter
Yeah I is mistaken about this. Fixed.
> you should check that the input contains at least 4 + (w*h*3)/16 bytes
Extra checks can't be a bad thing (usually). Added.
>
> it should be faster to interleave the last get_bits() vble_decode() and
> vble_median_restore() linewise due to better L1 data cache use
> alternatively the 3 median_restore could be interleaved linewise with
> vble_decode() and draw_horiz_band() be implemented
> (CODEC_CAP_DRAW_HORIZ_BAND)
>
> also you could add CODEC_FLAG_GRAY support to allow faster but
> grayscale only decoding.
>
> These are all of course just ideas, its perfectly fine if you dont
> implement them or do them as seperate patches.
Interesting ideas nonetheless. I'll try these out. Though, not
terribly sure what a decent use-case for VBLE and CODEC_FLAG_GRAY
is. ;)
> the reading and use can be combined making this array unneeded
> the casts are also unneeded
Yeah, this was leftover from an earlier implementation I had written.
> If you want you can also add yourself to the MAINTAINERS file,
> that is if you want to maintain this code in/for ffmpeg. And it would
> be very welcome if you maintain it!
Sure.
> [...]
Thanks for the feedback!
- Derek
More information about the ffmpeg-devel
mailing list