[FFmpeg-devel] [Patch] QT RLE encoder, bis
Alexis Ballier
alexis.ballier
Mon Jun 25 23:18:15 CEST 2007
> > There are some things that might need comments :
> > - I'm using the AVContext rc_max_rate variable to control whether to
> > use the optimal coding algorithm or the heuristic, that's the best
> > option I found but someone might have a better idea / not like this
> > one.
>
> such missuse of rc_max_rate is not ok, there are many better options
> also maybe we can simply make the optimal coder fast enough to avoid
> the heuristic one ...
+1 for faster code, see below
> > + for(i=width-1; i>=0; i--) {
> > +
> > + if((s->frame).key_frame)
> > + skipcount=0;
> > + else skipcount=scan_equal(this_line, prev_line, FFMIN(width - i,MAX_RLE_SKIP), s->pixel_size, 0);
> > +
> > + if(skipcount>0)
> > + total_skip_cost = (s->length_table[i+skipcount] + 2);
> > + else total_skip_cost = width;
> > +
> > +
> > + if(i<width-1) repeatcount=scan_equal(this_line , this_line + s->pixel_size, FFMIN(width-i-1,MAX_RLE_REPEAT-1), s->pixel_size, 0) + 1;
> > + else repeatcount=1;
>
> you dont have to run scan_equal() for each pixel, you already know how many
> are equal following the next pixel
Nice catch, shamely I didn't think about optimizing that, should be
better now, basic benchs showed no less than 40% perf gain ^^
I also removed the "total_skip_cost = width" total nonsense
Also added a new table to keep skipcount computed values, so that
there is no need to compute it again while writing to buf
That allows the removal of scan_equal
> [...]
> > + for(j=2;j<=limit;j++){
> > + temp_cost += s->pixel_size;
> > + if(s->length_table[i+j] + temp_cost < total_bulk_cost){
> > + /* We have found a better bulk copy ... */
> > + total_bulk_cost = s->length_table[i+j] + temp_cost;
> > + bulkcount = j;
> > + }
> > + }
>
> its not needed to do this from 2..limit
> this can instead be stoped much earlier if its sufficiently worse compared
> to any other method
> also you dont even need to start the bulk search if there are enough repeats
> or skips as the case of repeat/skip + later raw will then always be better
Good point, I thought there were special cases where this did not
work, but as even "skip one pixel" or "repeat twice" + bulk is cheaper
(or the same) than having the whole bulk, I modified the code to match
this. [note that skip one pixel + bulk is cheaper only because
pixel_size >= 2 (the cost of a skip sequence), thus if it is used to
encode something with strictly less that 16 bits per pixel it might
lose its optimality]
Again, there were perf gains here.
> > + while(i<width){
> > + rlecode = (signed char)s->rlecode_table[i];
>
> the cast is unneeded or your code is broken
was most likely the later, as in the spec a rlecode is signed, I
changed the rlecode_table to match this
>
> [...]
> > +/** Dumps frame including header */
> > +static int dump_frame(QtrleEncContext *s, AVFrame *p, uint8_t *buf)
>
> encode_frame()
not sure what you meant there, I just renamed it to encode_frame.
Do you mean that this should be merged into qtrle_encode_frame() ?
> > +{
> > + int i;
> > + int start_line=0;
> > + int end_line=s->avctx->height;
> > + uint8_t *orig_buf=buf;
> > +
> > + if(!((s->frame).key_frame)){
>
> superflous () and there are many more in the patch
Yep, most likely that I'm too paranoid with the parser that might
understand it wrongly, I removed some other () aswell
> > +
> > + /* save the current frame */
> > + memcpy(s->previous_frame, p->data[0], avctx->height*avctx->width*s->pixel_size);
>
> this is wrong if linesize != width*pixel_size
Indeed
I changed the "previous_frame" to an AVPicture, copying it with
av_picture_copy, I hope I got it right this time [one of the previous
patches used img_copy which I didn't want to use due to the attribute
deprecated]
The other minor things should also be fixed in enclosed patch
In the end, the encoding time is almost divided by 2, so I just
removed the heuristic and the hack to switch between the optimal &
heuristic. Basic tests showed that the speed gain is even more
important with screencasts (difference between pictures is very small
thus not searching for every possible bulk copy helps a lot), I can
now do realtime encoding with this algorithm at 1024x768 at 25fps :)
Regards,
Alexis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qtrle_encoder_try2.patch
Type: text/x-patch
Size: 14037 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070625/18ae4569/attachment.bin>
More information about the ffmpeg-devel
mailing list