[Ffmpeg-devel] [RFC] [PATCH] FLC/FLX/DTA encoder

Tue Feb 27 02:28:53 CET 2007

Hi

On Tue, Feb 27, 2007 at 01:48:05AM +0100, Alex Beregszaszi wrote:
[...]
> > [...]
> > > +static int encode_brun(AVCodecContext *avctx, AVFrame *p, unsigned char *buf) {
> > > +    int lines, pos = 6, y_ptr = 0, pixel_ptr;
> > > +    unsigned char *pixels = p->data[0];
> > > +    const int stride = avctx->width*((avctx->bits_per_sample+7)/8);
> > > +//    const int stride = avctx->width;
> > > +
> > > +    AV_WL16(buf+4, FLI_BRUN); // type
> > > +    
> > > +    for (lines = 0; lines < avctx->height; lines++) {
> > > +	int nppos = pos, npackets = 0;
> > > +	int pixel_countdown = stride;
> > > +	int copy_pos = pos; 
> > > +	unsigned int run_count = 0;
> > > +
> > > +	pixel_ptr = y_ptr;
> > > +
> > > +	pos++; // number of packets, patched later
> > > +
> > > +	while (pixel_countdown > 0) {
> > > +	    // no point checking for a run if only 1 byte left
> > > +	    if ((pixel_countdown > 2) && 
> > > +		(pixels[pixel_ptr] == pixels[pixel_ptr+1]) &&
> > > +		(pixels[pixel_ptr] == pixels[pixel_ptr+2])) {
> > > +		// 3 bytes the same in a row, so start counting the run
> > > +		// 3 is the threshold, because a run of two forces and extra byte
> > > +		
> > > +		// flush old run
> > > +		if (run_count != 0) {
> > > +		    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > > +		    run_count = 0;
> > > +		}
> > > +
> > > +		run_count = 3;
> > > +		while ((pixels[pixel_ptr] == pixels[pixel_ptr+run_count]) && 
> > > +		       (run_count < pixel_countdown) && 
> > > +		       (run_count < 127)) {
> > > +		    run_count++;
> > > +		}
> > > +		npackets++;
> > > +		AV_WL8(buf+pos, (signed char)(run_count));
> > 
> > what does the signed char cast do here?
> 
> The field must be signed, as negative run_count means copy, while
> positive means run. Or vice versa. 

let me ask the question differently
why do you write code which is effectively
uint8_t *buf
buf[pos]= (signed char)run_count;
instead of
buf[pos]= run_count;

> Better if I add AV_WL8S ? I dont
> think.

no of course not, iam already unhappy about the existence of AV_*8 as it is
nothing but a simple *buf= v

> 
> > > +		pos++;
> > > +		AV_WL8(buf+pos, pixels[pixel_ptr]);
> > > +		pos++;
> > > +		pixel_ptr       += run_count;
> > > +		pixel_countdown -= run_count;
> > > +		run_count = 0;
> > > +	    } else {
> > > +	        // 3 bytes in a row not able to be RLE'd, so emit the copy
> > > +
> > > +		// flush old run
> > > +	        if (run_count == 128) {
> > > +		    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > 
> > AV_WL8(buf+copy_pos, 128);
> 
> -128

(uint8_t)128 == (uint8_t)-128

> 
> > > +		    run_count = 0;
> > > +                }
> > > +
> > > +		// start of a copy run
> > > +		if (run_count == 0)
> > > +		{
> > > +		    copy_pos = pos; // written later 
> > > +		    pos++;
> > > +		    npackets++;
> > > +		}
> > > +		run_count++;
> > > +		AV_WL8(buf+pos, pixels[pixel_ptr]);
> > > +		pos++;
> > > +		pixel_ptr++;
> > > +		pixel_countdown--;
> > > +	    }
> > > +        }
> > > +
> > > +	// flush old run
> > > +	if (run_count != 0) {
> > > +	    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > > +	    run_count = 0;
> > > +        }
> > > +
> > > +	// this frame would be invalid this way, so just report failed
> > > +	// FIXME: try solvin it in a better way
> > > +	if (npackets > 255) 
> > > +	    return -1;
> > > +
> > > +	AV_WL8(buf+nppos, npackets);
> > 
> > what about
> > 
> > while (pixel < end) {
> >     int run=1;
> > 
> >     while(pixel+1 < end && run < 127 && pixel[0] == pixel[1]){
> >         run++;
> >         pixel++;
> >     }
> >     if(run > limit){
> >         *buf++= run;
> >         *buf++= *pixel++;
> >     }else{
> >         pixel -= run - 2;
> >         run=1;
> >         while(pixel+1 < end && run < 127 && pixel[0] != pixel[1]){
> >             run++;
> >             pixel++;
> >         }
> >         *buf++= - run;
> >         memcpy(buf, pixel-run, run);
> >         buf+= run;
> >     }
> >     npackets++;
> >     if(npackets>255){
> >         limit++;
> >         goto retry_from_begin;
> >     }
> > }
> 
> I was not using gotos.

no but your code didnt work, it just returns -1 if the "goto case" is reached
also you can just put the whole in a function in a loop and use a specific
return code to avoid the goto
iam not suggesting that you should use a goto but rather that you should
simplify your code, my code above is nothing but a suggestion on how to do
that

> 
> > > +static int encode_dta_brun(AVCodecContext *avctx, AVFrame *p, unsigned char *buf) {
> > > +    int lines, pos = 6, y_ptr = 0, pixel_ptr;
> > > +    unsigned short *pixels = (unsigned short*)p->data[0];
> > > +
> > > +    // FIXME: fix this to handle 24bit input too
> > > +    if (avctx->codec_id != CODEC_ID_FLIDTA ||
> > > +	avctx->bits_per_sample == 8 ||
> > > +	avctx->bits_per_sample == 24)
> > > +	return -1;
> > > +
> > > +    AV_WL16(buf+4, FLI_DTA_BRUN); // type
> > > +    
> > > +    for (lines = 0; lines < avctx->height; lines++) {
> > > +	int nppos = pos, npackets = 0;
> > > +	int pixel_countdown = avctx->width;
> > > +	int copy_pos = pos; 
> > > +	unsigned int run_count = 0;
> > > +
> > > +	pixel_ptr = y_ptr;
> > > +
> > > +	pos++; // number of packets, patched later
> > > +
> > > +	while (pixel_countdown > 0) {
> > > +	    // no point checking for a run if only 1 byte left
> > > +	    if ((pixel_countdown > 1) && 
> > > +		(pixels[pixel_ptr] == pixels[pixel_ptr+1])) {
> > > +		// 2 bytes the same in a row, so start counting the run
> > > +		
> > > +		// flush old run
> > > +		if (run_count != 0) {
> > > +		    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > > +		    run_count = 0;
> > > +		}
> > > +
> > > +		run_count = 2;
> > > +		while ((pixels[pixel_ptr] == pixels[pixel_ptr+run_count]) && 
> > > +		       (run_count < pixel_countdown) && 
> > > +		       (run_count < 127)) {
> > > +		    run_count++;
> > > +		}
> > > +		npackets++;
> > > +		AV_WL8(buf+pos, (signed char)(run_count));
> > > +		pos++;
> > > +		AV_WL16(buf+pos, pixels[pixel_ptr]);
> > > +		pos += 2;
> > > +		pixel_ptr       += run_count;
> > > +		pixel_countdown -= run_count;
> > > +		run_count = 0;
> > > +	    } else {
> > > +		// 2 bytes in a row not able to be RLE'd, so emit the copy
> > > +		
> > > +		// flush old run
> > > +	        if (run_count == 128) {
> > > +		    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > > +		    run_count = 0;
> > > +                }
> > > +
> > > +                // start of a copy run
> > > +		if (run_count == 0)
> > > +		{
> > > +		    copy_pos = pos; // written later 
> > > +		    pos++;
> > > +		    npackets++;
> > > +		}
> > > +		run_count++;
> > > +		AV_WL16(buf+pos, pixels[pixel_ptr]);
> > > +		pos += 2;
> > > +		pixel_ptr++;
> > > +		pixel_countdown--;
> > > +	    }
> > > +        }
> > > +
> > > +	// flush old run
> > > +	if (run_count != 0) {
> > > +	    AV_WL8(buf+copy_pos, (signed char)(0-run_count));
> > > +	    run_count = 0;
> > > +        }
> > > +
> > > +	// this frame would be invalid this way, so just report failed
> > > +	// FIXME: try solvin it in a better way
> > > +	if (npackets > 255) 
> > > +	    return -1;
> > > +
> > > +	AV_WL8(buf+nppos, npackets);
> > > +	
> > > +	y_ptr += p->linesize[0]/((avctx->bits_per_sample+7)/8);
> > > +    }
> > > +
> > > +    // padding
> > > +    if ((pos % 2) == 1) {
> > > +	AV_WL8(buf+pos, 0);
> > > +	pos++;
> > > +    }
> > > +        
> > > +    AV_WL32(buf, pos); // size field
> > > +
> > > +    return pos;
> > > +}
> > 
> > very similar to encode_brun, this should be merged
> 
> With two exceptions: the run_count field is exchanged (- means X, +
> means Y) and it works on pixels and not bytes. Its splitted for speed
> reason.

well code duplication at source level is unacceptable, you can use a
always_inline function like:

always_inline encode_dta_brun(AVCodecContext *avctx, AVFrame *p, unsigned char *buf, int pixel_size){
...
}

and call it like
encode_dta_brun(AVCodecContext *avctx, AVFrame *p, unsigned char *buf, 1);
...
encode_dta_brun(AVCodecContext *avctx, AVFrame *p, unsigned char *buf, 0);

gcc will inline them and optimize the sign flip and 1/2 byte R/W out 

> 
> > > +static int encode_lc(AVCodecContext *avctx, AVFrame *p, unsigned char *buf) {
> > 
> > this too and several other functions are very similar to the previous rle
> > encoders and must be merged
> 
> Similar to encode_brun in principle, as all the RLE is based on the same
> principle. I may add put_run() and put_copy() if that factorizing
> satisfies you.

iam only satisfied if there i no code duplication left and if i fail to
be able to simplify the code any further :)

> 
> > > +// 8 bit FLC DELTA
> > > +static int encode_delta8(AVCodecContext *avctx, AVFrame *p, unsigned char *buf) {
> > [...]
> > > +			    while (run_count > 254) {
> > > +				npackets++;
> > > +				AV_WL8(buf+pos, 254); // skip
> > > +				pos++;
> > > +				AV_WL8(buf+pos, 0x00); 
> > > +				pos++;
> > 
> > these really should be *bug++= X or bytestream_put_be16() could be used both
> > should be more readable
> 
> I thought about moving to bytestream_ already, but the time I wrote it
> there was no such layer yet.

now there is so use it, sorry yes iam an asshole i know but if i dont force
people to simplify their code before inclusion then it will in most cases
never happen

[...]

> > > +    // FIXME: this is rather unoptimal, but simple
> > > +    for (i = 0; flic_encoders[i>>1].encode != NULL; i++) {
> > > +        if ((avctx->codec_id == CODEC_ID_FLIC && !flic_encoders[i>>1].is_flc) ||
> > > +            (avctx->codec_id == CODEC_ID_FLIX && !flic_encoders[i>>1].is_flx) ||
> > > +            (avctx->codec_id == CODEC_ID_FLIDTA && !flic_encoders[i>>1].is_dta))
> > > +            continue;
> > > +        if (i & 1) {
> > > +            if (avctx->bits_per_sample == 8 && remapped)
> > > +                sizes[i] = flic_encoders[i>>1].encode(avctx, &(s->remap_frame), buf+chunkpos);
> > > +            else
> > > +                sizes[i] = -1;
> > 
> > using INT_MAX should avoid a few checks below
> 
> You prefer it that way?

yes, as its simpler IMHO

> 
> > > /*
> > >  * Palette optimisations
> > >  * Copyright (c) 2006 Sakura Industries Ltd.
> > >  * Author: Steven Johnson
> > 
> > the palette stuff belongs into a seperate patch!
> 
> Anyway, would you accept this (palette.c) theoretically? 

yes certainly its usefull for paletted codecs

> If yes, where
> should it reside? swscaler? libavcodec?

i dont know i would rather say lavc but its a hard decission ...

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

In a rich man's house there is no place to spit but his face.
-- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070227/8c97d024/attachment.pgp>