[FFmpeg-devel] [PATCH] vf_overlay: add support to RGBA packed input and output
Stefano Sabatini
stefasab at gmail.com
Sun Oct 30 22:34:41 CET 2011
On date Sunday 2011-10-30 14:42:38 +0100, Michael Niedermayer encoded:
> On Sat, Oct 29, 2011 at 04:47:41PM +0200, Stefano Sabatini wrote:
[...]
> > + switch (alpha) {
> > + case 0:
> > + break;
> > + case 255:
> > + d[dr] = s[sr];
> > + d[dg] = s[sg];
> > + d[db] = s[sb];
> > + break;
> > + default:
> > + // main_value = main_value * (1 - alpha) + overlay_value * alpha
>
> > + // apply a fast approximation: X/255 ~ (X+128)/256
>
> please use +128*257>>16 (which is exact)
Uhm I suppose you meant:
((X * 257) + 257)>> 16
For the interested reader:
research.swtch.com/2008/01/division-via-multiplication.html
(or read TAOCP if you want the long version ;-)).
Then I tested with the plain version:
22001580 dezicycles in first, 2 runs, 0 skips
22377187 dezicycles in first, 4 runs, 0 skips
22358670 dezicycles in first, 8 runs, 0 skips
22430178 dezicycles in first, 16 runs, 0 skips
27048690 dezicycles in first, 32 runs, 0 skips
24722512 dezicycles in first, 64 runs, 0 skips
23467227 dezicycles in first, 128 runs, 0 skips
22707239 dezicycles in first, 256 runs, 0 skips
22325824 dezicycles in first, 512 runs, 0 skips
22106139 dezicycles in first, 1024 runs, 0 skips
22007162 dezicycles in first, 2048 runs, 0 skips
21959926 dezicycles in first, 4096 runs, 0 skips
21978105 dezicycles in first, 8192 runs, 0 skips
21927611 dezicycles in first, 16384 runs, 0 skips
21889967 dezicycles in first, 32768 runs, 0 skips
With the optmized variant:
20987625 dezicycles in first, 2 runs, 0 skips
20781405 dezicycles in first, 4 runs, 0 skips
20581886 dezicycles in first, 8 runs, 0 skips
20787228 dezicycles in first, 16 runs, 0 skips
21084062 dezicycles in first, 32 runs, 0 skips
21028600 dezicycles in first, 64 runs, 0 skips
20786884 dezicycles in first, 128 runs, 0 skips
20671322 dezicycles in first, 256 runs, 0 skips
20563223 dezicycles in first, 512 runs, 0 skips
20527375 dezicycles in first, 1024 runs, 0 skips
20481658 dezicycles in first, 2048 runs, 0 skips
20452863 dezicycles in first, 4096 runs, 0 skips
20535609 dezicycles in first, 8192 runs, 0 skips
20503526 dezicycles in first, 16384 runs, 0 skips
20465800 dezicycles in first, 32768 runs, 0 skips
But I confess that I always build ffmpeg with optimizations disabled
(for easing debugging) and I suppose that most decent compilers
will know all about these numerical tricks, so I'm not sure if
these hand-crafted optimizations are worth the code obfuscation.
I'll push soon if I read no more comments.
--
FFmpeg = Fabulous Fantastic Mysterious Pitiless Evil Genius
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-vf_overlay-enable-RGB-path.patch
Type: text/x-diff
Size: 9946 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20111030/ac64f041/attachment.bin>
More information about the ffmpeg-devel
mailing list