[FFmpeg-devel] MMX accelerated DSP functions for VC1/WMV3 decoders
Michael Niedermayer
michaelni
Tue Jul 3 18:08:37 CEST 2007
Hi
On Sun, Jul 01, 2007 at 04:50:46PM +0200, Christophe GISQUET wrote:
> Hello,
>
> here's a new version, where the license header now specifies strictly
> v2.0 of the _Library_ General Public License, which is probably the most
> I would accept. Its text is not directly available anymore from the
> Licenses pages of the fsf and gnu sites.
>
> The question whether I accept version 2.1 or 3.0 of the LGPL or GPL is
> moot, as you pointed their texts offer a kind of loophole to select
> another version, which I disagree with.
>
> Michael Niedermayer a ?crit :
> > currently the code does run a dummy do nothing filter in 6 out of 15 cases
> > this is not good, if there where a general variable tap offset supported
> > then i think it should be easier to skip these dummy filter_0 copy thing
>
> So I implemented something that should be akin to your suggestion. Here
> are the execution times for 5 runs: 5.26 5.27 5.32 5.16 5.25. Therefore
> it's a tiny bit faster (around 1%).
>
> As for the code size of the object file, not sure what is the best
> indication:
> ls: 75696 -> 42480 (with debug symbols though)
> size (only text is present): 32869 -> 3552
[...]
> Index: libavcodec/i386/dsputil_mmx.c
> ===================================================================
> --- libavcodec/i386/dsputil_mmx.c (r??vision 9451)
> +++ libavcodec/i386/dsputil_mmx.c (copie de travail)
> @@ -410,7 +410,7 @@
> );
> }
>
> -static void put_pixels8_mmx(uint8_t *block, const uint8_t *pixels, int line_size, int h)
> +void put_pixels8_mmx(uint8_t *block, const uint8_t *pixels, int line_size, int h)
> {
non static fuctions need a ff_ prefix
[...]
> +#if defined(CONFIG_VC1_DECODER) || defined(CONFIG_WMV3_DECODER)
> + ff_vc1dsp_init_mmx(c, avctx);
> +#endif
shouldnt these be enabled for a VC1/WMV3 encoder too?
[...]
> +/** Interpolates fractional pel values using MMX */
> +static void vc1_mspel_mc_mmx(uint8_t *dst, const uint8_t *src, int stride, int mode, int rnd)
> +{
> + const uint8_t *tptr;
> + int tptrstr;
> + int mode1 = mode & 3;
> + int mode2 = (mode >> 2) & 3;
you could pass mode1 and 2 as parameters this would avoid the calculation above
[...]
> + /* Translation: tmp=src-stride, tmp+8=src, ... */
> + if (mode1) { /* Horizontal filter to apply */
> + if (mode2) { /* Vertical filter to apply, output to tmp */
> + vc1_put_shift[mode1-1](tmp, 8, src-stride, stride, 11, rnd, 1);
> + tptr = tmp+8;
> + tptrstr = 8;
> + }
> + else { /* No vertical filter, output 8 lines to dst */
> + //fprintf(stderr, "mode1 noV\n"); fflush(stderr);
> + vc1_put_shift[mode1-1](dst, stride, src, stride, 8, rnd, 1);
> + return;
> + }
> + }
> + else {
> + /* No horizontal filter, use directly src as input */
> + tptr = src;
> + tptrstr = stride;
> + /* put_vc1_mspel_mc00_mmx directly calls put_pixels8_mmx */
> + }
> +
> + vc1_put_shift[mode2-1](dst, stride, tptr, tptrstr, 8, 1-rnd, tptrstr);
> +}
dst_stride= stride;
if (mode1) { /* Horizontal filter to apply */
if (mode2) { /* Vertical filter to apply, output to tmp */
vc1_put_shift[mode1-1](tmp, 8, src-stride, stride, 11, rnd, 1);
src = tmp+8;
stride = 8;
}
else { /* No vertical filter, output 8 lines to dst */
//fprintf(stderr, "mode1 noV\n"); fflush(stderr);
vc1_put_shift[mode1-1](dst, stride, src, stride, 8, rnd, 1);
return;
}
}
vc1_put_shift[mode2-1](dst, dst_stride, src, stride, 8, 1-rnd, stride);
also the -1 in [...-1] can be avoided by putting a NULL at place 0 of the
array
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
No snowflake in an avalanche ever feels responsible. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20070703/37720d50/attachment.pgp>
More information about the ffmpeg-devel
mailing list