[FFmpeg-devel] [PATCH] faster vp6 decoding
Sebastien Lucas
sebastien.lucas
Thu Feb 12 13:43:10 CET 2009
On Thu, Feb 12, 2009 at 12:23 AM, Aurelien Jacobs <aurel at gnuage.org> wrote:
> Sebastien Lucas wrote:
>
>> On Tue, Feb 10, 2009 at 2:42 PM, Aurelien Jacobs <aurel at gnuage.org> wrote:
>> > Sebastien Lucas wrote:
>> >
>> >> On Tue, Feb 10, 2009 at 1:14 AM, Aurelien Jacobs <aurel at gnuage.org> wrote:
>> >> > I've slightly modified your patch (see attached):
>> >> > - uses a vp6dsp_mmx.c file
>> >> > - some cosmetics (no trailing whitespace, etc...)
>> >> > - gcc 2.95 compat: you can't do for(int i=....)
>> >> > - rough attempt at x86_64 compatibility
>> >> >
>> >> > Unfortunately it don't work (totally garbled output), but I've only
>> >> > tested it on x86_64.
>> >>
>> >> I tested further, mplayer (latest svn vanilla build) don't manage to
>> >> decode properly my vp6 sample (it was working fine 1 month ago). So I
>> >> build ffmpeg, applied your modified patch and using -f framecrc (only
>> >> ssh access to my dev computer for now) I get 17 frames with changed
>> >> CRC out of the 2611 frames of my sample (I was previously testing only
>> >> the first 1000 frames with mplayer and there was no difference).
>> >> Logically there should be no difference at all.
>> >>
>> >> But the "totally garbled output" you see could (~should~) be due to
>> >> x86_64. Can you point me the sample you used ?
>> >
>> > http://samples.mplayerhq.hu/FLV/flash8/harrypotter-480x272-450-vp6.flv
>> > (Any sample in the same directory should do too)
>> >
>>
>> I just tested your sample with the following command line :
>> ./ffmpeg -i ../../harrypotter-480x272-450-vp6.flv -an -f framecrc - >
>> hout_ref.txt
>> with or without the patch and CRC always match.
>>
>> So either my computer is blessed or the problem lies with X86_64. I'll
>> try to test it with another computer at home (no 64bits at home).
>
> OK. I gave it a deeper look and found the culprit. You had a problem
> in asm constraints. %0 and %1 are modified by the asm code but they
> where using a "r" constraint. They must in fact use a "+r" constraint.
> Without it, the second asm bloc was using the same register as the
> first asm block, without setting it again to a correct value (because
> it assumed its value didn't changed in between).
> Attached is an updated patch which works fine on x86_64 too. I verified
> it to be bitexact. I also cleaned it up.
> I intend to apply it soon, unless some asm guru has other suggestions.
>
Thanks Aur?lien,
I'll did a small test program feeding both the C and MMX with test
values yesterday and I found some little problem (saturation and
arithmetic) so here is a fixed (and hopefully last) patch.
It's basically your latest patch with some changed paddw <-> paddsw
and psrlw <-> psraw.
I tested 9 test flv and it always gave me bitexact output.
I'll update accordingly Zuxy's SSE2 patch and will test it.
Thanks in advance for applying this patch.
S?bastien.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vp6dsp_mmx_fixed.diff
Type: application/octet-stream
Size: 8948 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090212/93efe060/attachment.obj>
More information about the ffmpeg-devel
mailing list