[FFmpeg-devel] [PATCH] improve yuv422p to rgb in libswscale
Michael Niedermayer
michaelni
Wed Dec 1 04:34:06 CET 2010
On Tue, Nov 30, 2010 at 07:26:42PM -0800, Baptiste Coudurier wrote:
> On 11/30/2010 07:13 PM, Michael Niedermayer wrote:
>> On Tue, Nov 30, 2010 at 07:05:05PM -0800, Baptiste Coudurier wrote:
>>> On 11/30/2010 06:53 PM, Michael Niedermayer wrote:
>>>> On Tue, Nov 30, 2010 at 06:29:32PM -0800, Baptiste Coudurier wrote:
>>>>> On 11/30/2010 06:15 PM, Michael Niedermayer wrote:
>>>>>> On Tue, Nov 30, 2010 at 05:20:42PM -0800, Baptiste Coudurier wrote:
>>>>>>> On 11/30/2010 05:13 PM, Michael Niedermayer wrote:
>>>>>>>> On Tue, Nov 30, 2010 at 04:07:20AM -0800, Baptiste Coudurier wrote:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> $subject, use full vertical data when convert 422p, improve quality a lot.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Baptiste COUDURIER
>>>>>>>>> Key fingerprint 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
>>>>>>>>> FFmpeg maintainer http://www.ffmpeg.org
>>>>>>>>
>>>>>>>>> x86/yuv2rgb_template.c | 25 ++++-------
>>>>>>>>> yuv2rgb.c | 109 ++++++-------------------------------------------
>>>>>>>>> 2 files changed, 26 insertions(+), 108 deletions(-)
>>>>>>>>> 16f384c9b114c76572a539511c42267ce2942c67 yuv422p_to_rgb.patch
>>>>>>>>
>>>>>>>> this looks like it would make 420p->rgb quite a bit slower.
>>>>>>>
>>>>>>> Do you think changing>>1 to>>vshift would make is quite a bit slower ?
>>>>>>
>>>>>> no but that stuff:
>>>>>> @@ -152,134 +144,102 @@
>>>>>> YUV2RGBFUNC(yuv2rgb_c_48, uint8_t, 0)
>>>>>> LOADCHROMA(0);
>>>>>> PUTRGB48(dst_1,py_1,0);
>>>>>> - PUTRGB48(dst_2,py_2,0);
>>>>>>
>>>>>> LOADCHROMA(1);
>>>>>> - PUTRGB48(dst_2,py_2,1);
>>>>>> PUTRGB48(dst_1,py_1,1);
>>>>>>
>>>>>> LOADCHROMA(2);
>>>>>> PUTRGB48(dst_1,py_1,2);
>>>>>> - PUTRGB48(dst_2,py_2,2);
>>>>>>
>>>>>> LOADCHROMA(3);
>>>>>> - PUTRGB48(dst_2,py_2,3);
>>>>>> PUTRGB48(dst_1,py_1,3);
>>>>>> ENDYUV2RGBLINE(48)
>>>>>> LOADCHROMA(0);
>>>>>> PUTRGB48(dst_1,py_1,0);
>>>>>> - PUTRGB48(dst_2,py_2,0);
>>>>>>
>>>>>> LOADCHROMA(1);
>>>>>> - PUTRGB48(dst_2,py_2,1);
>>>>>> PUTRGB48(dst_1,py_1,1);
>>>>>> ENDYUV2RGBFUNC()
>>>>>>
>>>>>> and then running the code twice
>>>>>
>>>>> This is the C code, the mmx routine for yuv420 to rgb24 is already present.
>>>>>
>>>>> I don't understand running the code twice, can you please clarify ?
>>>>
>>>> yes, iam seeing this:
>>>> - for (y=0; y<srcSliceH; y+=2) {\
>>>> + for (y=0; y<srcSliceH; y++) {\
>>>>
>>>> and thus iam thinking it runs the code twice after the patch,
>>>> do i miss something?
>>>
>>> The old code was processing 2 lines at once, see py_1 and py_2, that's
>>> why a lot of code is removed in the macros.
>>
>> yes but i think its slower after the patch
>
> I guess it is a bit slower because of LOADCHROMA.
yes
>
>>>
>>> Do you agree that the mmx change looks trivial ?
>>
>> yes mmx looks good
>
> Well, problem now is that mmx routine will have different behaviour, and
> will be better.
they are already different
> I mean, the C routine is not supposed to be that fast is it ?
our code is supposed to be fast
>
> Also adding a check for some user parameters would require adding tests
> in a lot of places I fear.
just a seperate video filter that plays a bit with linesize and turns
422 in 420 as a sideeffect
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Its not that you shouldnt use gotos but rather that you should write
readable code and code with gotos often but not always is less readable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101201/1fc35f23/attachment.pgp>
More information about the ffmpeg-devel
mailing list