[FFmpeg-devel] [PATCH] h264: assembly version of get_cabac for x86_64 with PIC
Roland Scheidegger
rscheidegger_lists at hispeed.ch
Sat Apr 14 01:50:21 CEST 2012
Am 13.04.2012 20:24, schrieb Roland Scheidegger:
> Am 13.04.2012 19:34, schrieb Reimar Döffinger:
>> On Fri, Apr 13, 2012 at 04:34:34PM +0200, Roland Scheidegger wrote:
>>> There are similar functions which could get the same treatment but they
>>> are less frequently used and since this isn't very nice as we can't use the
>>> same assembly template focus on this function alone for now.
>>
>> Huh? The only other ones seem to be decode_significance as far as I can
>> tell, and the only reason seems to be that they use
>> BRANCHLESS_GET_CABAC.
>> Have you tried making them use your BRANCHLESS_GET_CABAC macro?
>> Code is in libavcodec/x86/h264_i386.h.
> Yes those are the ones I was talking about. decode_significance_8x8_x86
> (but not decode_significance_x86) also uses MANGLE itself so needs a
> change there. Other than that, the macro arguments are quite different
> hence these need some more ifdefs to make it work.
> If it's worth it I can give it a try - I just didn't see them in a
> profile because they are part of decode_cabac_residual_internal (which
> itself does show up quite prominently).
>
>> since the macros are used multiple times the label might cause issues.
>> So far that reason and
>> for better readability it might be better to require the surrounding
>> code to have such a label instead of having it in the macro?
>
> I don't think it should cause problems, the code already had a label
> before. Maybe if moving the label outside the lea should be moved out of
> the macro too (and instead the reg holding the offset passed in) which
> would save the decode_significance functions using BRANCHLESS_GET_CABAC
> twice from performing this lea twice. I don't think that helps
> readability, though...
>
So something like the attached patch maybe?
I think though the use of labels in different asm blocks is fishy - the
only reason it is in a different block is because otherwise I want it to
be an output argument, which means all input arguments get renumbered
and hence would need completely separate significance functions instead
of the same ones with some ifdefs (I actually tried to use c to get the
rip-relative offset but failed pretty miserably trying to pass the label
to the inline assembly - the offset itself isn't a problem).
In any case, I'm not sure of the performance difference yet. I can say
though now these functions indeed show up very high on the list with
profiling so I guess at least potentially it could help.
Roland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cabac_x86_64_pic.diff
Type: text/x-patch
Size: 10352 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120414/edbbc3f9/attachment.bin>
More information about the ffmpeg-devel
mailing list