[FFmpeg-soc] [soc]: r5419 - als/alsdec.c

Thilo Borgmann thilo.borgmann at googlemail.com
Fri Oct 23 16:10:41 CEST 2009


Michael Niedermayer schrieb:
> On Fri, Oct 23, 2009 at 01:03:56PM +0200, Thilo Borgmann wrote:
>> Michael Niedermayer schrieb:
>>> On Thu, Oct 22, 2009 at 08:26:12PM +0200, Thilo Borgmann wrote:
>>>> Michael Niedermayer schrieb:
>>>>> On Thu, Oct 22, 2009 at 01:32:15PM +0200, Thilo Borgmann wrote:
>>>>>> Michael Niedermayer schrieb:
>>>>>>> On Thu, Oct 22, 2009 at 11:29:55AM +0200, Thilo Borgmann wrote:
>>>>>>>> Thilo Borgmann schrieb:
>>>>>>>>> Michael Niedermayer schrieb:
>>>>>>>>>> On Wed, Oct 21, 2009 at 12:33:21PM +0200, Thilo Borgmann wrote:
>>>>>>>>>>> Michael Niedermayer schrieb:
>>>>>>>>>>>> On Tue, Oct 20, 2009 at 03:00:40PM +0200, thilo.borgmann wrote:
>>>>>>>>>>>>> Author: thilo.borgmann
>>>>>>>>>>>>> Date: Tue Oct 20 15:00:40 2009
>>>>>>>>>>>>> New Revision: 5419
>>>>>>>>>>>>>
>>>>>>>>>>>>> Log:
>>>>>>>>>>>>> Splits reading of block data and decoding of block data.
>>>>>>>>>>>>> Introduces ALSBlockData struct.
>>>>>>>>>>>> You are missing the "why" part, that should be explained in the commit
>>>>>>>>>>>> message
>>>>>>>>>>> Yes, sorry.
>>>>>>>>>>>
>>>>>>>>>>>> also this needs a benchmark as there are many additional dereferences
>>>>>>>>>>>> added
>>>>>>>>>>> It is a necessary evil to support MCC. If it would be faster the "old"
>>>>>>>>>>> way for non-MCC files, would this reason to have both, a split read &
>>>>>>>>>>> decode function pair and an all-in-one function?
>>>>>>>>>> I think a benchmark is usefull to judge if we should spend time thinking
>>>>>>>>>> about alternatives to the many dereferences or not
>>>>>>> [...]
>>>>>>>>> This is a 30% difference which makes me think to try these alternatives.
>>>>>>>>>
>>>>>>>>> What comes into my mind would be to use local copies, thus dereferencing
>>>>>>>>> the field of *bd just twice. One at the top and one at the bottom of the
>>>>>>>>> function.
>>>>>>>>>
>>>>>>>> I tested using local copies instead of dereferencing:
>>>>>>> [...]
>>>>>>>> That's a 4% gain so I think local copies don't pay off...
>>>>>>>>
>>>>>>>> Other alternatives?
>>>>>>> i would first confirm that gcc did not do something stupid about inlining.
>>>>>>> Or more precissely not inlining some random unrelated function after the
>>>>>>> file got bigger ...
>>>>>> How to do that?
>>>>> nm -S foobar.o
>>>>> will show you things that are not inlined and how large they are
>>>> The BSD version does not have the option to show the size instead of the
>>>> 'value'...
>>>>
>>>>
>>>>> you can also use gcc to compile code to .s files and use grep on them
>>>> ... so this seems to be necessary. What am I grepping for?
>>> here
>>> egrep '(^[^.]*.:|call[[:space:]]*[a-z])'
>>> shows function names and all calls to non inlined functions
>>> a diff of this from old and new could be interresting
>> Ok, reading this thread again I began not to feel too sure about if we
>> are talking about the diff between "local variables <-> dereferences" or
>>  "combined <-> separate" functions.
>>
>> So I grepped both files for both cases. I'm not trained in reading the
>> diff's so I included the grep output for all cases.
> 
> you need a different grep line
> its suposed to look like (for huffyuv for example)
> [...]

Here they are.

-Thilo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ALS_inlining_2.zip
Type: application/zip
Size: 72490 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-soc/attachments/20091023/9e0ed66e/attachment.zip>


More information about the FFmpeg-soc mailing list