[FFmpeg-devel] [PATCH 1/2] swresample: Refactor resample asm and port it to yasm
James Almer
jamrial at gmail.com
Wed Mar 19 19:00:30 CET 2014
On 19/03/14 2:34 PM, Michael Niedermayer wrote:
> On Wed, Mar 19, 2014 at 02:24:05PM -0300, James Almer wrote:
>> On 19/03/14 9:18 AM, Michael Niedermayer wrote:
>>> On Wed, Mar 19, 2014 at 02:49:33AM -0300, James Almer wrote:
>>>> This reduces code duplication and makes it easier to implement new asm
>>>> functions in the future
>>>>
>>>> Signed-off-by: James Almer <jamrial at gmail.com>
>>>> ---
>>>> libswresample/resample.c | 96 ++++++++++---------------------------
>>>> libswresample/resample_template.c | 49 +++++++------------
>>>> libswresample/swresample_internal.h | 24 ++++++++++
>>>> libswresample/x86/Makefile | 1 +
>>>> libswresample/x86/resample.asm | 64 +++++++++++++++++++++++++
>>>> libswresample/x86/resample_mmx.h | 74 ----------------------------
>>>> libswresample/x86/swresample_x86.c | 16 +++++++
>>>> 7 files changed, 148 insertions(+), 176 deletions(-)
>>>> create mode 100644 libswresample/x86/resample.asm
>>>> delete mode 100644 libswresample/x86/resample_mmx.h
>>>
>>> what effect does this has on speed?
>>> you are adding a function call in a in inner loop
>>
>> At least on my end it seems to be a couple cycles faster (Measured with
>> timer.h macros surrounding the c->scalarproduct() function call, and the
>> COMMON_CORE inline asm in the pre-patch version).
>
> IMO to meassure the call overhead the timer macros should be
> farther out, that is outside that loop
> we want to know how the extra function call interacts with the
> code before and after
> the timer code will affect this interaction if its between
Pre patch
300068 decicycles in swri_resample_int16_sse2, 65438 runs, 98 skips
Post patch
291174 decicycles in swri_resample_int16, 65414 runs, 122 skips
This was converting a 44100khz 16 bits stereo stream to 22050khz.
More information about the ffmpeg-devel
mailing list