[FFmpeg-devel] [PATCH] x86: hevc: adding transform_add
Mickaël Raulet
mraulet at gmail.com
Wed Jul 30 21:38:19 CEST 2014
Le 30 juil. 2014 à 16:35, Ronald S. Bultje <rsbultje at gmail.com> a écrit :
> Hi!
>
> On Wed, Jul 30, 2014 at 9:33 AM, Pierre Edouard Lepere <
> Pierre-Edouard.Lepere at insa-rennes.fr> wrote:
>
>> Here's a patch adding ASM transform_add functions for HEVC.
>
>
> Yay! I'll try to review soon. Do you have rough performance metrics? I know
> it's faster :-p but it's nice to document by how much.
>
Rather faster, yes. Some benches below. But note that these optimizations do not contain IDCTs.
8bits
wo assembly:
- 1753 decicycles in transform add, 975751 runs, 72825 skips
SSE2:
- 490 decicycles in transform add, 1048545 runs, 31 skips
10bits
wo assembly:
3534 decicycles in transform add, 1046294 runs, 2282 skips
SSE2:
527 decicycles in transform add, 1048525 runs, 51 skips
AVX:
483 decicycles in transform add, 1048542 runs, 34 skips
AVX2:
338 decicycles in transform add, 1048558 runs, 18 skips
Mickaël
More information about the ffmpeg-devel
mailing list