[FFmpeg-devel] [PATCH] x86: hevc: adding transform_add

Mickaël Raulet mraulet at gmail.com
Wed Jul 30 21:38:19 CEST 2014


Le 30 juil. 2014 à 16:35, Ronald S. Bultje <rsbultje at gmail.com> a écrit :

> Hi!
> 
> On Wed, Jul 30, 2014 at 9:33 AM, Pierre Edouard Lepere <
> Pierre-Edouard.Lepere at insa-rennes.fr> wrote:
> 
>> Here's a patch adding ASM transform_add functions for HEVC.
> 
> 
> Yay! I'll try to review soon. Do you have rough performance metrics? I know
> it's faster :-p but it's nice to document by how much.
> 

Rather faster, yes. Some benches below. But note that these optimizations do not contain IDCTs. 

8bits
wo assembly:
- 1753 decicycles in transform add, 975751 runs, 72825 skips
SSE2:
- 490 decicycles in transform add, 1048545 runs, 31 skips


10bits
wo assembly:
3534 decicycles in transform add, 1046294 runs, 2282 skips
SSE2:
527 decicycles in transform add, 1048525 runs, 51 skips
AVX:
483 decicycles in transform add, 1048542 runs, 34 skips
AVX2:
338 decicycles in transform add, 1048558 runs, 18 skips

Mickaël


More information about the ffmpeg-devel mailing list