[FFmpeg-devel] [PATCH 09/11] avcodec/x86: allow future 8-bit simple idct to have "DC only hack"

Henrik Gramner henrik at gramner.com
Sat Jun 24 21:01:25 EEST 2017


On Mon, Jun 19, 2017 at 5:11 PM, James Darnley <jdarnley at obe.tv> wrote:
> +    por     m1, m8, m13
> +    por     m1, m12
> +    por     m1, [blockq+ 16]       ; { row[1] }[0-7]
> +    por     m1, [blockq+ 48]       ; { row[3] }[0-7]
> +    por     m1, [blockq+ 80]       ; { row[5] }[0-7]
> +    por     m1, [blockq+112]       ; { row[7] }[0-7]

Using a single register as destination here means that only one
instruction per cycle can be executed due to dependencies. Splitting
it across two destinations would double the (local) IPC.

OoOE might alleviate it, but no reason to unnecessarily rely on it.


More information about the ffmpeg-devel mailing list