[MPlayer-dev-eng] Help with MMX asm code

Jason Tackaberry tack at auc.ca
Fri Oct 24 00:27:59 CEST 2003


On Thu, 2003-10-23 at 13:07, Billy Biggs wrote: 
>   If your Cb/Cr channels (U and V) are only width*height/4, then you
> have a 4:2:0 image not a 4:2:2 image.  For some explanation see:
> 
>   http://www.poynton.com/PDFs/Chroma_subsampling_notation.pdf

Thanks for the reference.  I think I understand that now. :)

>   I would recommend instead: a = layer_alpha/256 * img.A[pos] as
> division by 255 is expensive and it's cheap to keep around 4 bytes
> instead of only 1 byte for your layer alpha.

In my implementation I pulled layer_alpha/255 out of the inner loops so
although it's expensive it's barely noticed.

>   First, this seems wrong.  If we look at a block of four pixels:
> 
>   A B
>   C D
> 
>   You're using the alpha from pixel D to apply to the Cb/Cr components.
> For MPEG2, the chroma samples are positioned halfway between A and C, so
> if you want to be really correct, you should filter the alpha channel,
> for example by taking the average alpha value between A and C.  If this
> is expensive, at least use the alpha of pixel A and not pixel D.

Would averaging the alpha between B and D also be acceptable?  Also, why
would taking the alpha of pixel A for Cb/Cr be any better than using D,
or for that matter B or C?  At least, it's not clear to me why it's any
less correct (or more incorrect, in this case), and certainly my eye
can't tell the difference.
 
>   This memory layout is fine and you can optimize it like it is.  My
> code might help as a starting point ...  I can definitely edit any code
> you come up with too :)

The latest version of the code is at 
   http://people.auc.ca/tack/archive/2003-10/vf_bmovl2.c

The code that needs optimizing is the nested loop in put_image(), which
is essentially the C code for the algorithm I described earlier.  Aside
from the incorrectness of the blending math (which isn't fatally
problematic near as I can tell), I'm not sure if I can make that
appreciably faster without MMX.

>   If you want help let me know.

You might regret offering. :)  Really, right now I just need to read
through your code and experiment a bit more to try to come up with the
MMX instructions for the blend math.  Naturally any advice or bones
you'd like to throw me would be welcome, but for now I don't have any
specific questions.

Jason.

-- 
Jason Tackaberry  ::  tack at auc.ca  :: 705-949-2301 x330 
Academic Computing Support Specialist
Information Technology Services
Algoma University College  ::  www.auc.ca



More information about the MPlayer-dev-eng mailing list