[Ffmpeg-devel] few remarks for h264 decoder

Michael Niedermayer michaelni
Sat Dec 31 02:55:47 CET 2005

Hi

On Fri, Dec 30, 2005 at 04:00:14PM -0800, Loren Merritt wrote:
> On Fri, 30 Dec 2005, G?bor Kov?cs wrote:
[...]
> >5.
> >I didn't check it throughly but mmx h264_qpel4_hv_lowpass macro uses
> >approximation (comment sais ((a-b)/4-b)/4). I know a proper solution
> >needs 32bit range, but this is why God created "pmaddwd" :) I think the
> >performance loss needed for accurate calculation would be minimal.
>
> >input:  abccba
> >what it does: (((a-b)/4-b)/4+c+32)>>6
> >
> >reformulated (not bothering for a moment with 16bit limit)
> >((a-b)-b*4+c*16+32*16)>>(6+4)
> >(a-5*b+c*16+512)>>10
> >
> >what it should be:
> >(a-5*b+c*20+512)>>10
> >So there are two problems. One is the factor of c (16 vs 20). Other is
> >the loss of precision (lower bits) using the /4 right shifts.
>
> The code is right, the comment was wrong.
> I have not performed a detailed error analysis of the loss in precision,
> but since it does not appear to introduce any deviation from JM, I have
> to assume that those lsbs don't actually matter.

they dont matter, the trick is that (4*x + y)>>2 == x + (y>>2) and

(20*c - 5*b + a + 512)>>10
(4*(5*c - b + 128) + a - b)>>10
(5*c - b + 128 + ((a - b)>>2))>>8
(4*c + c - b + 128 + ((a - b)>>2))>>8
(c + ((c - b + 128 + ((a - b)>>2)))>>2))>>6

[...]

--
Michael