[Ffmpeg-devel] [PATCH] H.264 deblocking mmx

Loren Merritt lorenm
Thu Apr 28 23:51:07 CEST 2005

On Thu, 28 Apr 2005, Skal wrote:
> On Mon, 2005-04-25 at 00:39, Loren Merritt wrote:
>> I noticed that the inloop deblocking filter was taking a large fraction of
>> the decode time, and it is inherently parallel, so...
> just some remarks about the patch:
> a) chroma deblocking filters 4 pixels at a time, whereas
> it seems to me only 2 chroma pixels share the same
> strength (deduced from co-located the 4x4 luma block contents).
> And even, for MBAff, you sometimes have to filter only 1
> vertical chroma sample (in case of Field->Frame or Frame->Field
> vertical filtering) at a time.

It's ok that only 2 samples share a strength: The tc0 vector contains 
one value per sample, and they don't have to all be the same.
One could deal with MBAFF the same way (add heterogeneous alpha and 
beta vectors, since QP can differ too). That only breaks in the case 
that one of the two colocated field MBs is intra (bS=4) and the other 
isn't, so that they use different algorithms in addition to different 

> b) the ASM code is computing the ABS(a-b) value, and afterward
> compares it to Alpha/Beta. It uses 16bits words.
> But in fact, only the result of the test (not the abs value itself)
> matters. And could be advantageously be computed using
> unsigned 8bits values only,

I'll try it.

> Before you ask: why don't i supply a patch for that? Simply because i'm 
> very dislike inlined ASM code. I can hardly read it, let alone write 
> some. But fortunately, Michael is around here ;)

I prefer nasm syntax, but OTOH I also like being able to mix ASM with C. 
So I'm undecided.

--Loren Merritt

More information about the ffmpeg-devel mailing list