[Ffmpeg-devel] a little optim for a SSE version of H263_LOOP_FILTER

Stefan Gehrer stefan.gehrer
Sun Nov 12 09:32:43 CET 2006


Kostya wrote:
> On Sat, Nov 11, 2006 at 10:48:19AM +0100, Stefan Gehrer wrote:
>   
>> I am still surprised about the input to overlap being uint8_t
>> as my understanding of VC1 was that the overlap has to be
>> done with the pixels before clipping, which can be both
>> negative and beyond 255. I remember someone brought this
>> up before on the list but I think there was no response?
>>     
>
> The logic is simple: while VC1 standard demands processing of
> at least 10-bit samples lavc implies 8-bit samples. Any workaround
> will be too messy and slow (and I don't think quality will
> significantly degrade).
>
> I think supporting 16 bit per sample formats (grayscale is already
> supported) is nice but here it would be an overkill.
>   
Instead of having the complete picture in 16 bit, I was more thinking
about leaving the pictures in 8bit samples but having two extra
horizontal lines of storage in 16bit. So at the end of processing a block,
you store the bottom two lines of it in that storage, clip the whole block
and store it in the framebuffer. When it's the turn of the block below,
you do the overlap with the values in the extra storage.

IMHO it would be nice to have it like in H.264: A mode with strict 
compliance
which can be bit-wise compared to reference decodes (AFAIK this should be
possible in VC1 too as there is no IDCT with variation), and some "fast" 
flags
which would then trade compliance/quality for speed.

Regards
Stefan Gehrer




More information about the ffmpeg-devel mailing list