[FFmpeg-devel] [GSoC] [WIP] [RFC] FLIF Encoder & Decoder Project

Wed May 6 03:04:14 EEST 2020

Hi Kartik,

On Mon, Mar 30, 2020, at 4:50 AM, Kartik K. Khullar wrote:
> This is my WIP patch for GSoC and I worked on transformations involved 
> in encoding/decoding FLIF images. I have created a module under 
> libavcodec and as guided by my mentors I have tried to use pixel data 
> from AVFrame structs.
> Module covers all the chunks of code for YCoCg Transformation that will 
> be used in final encoder/decoder. Necessary structs and methods have 
> been made as suggested by my mentors. The module compiles/builds 
> successfully.
> Also I have attached a small code 'transformtest.c' which I wrote for 
> testing the implementation of pixel transformation.
> The variable AVFrame::data[] is of type uint8_t* and I was initially 
> unaware of this so I stored the YCoCg values in 'data'. So the negative 
> values which were the output of transformation of RGB -> YCoCg were not 
> stored properly and thats where the output is wrong.

I tested your code, and it is good for an initial attempt (ofc the negative values are overflowing the uint8_t range, which is wrong). 

Your understanding of the problem is correct, when we transform an RGB value that could lie in (0-255, 0-255, 0-255) it can result in a YCoCg value that could be anywhere between (0-255, -255-255, -255-255) and thus not fit within AVFrame.data which is uint8_t *

> Just wanted to ask, if I should be using some other structs for storing 
> the YCoCg values and not AVFrame, because AVFrame seems to be the 
> standard struct in FFmpeg where the raw media resides.

The YCoCg doesn't need to go in AVFrame as your testcase (RGB->YCoCg) is the encoding phase, which reads RGB values from **AVFrame** and ultimately should output binary encoded data (after entropy coding) into **AVPacket**. Sorry if this was not clear before.

It is OK if you use a bigger buffer with 16-bits per color value for the intermediate transform stages. The only invariant is that original frame will have 8-bit RGB values, and final encoder output will be binary data. What the encoder uses in the interim to represent pixel values doesn't matter to FFmpeg api.

But theoretically you will never use all the 16x16x16 bits with YCoCg, as the Co range is conditional on Y, and Cg range conditional on Y & Co. It is *crucial* for your project that you thoroughly understand the "new ranges" section in the spec [1]

Unlike YCbCr (and other common transforms) which goes from 0-255 to 0-255 (or even shorter), YCoCg works differently. If we know that Y value is very low or very high, it means color components are roughly equal and thus Co and Cg will definitely be in a small range. This is what the animation [2] in the spec is about. The Y/Luminance varies from 0-255 and the working range of CoCg is shown as the size of the square.

I.e. the transform may take a Co value to -255-255 but that will not happen for every value of Y. It will only happen when `origmax4-1 <= Y <= 3*origmax4 - 1`. Similar rules apply for Cg.

So your next steps should be:
1. Use uint16_t to store interim pixel values for all transformations (doesn't need to be part of AVFrame, is internal structure to decoder)
2. Figure out how to implement the crange functions/api as this will be crucial for the MANIAC encoder phase (it needs to know the conditional ranges to effectively do entropy coding of the pixel values)

> Attaching some testcases of RGB and its corresponding YCoCg values for 
> testing purposes.
> 
> Thanks
> Kartik K. Khullar

Cheers,
Jai

[1]: https://flif.info/spec.html#_new_ranges
[2]: https://www.youtube.com/watch?v=-v-xoKZBnhI