[FFmpeg-devel] [PATCH] JPEG-XL : Image Format Parser

Jai Luthra me at jailuthra.in
Wed Mar 18 12:46:53 EET 2020


Hi Varun,

Set [RFC]/[WIP]/[GSOC] and other subject labels for patches that are not 
intended for merge review. From your first email it seems like your mailer 
mangled it. You can edit the .patch file before sending it via git send-email.

On Mon, Mar 16, 2020 at 12:11:54AM +0530, Varun Gupta wrote:
> [...]
>+
>+typedef struct JPEGXLParseContext {
>+    ParseContext pc;
>+    int state;
>+    int index;  // keeps track of number of bits read from the media file
>+    SizeHeader size;
>+    PreviewHeader preview;
>+    ImageMetadata metadata;
>+    AnimationHeader animation;
>+} JPEGXLParseContext;

Most of the decoding specific headers should be read in the decoder itself. 
The parser should only find the end of frames, and decoder can initalize other 
parameters from first packet (which would be header + frame) and use same 
initialized contexts for the subsequent frame packets. Take a look at other 
video and image parsers.

If you think that it won't be possible to find frame ends without reading all 
the headers then that is a different case. Even then you need to make sure 
your parameters reach the decoder module via AVCodecContext->priv_data.

But it is good that you read the spec and figured out the bitstream 
organization.

>+
>+static unsigned int bits_offset_enum(GetBitContext gb, int n, int offset) {
>+    unsigned int read_n_bits = get_bits(&gb, n);
>+    return read_n_bits + offset;
>+}
>+
>+// TODO -> add checks for buffer size overflow in between and ill formed checks
>+static int jpegxl_find_frame_end(JPEGXLParseContext *context, const uint8_t *buf,
>+                                 int buf_size) {
>+    int index, next = END_NOT_FOUND;
>+    GetBitContext gb;
>+    init_get_bits(&gb, buf, buf_size*8);

init_get_bits8 can be used here as your buf_size is in bytes.

>+    for (index = 0; index < buf_size*8; ) {
>+        if (!context->state) {
>+            if (get_bits(&gb, 8) == JPEG_XL_SIG_FF) {
>+                context->state = JPEGXL_SIG;

Any particular reason for choosing state as type int and not the enum type it 
is being set to?

You can add a state called JPEGXL_UNDEFINED=0 if you are using int just to 
handle that case

>+            } else {
>+                // TODO -> Bitstream is ill formed
>+            }
>+            index += 8;

GetBitContext maintains an internal index as well, look into how you can use 
it for your case.

>+            context->index += 8;
>+        } else if (context->state == JPEGXL_SIG) {
>+            if (get_bits(&gb, 8) == JPEG_XL_SIG_TYPE) {
>+                context->state = JPEGXL_SIZE_HEADER;
>+            } else {
>+                // TODO -> Bitstream is ill formed
>+            }
>+            index +=8;
>+            context->index += 8;
>+        } else if (context->state == JPEGXL_SIZE_HEADER) {
>+            // default values
>+            context->size.ysize_div8_minus_1 = 0;
>+            context->size.ysize_minus_1 = 0;
>+            context->size.xsize_div8_minus_1 = 0;
>+            context->size.xsize_minus_1 = 0;

To simplify code you can 0 initialize all structs at start by setting context 
struct to 0. Then you only need to change the non-zero init values.

>+
>+            unsigned int small = get_bits(&gb, 1);

I do not think this will compile with FFmpeg which is C89 standard. Declare 
all variables at top of block. Same with your usage of loop variables, define 
them beforehand.

>+            index++;
>+            context->index++;
>+            if (small) {
>+                context->size.ysize_div8_minus_1 = get_bits(&gb, 5);
>+                index += 5;
>+                context->index += 5;
>+            } else {
>+                unsigned int input = get_bits(&gb, 2); // U32() first reads 2 bits
>+                index += 2;
>+                context->index += 2;
>+                if (input == 0) {   // d0 = Bits(9)
>+                    context->size.ysize_minus_1 = get_bits(&gb, 9);
>+                    index += 9;
>+                    context->index += 9;
>+                } else if (input == 1) {   // d1 = Bits(13)
>+                    context->size.ysize_minus_1 = get_bits(&gb, 17);
>+                    index += 13;
>+                    context->index += 13;
>+                } else if (input == 2) {   // d2 = Bits(18)
>+                    context->size.ysize_minus_1 = get_bits_long(&gb, 18);
>+                    index += 18;
>+                    context->index += 18;
>+                } else {   // d3 = Bits(30)
>+                    context->size.ysize_minus_1 = get_bits_long(&gb, 30);
>+                    index += 30;
>+                    context->index += 30;
>+                }
>+            }

you could simplify this a lot by setting a temporary variable to hold the read 
bit size n using a switch case, and then read n bits into the ysize_minus_1. 

similar simplifications can be done at a lot of places.

>+            unsigned int ratio = get_bits(&gb, 3);
>+            index += 3;
>+            context->index += 3;
>+            if (ratio == 0) {
>+                if (small) {
>+                    context->size.xsize_div8_minus_1 = get_bits(&gb, 5);
>+                    index += 5;
>+                    context->index += 5;
>+                } else {
>+                    unsigned int input = get_bits(&gb, 2); // U32() first reads 2 bits
>+                    index += 2;
>+                    context->index += 2;
>+                    if (input == 0) {   // d0 = Bits(9)
>+                        context->size.xsize_minus_1 = get_bits(&gb, 9);
>+                        index += 9;
>+                        context->index += 9;
>+                    } else if (input == 1) {   // d1 = Bits(13)
>+                        context->size.xsize_minus_1 = get_bits(&gb, 17);
>+                        index += 13;
>+                        context->index += 13;
>+                    } else if (input == 2) {   // d2 = Bits(18)
>+                        context->size.xsize_minus_1 = get_bits_long(&gb, 18);
>+                        index += 18;
>+                        context->index += 18;
>+                    } else {   // d3 = Bits(30)
>+                        context->size.xsize_minus_1 = get_bits_long(&gb, 30);
>+                        index += 30;
>+                        context->index += 30;
>+                    }
>+                }
>+            }
>+            context->state = JPEGXL_IMAGE_METADATA;
>+        } else if (context->state == JPEGXL_IMAGE_METADATA) {
>+            // setting up default values
>+            context->metadata.have_icc = 0;
>+            context->metadata.alpha_bits = 0;
>+            context->metadata.bits_per_sample = 8;
>+            context->metadata.target_nits_div50 = 5;
>+            context->metadata.colour_encoding.received_icc = 0;
>+            context->metadata.colour_encoding.opaque_icc = 0;
>+            context->metadata.colour_encoding.colour_space = kRGB;
>+            context->metadata.colour_encoding.white_point = kD65;
>+            context->metadata.colour_encoding.primaries = kSRGB;
>+            context->metadata.colour_encoding.have_gamma = 0;
>+            context->metadata.colour_encoding.gamma = 0;
>+            context->metadata.colour_encoding.transfer_function = kSRGBTransferFunction;
>+            context->metadata.colour_encoding.rendering_intent = kRelative;
>+            context->metadata.m2.have_preview = 0;
>+            context->metadata.m2.have_animation = 0;
>+            context->metadata.m2.orientation_minus_1 = 0;
>+            context->metadata.m2.depth_bits = 0;
>+            context->metadata.m2.depth_shift = 0;
>+            context->metadata.m2.num_extra_channels = 0;
>+            context->metadata.m2.extra_channel_bits = 0;
>+
>+            context->metadata.all_default = get_bits(&gb, 1);
>+            index++;
>+            context->index++;
>+            if (!context->metadata.all_default) {
>+                context->metadata.have_icc = get_bits(&gb, 1);
>+                index++;
>+                context->index++;
>+
>+                unsigned int input = get_bits(&gb, 2); // U32() first reads 2 bits
>+                index += 2;
>+                context->index += 2;
>+                if (input == 0) {   // d0 = Val(8)
>+                    context->metadata.bits_per_sample = 8;
>+                } else if (input == 1) {  // d1 = Val(16)
>+                    context->metadata.bits_per_sample = 16;
>+                } else if (input == 2) {  // d2 = Val(32)
>+                    context->metadata.bits_per_sample = 32;

if (input >= 0 && input <= 2)
     context->metadata.bits_per_sample = 1 << (3 + input);

but current way wayis also OK for this case.

>+                } else {  // d3 = Bits(5)
>+                    context->metadata.bits_per_sample = get_bits(&gb, 5);
>+                    index += 5;
>+                    context->index += 5;
>+                }
>+
>+                context->metadata.colour_encoding.all_default = get_bits(&gb, 1);
>+                index++;
>+                context->index++;
>+                if(!context->metadata.colour_encoding.all_default) {
>+                    context->metadata.colour_encoding.received_icc = get_bits(&gb ,1);
>+                    index++;
>+                    context->index++;
>+                }
>+                if(context->metadata.colour_encoding.received_icc) {
>+                    context->metadata.colour_encoding.opaque_icc = get_bits(&gb, 1);
>+                    index++;
>+                    context->index++;
>+                }
>+                int use_desc = !context->metadata.colour_encoding.all_default &&
>+                               !context->metadata.colour_encoding.opaque_icc;
>+                if(use_desc) {  // colour_space enum
>+                    unsigned int input = get_bits(&gb, 2);
>+                    index += 2;
>+                    context->index += 2;
>+                    unsigned int enum_value;
>+                    if (input == 0) {
>+                        enum_value = 0;
>+                    } else if (input == 1) {
>+                        enum_value = 1;
>+                    } else if (input == 2) {
>+                        enum_value = bits_offset_enum(gb, 4, 2);
>+                        index += 4;
>+                        context->index += 4;
>+                    } else {
>+                        enum_value = bits_offset_enum(gb, 6, 18);
>+                        index += 6;
>+                        context->index += 6;
>+                    }
>+
>+                    if (checkIfValueInColourSpace(enum_value)) {
>+                        context->metadata.colour_encoding.colour_space = enum_value;
>+                    } else {
>+                        // TODO -> Bitstream is ill formed
>+                    }
>+                }
>+                int not_xy = context->metadata.colour_encoding.colour_space != kXYZ &&
>+                             context->metadata.colour_encoding.colour_space != kXYB;
>+                if(use_desc && not_xy) {  // white_point enum
>+                    unsigned int input = get_bits(&gb, 2);
>+                    index += 2;
>+                    context->index += 2;
>+                    unsigned int enum_value;
>+                    if (input == 0) {
>+                        enum_value = 0;
>+                    } else if (input == 1) {
>+                        enum_value = 1;
>+                    } else if (input == 2) {
>+                        enum_value = bits_offset_enum(gb, 4, 2);
>+                        index += 4;
>+                        context->index += 4;
>+                    } else {
>+                        enum_value = bits_offset_enum(gb, 6, 18);
>+                        index += 6;
>+                        context->index += 6;
>+                    }
>+
>+                    if (checkIfValueInWhitePoint(enum_value)) {
>+                        context->metadata.colour_encoding.white_point = enum_value;
>+                    } else {
>+                        // TODO -> Bitstream is ill formed
>+                    }
>+                }
>+                if (use_desc && context->metadata.colour_encoding.white_point == kCustom) {
>+                    // TODO -> Implement custom xy for white
>+                }
>+                if (use_desc && not_xy && context->metadata.colour_encoding.colour_space != kGrey) {   // primaries enum
>+                    unsigned int input = get_bits(&gb, 2);
>+                    index += 2;
>+                    context->index += 2;
>+                    unsigned int enum_value;
>+                    if (input == 0) {
>+                        enum_value = 0;
>+                    } else if (input == 1) {
>+                        enum_value = 1;
>+                    } else if (input == 2) {
>+                        enum_value = bits_offset_enum(gb, 4, 2);
>+                        index += 4;
>+                        context->index += 4;
>+                    } else {
>+                        enum_value = bits_offset_enum(gb, 6, 18);
>+                        index += 6;
>+                        context->index += 6;
>+                    }

this exact chunk of code was seen before (and seen later as well), consider 
moving it to a function.

>+
>+                    if (checkIfValueInPrimaries(enum_value)) {
>+                        context->metadata.colour_encoding.primaries = enum_value;
>+                    } else {
>+                        // TODO -> Bitstream is ill formed
>+                    }
>+                }
>+                if (use_desc && context->metadata.colour_encoding.primaries == kCustomPrimary) {
>+                    // TODO -> Implement custom xy for red
>+                }
>+                if (use_desc && context->metadata.colour_encoding.primaries == kCustomPrimary) {
>+                    // TODO -> Implement custom xy for green
>+                }
>+                if (use_desc && context->metadata.colour_encoding.primaries == kCustomPrimary) {
>+                    // TODO -> Implement custom xy for blue
>+                }
>+                if (use_desc && not_xy) {
>+                    context->metadata.colour_encoding.have_gamma = get_bits(&gb, 1);
>+                    index++;
>+                    context->index++;
>+                }
>+                if (use_desc && context->metadata.colour_encoding.have_gamma) {
>+                    get_bits_long(&gb, 24);
>+                    index += 24;
>+                    context->index += 24;
>+                }
>+                if (use_desc && !context->metadata.colour_encoding.have_gamma && not_xy) { // transfer_function enum
>+                    unsigned int input = get_bits(&gb, 2);
>+                    index += 2;
>+                    context->index += 2;
>+                    unsigned int enum_value;
>+                    if (input == 0) {
>+                        enum_value = 0;
>+                    } else if (input == 1) {
>+                        enum_value = 1;
>+                    } else if (input == 2) {
>+                        enum_value = bits_offset_enum(gb, 4, 2);
>+                        index += 4;
>+                        context->index += 4;
>+                    } else {
>+                        enum_value = bits_offset_enum(gb, 6, 18);
>+                        index += 6;
>+                        context->index += 6;
>+                    }
>+
>+                    if (checkIfValueInTransferFunction(enum_value)) {
>+                        context->metadata.colour_encoding.transfer_function = enum_value;
>+                    } else {
>+                        // TODO -> Bitstream is ill formed
>+                    }
>+                }
>+                if (use_desc && context->metadata.colour_encoding.colour_space != kGrey && not_xy) { // rendering_intent enum
>+                    unsigned int input = get_bits(&gb, 2);
>+                    index += 2;
>+                    context->index += 2;
>+                    unsigned int enum_value;
>+                    if (input == 0) {
>+                        enum_value = 0;
>+                    } else if (input == 1) {
>+                        enum_value = 1;
>+                    } else if (input == 2) {
>+                        enum_value = bits_offset_enum(gb, 4, 2);
>+                        index += 4;
>+                        context->index += 4;
>+                    } else {
>+                        enum_value = bits_offset_enum(gb, 6, 18);
>+                        index += 6;
>+                        context->index += 6;
>+                    }
>+
>+                    if (checkIfValueInRenderingIntent(enum_value)) {
>+                        context->metadata.colour_encoding.rendering_intent = enum_value;
>+                    } else {
>+                        // TODO -> Bitstream is ill formed
>+                    }
>+                }
>+
>+                input = get_bits(&gb, 2); // U32() first reads 2 bits
>+                index += 2;
>+                context->index += 2;
>+                if (input == 0) {   // d0 = Val(0)
>+                    context->metadata.alpha_bits = 0;
>+                } else if (input == 1) {  // d1 = Val(8)
>+                    context->metadata.alpha_bits = 8;
>+                } else if (input == 2) {  // d2 = Val(16)
>+                    context->metadata.alpha_bits = 16;
>+                } else {  // d3 = Bits(4)
>+                    context->metadata.alpha_bits = get_bits(&gb, 4);
>+                    index += 4;
>+                    context->index += 4;
>+                }

again very similar to a block before. it is ok if you wrote everything 
explicitly for testing, but do clean it out at some point.

> [...]


Cheers

--
Jai (darkapex)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200318/64dad87a/attachment.sig>


More information about the ffmpeg-devel mailing list