[FFmpeg-devel] [PATCH v4 1/5] avcodec/jpegxl: add Jpeg XL image codec and parser

Sat Jan 8 05:59:24 EET 2022

On 1/7/22 08:51, Andreas Rheinhardt wrote:
> Leo Izen:
>> +    int level;
>> +
>> +} JpegXLHeader;
> Why is this structure exported in the header when it is only used by the
> parser?

This was a relic from before I was told to remove the struct from the 
exported function signature. I'll move it over to the parser.

>
>> +#include <stdint.h>
> Unnecessary, as inttypes.h is guaranteed by the spec to include stdint.h.
Will remove.
> +#include <inttypes.h>
> +#include <stdlib.h>
> +
> +#ifndef BITSTREAM_READER_LE
> +#define BITSTREAM_READER_LE
> +#endif
>
> The check here is nonsense.
Will remove.
>> +#ifdef CACHED_BITSTREAM_READER
>> +#undef CACHED_BITSTREAM_READER
>> +#endif
>> +#define CACHED_BITSTREAM_READER 1
> Is there a reason you insist on the cached bitstream reader?
I was under the impression it was needed for get_bits64. Is this not the 
case?
>> +
>> +#include "libavutil/error.h"
>> +#include "libavutil/intreadwrite.h"
>> +#include "libavutil/mem.h"
>> +
>> +#include "codec_id.h"
>> +#include "config.h"
>> +#include "get_bits.h"
>> +#include "jpegxl.h"
>> +#include "parser.h"
>> +
>> +#if CONFIG_JPEGXL_PARSER
>> +
>> +typedef struct JpegXLParseContext {
>> +    ParseContext pc;
>> +    GetBitContext gb;
>> +    const uint8_t *buf;
>> +    size_t buflen;
>> +    size_t bits_read;
> What exactly is the point of duplicating the GetBitContext's internal
> status?
I wasn't sure if any of the code ever added to gb->buffer. If it 
doesn't, I can remove the duplication.
> +{
> +    memset(&jxlr->gb, 0, sizeof(GetBitContext));
> Unnecessary, as you reset gb below.
Will remove.
> +    memset(&jxlr->gb, 0, sizeof(GetBitContext));
> Unnecessary, as you are initializing gb below.
Will remove.
>
>> +    if ((jxlr->bits_read + 1) / 8 + 8 > jxlr->buflen - 1)
> This check looks weird. E.g. why (jxlr->bits_read + 1) / 8 and not
> (jxlr->bits_read + 7) / 8? And why the extra buffer size of eight bytes?
I'm pretty sure this is a relic from before I was using get_bits.h and 
simply called AV_RL64() on the buffer, and I wanted to make sure that 
would not overflow. I missed the change here, and I should fix this.
>
>> +        /* overflowing buffer */
>> +        return 0;
>> +    while (bits > 64) {
> I do not see anything where you would read more than 64 bits at a time.
> It would also make no sense (the format would be wasting bits in this case).
When parsing the ISOBMFF-like container, I call jxl_bits() to skip boxes 
I do not intend to parse after determining their size.
>> +}
>> +
>> +static uint32_t jpegxl_u32(JpegXLParseContext *jxlr,
>> +        uint32_t *constants, uint32_t *ubits)
> Both arrays can be made const. And actually, both arrays should be
> declared as const uint32_t [4].
Will change.
>
>> +    memcpy(&ret, &mantissa, sizeof(float));
> return av_int2float(mantissa);
Huh, so that's a thing. Will change.
>> +                    (uint32_t[]){0, 0, 0, 0}, (uint32_t[]){9, 13, 18, 30});
> h can be up to 2^30 here. This means that the multiplications in
> jpegxl_width_from_ratio() can overflow and need to be performed in 64bit
> (although the end result always fits into an uint32_t).
> (E.g. if h were 2^30, then jpegxl_width_from_ratio() will return 0 in
> cases 2, 3 and 5.)
Good catch, I missed this overflow issue. Will fix.
>> +    header = av_mallocz(sizeof(JpegXLHeader));
> This allocation is completely unnecessary, just put a JpegXLHeader on
> the stack in avpriv_jpegxl_verify_codestream_header() and jpegxl_parse().
The parse function requires the header to be zeroed out. I could stack 
allocate it and assign { 0 } to it. I'm pretty sure this was from when 
the header used to be exported, before I was told it should be hidden, 
making this now unnecessary, although it sifted through. Thanks, will 
change.
>> +        header->extensions = jpegxl_u64(jxlr);
>> +        if (header->extensions) {
>> +            header->extension_bits = av_calloc(64, sizeof(uint64_t));
> This array is write-only.
There's many fields I parse and assign to the header, but do not do 
anything with because the eventual goal was to integrate this into an 
internal decoder. If I simply skipped them, then they'd just need to be 
added later.
>
>> +            if (!header->extension_bits) {
>> +                av_log(avctx, AV_LOG_ERROR, "Could not allocate extension bit array");
>> +                status = AVERROR(ENOMEM);
>> +                goto fail;
>> +            }
>> +            for (int i = 0; i < 64; i++) {
>> +                if (header->extensions & (UINT64_C(1) << i))
>> +                    header->extension_bits[i] = jpegxl_u64(jxlr);
>> +            }
>> +        }
>> +
>> +    } else {
>> +        header->modular_16bit_buffers = 1;
>> +        header->xyb_encoded = 1;
>> +    }
>> +
>> +    header->default_transform = jxl_bits(1);
>> +
>> +    /* lazy && works with this macro */
>> +    if (!header->default_transform && header->xyb_encoded && !jxl_bits(1)) {
>> +        header->opsin_inverse_matrix = av_malloc_array(16, sizeof(float));
>> +        if (!header->opsin_inverse_matrix) {
>> +            av_log(avctx, AV_LOG_ERROR, "Could not allocate Opsin Inverse Matrix");
>> +            status = AVERROR(ENOMEM);
>> +            goto fail;
>> +        }
>> +        for (int i = 0; i < 16; i++) {
>> +            header->opsin_inverse_matrix[i] = jpegxl_f16(jxlr);
>> +        }
>> +    }
>> +
>> +    if (!header->default_transform) {
>> +        header->cw_mask = jxl_bits(3);
>> +    }
>> +
>> +    if (header->cw_mask & 1) {
>> +        header->up2_weight = av_malloc_array(15, sizeof(float));
>> +        if (!header->up2_weight) {
>> +            av_log(avctx, AV_LOG_ERROR, "Could not allocate up2_weight");
>> +            status = AVERROR(ENOMEM);
>> +            goto fail;
>> +        }
>> +        for (int i = 0; i < 15; i++) {
>> +            header->up2_weight[i] = jpegxl_f16(jxlr);
>> +        }
>> +    }
>> +    if (header->cw_mask & 2) {
>> +        header->up4_weight = av_malloc_array(55, sizeof(float));
>> +        if (!header->up4_weight) {
>> +            av_log(avctx, AV_LOG_ERROR, "Could not allocate up4_weight");
>> +            status = AVERROR(ENOMEM);
>> +            goto fail;
>> +        }
>> +        for (int i = 0; i < 55; i++) {
>> +            header->up4_weight[i] = jpegxl_f16(jxlr);
>> +        }
>> +    }
>> +    if (header->cw_mask & 4) {
>> +        header->up8_weight = av_malloc_array(210, sizeof(float));
> These arrays are never used at all, so you can just avoid the
> allocation. In fact I wonder whether it is necessary to parse this stuff
> here at all (AFAIK no other parser does such deep parsing).
The parser parses pretty deeply in order to check for validity. Without 
very deep parsing, it fails the probetest fuzzer because it gives too 
many false positives. The codestream format is very permissive in this 
regard.
>> +    if (status >= 0 && (jxlr->bits_read + 1) / 8 - 1 > jxlr->buflen)
>> +        return FFMIN(jxlr->bits_read, INT_MAX);
>> +    if (status > 0)
>> +        return -status;
>> +    return status || -1;
> This is equivalent to "return 1;"
The goal was to use the return value to distinguish between a parser 
error caused by an unexpected end of codestream (by how many bytes) 
verses a particular error. I don't believe I ended up using it anywhere, 
so I suppose I could change this to "return 1;"
>> +}
>> +
>> +static int jpegxl_parse_header(void *avctx, JpegXLParseContext *jxlr, JpegXLHeader **headerp, int level)
>> +{
>> +    uint64_t sig = jxl_bits(64);
>> +    jpegxl_reset_pc(jxlr);
>> +    if (sig == FF_JPEGXL_CONTAINER_SIGNATURE_LE) {
>> +        for (;;) {
>> +            uint64_t size = 0;
>> +            uint32_t tag = 0;
>> +            for (int k = 0; k < 4; k++)
>> +                size = (size << 8) | jxl_bits(8);
>> +            for (int k = 0; k < 4; k++)
>> +                tag = (tag << 8) | jxl_bits(8);
>> +            if (tag == MKBETAG('j','x','l','p')) {
>> +                jxl_bits(32);
>> +                break;
>> +            }
>> +            if (tag == MKBETAG('j','x','l','c'))
>> +                break;
>> +            if (size == 1) {
>> +                size = 0;
>> +                for (int k = 0; k < 8; k++)
>> +                    size = (size << 8) | jxl_bits(8);
>> +                if (size > INT_MAX)
>> +                    break;
>> +                size -= 8;
>> +            }
>> +            if (jxlr->bits_read / 8 > jxlr->buflen)
>> +                break;
> This code looks weird: You read the size of an isobmff-style box, but
> then you actually ignore the size.
> (Apart from that: You are always byte-aligned here, so you could just as
> well read the stuff bytewise and skip the corresponding number of bits
> lateron.)
>
Uh, there's supposed to be a jxl_bits(size * 8) that somehow fell 
through the cracks (with appropriate overflow checking). I'm not sure 
why it still works. Perhaps I got lucky on my samples. See earlier 
comment about jxl_bits(n) for n > 64.
>> +    if (buf_size == 0 || s1->flags & PARSER_FLAG_COMPLETE_FRAMES) {
>> +        /* eof is a frame boundary */
>> +        next = buf_size;
>> +    } else if (!jxlr->pc.frame_start_found) {
>> +        /* look for stream signature */
>> +        uint64_t state64 = jxlr->pc.state64;
>> +        for (; i < buf_size; i++) {
>> +            state64 = (state64 << 8) | buf[i];
>> +            if ((state64 & 0xFFFF) == FF_JPEGXL_CODESTREAM_SIGNATURE_BE
>> +                    || state64 == FF_JPEGXL_CONTAINER_SIGNATURE_BE) {
>> +                jxlr->pc.frame_start_found = 1;
>> +                break;
>> +            }
>> +        }
>> +        jxlr->pc.state64 = state64;
> Did you test this part? Did it work? I am wondering because you are
> actually supposed to indicate how many bytes of input you have consumed,
> yet you always indicate END_NOT_FOUND.
I did, and it did, although I suppose I did not test against things like 
junk in front of a valid file.
>
>> +    }
>> +
>> +    if (jxlr->pc.frame_start_found && s1->pict_type == AV_PICTURE_TYPE_NONE) {
>> +        jpegxl_init_pc(jxlr, buf, buf_size);
>> +        status = jpegxl_parse_header(NULL, jxlr, &header, 5);
> Why don't you forward the logcontext?
Will do. Also, the init line should read buf + i, and buf_size - i, per 
earlier comment.

>> +const AVCodecParser ff_jpegxl_parser = {
>> +    .codec_ids      = { AV_CODEC_ID_JPEGXL },
>> +    .priv_data_size = sizeof(JpegXLParseContext),
>> +    .parser_init    = jpegxl_parse_init,
>> +    .parser_parse   = jpegxl_parse,
>> +    .parser_close   = ff_parse_close,
>> +};
>> +
>> +#else /* CONFIG_JPEGXL_PARSER */
>> +
>> +int avpriv_jpegxl_verify_codestream_header(void *avctx, uint8_t *buf, size_t buflen, int level)
>> +{
> We don't provide fallback-stubs for avpriv functions; instead we just
> add configure/Makefile depencies.
I'm not sure how to do that for this specific task, so I'll ask on IRC.

Also, thanks for taking the time to do a thorough code review, I very 
much appreciate it.

-Leo Izen