[FFmpeg-devel] [PATCH V2] [RFC] GSoC: FLIF16 Image format parser

Mon Mar 2 21:58:57 EET 2020

Anamitra Ghorui (12020-03-02):
> According to the specification of the file format, there is no mention of an
> upper bound for the integer: https://flif.info/spec.html#_part_1_main_header

But they are mapped to FFmpeg values: width and height are ints, frame
counts are uint64_t at best. Therefore, I think you should drop the
varint_t structure entirely, and just read varints into uint64_t,
returning an error (INVALIDDATA) if it exceeds.

> >> + * magnitude of the varint, similar to UTF-8 encoding.
> > This is not how UTF-8 works.
> I was talking about it in the context of the 'prefix coding' it uses. UTF-8
> uses the MSB bits to indicate how many successive bits aside from the starting
> bit are included in a character. The varint does a similar thing in the manner
> that it also uses an MSB bit to indicate that a successive bit is included in
> the number. Hovever aside from this superficial similarity there is not much
> in common.

I'd prefer if there is no mention of UTF-8 rather than a misleading one.
Actually, this particular coding of integers seems quite common, it must
have a name.

>     AVPacket *pkt = av_mallocz(sizeof(AVPacket));
>     if (!pkt)
>         return pkt;
> 
> Is there a special reason for this?

I don't think so.

> I see. the stucture should therefore be defined with buf after buf_size
> (hence putting pad + 1 bytes in contiguous storage but there may be a problem
> with the byte padding of the struct), or should buf point to v + sizeof(*v)?
> (* See follow-up content below)

See below.

> I have tested it separately. In the given code, the function will set the value 
> to { 128, 0, 0 }, which is an incorrect representation. But for the puropses of 
> the parser and the decoder where we are only concerned with bound checking with 
> an existing number, such a situation will never occur. However I can see that 
> there will be issues when the encoder will be written since then we will have 
> to make the varint and write the varint to a file.

Either fix the function to make it work correctly, or document that it
has an unusual behavior.

> >An assert cannot be used to check for a memory allocation failure.
> I think I have to use av_log then.

No, not av_log() either, because it does not prevent the code from
continuing and dereferencing NULL (the asserts do, unless they are
disabled at build time). You need to add an error return, and return
AVERROR(ENOMEM).

> Should I double the size everytime v->buf_size becomes
> equal to the actual number of byte-segments of the varint?

Double the size every time you need to increase it, every time what you
need to put in the buffer does not fit.

> If I add it to the end of the struct, the compiler will likely add padding
> bits to it, but I don't think its a problem since the padding bits will be set
> to zero anyway and therefore can be used in the buffer.

Exactly. You could even make a sizeof() expression to avoid allocating
bytes if there is already room enough in the padding. It is quite a
common pattern. Modern C even allows 0-sized arrays at the end of
structs to make it more elegant, but we don't use it in FFmpeg.

Note: all these considerations about allocation, incrementation, etc.,
would be moot if you decide, as I think you should, to just drop
varint_t and use a large enough integer instead.

> FLIF16Varint should be fine then?

I think so; as Anton points, the FF prefix is not mandatory for
structures, only the ff_ prefix for functions.

> >> +typedef enum FLIF16ParseStates {
> >> +    FLIF16_HEADER = 1,
> >> +    FLIF16_METADATA,
> >> +    FLIF16_BITSTREAM,
> >> +    // FLIF16_TRANSFORM,
> >> +    FLIF16_PIXELDATA,
> >> +    FLIF16_CHECKSUM,
> >> +    FLIF16_VARINT
> >> +} flif16_states;
> > We usually name the enum and the corresponding types the same way.
> I had modelled most of the code after the GIF parser, and it names the enum
> values in the same manner:
> https://ffmpeg.org/doxygen/trunk/gif__parser_8c_source.html
> I will probably prefix them with 'FLIF16_STATE_' then.

The name of the values are fine. I meant the name of the enum itself,
FLIF16ParseStates compared to the name of the type, flif16_states.

By the way, the name of enums is usually singular.

Regards,

-- 
  Nicolas George
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200302/50bf1dec/attachment.sig>