[FFmpeg-devel] [PATCH] [RFC] GSoC: FLIF16 Image format parser

Thu Feb 27 13:15:04 EET 2020

Hi Anamitra,

On Wed, Feb 26, 2020 at 12:26:37PM +0530, Anamitra Ghorui wrote:
>This is a buildable "skeleton" of my component (the FLIF16 parser)
>i.e. everything is present aside from the logic itself.
>
>***
>
>Hello, I am trying to implement a parser for the FLIF16 file format as
>a GSoC 2020 qualification project. So far I think I have managed to
>register the parser (alongwith the format) and the basic structure
>of the parser code.
>
>I have now reached a point where moving forward is going to be quite
>difficult without outside help and references, and so I have a number
>of questions regarding the conceptual understanding of FFmpeg:
>
>a. Please tell me if I am right or wrong here:
>1. Each audio/video/image file format has a parser for converting the
>   file data into a format that can be understood by a decoder.

Yes

>
>2. A Decoder converts a given, recogised encoded data stream into a
>   form that can be processed by physical hardware.

Yes. To be exact, decoder turns the encoded data packets to raw frames or 
samples, which can then be transcoded to some other codec or displayed/played.

>
>3. File formats can be independent of what sort of encoding it uses.
>   Eg: WebM

Yes a single container format can support diff codecs.

>
>4. The general Audio parsing/decoding process is as follows:
>     i. Allocate space for a packet of data
>    ii. Try to find a hit for the codec of  given data format
>   iii. Now, with the codec id, attempt to init a parser
>    iv. Allocate a context for the codec
>     v. Initialize the codec context
>    vi. Initialize the codec
>   vii. Allocate space for frame data
>  viii. Open the imput file
>    ix. While file pointer isn't EOF:
>            Read data into buffer
>            Parse data into a single frame
>            Decode the data
>     x. Flush the file and free stuff.

Yes, there may also be some form of probing taking place, i.e. checking the 
first few packets to find what file format and codec is used. 

>
>5. Every parser has its own parser context extended from the default parser
>   context. The byte offsets/positions in the file are kept by the parser
>   context.
>
>6. An image can be thought of as a video with a single frame

For some purposes this high level distinction may work. But many image formats 
also support multiple frames and animations like GIF and even FLIF. 

>
>b. In libavcodec/parser.h:
>
>    typedef struct ParseContext{
>        ...
>        int frame_start_found;
>        ...
>    } ParseContext;
>
>Is frame_start_found the determined position of the start of the frame
>in the data stream?
>
>
>c. I have been looking at the decoder/encoder/parser of the BMP format
>   (which is one of the simplest image formats), the actual decoding work
>   (according to me), i.e. Finding the magic numbers, seeing the various
>   segments is being done by the decoder function and not the parser.
>
>   The parser function from what I can see from the png_parser and
>   bmp_parser, simply manipulates the ParseConstext for appropriate
>   values, and does not much else. What is it exactly doing over here?

You are correct. The parser is usally used for video formats, to read and 
iterate over encoded packets/frames in a bitstream. Main decoding part and 
filling contexts for a particular packet is done within the decoder module 
usually.

FLIF does have multiple frames so having a parser is a good idea. But you may 
choose to read the other information through header into the decoder context, 
that is up to you whatever you find better.

>
>If there are any books or articles I should read, please tell me.
>---
> libavcodec/Makefile        |  1 +
> libavcodec/avcodec.h       |  1 +
> libavcodec/flif16_parser.c | 51 ++++++++++++++++++++++++++++++++++++++
> libavcodec/parsers.c       |  1 +
> libavformat/img2.c         |  1 +
> 5 files changed, 55 insertions(+)
> create mode 100644 libavcodec/flif16_parser.c
>
>diff --git a/libavcodec/Makefile b/libavcodec/Makefile
>index 1e894c8049..ce18632d2c 100644
>--- a/libavcodec/Makefile
>+++ b/libavcodec/Makefile
>@@ -1045,6 +1045,7 @@ OBJS-$(CONFIG_DVD_NAV_PARSER)          += dvd_nav_parser.o
> OBJS-$(CONFIG_DVDSUB_PARSER)           += dvdsub_parser.o
> OBJS-$(CONFIG_FLAC_PARSER)             += flac_parser.o flacdata.o flac.o \
>                                           vorbis_data.o
>+OBJS-$(CONFIG_FLAC_PARSER)             += flif16_parser.o
> OBJS-$(CONFIG_G723_1_PARSER)           += g723_1_parser.o
> OBJS-$(CONFIG_G729_PARSER)             += g729_parser.o
> OBJS-$(CONFIG_GIF_PARSER)              += gif_parser.o
>diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>index 978f36d12a..c6b8c6a1eb 100644
>--- a/libavcodec/avcodec.h
>+++ b/libavcodec/avcodec.h
>@@ -461,6 +461,7 @@ enum AVCodecID {
>     AV_CODEC_ID_MVDV,
>     AV_CODEC_ID_MVHA,
>     AV_CODEC_ID_CDTOONS,
>+    AV_CODEC_ID_FLIF16,
>
>     /* various PCM "codecs" */
>     AV_CODEC_ID_FIRST_AUDIO = 0x10000,     ///< A dummy id pointing at the start of audio codecs
>diff --git a/libavcodec/flif16_parser.c b/libavcodec/flif16_parser.c
>new file mode 100644
>index 0000000000..54bd93d499
>--- /dev/null
>+++ b/libavcodec/flif16_parser.c
>@@ -0,0 +1,51 @@
>+/*
>+ * FLIF16 parser
>+ * Copyright (c) 2020 Anamitra Ghorui
>+ *
>+ * This file is part of FFmpeg.
>+ *
>+ * FFmpeg is free software; you can redistribute it and/or
>+ * modify it under the terms of the GNU Lesser General Public
>+ * License as published by the Free Software Foundation; either
>+ * version 2.1 of the License, or (at your option) any later version.
>+ *
>+ * FFmpeg is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>+ * Lesser General Public License for more details.
>+ *
>+ * You should have received a copy of the GNU Lesser General Public
>+ * License along with FFmpeg; if not, write to the Free Software
>+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
>+ */
>+
>+ /**
>+  * @file
>+  * FLIF16 parser
>+  */
>+
>+#include "parser.h"
>+#include <stdio.h>
>+
>+typedef struct FLIF16ParseContext {
>+    ParseContext pc;
>+
>+} FLIF16ParseContext;
>+
>+static int flif16_parse(AVCodecParserContext *s, AVCodecContext *avctx,
>+                     const uint8_t **poutbuf, int *poutbuf_size,
>+                     const uint8_t *buf, int buf_size)
>+{
>+    FLIF16ParseContext *fpc = s->priv_data;
>+    int next = END_NOT_FOUND;
>+
>+    return next;
>+}
>+
>+AVCodecParser ff_flif16_parser = {
>+    .codec_ids      = { AV_CODEC_ID_FLIF16 },
>+    .priv_data_size = sizeof(FLIF16ParseContext),
>+    .parser_parse   = flif16_parse,
>+    .parser_close   = ff_parse_close,
>+};
>+
>diff --git a/libavcodec/parsers.c b/libavcodec/parsers.c
>index 33a71de8a0..8b6eb954b3 100644
>--- a/libavcodec/parsers.c
>+++ b/libavcodec/parsers.c
>@@ -40,6 +40,7 @@ extern AVCodecParser ff_dvbsub_parser;
> extern AVCodecParser ff_dvdsub_parser;
> extern AVCodecParser ff_dvd_nav_parser;
> extern AVCodecParser ff_flac_parser;
>+extern AVCodecParser ff_flif16_parser;
> extern AVCodecParser ff_g723_1_parser;
> extern AVCodecParser ff_g729_parser;
> extern AVCodecParser ff_gif_parser;
>diff --git a/libavformat/img2.c b/libavformat/img2.c
>index 16bc9d2abd..14c11d0c82 100644
>--- a/libavformat/img2.c
>+++ b/libavformat/img2.c
>@@ -81,6 +81,7 @@ const IdStrMap ff_img_tags[] = {
>     { AV_CODEC_ID_XPM,        "xpm"      },
>     { AV_CODEC_ID_XFACE,      "xface"    },
>     { AV_CODEC_ID_XWD,        "xwd"      },
>+    { AV_CODEC_ID_FLIF16,     "flif16"   },
>     { AV_CODEC_ID_NONE,       NULL       }
> };
>
>-- 
>2.17.1
>

Looks good to me, try to parse an animated FLIF file and see if you can find 
the right frame boundaries. Then move onto reading other parameters from the 
bitstream headers into a context.

Cheers!

--
Jai (darkapex)