[FFmpeg-devel] Question regarding ogg cover art implementation

Mon Aug 23 02:49:04 EEST 2021

Hi,

First time using a mailing list and (properly) working with C, so please 
forgive me if I've done something wrong.

I'm looking at implementing https://trac.ffmpeg.org/ticket/4448 
<https://trac.ffmpeg.org/ticket/4448>, and I understand what I need to 
do but I'm on the fence about how I should go about doing it.

For the OGG container, cover art is embedded into the file by adding a 
METADATA_BLOCK_PICTURE tag in the vorbis comment section. The data of 
this tag is a base64 representation of the FLAC embedded image data 
structure, which includes the data of the image itself.

I looked at how vorbis comments are currently written, and it's lead me 
to `ff_vorbiscomment_write`. It takes in an AVDictionary for metadata, 
and an array of AVChapters for chapter writing. This is where I get a 
bit concerned.

I see two options as to how I could handle this:

1. Place the METADATA_BLOCK_PICTURE tag in the metadata AVDictionary.

This would be the easiest functionality wise, however it feels like a 
bad idea in my gut because the dictionary would be housing a value that 
is 1.33 * the size of the image(s) being embedded. So for example, if 
you wished to embed a 12 MB image it would require a 16MB malloc which 
doesn't feel right.

The ogg specification allows for multiple embedded images by specifying 
the same METADATA_BLOCK_PICTURE tag multiple times, which would require 
me to use AV_DICT_MULTIKEY when adding values to the dictionary. I 
couldn't find any examples of using it within the source code for either 
writing or reading (which I will have to handle in oggdec.c as well), so 
I'm not sure if other parts of the codebase will handle it well. This is 
probably more paranoia than anything.

2. Change ff_vorbiscomment_write to take in an array of AVStream / 
AVPacket to write it directly to the output IO stream

As far as I know this would classify as an API/ABI change. However it 
would mean that I don't have to allocate such a huge array upfront, and 
can encode smaller chunks at a time as I write them to the output stream.

To mitigate it, it could be created as a new function with the 
additional parameters, and the old function pointing towards the new one 
instead. I'm not entirely sure how APIs / ABIs are exposed and consumed 
in C, so I'm not sure if this will fix the issue.

I'm interested in what people think about how I should tackle this 
issue. It was annoying enough that I downloaded the source code and set 
up a compilation environment, so I'd like to get working on this.