[Libav-user] About audio decoding

Sat Aug 10 17:40:47 CEST 2013

Le 10 août 2013 à 16:32, Nicolas George <nicolas.george at normalesup.org> a écrit :

> Le tridi 23 thermidor, an CCXXI, Lucas Soltic a écrit :
>> Hello,
>> However, there is one part of the audio decode example that I do not
>> understand:
>> if (avpkt.size < AUDIO_REFILL_THRESH) {
>>    /* Refill the input buffer, to avoid trying to decode
>>     * incomplete frames. Instead of this, one could also use
>>     * a parser, or use a proper container format through
>>     * libavformat. */
>>    memmove(inbuf, avpkt.data, avpkt.size);
>>    avpkt.data = inbuf;
>>    len = fread(avpkt.data + avpkt.size, 1,
>>                AUDIO_INBUF_SIZE - avpkt.size, f);
>>    if (len > 0)
>>        avpkt.size += len;
>> }
>> 
>> Does that mean that AVPacket can contain incomplete frames? I would have
>> expected an AVPacket to contain one or several frames, but not incomplete
>> ones…
> 
> Depends on what "can" you mean exactly.
> 
> If you read a file using libavformat, the packets it returns are supposed to
> always contain exactly one frame, and never incomplete ones. In that sense,
> a packet can not contain incomplete frames.
> 
> But if you give a packet with incomplete frames to libavcodec, it will
> decode what can be decoded, return the number of bytes consumed and properly
> leave the incomplete frame for later. In that sense, a packet can contain
> incomplete frames.

So if all of my packets are created through av_malloc(sizeof(AVPacket)) + av_init_packet() + av_read_frame(), I will always get one and only one frame (may it be valid or not) per packet? Thus I don't have to care of the above piece of code. Am I right?

By the way.. I just noticed that av_read_frame()'s documentation states the following: It will split what is stored in the file into frames and return one for each call.

So it seems like I'm right...

>> I also do not understand what the 'parser' and 'proper container format'
>> solutions would be.
> 
> Look at what the code in the example does: it uses stdio to read directly
> from a MP2 file. A proper container format is a file format designed to
> store various streams in various codecs, such as Matroska, Ogg, NUT, MP4,
> AVI, etc. The format has generic data structures to delimit packets and
> store timestamps. That is the best way of storing audio-video content.
> 
> Some formats are just the packets concatenated together, without any header
> or delimiter. It is called an elementary stream, and does not work with all
> codecs. For example, it works with most MPEG codecs, but usually not with
> the Vorbis codecs. A parser is a stripped-down decoder that is used to find
> packet boundaries and various additional information in an elementary
> stream; libavformat uses them for you whenever necessary.

Hmm ok I understand now :), many thanks for the explanation!

> 
> Regards,
> 
> -- 
>  Nicolas George
> _______________________________________________
> Libav-user mailing list
> Libav-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/libav-user