[FFmpeg-devel] [PATCH+RFC] AVFrame for audio

Måns Rullgård mans
Sun Jul 11 12:12:56 CEST 2010


Peter Ross <pross at xvid.org> writes:

> Hi,
>
> To prototype use of audio in AVFrame, I have modified the PCM encoder/decoder
> and addded new public lavc functions.

Nice work.

> Existing SAMPLE_FMTs are mapping onto (AVFrame*)->data[0]. The obvious
> next step is to support planar audio, and splitting FF_COMMON_FRAME into
> common, audio- and video-specific parts.
>
> avctx->reget_audio_buffer() is implemened using a simple realloc. It

realloc() doesn't maintain 16-byte alignment.  You can't use it for
these buffers.

> probably makes sense to reuse the video InternalBuffer stuff, but i'm not
> sure.
>
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 05c96e8..5e050a9 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -931,6 +931,13 @@ typedef struct AVPanScan{
>       * - decoding: Set by libavcodec\
>       */\
>      void *hwaccel_picture_private;\
> +\
> +    /**\
> +     * number of audio samples (per channel) described by this frame\
> +     * - encoding: Set by user.\
> +     * - decoding: Set by libavcodec.\
> +     */\
> +    int nb_samples;\
>  
>  
>  #define FF_QSCALE_TYPE_MPEG1 0
> @@ -942,6 +949,7 @@ typedef struct AVPanScan{
>  #define FF_BUFFER_TYPE_USER     2 ///< direct rendering buffers (image is (de)allocated by user)
>  #define FF_BUFFER_TYPE_SHARED   4 ///< Buffer from somewhere else; don't deallocate image (data/base), all other tables are not shared.
>  #define FF_BUFFER_TYPE_COPY     8 ///< Just a (modified) copy of some other buffer, don't deallocate anything.
> +#define FF_BUFFER_TYPE_SIMPLE   16  ///< data[0] elements have been allocated using av_malloc()

What is this needed for?  Are you planning on adding more types?

> @@ -3326,6 +3341,7 @@ AVFrame *avcodec_alloc_frame(void);
>  int avcodec_default_get_buffer(AVCodecContext *s, AVFrame *pic);
>  void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic);
>  int avcodec_default_reget_buffer(AVCodecContext *s, AVFrame *pic);
> +int avcodec_default_reget_audio_buffer(AVCodecContext *s, AVFrame *frame, int nb_samples);

I've always wondered why those functions are exposed here.  I can't
think of any valid reason for anyone ever to call them directly.

> +/**
> + * Decode the audio frame of size avpkt->size from avpkt->data into frame.
> + *
> + * Some decoders may support multiple frames in a single AVPacket, such

Start a new sentence at the comma.

> + * decoders would then just decode the first frame. In this case,
> + * avcodec_decode_audio4 has to be called again with an AVPacket that contains
> + * the remaining data in order to decode the second frame etc.

s/that contains/containing/

> + * If no frame
> + * could be outputted, got_frame_ptr is zero.

"could be output", "is set to zero"

> + * @warning The input buffer must be FF_INPUT_BUFFER_PADDING_SIZE larger than
> + * the actual read bytes because some optimized bitstream readers read 32 or 64
> + * bits at once and could read over the end.
> + *
> + * @warning The end of the input buffer avpkt->data should be set to 0 to ensure that
> + * no overreading happens for damaged MPEG streams.

"MPEG streams" seems a bit out of place here.  Sure, there is MPEG
audio, but still...  Just say "streams" plain and simple.

> + * @note You might have to align the input buffer avpkt->data and output buffer
> + * samples. The alignment requirements depend on the CPU: On some CPUs it isn't
> + * necessary at all, on others it won't work at all if not aligned and on others
> + * it will work but it will have an impact on performance.
> + *
> + * In practice, avpkt->data should have 4 byte alignment at minimum and
> + * samples should be 16 byte aligned unless the CPU doesn't need it
> + * (AltiVec and SSE do).

Drop the "in practice".  It is confusing.  Also "samples should be
... aligned" could be read as "each sample should be ... aligned"
which is clearly not the case.  In fact, mentioning the output buffers
here at all is odd, since they are no longer a parameter to the
function.

> + * @param avctx the codec context
> + * @param[out] frame The AVFrame in which the decoded audio frame will be stored.

"in which to store decoded samples"

> + *             The deecoder will allocate memory for the samples.

"decoder"

> + *             //FIXME: user procedure to deallocate frame
> + * @param[in,out] got_frame_ptr Zero if no frame could be decompressed, otherwise, it is nonzero.

That's not an input parameter IIRC.

> + * @param[in] avpkt The input AVPacket containing the input buffer.
> + * @return On error a negative value is returned, otherwise the number of bytes
> + * used or zero if no frame data was decompressed (used) from the input AVPacket.

s/frame data was decompressed (used)/data was consumed/

> + */
> +int avcodec_decode_audio4(AVCodecContext *avctx, AVFrame *frame,
> +                          int *got_frame_ptr,
> +                          AVPacket *avpkt);
> [...] 
> +/**
> + * Encode an audio frame from samples into buf.
> + *
> + * @param avctx the codec context
> + * @param[out] buf the output buffer
> + * @param[in] buf_size the output buffer size
> + * @param[in] frame the input buffer containing the samples
> + * @return On error a negative value is returned, on success zero or the number
> + * of bytes used to encode the data read from the input buffer.
> + */
> +int avcodec_encode_audio2(AVCodecContext *avctx, uint8_t *buf, int buf_size,
> +                          const AVFrame *frame);

At some point, we should do something about the output buffer
allocation too.  Currently, a ridiculous size is allocated in ffmpeg.c
just to be on the safe side.

>  #if CONFIG_ENCODERS
>  #define PCM_ENCODER(id,sample_fmt_,name,long_name_) \
>  AVCodec name ## _encoder = {                    \
> @@ -470,7 +484,7 @@ AVCodec name ## _decoder = {                    \
>      sizeof(PCMDecode),                          \
>      pcm_decode_init,                            \
>      NULL,                                       \
> -    NULL,                                       \
> +    pcm_decode_close,                           \

I recommend using designated initialisers for any changes to these
structs.  It's so much easier to read that way.

>      pcm_decode_frame,                           \
>      .sample_fmts = (const enum SampleFormat[]){sample_fmt_,SAMPLE_FMT_NONE}, \
>      .long_name = NULL_IF_CONFIG_SMALL(long_name_), \
> diff --git a/libavcodec/utils.c b/libavcodec/utils.c
> index c3d701c..f52baad 100644
> --- a/libavcodec/utils.c
> +++ b/libavcodec/utils.c
> @@ -353,6 +353,12 @@ void avcodec_default_release_buffer(AVCodecContext *s, AVFrame *pic){
>      assert(pic->type==FF_BUFFER_TYPE_INTERNAL);
>      assert(s->internal_buffer_count);
>  
> +    if (pic->type == FF_BUFFER_TYPE_SIMPLE) {
> +        for(i = 0; i < 4; i++)
> +            if (pic->data[i]) av_freep(&pic->data[i]);

Useless if().

> +        return;
> +    }
> +
>      buf = NULL; /* avoids warning */
>      for(i=0; i<s->internal_buffer_count; i++){ //just 3-5 checks so is not worth to optimize
>          buf= &((InternalBuffer*)s->internal_buffer)[i];
> @@ -409,6 +415,17 @@ int avcodec_default_reget_buffer(AVCodecContext *s, AVFrame *pic){
>      return 0;
>  }
>  
> +int avcodec_default_reget_audio_buffer(AVCodecContext *s, AVFrame *frame, int nb_samples)
> +{
> +    int sample_size = av_get_bits_per_sample_format(s->sample_fmt)/8;
> +    frame->type    = FF_BUFFER_TYPE_SIMPLE;
> +    frame->data[0] = av_realloc(frame->data[0], nb_samples * s->channels * sample_size);

As I said above, av_realloc() doesn't maintain the 16-byte alignment.

> +    if (!frame->data[0])
> +        return -1;

This leaks memory if realloc() fails.  Then again, we shouldn't use
realloc() here...

> +    frame->nb_samples = nb_samples;
> +    return 0;
> +}
> +
>  int avcodec_default_execute(AVCodecContext *c, int (*func)(AVCodecContext *c2, void *arg2),void *arg, int *ret, int count, int size){
>      int i;
>  
> @@ -533,15 +550,27 @@ free_and_end:
>      goto end;
>  }
>  
> +#if LIBAVCODEC_VERSION_MAJOR < 53
>  int attribute_align_arg avcodec_encode_audio(AVCodecContext *avctx, uint8_t *buf, int buf_size,
>                           const short *samples)
>  {
> +    AVFrame frame;
> +    avcodec_get_frame_defaults(&frame);
> +    frame.data[0]    = (uint8_t*)samples;
> +    frame.nb_samples = buf_size / ( av_get_bits_per_sample(avctx->codec->id)/8 * avctx->channels);

Drop space after '('.

> +    return avcodec_encode_audio2(avctx, buf, buf_size, &frame);
> +}
> +#endif
> +
> +int attribute_align_arg avcodec_encode_audio2(AVCodecContext *avctx, uint8_t *buf, int buf_size,
> +                         const AVFrame *frame)

Please reformat this to fit in 80 columns.  Same applies to all new
function declarations/definitions.

> +{
>      if(buf_size < FF_MIN_BUFFER_SIZE && 0){
>          av_log(avctx, AV_LOG_ERROR, "buffer smaller than minimum size\n");
>          return -1;
>      }
> -    if((avctx->codec->capabilities & CODEC_CAP_DELAY) || samples){
> -        int ret = avctx->codec->encode(avctx, buf, buf_size, samples);
> +    if((avctx->codec->capabilities & CODEC_CAP_DELAY) || frame){
> +        int ret = avctx->codec->encode(avctx, buf, buf_size, frame);
>          avctx->frame_number++;
>          return ret;
>      }else
> @@ -633,31 +662,39 @@ int attribute_align_arg avcodec_decode_audio2(AVCodecContext *avctx, int16_t *sa
>  
>      return avcodec_decode_audio3(avctx, samples, frame_size_ptr, &avpkt);
>  }
> -#endif
>  
>  int attribute_align_arg avcodec_decode_audio3(AVCodecContext *avctx, int16_t *samples,
>                           int *frame_size_ptr,
>                           AVPacket *avpkt)
>  {
> -    int ret;
> +    AVFrame frame;
> +    int ret, got_frame = 0;
> +
> +    avcodec_get_frame_defaults(&frame);
> +    ret = avcodec_decode_audio4(avctx, &frame, &got_frame, avpkt);
> +    if (got_frame) {
> +        *frame_size_ptr = frame.nb_samples * avctx->channels * av_get_bits_per_sample(avctx->codec->id)/8;
> +        if (*frame_size_ptr > AVCODEC_MAX_AUDIO_FRAME_SIZE)
> +            av_log(avctx, AV_LOG_WARNING, "avcodec_decode_audio3 samples truncated to AVCODEC_MAX_AUDIO_FRAME_SIZE\n");
> +        memcpy(samples, frame.data[0], FFMIN(*frame_size_ptr, AVCODEC_MAX_AUDIO_FRAME_SIZE));
> +    } else {
> +        *frame_size_ptr = 0;
> +    }
> +    return ret;
> +}
> +#endif

I find the above block of code impenetrable.  Please try to format it
in a more readable fashion.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list