[FFmpeg-devel] Clarification about bits_per_coded_sample vs bits_per_raw_sample

Wed May 3 02:18:09 EEST 2017

On Tue, 2 May 2017 18:01:37 -0400
Shawn Singh <shawnsingh-at-google.com at ffmpeg.org> wrote:

> Hello,
> 
> We are trying to understand two fields in AVCodecParameters:
>  bits_per_coded_sample and bits_per_raw_sample.
> 
> In the comments in libavcodec/avcodec.h, bits_per_coded_sample is described
> as "the bitrate per sample".   This sounds like (encoded bitrate / sample
> rate), for example 128 kbps 44.1 kHz audio stream would be 3 bits per coded
> audio sample.  But, the code usage suggests that this field is actually
> "the bit depth of each sample, if the sample was uncompressed", which is
> also similar to the comments and usage for bits_per_raw_sample.  For
> example, the mov.c demuxing initializes bits_per_coded_sample when parsing
> the "sample size" field of the mp4 AudioSampleEntry (in the stsd atom, for
> both audio and video).
> 
> Various codecs/formats initialize one value, or the other, or both in
> different times.  For example, pcm.c audio codec sets bits_per_coded_sample
> on encoding, and bits_per_raw_sample on decoding.  But the mov.c demuxer
> and movenc.c muxer both use bits_per_coded_sample for muxing.
> 
> So, what really is the difference between these two values?   Is it
> possible that these fields should just be merged into one field?   Or if
> there is a pattern we don't see, perhaps only the comments need to be
> updated?

That is indeed not very clear. Here's my guess (from my hazy memory and
a superficial look at the code):

bits_per_coded_sample is the "proper" value, and used as parameter for
the decoder, while bits_per_raw_sample signals that not all bits in the
_decoded_ data are used. For example, a codec could use
bits_per_codec_sample as a mandatory parameter for the decoder (and
that tells the decoder something about how a bitstream needs to be
decoded). On the other hand, it might set bits_per_raw_sample to signal
that the decoded PCM stream has a certain bit resolution. For example,
some codecs support 24 bit audio, which is output as 32 bits, but with
bits_per_raw_sample set to 24 (the 8 LSBs will be set to 0).

Hopefully others have better explanations.