[FFmpeg-devel] [PATCH 23/31] cbs_mpeg2: Remove zero byte stuffing

Sun Jul 28 20:14:48 EEST 2019

On 20/06/2019 00:45, Andreas Rheinhardt wrote:
> Remove superfluous trailing zeros from slices. Because MPEG-2 slices
> can end with zero bits a safe number of trailing zero bits is always
> kept.
> 
> More explicitly, 6 + max{f_code[i][1] - 1, i = 0,1, f_code[i][1] != 0xf}
> is an upper bound for the number of possible trailing zeros that are
> part of the slice. Here f_code[i][1] is the relevant value of the
> picture coding extension the slice belongs to and the maximum of the
> empty set is zero.
> It is this number of trailing zero bits that is actually kept.
> 
> That this is really an upper bound can be seen as follows:
> 
> a) Every slice actually ends with a macroblock.
> 
> b) If the last macroblock of a slice ends with a block(i) structure
> with pattern_code[i] != 0, then the slice ends with an "End of block"
> VLC code (namely the "End of block" code of the last block with
> pattern_code[i] != 0).
> These codes are 10 and 0110, so that in this case there is exactly one
> trailing zero bit.
> 
> c) Otherwise, all pattern_code[i] are zero. In this case,
> if macroblock_pattern is set for the last macroblock of the slice, then
> by the definition of pattern_code[i] in 6.3.17.4 cbp (derived
> according to table B.9) must be zero and also the
> coded_block_pattern_1/2 (if existing) must consist of zeros alone. The
> value zero for cbp is coded by 0000 0000 1 so that the maximum number of
> trailing zeros in this case is the length of coded_block_pattern_1/2 which
> have a length of two resp. six bits. So six trailing zero bits at most.
> 
> d) Otherwise, if the slice actually ends with the marker bit of the
> last macroblock, then there are certainly no trailing zero bits at
> all.
> 
> e) Otherwise, if the slice ends with a motion_vectors(s) structure
> (with s = 0 or 1 -- it doesn't matter which one), then it ends with a
> motion_vector(r,s) (r, s = 0, 1 -- it doesn't matter) structure. This
> structure ends with motion_code[r][s][1] (always existing) potentially
> followed by motion_residual[r][s][1] and dmvector[1]. If dmvector[1]
> exists, and contains a bit different from 0, there is at most one
> trailing zero bit; if dmvector[1] consists of zeros alone, its length
> is one according to B.11. motion_residual[r][s][1] (if it exists) has
> a length of f_code[s][1] - 1 bits and can consist of zero bits alone.
> Given that the value 0xf for f_code indicates that there is no motion
> vector of the mentioned type, the length of motion_residual[r][s][1] is
> bounded by max{f_code[i][1] - 1, i=1,2, f_code[i][1] != 0xf}. The
> motion_code[r][s][1] can end with at most five zero bits (see B.10)
> and always contains a bit set to one, so that in this case there are
> at most 5 + max{f_code[i][1] - 1, i=1,2, f_code[i][1] != 0xf} + 1
> zero trailing bits.
> 
> f) Otherwise, if the last macroblock of the slice ends with a
> quantiser_scale_code, then there are at most four trailing zero bits,
> because quantiser_scale_code has a length of five bits and must not
> attain the value zero.
> 
> g) Otherwise, the last macroblock ends with the macroblock_modes
> syntax structure. The potentially existing dct_type at the end might
> be a zero bit; the frame/field_motion_type isn't present here, because
> otherwise we would have a motion_vectors(i) (i = 0 or 1 or both) syntax
> structure, so that e) (or b)-d)) would have applied.
> spatial_temporal_weight_code might entirely consist of two zero bits.
> The macroblock_type VLC code always contains a 1 bit and ends with two
> zero bits at most (see B.2-B.8 for this), so we have maximally 2+2+1
> trailing zero bits.
> 
> The fate test cbs-mpeg2-sony-ct3 had to be adapted because the input
> file contains trailing zeros that were stripped away; the filesize is
> reduced from 135 KB to 117 KB. Of course, decoding the smaller output
> still produces the same frames.
> Most of these savings happen in between slices rather than after the
> last slice: The chomp bitstream filter can only reduce the filesize
> by 50 bytes.
> 
> Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
> ---
>  libavcodec/cbs_mpeg2.c            | 26 ++++++++++++++++++++++++--
>  tests/ref/fate/cbs-mpeg2-sony-ct3 |  2 +-
>  2 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/libavcodec/cbs_mpeg2.c b/libavcodec/cbs_mpeg2.c
> index 516b728a64..0630fe17d8 100644
> --- a/libavcodec/cbs_mpeg2.c
> +++ b/libavcodec/cbs_mpeg2.c
> @@ -170,7 +170,7 @@ static int cbs_mpeg2_split_fragment(CodedBitstreamContext *ctx,
>      uint8_t *unit_data;
>      uint32_t start_code = -1, next_start_code = -1;
>      size_t unit_size;
> -    int err, i, unit_type;
> +    int err, i, unit_type, max_trailing_bits = 14;
>  
>      start = avpriv_find_start_code(frag->data, frag->data + frag->data_size,
>                                     &start_code);
> @@ -187,10 +187,32 @@ static int cbs_mpeg2_split_fragment(CodedBitstreamContext *ctx,
>              unit_size = end - (start - 1);
>          } else {
>              // Unit runs from start to the beginning of the start code
> -            // pointed to by end (including any padding zeroes).
> +            // pointed to by end (preliminarily including any padding zeroes).
>              unit_size = (end - 4) - (start - 1);
>          }

The start pointer being offset by one from the start of the unit makes the following code more confusing.  Perhaps you could adjust start here before using it?

>  
> +        if (unit_type == MPEG2_START_EXTENSION && unit_size >= 4 &&
> +            *start >> 4 == MPEG2_EXTENSION_PICTURE_CODING) {
> +            // The values f_code[0][1], f_code[1][1] are used to improve
> +            // the upper bound for the number of trailing zero bits.
> +            // 6 + max{f_code[i][1] - 1, i = 0,1, f_code[i][1] != 0xf} is
> +            // an upper bound. An f_code value of 0xf means that there is
> +            // no motion vector of the respective type.
> +            max_trailing_bits = start[1] >> 4 == 0xf ? 0 : (start[1] >> 4) - 1;
> +            max_trailing_bits = FFMAX(start[2] >> 4 == 0xf ?
> +                                      0 : (start[2] >> 4) - 1,
> +                                      max_trailing_bits) + 6;

The f_code values aren't yet verified, so could be invalid.  Does that matter?

The symmetry between the two start[] expressions is a bit hidden - maybe make sure they have matching code alignment.

> +        }
> +
> +        if (MPEG2_START_IS_SLICE(unit_type)) {
> +            const uint8_t *tmp = start + unit_size - 2;
> +
> +            while (tmp > start && *tmp == 0)
> +                tmp--;
> +            unit_size = FFMIN(unit_size, tmp - start + max_trailing_bits / 8 +
> +                              !!(*tmp & 0xff >> 8 - max_trailing_bits % 8) + 2);

I think a calculation based on max_trailing_bits - ctz(*tmp) might be clearer?

> +        }
> +
>          unit_data = (uint8_t *)start - 1;
>  
>          err = ff_cbs_insert_unit_data(ctx, frag, i, unit_type,

Tbh the tricksiness here feels excessive to me, though I have no objection to it.

Thanks,

- Mark