[FFmpeg-devel] [PATCH 6/7] cbs_mpeg2: Remove zero byte stuffing

Mon Jul 29 22:56:57 EEST 2019

Remove superfluous trailing zeros from slices. Because MPEG-2 slices
can end with zero bits a safe number of trailing zero bits is always
kept.

More explicitly, 6 + max{f_code[i][1] - 1, i = 0,1, f_code[i][1] != 0xf}
is an upper bound for the number of possible trailing zeros that are
part of the slice. Here f_code[i][1] is the relevant value of the
picture coding extension the slice belongs to and the maximum of the
empty set is zero.
It is this number of trailing zero bits that is actually kept.

That this is really an upper bound can be seen as follows:

a) Every slice actually ends with a macroblock.

b) If the last macroblock of a slice ends with a block(i) structure
with pattern_code[i] != 0, then the slice ends with an "End of block"
VLC code (namely the "End of block" code of the last block with
pattern_code[i] != 0).
These codes are 10 and 0110, so that in this case there is exactly one
trailing zero bit.

c) Otherwise, all pattern_code[i] are zero. In this case,
if macroblock_pattern is set for the last macroblock of the slice, then
by the definition of pattern_code[i] in 6.3.17.4 cbp (derived
according to table B.9) must be zero and also the
coded_block_pattern_1/2 (if existing) must consist of zeros alone. The
value zero for cbp is coded by 0000 0000 1 so that the maximum number of
trailing zeros in this case is the length of coded_block_pattern_1/2 which
have a length of two resp. six bits. So six trailing zero bits at most.

d) Otherwise, if the slice actually ends with the marker bit of the
last macroblock, then there are certainly no trailing zero bits at
all.

e) Otherwise, if the slice ends with a motion_vectors(s) structure
(with s = 0 or 1 -- it doesn't matter which one), then it ends with a
motion_vector(r,s) (r, s = 0, 1 -- it doesn't matter) structure. This
structure ends with motion_code[r][s][1] (always existing) potentially
followed by motion_residual[r][s][1] and dmvector[1]. If dmvector[1]
exists, and contains a bit different from 0, there is at most one
trailing zero bit; if dmvector[1] consists of zeros alone, its length
is one according to B.11. motion_residual[r][s][1] (if it exists) has
a length of f_code[s][1] - 1 bits and can consist of zero bits alone.
Given that the value 0xf for f_code indicates that there is no motion
vector of the mentioned type, the length of motion_residual[r][s][1] is
bounded by max{f_code[i][1] - 1, i=1,2, f_code[i][1] != 0xf}. The
motion_code[r][s][1] can end with at most five zero bits (see B.10)
and always contains a bit set to one, so that in this case there are
at most 5 + max{f_code[i][1] - 1, i=1,2, f_code[i][1] != 0xf} + 1
zero trailing bits.

f) Otherwise, if the last macroblock of the slice ends with a
quantiser_scale_code, then there are at most four trailing zero bits,
because quantiser_scale_code has a length of five bits and must not
attain the value zero.

g) Otherwise, the last macroblock ends with the macroblock_modes
syntax structure. The potentially existing dct_type at the end might
be a zero bit; the frame/field_motion_type isn't present here, because
otherwise we would have a motion_vectors(i) (i = 0 or 1 or both) syntax
structure, so that e) (or b)-d)) would have applied.
spatial_temporal_weight_code might entirely consist of two zero bits.
The macroblock_type VLC code always contains a 1 bit and ends with two
zero bits at most (see B.2-B.8 for this), so we have maximally 2+2+1
trailing zero bits.

The fate test cbs-mpeg2-sony-ct3 had to be adapted because the input
file contains trailing zeros that were stripped away; the filesize is
reduced from 135 KB to 117 KB. Of course, decoding the smaller output
still produces the same frames.
Most of these savings happen in between slices rather than after the
last slice: The chomp bitstream filter can only reduce the filesize
by 50 bytes.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt at gmail.com>
---
The fate test result for cbs-mpeg2-sony-ct3 had to be modified compared
to the last patch because the sample ends with a sequence_end_code that
is preceded by several zero bytes. The earlier code thought that the
slice ended at the end of the packet (i.e. at the end of the
sequence_end_code) and therefore did not strip any of these zero bytes
between the last slice and the sequence_start_code away. The now code
does this.

 libavcodec/cbs_mpeg2.c            | 30 ++++++++++++++++++++++++++++--
 tests/ref/fate/cbs-mpeg2-sony-ct3 |  2 +-
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/libavcodec/cbs_mpeg2.c b/libavcodec/cbs_mpeg2.c
index 596e9677b7..559793dc98 100644
--- a/libavcodec/cbs_mpeg2.c
+++ b/libavcodec/cbs_mpeg2.c
@@ -170,7 +170,7 @@ static int cbs_mpeg2_split_fragment(CodedBitstreamContext *ctx,
     CodedBitstreamUnitType unit_type;
     uint32_t start_code = -1;
     size_t unit_size;
-    int err, i, final = 0;
+    int err, i, final = 0, max_trailing_bits = 15;
 
     start = avpriv_find_start_code(frag->data, frag->data + frag->data_size,
                                    &start_code);
@@ -199,7 +199,7 @@ static int cbs_mpeg2_split_fragment(CodedBitstreamContext *ctx,
         // the end of fragment->data).
         if (start_code >> 8 == 0x01) {
             // Unit runs from start to the beginning of the start code
-            // pointed to by end (including any padding zeroes).
+            // pointed to by end (preliminarily including any padding zeroes).
             unit_size = (end - 4) - start;
         } else {
            // We didn't find a start code, so this is the final unit.
@@ -207,6 +207,32 @@ static int cbs_mpeg2_split_fragment(CodedBitstreamContext *ctx,
            final     = 1;
         }
 
+        if (unit_type == MPEG2_START_EXTENSION && unit_size >= 4 &&
+            start[1] >> 4 == MPEG2_EXTENSION_PICTURE_CODING) {
+            // The values f_code[0][1], f_code[1][1] are used to derive
+            // an upper bound for the number of trailing zero bits.
+            // 6 + max{f_code[i][1] - 1, i = 0,1, f_code[i][1] != 0xf} is
+            // an upper bound.  An f_code value of 0xf means that there is
+            // no motion vector of the respective type.  f_code values 10..14
+            // are reserved and 0 is forbidden.  In these cases, no attempt
+            // to strip trailing zeroes is made.
+#define BOUND(f_code) \
+    ((f_code) == 0xf ? 0 : (f_code) >= 10 || !(f_code) ? 9 : (f_code) - 1)
+            max_trailing_bits = 6 + FFMAX(BOUND(start[2] >> 4),
+                                          BOUND(start[3] >> 4));
+        }
+
+        if (MPEG2_START_IS_SLICE(unit_type) && max_trailing_bits < 15) {
+            const uint8_t *tmp = start + unit_size - 1;
+
+            while (tmp > start && *tmp == 0)
+                tmp--;
+            av_assert0(*tmp);
+
+            unit_size = FFMIN(unit_size, tmp + 1 - start + max_trailing_bits / 8
+                                         + (ff_ctz(*tmp) < max_trailing_bits % 8));
+        }
+
         err = ff_cbs_insert_unit_data(ctx, frag, i, unit_type, (uint8_t*)start,
                                       unit_size, frag->data_ref);
         if (err < 0)
diff --git a/tests/ref/fate/cbs-mpeg2-sony-ct3 b/tests/ref/fate/cbs-mpeg2-sony-ct3
index b5b4b12f07..df7af1af31 100644
--- a/tests/ref/fate/cbs-mpeg2-sony-ct3
+++ b/tests/ref/fate/cbs-mpeg2-sony-ct3
@@ -1 +1 @@
-b1e15a09cfffbad801810af0928736ab
+9d55aa8bf36ee193b637001b5ef7e2ab
-- 
2.21.0