[FFmpeg-devel] [PATCH 5/6] avcodec/qtrle: return last frame even if unchanged

Mon Aug 26 23:39:45 EEST 2019

On Mon, Aug 26, 2019 at 9:38 PM Michael Niedermayer
<michael at niedermayer.cc> wrote:
>
> On Mon, Aug 26, 2019 at 01:45:55PM +0200, Hendrik Leppkes wrote:
> > On Mon, Aug 26, 2019 at 11:09 AM Michael Niedermayer
> > <michael at niedermayer.cc> wrote:
> > >
> > > On Mon, Aug 26, 2019 at 09:51:45AM +0200, Hendrik Leppkes wrote:
> > > > On Mon, Aug 26, 2019 at 9:33 AM Michael Niedermayer
> > > > <michael at niedermayer.cc> wrote:
> > > > >
> > > > > On Sun, Aug 25, 2019 at 08:04:03PM -0300, James Almer wrote:
> > > > > > On 8/25/2019 6:46 PM, Michael Niedermayer wrote:
> > > > > > > On Sun, Aug 25, 2019 at 01:22:22PM -0300, James Almer wrote:
> > > > > > >> On 8/24/2019 3:18 PM, Michael Niedermayer wrote:
> > > > > > >>> Fixes: Ticket7880
> > > > > > >>>
> > > > > > >>> Signed-off-by: Michael Niedermayer <michael at niedermayer.cc>
> > > > > > >>> ---
> > > > > > >>>  libavcodec/qtrle.c        | 30 ++++++++++++++++++++++++++----
> > > > > > >>>  tests/ref/fate/qtrle-8bit |  1 +
> > > > > > >>>  2 files changed, 27 insertions(+), 4 deletions(-)
> > > > > > >>>
> > > > > > >>> diff --git a/libavcodec/qtrle.c b/libavcodec/qtrle.c
> > > > > > >>> index 2c29547e5a..c22a1a582d 100644
> > > > > > >>> --- a/libavcodec/qtrle.c
> > > > > > >>> +++ b/libavcodec/qtrle.c
> > > > > > >>> @@ -45,6 +45,7 @@ typedef struct QtrleContext {
> > > > > > >>>
> > > > > > >>>      GetByteContext g;
> > > > > > >>>      uint32_t pal[256];
> > > > > > >>> +    AVPacket flush_pkt;
> > > > > > >>>  } QtrleContext;
> > > > > > >>>
> > > > > > >>>  #define CHECK_PIXEL_PTR(n)                                                            \
> > > > > > >>> @@ -454,11 +455,27 @@ static int qtrle_decode_frame(AVCodecContext *avctx,
> > > > > > >>>      int has_palette = 0;
> > > > > > >>>      int ret, size;
> > > > > > >>>
> > > > > > >>> +    if (!avpkt->data) {
> > > > > > >>> +        if (avctx->internal->need_flush) {
> > > > > > >>> +            avctx->internal->need_flush = 0;
> > > > > > >>> +            ret = ff_setup_buffered_frame_for_return(avctx, data, s->frame, &s->flush_pkt);
> > > > > > >>> +            if (ret < 0)
> > > > > > >>> +                return ret;
> > > > > > >>> +            *got_frame = 1;
> > > > > > >>> +        }
> > > > > > >>> +        return 0;
> > > > > > >>> +    }
> > > > > > >>> +    s->flush_pkt = *avpkt;
> > > > > > >>> +    s->frame->pkt_dts = s->flush_pkt.dts;
> > > > > > >>> +
> > > > > > >>>      bytestream2_init(&s->g, avpkt->data, avpkt->size);
> > > > > > >>>
> > > > > > >>>      /* check if this frame is even supposed to change */
> > > > > > >>> -    if (avpkt->size < 8)
> > > > > > >>> +    if (avpkt->size < 8) {
> > > > > > >>> +        avctx->internal->need_flush = 1;
> > > > > > >>>          return avpkt->size;
> > > > > > >>> +    }
> > > > > > >>> +    avctx->internal->need_flush = 0;
> > > > > > >>>
> > > > > > >>>      /* start after the chunk size */
> > > > > > >>>      size = bytestream2_get_be32(&s->g) & 0x3FFFFFFF;
> > > > > > >>> @@ -471,14 +488,18 @@ static int qtrle_decode_frame(AVCodecContext *avctx,
> > > > > > >>>
> > > > > > >>>      /* if a header is present, fetch additional decoding parameters */
> > > > > > >>>      if (header & 0x0008) {
> > > > > > >>> -        if (avpkt->size < 14)
> > > > > > >>> +        if (avpkt->size < 14) {
> > > > > > >>> +            avctx->internal->need_flush = 1;
> > > > > > >>>              return avpkt->size;
> > > > > > >>> +        }
> > > > > > >>>          start_line = bytestream2_get_be16(&s->g);
> > > > > > >>>          bytestream2_skip(&s->g, 2);
> > > > > > >>>          height     = bytestream2_get_be16(&s->g);
> > > > > > >>>          bytestream2_skip(&s->g, 2);
> > > > > > >>> -        if (height > s->avctx->height - start_line)
> > > > > > >>> +        if (height > s->avctx->height - start_line) {
> > > > > > >>> +            avctx->internal->need_flush = 1;
> > > > > > >>>              return avpkt->size;
> > > > > > >>> +        }
> > > > > > >>>      } else {
> > > > > > >>>          start_line = 0;
> > > > > > >>>          height     = s->avctx->height;
> > > > > > >>> @@ -572,5 +593,6 @@ AVCodec ff_qtrle_decoder = {
> > > > > > >>>      .init           = qtrle_decode_init,
> > > > > > >>>      .close          = qtrle_decode_end,
> > > > > > >>>      .decode         = qtrle_decode_frame,
> > > > > > >>> -    .capabilities   = AV_CODEC_CAP_DR1,
> > > > > > >>> +    .caps_internal  = FF_CODEC_CAP_SETS_PKT_DTS | FF_CODEC_CAP_SETS_PKT_POS,
> > > > > > >>> +    .capabilities   = AV_CODEC_CAP_DR1 | AV_CODEC_CAP_DELAY,
> > > > > > >>>  };
> > > > > > >>> diff --git a/tests/ref/fate/qtrle-8bit b/tests/ref/fate/qtrle-8bit
> > > > > > >>> index 27bb8aad71..39a03b7b6c 100644
> > > > > > >>> --- a/tests/ref/fate/qtrle-8bit
> > > > > > >>> +++ b/tests/ref/fate/qtrle-8bit
> > > > > > >>> @@ -61,3 +61,4 @@
> > > > > > >>>  0,        160,        160,        1,   921600, 0xcfd6ad2b
> > > > > > >>>  0,        163,        163,        1,   921600, 0x3b372379
> > > > > > >>>  0,        165,        165,        1,   921600, 0x36f245f5
> > > > > > >>> +0,        166,        166,        1,   921600, 0x36f245f5
> > > > > > >>
> > > > > > >> Following what i said in the nuv patch, do you still experience timeouts
> > > > > > >> with the current codebase, or even if you revert commit
> > > > > > >> a9dacdeea6168787a142209bd19fdd74aefc9dd6? Creating a reference to an
> > > > > > >> existing frame buffer shouldn't be expensive anymore for the fuzzer
> > > > > > >> after my ref counting changes to target_dec_fuzzer.c
> > > > > > >>
> > > > > > >> This is a very ugly solution to a problem that was already solved when
> > > > > > >> reference counting was introduced. Returning duplicate frames to achieve
> > > > > > >> cfr in decoders where it's expected shouldn't affect performance.
> > > > > > >
> > > > > > > Maybe i should ask this backward to make it clearer what this is trying
> > > > > > > to fix.
> > > > > > >
> > > > > > > Consider a patch that would return every frame twice with the same ref
> > > > > > > of course.
> > > > > > > Would you consider this to have 0 performance impact ?
> > > > > >
> > > > > > No, but it would be negligible.
> > > > >
> > > > > ok, lets see some actual numbers
> > > > >
> > > > > This comparission is about fixing ticket 7880 (just saying so we dont loose
> > > > > track of what we compare here)
> > > > >
> > > > > this qtrle patchset which fixes the last frame
> > > > > time ./ffmpeg -i clock.mov -vf scale=1920x1080 -qscale 2 -y test-new.avi
> > > > > real            0m0.233s
> > > > > user            0m0.404s
> > > > > sys             0m0.036s
> > > > > -rw-r----- 1 michael michael 769212 Aug 26 08:38 test-new.avi
> > > > > time ./ffmpeg -i test-new.avi -f null -
> > > > > real            0m0.095s
> > > > > user            0m0.131s
> > > > > sys             0m0.038s
> > > > >
> > > > > The alternative of reverting the code that discards duplicate frames
> > > > > (this i believe is what the people opposing this patchset here want)
> > > > > git revert  a9dacdeea6168787a142209bd19fdd74aefc9dd6
> > > > > time ./ffmpeg -i clock.mov -vf scale=1920x1080 -qscale 2 -y test.avi
> > > > > real            0m1.208s
> > > > > user            0m2.866s
> > > > > sys             0m0.168s
> > > > > -rw-r----- 1 michael michael 3274430 Aug 26 08:44 test.avi
> > > > > time ./ffmpeg -i test.avi -f null -
> > > > > real            0m0.185s
> > > > > user            0m0.685s
> > > > > sys             0m0.076s
> > > > >
> > > > > https://samples.ffmpeg.org/ffmpeg-bugs/trac/ticket7880/clock.mov
> > > > >
> > > > > As we can see the solution preferred by the opposition to this patchset
> > > > > will make the code 6 times slower and result in 4 times higher bitrate for
> > > > > the same quality.
> > > > > I would say this is not "negligible"
> > > > > you know this is not 0.6%, its not 6% its not 60% it is 600%
> > > > >
> > > > > I do not understand why we have a discussion here about this.
> > > > > Do we want 600% slower code ?
> > > > > Do our users want 600% slower code ?
> > > > >
> > > > > I do not want 600% slower code
> > > > >
> > > >
> > > > Speed at the expense of correctness should not even be an argument.
> > >
> > > and here we are back to square 0 because for all mainstream codecs
> > > we drop duplicates and fail this correctness.
> > > And noone wants to change that no matter which side of the argument one
> > > is on.
> >
> > You claim this is the same thing, but it really is not. If I encode a
> > static scene in h264 CFR, I get X frames in the bitstream, and avcodec
> > gives me X frames on the output, even if every single frame is
> > perfectly identical.
> > If I do the same with QTRLE, or whichever other codec you've already
> > been to, I get X frames in the encoded bitstream, and only Y (Y < X)
> > on the decoder output (at worst, 1 only).
> >
> > Every encoded frame should produce a decoded frame, and it should not
> > be up to the decoder to decide to drop them.
> > H264 repeats are not frames in the bitstream, they are metadata. They
> > do not have a timestamp that is being lost, they did not belong to a
> > distinct input packet. And we convey this metadata to the user, so he
> > can decide to act on it.
> >
>
> > How can you not see this huge gap? This behavior in these decoders
> > actively loses data. The H264 decoder does not lose any data.
>
> Lets check H264
> claim "They do not have a timestamp"
> immedeatly after the pic_struct field which tells you about the repeating
> behavior there is a loop for each repeated value with a timestamp.
> This timestamp is lost, if at CFR one can call it that way.
>

"time of origin, capture" is clearly a timecode, not a timestamp in
the sense we're talking about here (plus that the bitstream codes it
in hours/minutes/seconds). I expect you know the difference.
If these timecodes are considered useful it would be trivial to expose
them from the decoder too, since they are already being parsed and
stored.

> about "Every encoded frame should produce a decoded frame"
> quote about this timestamp from the h264 spec
> "The contents of the clock timestamp syntax elements indicate a time of origin, capture, or alternative ideal display."
>
> "origin, capture" implies that the repeated frame can have come from
> a real input packet because otherwise it wouldnt have a capture or origin
> time.
> So an encoder could use this as a way to represent some input frames
> And in that case only a decoder that follows these repeats exactly would
> have input == output
> which again brings us full circle as this matches what we see in qtrle repeats
>
>

No it does not. There are key differences in how this is handled on a
technical level. I can reproduce the full original video stream from
the output the H264 decoder gives me. I cannot do this with QTRLE.
That is all that it ultimately comes down to, any other arguments are
irrelevant.

- Hendrik