[FFmpeg-devel] [PATCH] Fixed bug encountered when decoding interlaced video

Fri May 5 10:55:48 EEST 2017

On Thu, 4 May 2017 23:46:30 -0700
Aaron Levinson <alevinsn at aracnet.com> wrote:

> On 4/12/2017 6:08 PM, Aaron Levinson wrote:
> > On 3/26/2017 10:34 AM, Aaron Levinson wrote:  
> >> On 3/26/2017 4:41 AM, Matthias Hunstock wrote:  
> >>> Am 26.03.2017 um 11:50 schrieb Aaron Levinson:  
> >>>> When using the following command to play back either file:
> >>>>  ffmpeg -i <mpegts file> -f decklink -pix_fmt uyvy422 "DeckLink SDI
> >>>>  4K", I noticed that with the mpegts file with the AAC audio stream,
> >>>>  it would correctly select an interlaced video mode for the video
> >>>>  output stream, but with the mpegts file with the Opus audio stream,
> >>>>  it would use a progressive video mode (1080p29.97) for the video
> >>>>  output stream.  
> >>>
> >>> Which FFmpeg version did you test this with?
> >>>
> >>> There was a change related to this just short time ago.
> >>>
> >>> Does it happen with current git HEAD?
> >>>
> >>> Matthias  
> >>
> >> This issue occurs with the current git HEAD.  I'm aware of the
> >> Blackmagic improvement that was added in February to add support for
> >> interlaced video modes on output, and actually that's one of the reasons
> >> why I'm using the latest git sources, as opposed to, say, 3.2.4.  This
> >> particular issue has nothing to do with Blackmagic, and I only used
> >> Blackmagic in the example that reproduces the bug because it is
> >> something that can be reproduced on both Windows and Linux (and
> >> presumably also on OS/X).  The issue also occurs if I do something like
> >> -f rawvideo out.avi on Windows, and I'm sure that there are plenty of
> >> other examples.
> >>
> >> Aaron Levinson  
> > 
> > Has anyone had a chance to review this patch yet, which I submitted on March 26th?  To demonstrate the impact of this patch, here's some output of before and after for an .h264 file with interlaced 1080i59.94 video content:
> > 
> > Command-line:  ffmpeg -i test8_1080i.h264 -c:v mpeg2video test8_1080i_mp2.ts
> > 
> > Before patch:
> > 
> > --------------------------------------
> > 
> > Input #0, h264, from 'test8_1080i.h264':
> >   Duration: N/A, bitrate: N/A
> >     Stream #0:0: Video: h264 (High), yuv420p(top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn, 59.94 tbc
> > Stream mapping:
> >   Stream #0:0 -> #0:0 (h264 (native) -> mpeg2video (native))
> > Press [q] to stop, [?] for help
> > Output #0, mpegts, to 'test8_1080i_mp2_2.ts':
> >   Metadata:
> >     encoder         : Lavf57.72.100
> >     Stream #0:0: Video: mpeg2video (Main), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 90k tbn, 29.97 tbc
> >     Metadata:
> >       encoder         : Lavc57.92.100 mpeg2video
> >     Side data:
> >       cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
> > 
> > --------------------------------------
> > 
> > After patch:
> > 
> > --------------------------------------
> > 
> > Input #0, h264, from 'test8_1080i.h264':
> >   Duration: N/A, bitrate: N/A
> >     Stream #0:0: Video: h264 (High), yuv420p(top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn, 59.94 tbc
> > Stream mapping:
> >   Stream #0:0 -> #0:0 (h264 (native) -> mpeg2video (native))
> > Press [q] to stop, [?] for help
> > Output #0, mpegts, to 'test8_1080i_mp2_2.ts':
> >   Metadata:
> >     encoder         : Lavf57.72.100
> >     Stream #0:0: Video: mpeg2video (Main), yuv420p(top coded first (swapped)), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 90k tbn, 29.97 tbc
> >     Metadata:
> >       encoder         : Lavc57.92.100 mpeg2video
> >     Side data:
> >       cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
> > 
> > --------------------------------------
> > 
> > As can be seen, before the patch, after decoding the .h264 file and then re-encoding it as mpeg2video in an mpegts container, the interlaced aspect of the video has been lost in the output, and it is now effectively 1080p29.97, although the video hasn't actually been converted to progressive.  ffmpeg simply thinks that the video is progressive when it is not.  With the patch, the interlaced aspect is not lost and propagates to the output.  So, this conclusively demonstrates that the issue has nothing to do with Blackmagic and is a more general issue with interlaced video and decoding.
> > 
> > I can make the input file available if that would be helpful.
> > 
> > Anyway, it would be great if this bug fix could make it into ffmpeg.
> > 
> > Thanks,
> > Aaron Levinson  
> 
> I've provided a new version of the patch.  When I created the first version of the patch on March 26th, this was the first patch that I submitted to ffmpeg, and some aspects were rough.  I had indicated that the patch passed regression tests, but all I did was run "make fate", instead of "make fate SAMPLES=fate-suite/", and once I understood that I should use fate-suite, I discovered that some of the FATE tests failed with this patch.  I fixed the issues with the patch, adjusted some comments, and adjusted the patch description text.  I've confirmed that this patch passes all fate-suite tests for 64-bit MSVC on Windows and 64-bit gcc on Linux.
> 
> Thanks,
> Aaron Levinson
> 
> ------------------------------------------------------------------------
> 
> From 85aa383f5753795652bae015f4fe91b016f7387e Mon Sep 17 00:00:00 2001
> From: Aaron Levinson <alevinsn at aracnet.com>
> Date: Thu, 4 May 2017 22:54:31 -0700
> Subject: [PATCH] ffmpeg:  Fixed bug with decoding interlaced video
> 
> Purpose: Fixed bug in ffmpeg encountered when decoding interlaced
> video.  In some cases, ffmpeg would treat it as progressive.  As a
> result of the change, the issues with interlaced video are fixed.
> 
> Detailed description of problem: Run the following command: "ffmpeg -i
> test8_1080i.h264 -c:v mpeg2video test8_1080i_mp2.ts".  In this case,
> test8_1080i.h264 is an H.264-encoded file with 1080i59.94
> (interlaced).  Prior to the patch, the following output is generated:
> 
> Input #0, h264, from 'test8_1080i.h264':
>   Duration: N/A, bitrate: N/A
>     Stream #0:0: Video: h264 (High), yuv420p(top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 1200k tbn, 59.94 tbc
> Stream mapping:
>   Stream #0:0 -> #0:0 (h264 (native) -> mpeg2video (native))
> Press [q] to stop, [?] for help
> Output #0, mpegts, to 'test8_1080i_mp2_2.ts':
>   Metadata:
>     encoder         : Lavf57.72.100
>     Stream #0:0: Video: mpeg2video (Main), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 90k tbn, 29.97 tbc
>     Metadata:
>       encoder         : Lavc57.92.100 mpeg2video
> 
> which demonstrates the bug.  The output should instead look like:
> 
>     Stream #0:0: Video: mpeg2video (Main), yuv420p(top coded first (swapped)), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 90k tbn, 29.97 tbc
> 
> The bug is caused by the fact that reap_filters() calls
> init_output_stream(), which in turn calls check_init_output_file(),
> and this is all done prior to the first call to do_video_out().  An
> initial call to do_video_out() is necessary to populate the interlaced
> video information to the output stream's codecpar
> (mux_par->field_order in do_video_out()).  However,
> check_init_output_file() calls avformat_write_header() prior to the
> initial call to do_video_out(), so field_order is populated too late,
> and the header is written with the default field_order value,
> progressive.
> 
> Comments:
> 
> -- ffmpeg.c:
> a) Enhanced init_output_stream() to take an additional input
>    indicating whether or not check_init_output_file() should be
>    called.  There are other places that call init_output_stream(), and
>    it was important to make sure that these continued to work
>    properly.
> b) Adjusted reap_filters() such that, in the case that
>    init_output_stream() is called, it indicates that it should not
>    call check_init_output_file() in init_output_stream().  Instead, it
>    makes the call to check_init_output_file() directly, but after it
>    does the filtered frame setup and processing.
> 
> Signed-off-by: Aaron Levinson <alevinsn at aracnet.com>
> ---
>  ffmpeg.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 53 insertions(+), 8 deletions(-)
> 
> diff --git a/ffmpeg.c b/ffmpeg.c
> index e798d92277..45dbfafc04 100644
> --- a/ffmpeg.c
> +++ b/ffmpeg.c
> @@ -1400,7 +1400,7 @@ static void do_video_stats(OutputStream *ost, int frame_size)
>      }
>  }
>  
> -static int init_output_stream(OutputStream *ost, char *error, int error_len);
> +static int init_output_stream(OutputStream *ost, char *error, int error_len, int do_check_output_file);
>  
>  static void finish_output_stream(OutputStream *ost)
>  {
> @@ -1415,6 +1415,8 @@ static void finish_output_stream(OutputStream *ost)
>      }
>  }
>  
> +static int check_init_output_file(OutputFile *of, int file_index);
> +
>  /**
>   * Get and encode new output from any of the filtergraphs, without causing
>   * activity.
> @@ -1433,6 +1435,7 @@ static int reap_filters(int flush)
>          AVFilterContext *filter;
>          AVCodecContext *enc = ost->enc_ctx;
>          int ret = 0;
> +        int did_init_output_stream = 0;
>  
>          if (!ost->filter || !ost->filter->graph->graph)
>              continue;
> @@ -1440,12 +1443,14 @@ static int reap_filters(int flush)
>  
>          if (!ost->initialized) {
>              char error[1024] = "";
> -            ret = init_output_stream(ost, error, sizeof(error));
> +            ret = init_output_stream(ost, error, sizeof(error), 0);
>              if (ret < 0) {
>                  av_log(NULL, AV_LOG_ERROR, "Error initializing output stream %d:%d -- %s\n",
>                         ost->file_index, ost->index, error);
>                  exit_program(1);
>              }
> +
> +            did_init_output_stream = 1;
>          }
>  
>          if (!ost->filtered_frame && !(ost->filtered_frame = av_frame_alloc())) {
> @@ -1521,6 +1526,44 @@ static int reap_filters(int flush)
>              }
>  
>              av_frame_unref(filtered_frame);
> +
> +            /*
> +             * It is important to call check_init_output_file() here, after
> +             * do_video_out() was called, instead of in init_output_stream(),
> +             * as was done previously.
> +             * If called from init_output_stream(), it is possible that not
> +             * everything will have been fully initialized by the time that
> +             * check_init_output_file() is called, and if
> +             * check_init_output_file() determines that all output streams
> +             * have been initialized, it will write the header.  An example
> +             * of initialization that depends on do_video_out() being called
> +             * at least once is the specification of interlaced video, which
> +             * happens in do_video_out().  This is particularly relevant when
> +             * decoding--without processing a video frame, the interlaced
> +             * video setting is not propagated before the header is written,
> +             * and that causes problems.
> +             * TODO:  should probably handle interlaced video differently
> +             * and not depend on it being setup in do_video_out().  Another
> +             * solution would be to set it up once by examining the input
> +             * header.
> +             */
> +            if (did_init_output_stream) {
> +                ret = check_init_output_file(of, ost->file_index);
> +                if (ret < 0)
> +                    return ret;
> +                did_init_output_stream = 0;
> +            }
> +        }
> +
> +        /*
> +         * Can't wait too long to call check_init_output_file().
> +         * Otherwise, bad things start to occur.
> +         * If didn't do it earlier, do it by the time it gets here.
> +         */
> +        if (did_init_output_stream) {
> +            ret = check_init_output_file(of, ost->file_index);
> +            if (ret < 0)
> +                return ret;
>          }
>      }
>  
> @@ -1888,7 +1931,7 @@ static void flush_encoders(void)
>                  finish_output_stream(ost);
>              }
>  
> -            ret = init_output_stream(ost, error, sizeof(error));
> +            ret = init_output_stream(ost, error, sizeof(error), 1);
>              if (ret < 0) {
>                  av_log(NULL, AV_LOG_ERROR, "Error initializing output stream %d:%d -- %s\n",
>                         ost->file_index, ost->index, error);
> @@ -3396,7 +3439,7 @@ static int init_output_stream_encode(OutputStream *ost)
>      return 0;
>  }
>  
> -static int init_output_stream(OutputStream *ost, char *error, int error_len)
> +static int init_output_stream(OutputStream *ost, char *error, int error_len, int do_check_output_file)
>  {
>      int ret = 0;
>  
> @@ -3564,9 +3607,11 @@ static int init_output_stream(OutputStream *ost, char *error, int error_len)
>  
>      ost->initialized = 1;
>  
> -    ret = check_init_output_file(output_files[ost->file_index], ost->file_index);
> -    if (ret < 0)
> -        return ret;
> +    if (do_check_output_file) {
> +        ret = check_init_output_file(output_files[ost->file_index], ost->file_index);
> +        if (ret < 0)
> +            return ret;
> +    }
>  
>      return ret;
>  }
> @@ -3633,7 +3678,7 @@ static int transcode_init(void)
>          if (output_streams[i]->filter)
>              continue;
>  
> -        ret = init_output_stream(output_streams[i], error, sizeof(error));
> +        ret = init_output_stream(output_streams[i], error, sizeof(error), 1);
>          if (ret < 0)
>              goto dump_format;
>      }

As I said on IRC, I'm skeptic against this, but I'm also not sure
whether I understand the situation.

First, your change seems a bit fragile, and almost relies on
coincidences. This could easily break in the future. (And if we add a
FATE test, it would probably hit random people trying to change
ffmpeg.c, and would cause them more work.)

Looking at the current do_video_out() function (which btw. is way too
long and also has broken indentation), I see the following line:

  mux_par->field_order = AV_FIELD_PROGRESSIVE;

(and similar assignments to field_order to set interlaced modes)

This is apparently run on every frame. This looks pretty wrong to me.
It should _never_ change the codecpar after headers were written. This
is simply invalid API use.

If this is really a parameter that can change per frame, and that the
_muxer_ needs this information (per packet I suppose), a side data
signaling the field order should be added. mpegtsenc.c doesn't seem to
read field_order at all, though.

So I remain confused.