[FFmpeg-user] filter_complex and map. Am i confused or bug?

Thu Jan 31 14:41:42 EET 2019

Hi guys,

What I'm trying to do, in theory, should be very simple but it seems that I miss to understand about how ffmpeg routes different tracks.

Basically I have a file with a single video track and a single audio track.

What I want to achieve is:

a) Scale the video track to 720x576 and encode it in H264 @ 2M with 2M minrate and 2M maxrate

b) Scale the video track to 640x480 and encode it in H264 @ 1M with 1M minrate and 1M maxrate

c) Pick the audio track and encode it in aac

d) Mux all the tracks together in this order: first Video 640x480, second Video 720x576, third Audio

So I though to run this command (I will separate section by section commenting with my thoughs)

ffmpeg -i INPUTFILE \    <- this is quite clear :)

-filter_complex "[0:v]scale=720:576[vidout1];[0:v]scale=640:480[vidout2]" \ <- here I pickup the video, I scale it to 720x576 and I route it to [vidout1], and I scale it again to 640x480 and I route it to [vidout2]

-c:v libx264 -c:a aac \ <- here I specify that all video tracks have to be encoded in x264, all audio tracks in aac

-b:v:vidout1 2M -minrate:v:vidout1 2M -maxrate:v:vidout1 2M \ <- I specify that [vidout1] has to be encoded in 2M/2M/2M

-b:v:vidout2 1M -minrate:v:vidout2 1M -maxrate:v:vidout2 1M \ <- I specify that [vidout2] has to be encoded in 1M/1M/1M

-map [vidout2] -map [vidout1] -map 0:a \ <- here I specify that the order has to be 640x480, 720x576, Audio

-f mpegts test.ts

So the final command is the following:

ffmpeg -i INPUTFILE -filter_complex "[0:v]scale=720:576[vidout1];[0:v]scale=640:480[vidout2]" -c:v libx264 -c:a aac -b:v:vidout1 2M -minrate:v:vidout1 2M -maxrate:v:vidout1 2M -b:v:vidout2 1M -minrate:v:vidout2 1M -maxrate:v:vidout2 1M -map [vidout2] -map [vidout1] -map 0:a -f mpegts test.ts

but the output, is definitely different that what I expected:

Output #0, mpegts, to 'test.ts':

  Metadata:

    encoder         : Lavf58.20.100

    Stream #0:0: Video: h264 (libx264), yuv420p(top coded first (swapped)), 640x480 [SAR 4:3 DAR 16:9], q=-1--1, 1000 kb/s, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 1000000/0/1000000 buffer size: 0 vbv_delay: -1

    Stream #0:1: Video: h264 (libx264), yuv420p, 720x576 [SAR 64:45 DAR 16:9], q=-1--1, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1

    Stream #0:2: Audio: aac (LC), 48000 Hz, stereo, fltp, 128 kb/s

    Metadata:

      encoder         : Lavc58.35.100 aac

The order of the streams is correct (first 640x480, then 720x576, then audio) but the encoding settings were passed correctly only to the first track, and totally ignored for the second  track.

So, I've tried to change the order of the video tracks to see what happened:

-map [vidout1] -map [vidout2] -map 0:a

And the result is:

Output #0, mpegts, to 'test.ts':

  Metadata:

    encoder         : Lavf58.20.100

    Stream #0:0: Video: h264 (libx264), 1 reference frame, yuv420p(top coded first (swapped)), 720x576 [SAR 64:45 DAR 16:9], q=-1--1, 1000 kb/s, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 1000000/0/1000000 buffer size: 0 vbv_delay: -1

    Stream #0:1: Video: h264 (libx264), 1 reference frame, yuv420p, 640x480 [SAR 4:3 DAR 16:9], q=-1--1, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1

    Stream #0:2: Audio: aac (LC), 48000 Hz, stereo, fltp, delay 1024, 128 kb/s

    Metadata:

      encoder         : Lavc58.35.100 aac

As before, the tracks are in the order expected, but not only one of the two tracks were not encoded as expected, now surprisingly the first track was encoded with the charateristics I expected to be used for the second track!!!! So basically vout1 was encoded with the parameters I expected to be applied to vout2, and vout2 just ignored the settings at all.

But even more surprisingly, if I remove "-b:v:vidout2 1M -minrate:v:vidout2 1M -maxrate:v:vidout2" the result is this:

Output #0, mpegts, to 'test.ts':

  Metadata:

    encoder         : Lavf58.20.100

    Stream #0:0: Video: h264 (libx264), 1 reference frame, yuv420p(top coded first (swapped)), 720x576 [SAR 64:45 DAR 16:9], q=-1--1, 2000 kb/s, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 2000000/0/2000000 buffer size: 0 vbv_delay: -1

    Stream #0:1: Video: h264 (libx264), 1 reference frame, yuv420p, 640x480 [SAR 4:3 DAR 16:9], q=-1--1, 25 fps, 90k tbn, 25 tbc

    Metadata:

      encoder         : Lavc58.35.100 libx264

    Side data:

      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1

    Stream #0:2: Audio: aac (LC), 48000 Hz, stereo, fltp, delay 1024, 128 kb/s

    Metadata:

      encoder         : Lavc58.35.100 aac

Basically now vout1 is encoded "almost" as expected....except for the minrate....

What am I doing wrong?

Did I totally misunderstood how to use filter_complex and map, or is there any possible bug preventing the mechanism to work properly?

Thanks in advance,

Alex Molon