[FFmpeg-user] The concat filter and duplicate frames from prores files
Nick Ludlam
nick at recoil.org
Mon Aug 6 00:08:59 EEST 2018
Hi all,
I’ve got some puzzling behaviour when attempting to join a set of prores quicktime files together via the concat filter, and encode down to an mp4.
Quicktimes produced by the video editing software we’re using cannot be successfully concatenated without producing duplicate frames. In a reduced case, I can demonstrate this happening when joining a video to itself three times. A duplicate frame is reliably inserted between the second and third section.
If we use Adobe Media Encode to “rewrap" the original prores files, then they are able to be concatenated correctly with no dupes.
I’ve got a capture of the session at https://gist.github.com/nickludlam/5a8d43f7d54d5f0b626c7b6d0eca7756 and the report of duplicate frames happens at line 133, but I’m also going to paste it here for convenience.
Is there a likely culprit for this? Something where the audio is fractionally longer than the video, somehow? Or timestamps are causing the concatenation process to behave in this way? I would ultimately like to remove the dependency on AME in our pipeline, so I’m keen to understand how this is happening.
I’ve started to use ffprobe to have a look at frames and packets, but without an idea of what to look for, it’s a bit difficult to make sense of the data.
Thanks,
Nick
$ ffmpeg -loglevel verbose \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-i /Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov \
-t 1 -f lavfi -i anullsrc=r=48000:cl=stereo -pix_fmt yuv420p -filter_complex "[0:v] [0:a] [1:v] [1:a] [2:v] [2:a] concat=n=3:v=1:a=1[v][a]" -map "[v]" -map "[a]" -preset fast -c:v libx264 -b:v 2000k -c:a aac -b:a 96k /tmp/output.mp4
ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
built with Apple LLVM version 9.1.0 (clang-902.0.39.2)
configuration: --prefix=/usr/local/Cellar/ffmpeg/4.0.2 --enable-shared --enable-pthreads --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libfreetype --enable-libmp3lame --enable-libx264 --enable-libxvid --enable-opencl --enable-videotoolbox --disable-lzma
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2018-07-27T15:02:45.000000Z
Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
Stream #0:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
encoder : Apple ProRes 422 HQ
timecode : 07:47:36:03
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
Stream #0:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
timecode : 07:47:36:03
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2018-07-27T15:02:45.000000Z
Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
Stream #1:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
encoder : Apple ProRes 422 HQ
timecode : 07:47:36:03
Stream #1:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
Stream #1:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
timecode : 07:47:36:03
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
creation_time : 2018-07-27T15:02:45.000000Z
Duration: 00:00:05.68, start: 0.000000, bitrate: 187659 kb/s
Stream #2:0(und): Video: prores, 1 reference frame (apch / 0x68637061), yuv422p10le(bt709, progressive), 1080x1920, 187329 kb/s, SAR 1:1 DAR 9:16, 25 fps, 25 tbr, 25 tbn, 25 tbc (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
encoder : Apple ProRes 422 HQ
timecode : 07:47:36:03
Stream #2:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 320 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
Stream #2:2(und): Data: none (tmcd / 0x64636D74), 0 kb/s (default)
Metadata:
creation_time : 2018-07-27T15:02:45.000000Z
handler_name : Core Media Data Handler
timecode : 07:47:36:03
[Parsed_anullsrc_0 @ 0x7f8f3b50ffc0] sample_rate:48000 channel_layout:'stereo' nb_samples:1024
Input #3, lavfi, from 'anullsrc=r=48000:cl=stereo':
Duration: N/A, start: 0.000000, bitrate: 768 kb/s
Stream #3:0: Audio: pcm_u8, 48000 Hz, stereo, u8, 768 kb/s
File '/tmp/output.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
Stream #0:0 (prores) -> concat:in0:v0
Stream #0:1 (aac) -> concat:in0:a0
Stream #1:0 (prores) -> concat:in1:v0
Stream #1:1 (aac) -> concat:in1:a0
Stream #2:0 (prores) -> concat:in2:v0
Stream #2:1 (aac) -> concat:in2:a0
concat:out:v0 -> Stream #0:0 (libx264)
concat:out:a0 -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[graph 0 input from stream 0:0 @ 0x7f8f3b613cc0] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_0_1 @ 0x7f8f3b614380] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[graph 0 input from stream 1:0 @ 0x7f8f3b614500] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_1_1 @ 0x7f8f3b614900] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[graph 0 input from stream 2:0 @ 0x7f8f3b614d40] w:1080 h:1920 pixfmt:yuv422p10le tb:1/25 fr:25/1 sar:1/1 sws_param:flags=2
[graph_0_in_2_1 @ 0x7f8f3b615100] tb:1/48000 samplefmt:fltp samplerate:48000 chlayout:0x3
[auto_scaler_0 @ 0x7f8f3b616cc0] w:iw h:ih flags:'bilinear' interl:0
[format @ 0x7f8f3b616000] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_concat_0' and the filter 'format'
[auto_scaler_0 @ 0x7f8f3b616cc0] w:1080 h:1920 fmt:yuv422p10le sar:1/1 -> w:1080 h:1920 fmt:yuv420p sar:1/1 flags:0x2
[libx264 @ 0x7f8f3d01d600] using SAR=1/1
[libx264 @ 0x7f8f3d01d600] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7f8f3d01d600] profile High, level 4.0
[libx264 @ 0x7f8f3d01d600] 264 - core 152 r2854 e9a5903 - H.264/MPEG-4 AVC codec - Copyleft 2003-2017 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=30 rc=abr mbtree=1 bitrate=2000 ratetol=1.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to '/tmp/output.mp4':
Metadata:
major_brand : qt
minor_version : 0
compatible_brands: qt
encoder : Lavf58.12.100
Stream #0:0: Video: h264 (libx264), 1 reference frame (avc1 / 0x31637661), yuv420p(progressive), 1080x1920 [SAR 1:1 DAR 9:16], q=-1--1, 2000 kb/s, 25 fps, 12800 tbn, 25 tbc (default)
Metadata:
encoder : Lavc58.18.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 0 vbv_delay: -1
Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, delay 1024, 96 kb/s (default)
Metadata:
encoder : Lavc58.18.100 aac
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in0:v0, 1 streams left in segment.its/s speed=2.12x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in0:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=5696000
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in1:v0, 1 streams left in segment.its/s speed=1.97x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in1:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=11392000
*** 1 dup!
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in2:v0, 1 streams left in segment.its/s dup=1 drop=0 speed=1.91x
[Parsed_concat_0 @ 0x7f8f3b6139c0] EOF on in2:a0, 0 streams left in segment.
[Parsed_concat_0 @ 0x7f8f3b6139c0] Segment finished at pts=17088000
No more output streams to write to, finishing.
frame= 427 fps= 42 q=-1.0 Lsize= 4550kB time=00:00:17.08 bitrate=2181.0kbits/s dup=1 drop=0 speed= 1.7x
video:4328kB audio:208kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.309756%
Input file #0 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
Input stream #0:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
Input stream #0:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
Input stream #0:2 (data): 0 packets read (0 bytes);
Total: 409 packets (133231695 bytes) demuxed
Input file #1 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
Input stream #1:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
Input stream #1:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
Input stream #1:2 (data): 0 packets read (0 bytes);
Total: 409 packets (133231695 bytes) demuxed
Input file #2 (/Users/nickludlam/Work/data-ingest/Resolve_prores_422/s1-4-1_004_04_25mm_5.mov):
Input stream #2:0 (video): 142 packets read (133003744 bytes); 142 frames decoded;
Input stream #2:1 (audio): 267 packets read (227951 bytes); 267 frames decoded (273408 samples);
Input stream #2:2 (data): 0 packets read (0 bytes);
Total: 409 packets (133231695 bytes) demuxed
Input file #3 (anullsrc=r=48000:cl=stereo):
Input stream #3:0 (audio): 0 packets read (0 bytes);
Total: 0 packets (0 bytes) demuxed
Output file #0 (/tmp/output.mp4):
Output stream #0:0 (video): 427 frames encoded; 427 packets muxed (4431401 bytes);
Output stream #0:1 (audio): 801 frames encoded (820224 samples); 802 packets muxed (212903 bytes);
Total: 1229 packets (4644304 bytes) muxed
[libx264 @ 0x7f8f3d01d600] frame I:9 Avg QP:24.44 size: 40665
[libx264 @ 0x7f8f3d01d600] frame P:111 Avg QP:27.69 size: 16127
[libx264 @ 0x7f8f3d01d600] frame B:307 Avg QP:28.80 size: 7409
[libx264 @ 0x7f8f3d01d600] consecutive B-frames: 3.5% 1.4% 1.4% 93.7%
[libx264 @ 0x7f8f3d01d600] mb I I16..4: 33.7% 60.9% 5.4%
[libx264 @ 0x7f8f3d01d600] mb P I16..4: 5.9% 8.2% 0.5% P16..4: 32.3% 5.0% 3.0% 0.0% 0.0% skip:45.1%
[libx264 @ 0x7f8f3d01d600] mb B I16..4: 2.5% 5.7% 0.0% B16..8: 16.9% 1.6% 0.0% direct:12.2% skip:61.0% L0:44.6% L1:52.6% BI: 2.7%
[libx264 @ 0x7f8f3d01d600] final ratefactor: 26.28
[libx264 @ 0x7f8f3d01d600] 8x8 transform intra:63.6% inter:89.1%
[libx264 @ 0x7f8f3d01d600] coded y,uvDC,uvAC intra: 28.5% 50.8% 8.1% inter: 7.2% 11.5% 0.0%
[libx264 @ 0x7f8f3d01d600] i16 v,h,dc,p: 27% 23% 9% 42%
[libx264 @ 0x7f8f3d01d600] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 21% 19% 28% 5% 5% 6% 6% 6% 4%
[libx264 @ 0x7f8f3d01d600] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 19% 16% 7% 8% 7% 9% 5% 3%
[libx264 @ 0x7f8f3d01d600] i8c dc,h,v,p: 60% 25% 11% 4%
[libx264 @ 0x7f8f3d01d600] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7f8f3d01d600] ref P L0: 70.5% 29.5%
[libx264 @ 0x7f8f3d01d600] ref B L0: 86.8% 13.2%
[libx264 @ 0x7f8f3d01d600] ref B L1: 96.3% 3.7%
[libx264 @ 0x7f8f3d01d600] kb/s:2075.27
[aac @ 0x7f8f3d03e600] Qavg: 523.747
More information about the ffmpeg-user
mailing list