[FFmpeg-user] Cutting & splicing AVC -- read me
Mark Filipak
markfilipak.imdb at gmail.com
Thu Aug 1 23:41:37 EEST 2024
I've spent the past 3 months working on this [note1]. During that time I've submitted 2 bogus bugs
and wasted other peoples' time. I've tried other peoples' patience. I'm sorry. I apologize, but it
was not my fault. You see, I believed FFmpeg documentation. I believed x264 status reports [note2].
Success: I've trimmed, spliced, and AVC/m2ts-to-AVC/mp4 converted a 5 hour video (50 G-bytes reduced
to 9 G-bytes) with zero loss of resolution, even viewed at 4x. I include the -x264-params [note3] so
that everyone can have a method to get a lossless reduction of over 5x.
So, what caused me to waste 3 months? Bogus documentation and bogus status reports [note2]. FFmpeg
led me to believe that AVC has I- P- & B-frames and GOPs. It does not and never did.
MPEG-2 can be altered via remuxing but AVC cannot [note4]. However, FFmpeg documentation makes users
think that AVC is like MPEG-2 and can be altered via remuxing.
Here's the truth about AVC.
An AVC stream is not populated by frames, it is populated by AUs (access units) which are, in turn,
populated by various NAL units (network abstraction layer units). My video has 448297 AUs. They are
not frames, they are AUs. The arrive at the times that frames would arrive if frames were streamed.
I tried to promote discussion in my thread "The future of video" but that failed to promote
discussion and I didn't want to be pedantic, so I let it drop.
NAL units are what are muxed. AUs are the abstracted equivalents of frames. H.264 mentions that NAL
payloads are RBSPs (raw byte sequence payloads) but of course they are not raw bytes (so ITU/MPEG is
simply using sloppy nomenclature once again). MPEG-2 GOPs (group of pictures) is equivalent to
NAL-5, i.e., IDR-VCL (instant decoder refresh-video coding layer) in AVC. All NAL-VCLs can have I-
P- & S-slices. AVC does not distinguish I P B except as slices.
FFmpeg's AVC documentation (and probably VideoLAN's x264 documentation, too) needs to be fixed.
[note1] Believe me, a waste of 3 months is a tragedy to someone 77 years old.
[note2] Evidence of libx264's bogus status.
[libx264 @ 0000000000577fc0] frame I:536 Avg QP:23.92 size:103564
[libx264 @ 0000000000577fc0] frame P:24209 Avg QP:27.01 size: 31261
[libx264 @ 0000000000577fc0] frame B:83229 Avg QP:28.45 size: 9510
[libx264 @ 0000000000577fc0] consecutive B-frames: 0.7% 0.9% 9.0% 33.5% 33.6% 22.4%
[libx264 @ 0000000000577fc0] mb I I16..4: 13.4% 79.8% 6.8%
[libx264 @ 0000000000577fc0] mb P I16..4: 2.7% 16.1% 0.3% P16..4: 40.3% 5.8% 4.4% 0.0% 0.0%
skip:30.4%
[libx264 @ 0000000000577fc0] mb B I16..4: 0.3% 1.9% 0.0% B16..8: 34.5% 1.4% 0.2% direct:
2.2% skip:59.7% L0:46.3% L1:52.3% BI: 1.4%
[libx264 @ 0000000000577fc0] final ratefactor: 26.61
[libx264 @ 0000000000577fc0] 8x8 transform intra:84.7% inter:86.2%
[libx264 @ 0000000000577fc0] direct mvs spatial:100.0% temporal:0.0%
[libx264 @ 0000000000577fc0] coded y,uvDC,uvAC intra: 59.1% 57.4% 20.4% inter: 7.7% 16.1% 0.8%
[libx264 @ 0000000000577fc0] i16 v,h,dc,p: 34% 18% 20% 28%
[libx264 @ 0000000000577fc0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 9% 55% 4% 3% 3% 3% 4% 3%
[libx264 @ 0000000000577fc0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 13% 29% 6% 7% 7% 6% 6% 4%
[libx264 @ 0000000000577fc0] i8c dc,h,v,p: 69% 13% 16% 2%
[libx264 @ 0000000000577fc0] Weighted P-Frames: Y:3.3% UV:0.8%
[libx264 @ 0000000000577fc0] ref P L0: 46.0% 11.8% 23.8% 7.9% 10.2% 0.4% 0.0%
[libx264 @ 0000000000577fc0] ref B L0: 79.7% 13.4% 5.2% 1.6%
[libx264 @ 0000000000577fc0] ref B L1: 96.3% 3.7%
[libx264 @ 0000000000577fc0] kb/s:2849.11
[note3] -x264-params
ref=5:deblock=-1,-1:me=umh:subme=7:psy-rd=1.0,0.0:merange=30:trellis=1:cqm=flat:deadzone-inter=21:deadzone-intra=11:chroma-qp-offset=-2:threads=12:lookahead-threads=3:sliced-threads=0:slices=4:nr=0:constrained-intra=0:bframes=5:b-pyramid=2:b-adapt=2:b-bias=0:direct=auto:weightp=2:keyint=240:min-keyint=23:scenecut=40:rc-lookahead=50:bitrate=2850:ratetol=1.0:qcomp=0.60:qpmin=1:qpmax=69:qpstep=4:cplxblur=20.0:qblur=0.5:vbv-maxrate=62500:vbv-bufsize=78125:nal-hrd=vbr:filler=0:ipratio=1.40:aq-mode=1:aq-strength=1.00
[note4] It should be possible to fabricate an IDR NAL in the zero-th AU and thereby avoid
transcoding, but to date, no one has done it.
--Mark.
More information about the ffmpeg-user
mailing list