[FFmpeg-user] Cutting & splicing AVC -- read me

Thu Aug 1 23:41:37 EEST 2024

I've spent the past 3 months working on this [note1]. During that time I've submitted 2 bogus bugs 
and wasted other peoples' time. I've tried other peoples' patience. I'm sorry. I apologize, but it 
was not my fault. You see, I believed FFmpeg documentation. I believed x264 status reports [note2].

Success: I've trimmed, spliced, and AVC/m2ts-to-AVC/mp4 converted a 5 hour video (50 G-bytes reduced 
to 9 G-bytes) with zero loss of resolution, even viewed at 4x. I include the -x264-params [note3] so 
that everyone can have a method to get a lossless reduction of over 5x.

So, what caused me to waste 3 months? Bogus documentation and bogus status reports [note2]. FFmpeg 
led me to believe that AVC has I- P- & B-frames and GOPs. It does not and never did.

MPEG-2 can be altered via remuxing but AVC cannot [note4]. However, FFmpeg documentation makes users 
think that AVC is like MPEG-2 and can be altered via remuxing.

Here's the truth about AVC.

An AVC stream is not populated by frames, it is populated by AUs (access units) which are, in turn, 
populated by various NAL units (network abstraction layer units). My video has 448297 AUs. They are 
not frames, they are AUs. The arrive at the times that frames would arrive if frames were streamed. 
I tried to promote discussion in my thread "The future of video" but that failed to promote 
discussion and I didn't want to be pedantic, so I let it drop.

NAL units are what are muxed. AUs are the abstracted equivalents of frames. H.264 mentions that NAL 
payloads are RBSPs (raw byte sequence payloads) but of course they are not raw bytes (so ITU/MPEG is 
simply using sloppy nomenclature once again). MPEG-2 GOPs (group of pictures) is equivalent to 
NAL-5, i.e., IDR-VCL (instant decoder refresh-video coding layer) in AVC. All NAL-VCLs can have I- 
P- & S-slices. AVC does not distinguish I P B except as slices.

FFmpeg's AVC documentation (and probably VideoLAN's x264 documentation, too) needs to be fixed.

[note1] Believe me, a waste of 3 months is a tragedy to someone 77 years old.

[note2] Evidence of libx264's bogus status.
[libx264 @ 0000000000577fc0] frame I:536   Avg QP:23.92  size:103564
[libx264 @ 0000000000577fc0] frame P:24209 Avg QP:27.01  size: 31261
[libx264 @ 0000000000577fc0] frame B:83229 Avg QP:28.45  size:  9510
[libx264 @ 0000000000577fc0] consecutive B-frames:  0.7%  0.9%  9.0% 33.5% 33.6% 22.4%
[libx264 @ 0000000000577fc0] mb I  I16..4: 13.4% 79.8%  6.8%
[libx264 @ 0000000000577fc0] mb P  I16..4:  2.7% 16.1%  0.3%  P16..4: 40.3%  5.8%  4.4%  0.0%  0.0% 
   skip:30.4%
[libx264 @ 0000000000577fc0] mb B  I16..4:  0.3%  1.9%  0.0%  B16..8: 34.5%  1.4%  0.2%  direct: 
2.2%  skip:59.7%  L0:46.3% L1:52.3% BI: 1.4%
[libx264 @ 0000000000577fc0] final ratefactor: 26.61
[libx264 @ 0000000000577fc0] 8x8 transform intra:84.7% inter:86.2%
[libx264 @ 0000000000577fc0] direct mvs  spatial:100.0% temporal:0.0%
[libx264 @ 0000000000577fc0] coded y,uvDC,uvAC intra: 59.1% 57.4% 20.4% inter: 7.7% 16.1% 0.8%
[libx264 @ 0000000000577fc0] i16 v,h,dc,p: 34% 18% 20% 28%
[libx264 @ 0000000000577fc0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15%  9% 55%  4%  3%  3%  3%  4%  3%
[libx264 @ 0000000000577fc0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 13% 29%  6%  7%  7%  6%  6%  4%
[libx264 @ 0000000000577fc0] i8c dc,h,v,p: 69% 13% 16%  2%
[libx264 @ 0000000000577fc0] Weighted P-Frames: Y:3.3% UV:0.8%
[libx264 @ 0000000000577fc0] ref P L0: 46.0% 11.8% 23.8%  7.9% 10.2%  0.4%  0.0%
[libx264 @ 0000000000577fc0] ref B L0: 79.7% 13.4%  5.2%  1.6%
[libx264 @ 0000000000577fc0] ref B L1: 96.3%  3.7%
[libx264 @ 0000000000577fc0] kb/s:2849.11

[note3] -x264-params 
ref=5:deblock=-1,-1:me=umh:subme=7:psy-rd=1.0,0.0:merange=30:trellis=1:cqm=flat:deadzone-inter=21:deadzone-intra=11:chroma-qp-offset=-2:threads=12:lookahead-threads=3:sliced-threads=0:slices=4:nr=0:constrained-intra=0:bframes=5:b-pyramid=2:b-adapt=2:b-bias=0:direct=auto:weightp=2:keyint=240:min-keyint=23:scenecut=40:rc-lookahead=50:bitrate=2850:ratetol=1.0:qcomp=0.60:qpmin=1:qpmax=69:qpstep=4:cplxblur=20.0:qblur=0.5:vbv-maxrate=62500:vbv-bufsize=78125:nal-hrd=vbr:filler=0:ipratio=1.40:aq-mode=1:aq-strength=1.00

[note4] It should be possible to fabricate an IDR NAL in the zero-th AU and thereby avoid 
transcoding, but to date, no one has done it.

--Mark.