[FFmpeg-user] Towards better trims & concatenations
Mark Filipak
markfilipak.imdb at gmail.com
Tue Jan 9 23:16:37 EET 2024
On 1/9/24 07:04, Rob Hallam wrote:
> On Mon, 8 Jan 2024 at 23:02, Mark Filipak <markfilipak.imdb at gmail.com> wrote:
>
>> [explanation snipped]
>> Oh, I think I see why your difficulty, Rob.
>
> Thank you for taking the time to write the explanations, they are much
> appreciated.
>
> My difficulty, as you guessed, is I don't know about the internals of
> video containers.
I know DVD VOBs but none of the others.
>
> I thought it might work like being handed pages from a book- annoying
> if they're in the wrong order, but fixable to get the right order
> since there are page numbers. From what you say, it sounds like in
> this analogy not only are the pages (packets) in the wrong order, the
> page numbers (timestamps) are wrong- and not even consistently wrong?
That is an excellent analogy. "Yes" to all you wrote. FFmpeg is not just scrambling the 'page
numbers' (DTSs & PTSs), it's changing the 'page numbers' so that the page marked "23" is not page
23. In the hardbound version of the book (m2ts), page 23 has been marked "54" and in the softbound
version of the book (mp4), page 23 has been marked "3845". I'm not saying that that's what must be
happening to explain what I see, I'm saying that's what _is_ happening.
Now I have to be careful and logical at this point because I'm using ffmpeg to look at what ffmpeg
is doing. There's a danger there. If ffmpeg is in error, as it certainly appears to be, then either
the packet table is built wrongly or the packet table is correct but it's being accessed wrongly. I
have no idea which alternative, build or access, is wrong. I could dump the packets and look at the
binary values and parse the headers to see the actual 'page numbers', but I know how to do that
solely for DVD VOB files. Or, I could somehow dump ffmpeg's packet table and see whether it's
correct. I assume that '-framecrc' is doing that. Based on '-framecrc', it appears the packet table
is being built wrongly.
I could reliably use ffmpeg to examine ffmpeg if I could get ffmpeg to stop making changes and to
instead make just the changes I want. For example, when I do a trim-cut, ffmpeg automatically
changes all the 'page numbers'.
If you took a physical book and ripped out pages 1..50, the top page would say "51". That doesn't
happen in ffmpeg. The top page is now different, but it's different in inconsistent ways. The top
page might say "0", or it might say "73688" or it might say "-4967". There doesn't seem to be any
consistency about what the top page's 'page number' is.
If I rip out pages 1..50, I want '-framecrc' to show "51" as the top 'page'. It appears there's no
way to do that. It appears there's no way to prevent ffmpeg making wholesale changes to the packets.
Now, there is the packet CRCs. Looking at them is the reason '-framecrc' exists. Tracking those
packet CRCs before the cut and after the cut tells me that, yes indeed, ffmpeg is changing the page
numbers to wrong numbers.
For example, when joining two trims, each over an hour long, I get 6 seconds of the join in which
the MPV player stops-starts-stops-starts-stops-starts-stops-starts-stop-starts -- 5 stutters. During
that time, the 1st non-black frame of the second section flashes 3 times. When I look at those
packets, '-framecrc' shows me why that's happening. The 'page numbers' don't match up.
Now I cut on I-frames, so joining two clips should simply abut #2's beginning I-frame to #1's ending
I-frame. That's not what's happening. What's happening is that parts of #1 are spewed into #2 and
parts of #2 are spewed into #1. The videos are 'mixed' in some strange way. It's hard to figure out
because the 'page numbers' are unreliable, and they come out of order.
Imagine you rip out pages 1..50. You find that page 51 now says "398". In addition, what was page 49
has magically become "399" and is in the wrong section.
>> Trimming errors are wrecking concatenations ...
>> [snip]
>> trimming has to take PTS into account so that the cut happens in the right spot with no leftover
>> packets that shouldn't be there, but that apparently isn't happening and I have the proof.
>
> That certainly sounds consistent with behaviour I saw in the past when
> I tried to re-join trimmed clips.
>
>> To be frank, Rob, if you want to help yourself, you may want to help me. I published my procedure.
>> Duplicate it and apply it to some of the videos you've had problems with. Learn how to use
>> '-framecrc' and '-showinfo'. It will take you awhile, but it will be time well spent. It will
>> demystify a lot for you. I'll be here to help if you like.
>
> Thank you. The videos I've worked with in the past were simple
> h264-in-mkv/mp4, but at the time having to do a full re-encode was
> irksome. If I get a chance to replicate your procedure I'll post the
> results.
When I saw what was happening I stopped and tried to partition the problem into cut-problems and
join-problems. In that process I found that there were cut-problems at the beginning of the process,
too, and that concentrating on those beginning cut-problems was 'simpler' -- I'm cutting out the
"Criterion" animation and the succeeding "Janus Films" logo so that my final concatenations will
begin with just the "Svensk Filmindustry" animation: The original Swedish starting point.
There definitely are cut-problems. In some cases, the 'page numbers' are wrong but the 'page
sequences' are correct. In others, both 'page numbers' and 'page sequences' are wrong. In others,
'pages' that should have been ripped out ("49") wind up in the save-section (following "50" but
still marked "49"), and in others, marked with a different 'page number' altogether. Tracking the
CRCs is the way to discover what's really happening. If you look at my 'recipe' script you will see
all those cases. Jim DeLaHunt wants to partition the results and come up with simple cases that
'capture' each case. I don't know how to do that because I don't know the internals of ffmpeg.
Now, are the join-problems caused solely by cut-problems at the beginning of #2 or are there really
separate, independent join-problems? I don't know. I haven't gotten that far. But considering my
I-frame-abuts-to-I-frame scenario, it appears there are separate, independent join-problems. I just
can't be sure.
Now, what if you ripped out pages 1..50 and looked at what was left? What if you discovered that
what was left is pages 23..81 from _another_ book jammed in front of your page 50? That's happening,
too. Audio packets that follow 'page 50' are winding up in a 58 packet run ahead of 'page 50' after
the cut. It's stuff like that that causes the MPV player to _not_ start at "00:00:00.000".
NONE of the cuts I've done on this video have started at "00:00:00.000" in MPV.
More information about the ffmpeg-user
mailing list