[FFmpeg-user] Illustration review, Streams and GOP Frame Reordering [was: a couple of things to look at]
Mark Filipak
markfilipak.imdb at gmail.com
Mon Mar 4 02:11:11 EET 2024
On 03/03/2024 18.33, Jim DeLaHunt wrote:
> Mark:
>
> On 2024-03-02 19:51, Mark Filipak wrote:
>> I have a couple of things to look at.
>>
>> https://markfilipak.github.io/Video-Object-Notation/Streams.html
>> https://markfilipak.github.io/Video-Object-Notation/GOP%20%26%20Frame%20Reordering.html
>>
>> Comments are welcome. Please be brutal. 'Streams' is crucial.
>
> Good work!
Thank you.
> Regarding the Streams illustration <https://markfilipak.github.io/Video-Object-Notation/Streams.html>:
>
> The macroblock to slice to picdata transition is clear. Showing 45 macroblocks in a horizontal slice
> works. Good work.
>
>... It is hard to count out 45 from single-digit numbers. 00..44 would be much clearer.
I agree, and I would have "0..44" if I could. If I used 2-digit numbers, I'd have to almost double
the table width. The issue is that FireFox doesn't support 'font-size' style, so making the font
smaller to fit can't be done.
> The complete list of 0..29 slices is visually overwhelming, and not necessary. I think you could
> keep slices 0..2, elide slices 3..27 with a vertical ellipsis, and keep slices 28..29. That would
> get the slice structure across.
I'm going for visual impact, too. Do you find what I have confusing?
> The slice structure lacks a comment with size, of the sort you included for macroblock and picdata.
> The full slice structure does not leave any room for such a comment.
Well, I felt that with all 30 slices and all 1350 macroblocks explicitly shown, comments were
superfluous. They will get looked at one time, then ignored for the rest of time.
> Regarding the GOP & Frame Reordering illustration,
> <https://markfilipak.github.io/Video-Object-Notation/GOP%20%26%20Frame%20Reordering.html>:
>
> Time is plastic in illustration space also. You have term definitions which happen after the first
> use of those terms. It would be easier to follow if the term definitions could come at first use.
>
> The opening text, "an I-frame followed by P-frames and optional B-frames", could be improved by
> adding term definitions. e.g. "an I-frame (complete unto itself, sometimes called keyframe) followed
> by P-frames (predictive based on differences with the preceding I-frame) and optional B-frames
> (bipredictive based on differences with the preceding P-frame and I-frame)".
Thanks, Jim. That's your style.
> The first rectangle, GOP specimen, gives a particular frame order. Which order is this? Is this the
> order of frames in the incoming data stream, before reording? That specimen seems to be in PTS
> order. Is this necessary, or coincidental?
Yes, frames in the stream are in PTS order.
> What reordering happens in the first step? Is it reordering from incoming stream order to DTS order?
Yes.
> I don't get how the conveyor belt metaphor and illustrations add value.
They can easily be visualized and they are memorable.
> Then show arrows from that sequence down to the same frames, in PTS order.
>
> It is not clear to me why the final two B-frames have later DTSs than the following I-frame, but
> earlier PTSs. Why would these B-frames not be relative to the first I-frame?
They are between the last P-frame and the next I-frame of the next GOP. They have no relation to the
I-frame back at the beginning of their own GOP other than through the P-frame.
> If they are relative to the second I-frame, why would that I-frame not have an earlier DTS?
It does. When reordered, the next GOP's I-frame is decoded before the previous GOP's B-frames. You
see that every time in every video that has B-frames.
> Are the B-frames relative to the final P-frame before them?
To my understanding, yes. That's what the page is about.
> What is going on visually that the encoder would choose to sequence things this way?
To my understanding, it's complying with the specifications: MPEG ISO & ITU.
> It is great to have a reference to the specification which you are illustrating, "ITU-T H.262
> (02/2012)". It would be even better to have that at the beginning. The illustration might explain
> its goal, e.g. "This illustrates the Group of Pictures and frame reordering operations as described
> in ITU-T H.262 (02/2012)."
It's a matter of writing style. I prefer to not justify something until after I've said it, if at all.
> And, these diagrams are amazing works of character graphics. They would be even more amazing as
> works of vector graphics. But drawing them in vector graphics would require a different skill-set.
They can be swipe-copied and pasted as plain text. You can't do that with either tables or vector
graphics. I consider that important.
Thanks for your thoughts, Jim
--Mark.
More information about the ffmpeg-user
mailing list