[Ffmpeg-devel] [PATCH] increase max numbers of B frames
Tue Feb 21 22:07:08 CET 2006
On Tue, 21 Feb 2006, Erik Slagter wrote:
> On Tue, 2006-02-21 at 13:44 +0100, skal wrote:
>>> PBBBPBBBP: again, each frame is bigger than it would be in 2 consecutive,
>>> and now there's a decent chance that 67%->75% B-frames isn't enough to
>>> pay off. Though it's a bit more useful than in mpeg4, due to B-pyramid
>>> (which I won't detail here) and miscellaneous other improvements in the
>> Maybe this point is worth some details: h264 permits use the middle
>> B-slice as reference, making the reference anchors not so far from
>> the remaining inbetween b-slices (non-refs). So, you're back to
>> deciding PbBbP over PbPbP for the refs, which is quite similar to
>> the PBP vs. PPP decision, as far as refs placement is concerned.
>> Well, this is quite an uncharted (and slippery) land, but some
>> are advocating use of GOPs as big as 32 (!):
> A very crude and non-representative test ;-) reveals that
> - without b_pyramid: 3 b frames is optimal (1784 kb/s)
> - with b_pyramid: 5 b frames is optimal (1636 kb/s)
> Both show no significant improvement with larger amounts of B frames
> inserted. Nice to know :-)
> For this test I took a short clip, with forced B frames, otherwise the
> encoder would not select more than 1 B frame at a time...
Do not compare just bitrate. Remember that one of the reasons B-frames
improve compression is that they allow asymmetric allocation of quality,
whereby the B-frames are quantized more and the P-frames less. But if you
encode with constant QP, then it quantizes B-frames more and doesn't
change the P-frames to match, thus decreasing overall quality a little.
You could encode with a different constant QP, but QP's granularity is too
coarse for such a comparison.
So run 2pass encodes with the same target bitrate, and compare PSNR or
other quality metric. (That being easier than encoding with a target PSNR
and comparing bitrate.) Keeping in mind that PSNR sucks, it's still better
than ignoring quality entirely.
> What is the suggested (default) gop size for h264 anyway? I was under
> the impression something like 250 was sort of optimal for (old style)
Confusing choice of terms. Skal's 32 was the repetition period, i.e.
... though I'm sure that in practice you wouldn't use a constant period,
just like a constant number of conventional B-frames isn't optimal. (Ok,
so some sources really do suggest using 32-frame GOPs with no P-frames.
They're just wrong.)
There is no optimal GOP size in h264. Bigger GOP = better compression, up
until 1 GOP = 1 movie scene, at which point increasing the allowed GOP
size won't affect the encode at all. Because the codec will choose to put
an I-frame at the scenecut no matter how big the GOP is allowed to be.
The only tradeoff is compression vs seeking and error resilience. There
are other ways to deal with error resilience that are much better than
extra I-frames, so I consider only compression vs seeking. The default is
GOP=250 because 10 seconds is a reasonable seeking granularity, and it's
not worth the interface complexity to make it depend on framerate. Keep in
mind that this is only a worst-case; if scenes are shorter than 10
seconds, you'll still be able to seek to each scenecut.
This is not the same as in mpeg*, where there is an additional factor
arguing for smaller GOPs: DCT drift. The mpeg4 standard recommends no more
than about 100 P-frames (actually they give a specific number, I don't
know where they got it from), and mpeg2 recommends 12-18 frames, or 4-6
P-frames. As long as you use the same implementation of DCT in both the
encode and decoder, the length doesn't matter. But if they differ (and
many mpeg2 codecs are pretty sloppy about this), then you get accumulated
error. (For some reason, the most common symptom I've seen of this is
horizontal green and purple stripes.)
More information about the ffmpeg-devel