[FFmpeg-devel] [PATCH] libvpx: alt reference frame / lag

Thu Jun 17 17:42:28 CEST 2010

On Thu, Jun 17, 2010 at 12:24 AM, Reimar D?ffinger
<Reimar.Doeffinger at gmx.de> wrote:
> On Wed, Jun 16, 2010 at 10:30:50PM -0400, John Koleszar wrote:
>> I still think this multiple packet approach is very much KISS, and
>> it's not just libvpx that it's simple for. The other part of that rule
>> is "make it as simple as possible, but no simpler."
>
> No it really isn't. It introduces data packets that produce no data.
> The fact that no other format has this means this needs extensive
> testing and honestly won't work for quite a few applications
> (won't work with MEncoder, won't work with MPlayer if you try to
> override -fps to some fixed value, probably won't work with ffmpeg
> if you try to use -fps, I doubt it will work with any Video for Windows
> application, for which VirtualDub is an example, and I think that
> is not possible to fix).
>

Yes, it doesn't work if you only support fixed frame rates. But VFR is
a valuable feature, and it's required to be supported in WebM
independent of invisible ARFs, so this isn't a new requirement.

>> >> There are existing
>> >> applications that very much care about the contents of each reference
>> >> buffer and what's in each packet, this isn't a hypothetical like
>> >> decoding a single slice.
>> >
>> > Which applications exactly? What exactly are they doing? And why exactly
>> > do they absolutely need to have things in a separate packet?
>>
>> I'm not going to name names, but I'm talking specifically about video
>> conferencing applications. I should have been more precise here --
>> these applications aren't using invisible frames today (though they do
>> use the alt-ref buffer) but I called them out because they're the type
>> of applications that are *very* concerned with what's going on in the
>> encoder, they will want to use invisible frames in the future, and
>> they'll need access to the frames in the most fine-grained way
>> possible.
>
> Do you have any argument except repeating that they will need that?
> I can only assume they will want to drop frames behind anyone's back.
> In that case the questions are
> 1) Do they really need to be able to drop the frame after ARF?
> 2) Do they really have to be able to do that without parsing the ARF?
> 3) Do they really need to be able to do that before having received
> ? the full data of the frame following the ARF?
>
> Even if all of them are true, it would be possible to append extra data
> to help this case after each frame that would allow splitting of ARF
> data, which should be backwards-compatible with all existing decoders
> and even for any other application it should only impede their ability
> to drop the frame right after an ARF
>

Here's another example: consider a VNC type application. You could use
the ARF buffer to keep a low-quality version of parts of the screen
that are currently hidden. You might go a long time without ever
updating a visible part of the screen, but still do updates to the
hidden part of the screen. This could give a better experience when
dragging windows around. In this case, there would be many invisible
packets, and there wouldn't be a visible packet to pack them with.

>> I've been through a lot of the advantages of keeping the data
>> separate, but it mostly boils down to staying out of the way of
>> applications that know what they're doing
>
> I think you are arguing that we should make things more complex for
> everyone for the sake of simplifying things for a use-case that
> currently does not exist and you don't specify well enough so we
> could suggest and alternative.

I don't want to be in the business of saying what people can and can't
do with the codec, so I'm looking to provide flexibility. Yes, ARFs
are a new feature that other codecs don't have. No, I don't know all
the ways that people will find to use them yet.