[Libav-user] Why does av_seek_frame() not seek to a keyframe?

Thu Jul 19 10:54:35 CEST 2012

On 07/19/2012 09:25 AM, Carl Eugen Hoyos wrote:
> Hendrik Leppkes <h.leppkes at ...> writes:
>
>> On Thu, Jul 19, 2012 at 8:31 AM, Carl Eugen Hoyos wrote:
>> Isn't this necessary for some codecs / at least for H264?
>> Wouldn't the API do exactly the same?
>>
>> Sure, if you end up seeking to a non-keyframe, it should not
>> decode garbage frames.
>> The point here is that we would want to
>> actually end up on a proper keyframe.
> I thought that valid H264 streams do not necessarily contain
> key-frames. (FFmpeg is supposed to seek correctly in such
> files.)
H264 streams are not required to have IDR frames, but they are required 
to have recovery points.  So the equivalent operation would be to seek 
to the  recovery point where recovery pos + recovery_frames <= seek 
pos.  You are guaranteed to have a complete frame after decoding 
recovery_frames (slight oversimplification).

>>> But the idea is to ensure that you can decode as of time X,
>>> which means you need to seek to the last keyframe *before*
>>> time X, which is something that is not supported currently.
>> I see.
>> (But am I wrong to assume that this is not generally possible,
>> assuming large GOPs?)
>>
>> Even a large GOP has a start somewhere. Granted seeking might
>> take a bit longer because reading data backwards to find the
>> keyframe is not as ideal as going forward, but it should be
>> possible.
> So if the file is ~2GB and (as a user) I seek approximately to
> the middle of the video and decide to seek back then, the
> video should be decoded from the beginning to find the keyframe
> before the position I want to seek to?
> (I am just trying to find out if I am correct in my believe that
> it is not generally possible / useful to seek to the latest
> keyframe before the requested position, but that the only
> realistic approach is to seek to the nearest keyframe at or later
> than requested.)
I almost agree with Carl here, but I think there are alternative 
conclusions.  The "general" case includes worst case scenarios that 
would result in undesirable behavior. How far before the seek point you 
search for a keyframe should be an application level decision. So one 
alternative is the application seeks to a position before the desired 
position and reads till it reaches the desired position.  But in the 
"typical" case, this can be inefficient.

Another alternative is libav supplies a helper function that does this 
for you. The API for such a helper would have a parameter to set the 
seek window.  The advantage to libav providing a helper function for 
this is libav has more information than the app in many cases in the 
form of an index.  There may be keyframes closer to the desired seek 
position than the window limit. libav could seek directly to the closest 
keyframe to the desired position through the index.  The application can 
not do this and would always have to seek first to the outer limit of 
it's desired window and then read from that point onward.

John