[FFmpeg-devel] [PATCH] libavformat/mov.c: use calculated dts offset when seeking in streams

Wed Mar 7 13:27:45 EET 2018

On Mon, Mar 05, 2018 at 03:53:04PM -0800, Sasi Inguva wrote:
> This patch seems to be doing the wrong thing and breaking seek tests for us.
> 
> As far as I understand , seeking for most containers is based on "decoding
> timestamp". Unless  AV_SEEK_TO_PTS flag is specified in container, which is
> not for most containers and MOV.  So if PTS and DTS are like such,

I think most seeking in libavformat uses whatever was convenient for the 
implementation of each demuxer. Some are much more carefully implemented
than others. Iam not so sure its consistently DTS when AVFMT_SEEK_TO_PTS
is not set.

> Pts  Dts
>  0  -2   : frame0
>  1  -1   : frame1
>  2   0   : frame2
>  3   1   : frame3
> ...
> Seeking  to "0" timestamp without any flags, I should expect frame2 . But
> instead this patch will give me frame0 . The patch's intention seems to be
> seeking based on PTS (subtracting by the sc->time_offset essentially is a
> mapping from PTS to DTS) .

IMO, seeking per PTS is more usefull to end users, the user gets what he asked
for, a frame that he can display at that requested time.
The ultimate goal should be to have proper frame exact seeking or as close to
it as practically implementable.

DTS is sometimes easier to implement on our side. 
But with seeking per DTS it is a heuristic gamble on how to get a frame to
display at a specific timestamp. Its possible with seek per dts to never
get a frame displayable at the time requested.

For formats with a full index like mov/mp4 seeking per pts should be IMHO be
preferred over dts.

There are more corner cases that require mentioning, like for example that there
can be a B frame after a I frame with a pts prior to the I frame.
as in

coded order:
Pts Dts 
1  -1   I frame
0   0   B frame
3   1   P frame
2   2   B frame
It is desirable and valid to seek to the I frame at pts=1 for a seek targeting 
the B frame  at pts=0 if and only if this B frame can be displayed and 
does not depend on a frame of the prior GOP (streams generally have flags
in their headers for this specific case to be detectable)

Furthermore, if the goal is to seek to pts=5 and there is a keyframe at pts=5
and one at pts=4. And there is another stream that can only be decoded at pts=5
if demuxing starts at pts=4 then the demuxer can seek to pts=4 instead of 5.
This is especially the case for containers with subtitles where their
display may require positioning at a earlier place and then potentially
discarding packets in streams that are unneeded.

also avformat_seek_file() allows much more flexibility for the user to specify
where and how to seek. And where the av_seek_frame() API does not specify
what timestamp is used. avformat_seek_file() does specify that
"can be presented successfully" and can thus not be just DTS

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

It is what and why we do it that matters, not just one of them.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20180307/e43d10e7/attachment.sig>