[Libav-user] security camera app: bookmark+seek mp4 by wall-clock? low latency? visual timestamps?
Camera Man
i.like.privacy.too at gmail.com
Fri Aug 26 09:04:18 CEST 2011
Dear List,
I am developing a "security monitoring" type of application, that takes
input from multiple IP cameras (a mix of Sanyo, Axis and Arecont, each
covering a different area, with possibly different resolutions and
framerates; all accessible through rtsp). The application includes a
recording part, which reads RTSP packets and writes them to a disk file;
and a playback part which reads those files and plays them back. I also
want to show the files while recording with minimum latency.
One of the requirements for the playback application is to show
frame-synchronized views from different cameras (at 25fps, this requires
40ms precision). The recorded files might need to be played on an
independent system (using e.g. vlc of ffplay), so I would like to add a
visual timestamp to the file, either onto the image, or as subtitles.
Some of the cameras insist that they need 4 image references, although
the output stream consist entirely of I and P that never reference more
than the previous ref frame (that is, the streams ARE low delay streams,
despite the SPS/PPS flags).
Now, for the questions, with some discussion below.
(a) is there a general way to bookmark frames while writing a file in
such a way that I can seek to them directly when playing, without
searching? pts is NOT the answer, as explained below.
(b) is there a general way to encode *wall-clock* time into a file I
write? e.g. I want the "pts" (or equivalent) of the first frame of the
file to be 2011-08-26 02:13:57.917 (millisec resolution sufficient - 48
bits more than enough precision).
(c) is there a way to force "low delay" handling of a stream despite its
SPS/PPS description? for some cameras, video decoding is lagging 4
frames after the packet arrival, which - at 5fps, is almost 1 second
delay. (one suggestion given below, looking for more options)
(d) is there a way, other than subtitles, to add a visual timestamp to
the file while writing it, without decoding+overlaying+reencoding?
(e) is there a way to tell, without decoding the video stream, that a
received packet starts a new non-key frame?
Discussion:
for (a) (bookmarking), the solution that I am using so far is:
when recording, after I receive a packet from the RTSP stream, I note
the exact (ntp synchronized) time, and the exact file offset using
avio_tell(), and write them to a database, together with the
AV_PKT_FLAG_KEY of the packet and other data. Every camera has its own
output file.
when playing, to show a specific time, I independently in each file
seek to the nearest preceding key frame using
av_seek_frame(x,y,offset,AVSEEK_FLAG_BYTE), and run the following logic:
do {
av_read_frame(...);
avcodec_decode_video2(...);
} while (avio_tell(...) < file_offset_at_time_wanted);
it mostly works well for h264 "container" files (not really a
container, as it has no header, footer or structure beyond the packets).
However, if I try to do that in any structured file like avi,mp4,3gp,
seeking by byte location does not seem to work properly (read_frame and
or decode_video2 fail).
I'm looking for a solution that would work equally well for mp4 and
avi files. It's possible that one does not exist -- for those formats
that have a pts/dts, perhaps it is possible to use the pts/dts as index?
(h264 files don't have pts or dts info at all)
for (b) (wall clock), assume an mp4 or avi file; if I don't start
pts/dts at 0, I get a delay at the beginning of the file (proportional
to first frame's pts/dts); I suspect there is a way to mark a file
"starting at a late pts", but I must have missed it in the docs? If I
can do that, I can just av_seek_frame() by pts, relying on mp4/avi's
frame index instead of my own. can I do that?
for (c) (low latency), I found this post:
<http://libav-users.943685.n4.nabble.com/Libav-user-latency-of-mpegts-handling-in-libavformat-tp3678681p3695084.html>
with a suggestion for a solution. Seems to work, but looks fragile, and
requires 4 calls to decode for each read_frame. Perhaps there is a
better way?
for (d) (visual timestamp) and (e) (frame boundary) I have no idea.
Thank you for your time and ideas,
Camera Man.
More information about the Libav-user
mailing list