[MEncoder-users] V210 Codec

Larry Reznick lreznick at idistream.com
Wed Feb 18 22:01:23 CET 2009


Reimar Döffinger wrote:
> On Wed, Feb 18, 2009 at 11:01:29AM -0800, Larry Reznick wrote:
>   
>> Looking inside the MOV reveals that the time scale is 2997, the duration 
>> is 219125, which works out to 73.114781 seconds, and the number of 
>> frames is 1753. Thus, 1753/(219125/2997) = 23.976, or 2997/125. Despite 
>> any internal single-precision rounding error the YUV4MPEG2 header should 
>> have got the 2997:125 value because 1753*2997 = 5253741, which fits into 
>> 32 bits, and 5253741/219125 reduces to 2991/125. The rational fraction 
>> routine MPlayer uses is a problem. Creeping sync errors bit me too many 
>> times in the past.
>>     
>
> Ah, I missed your source was a mov file. The problem with that format is
> that it is inherently variable-framerate. libavformat guesses a fps
> value based on the time-stamps of the first few frames, so it is
> unlikely it's the same value as you just calculated.
>   

That's unfortunate, but I suppose it was the easy way to handle multiple 
formats with varying information.


>>>> Mathematically, this fraction works out to 
>>>> 23.97599983 instead of 23.976. In fact, 23.976 should have been 
>>>> expressed as 2997/125.
>>>>         
>>> 23.976 is a rounded value for display, not exact.
>>>   
>>>       
>> I understand, but, as I showed above, the metadata in the MOV works out 
>> to that number number, while single-precision error delivered the number 
>> that went into the YUV4MPEG2 file header. While it may or may not be 
>> accurate to put 24000:1001 in the file when 2997:125 comes out of the 
>> MOV data, it is certainly inaccurate to put 12570329:524288 into the file.
>>     
>
> The value 2997:125 is nowhere in the metadata though. Your guess may be
> more precise, but requires knowing the full length of the video and the
> number of frames (not so hard for mov) and works very badly if you have
> a few frames with excessive duration (e.g. multiple seconds) somewhere
> in the file.
>   

Actually, it is in the metadata, although I've reduced the fraction. The 
timescale (2997 for this clip) is repeated in many atoms (boxes). In the 
MOOV atom, the MVHD subatom has both it and the duration (219125 for 
this clip). Divide the duration by the timescale to get the clip's 
length in seconds. Furthermore, the MVHD timescale is the default when 
no other timescale comes along. An example of that is when the duration 
is repeated in the TRAK atom's TKHD subatom, which does not contain the 
timescale. However, both these values repeat in other atoms. As for the 
length, in the STBL atom, the STTS atom has the length in frames (1753 
for this clip). So does the STSZ, although there are two ways to 
interpret that. My point is that one must read these atoms to know how 
to interpret the MDAT atom. Consequently, the 1753, the 2997, and the 
219125 for this clip are knowable quickly and they reduce to 2997:125 in 
32 bits as I showed previously.


> You could try if -demuxer mov guesses better than libavformat though.
>   

That's an excellent suggestion. I tried it and got the following output:

====8<====
$ time mplayer -nocache -nosound -fps 24000/1001 -demuxer mov -vo 
yuv4mpeg:file=spirit_720p_3.y4m ~/movies/trailers/the_spirit_h720p.mov
MPlayer SVN-r28635-4.3.0 (C) 2000-2009 MPlayer Team
138 audio & 293 video codecs

Playing /home/lreznick/movies/trailers/the_spirit_h720p.mov.
ISO: File Type Major Brand: Original QuickTime
Quicktime/MOV file format detected.
[mov] Video stream found, -vid 0
[mov] Audio stream found, -aid 1
VIDEO:  [avc1]  1280x544  24bpp  23.976 fps    0.0 kbps ( 0.0 kbyte/s)
Clip info:
 comments: Encoded and delivered by apple.com/trailers/
 copyright:  2008 Lionsgate Films. All Rights Reserved
 name: The Spirit
Using (default) progressive frame 
mode.==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
Selected video codec: [ffh264] vfm: ffmpeg (FFmpeg H.264)
==========================================================================
Audio: no sound
FPS forced to be 23.976  (ftime: 0.042).
Starting playback...
VDec: vo config request - 1280 x 544 (preferred colorspace: Planar YV12)
VDec: using Planar YV12 as output csp (no 0)
Movie-Aspect is undefined - no prescaling applied.
VO: [yuv4mpeg] 1280x544 => 1280x544 Planar YV12
V:  73.1 1753/1753  8%  6%  0.0% 0 0

Exiting... (End of file)

real    1m13.090s
user    0m7.076s
sys    0m3.321s
====>8====

You'll notice this time the [mov] announced itself instead of [lavf] and 
the frame counts appeared, unlike in lavf, but the header was the same:

====8<====
$ head -1 spirit_720p_*.y4m
==> spirit_720p_2.y4m <==
YUV4MPEG2 W1280 H544 F6285171:262144 Ip A1:1

==> spirit_720p_3.y4m <==
YUV4MPEG2 W1280 H544 F6285171:262144 Ip A1:1
====>8====

File #2 was generated by lavf by default while #3 was generated by 
explicit "-demuxer mov" yet they have the same F argument in the header 
-- even though I used the -fps option. Oh, well.


>>>> However, one could argue that the rate is 
>>>> supposed to be 24000/1001, which is 23.976023976 (repeating the last six 
>>>> digits). Thus, while 23.976 is theoretically inaccurate enough, the 
>>>> bizarre fraction MPlayer comes up with causes even more error.
>>>>         
>>> Well, it's not the one I get, mine corresponds to
>>> 23.97602462768554687500.
>>> Nevertheless, since the calculation method was already changed in
>>> mencoder I saw no fault in using the same in that code, which works for
>>> all common frame rates (though I notice it might cause issues for frame
>>> rates < 0), e.g. I now get F24000:1001
>>>       
>> It's an interesting problem. As you can see, I'm using an SVN version 
>> from this week and I don't get the number you got.
>>     
>
> Well, IMO this is simply because your source is in a format with
> variable frame-rate and non-exact timestamps. Still, you can check if it
> is any better with very latest SVN. I don't think so, but at least if
> you specify explicitly -fps 24000/1001 it should write the right
> numbers, which is simpler than editing it afterwards (and works with
> fifos).
> _______________________________________________
> MEncoder-users mailing list
> MEncoder-users at mplayerhq.hu
> https://lists.mplayerhq.hu/mailman/listinfo/mencoder-users
>
>   

I've parsed the MOV file with a tool I wrote last year. The mp4dump 
program shows much of this same info. The video STTS uses the 
single-entry format, which means that its setting is uniform across the 
entire clip. That is, STTS specifies a count of 1753 and a duration of 
125. Multiply them to get the duration 219125. Then, of course, take the 
2997 timescale and divide it by the frame duration of 125 to get the 
frame rate of 23.976. So, the frame time does not vary in any frame and 
the timestamps are exactly 23.976 by these rational fractions. However, 
there is a CTTS table for the B-frames that alternates +125 & -125 
entries. Is that what you're thinking of as a variable frame rate? I 
thought the CTTS was used for composition reordering.

--Larry



More information about the MEncoder-users mailing list