[FFmpeg-devel] FFV1 Specification

Michael Niedermayer michaelni at gmx.at
Sat Apr 7 14:57:15 CEST 2012


On Sat, Apr 07, 2012 at 10:44:45AM +0930, Rodney Baker wrote:
> On Sat, 7 Apr 2012 07:26:47 Michael Niedermayer wrote:
> > On Fri, Mar 30, 2012 at 11:53:58AM +0200, Michael Niedermayer wrote:
> > > Hi
> > > 
> > > Just wanted to announce that ive moved the ffv1 spec to github and
> > > i am working on cleaning it up and updating it to match the existing
> > > implementation.
> > > 
> > > see: https://github.com/FFmpeg/FFV1
> > > 
> > > patches, pull requests and comments are like always, welcome
> > 
> > latest draft at github and at:
> > http://ffmpeg.org/~michael/ffv1-draft/ffv1.html
> > 
> > If someone could read through it and point out where its unclear or
> > incomplete, that would be very helpfull!
> > I imagine i can easyly miss incompletenesses given that i know the
> > codec pretty well ...
> > 
> > Also spellcheck/grammer/formating tips are welcome too!
> > 
> > [...]
> 
> Michael,
> 
> Comments re spelling/grammar/style (I'll leave the technical review to others 
> who know what they're talking about). :-)
> 
> Section 3:
> 
> >"In the case of the JPEG2000-RCT colorspace the lines are interleaved to 
> >reduce cache trashing as most likely the RCT will be immedeatly converted to 
> >RGB during decoding, the order of the lines in the interleaving is again 
> >Y,Cb,Cr. "
> 
> In the case of the JPEG2000-RCT colorspace the lines are interleaved to reduce 
> cache trashing since it is most likely that the RCT will immediately be 
> converted to RGB during decoding; the interleaved coding order is also 
> Y,Cb,Cr.
> 
> [Not sure about "cache trashing" - sounds too "colloquial" for a technical 
> document - is there a better way to say this? Perhaps, "to improve caching 
> efficiency"?]

Yes "to improve caching efficiency" works great


> 
> >Samples within a plane are coded in raster scan order (left->right, top-
> >bottom), each sample is predicted by the median predictor from samples in the 
> same plane and the difference is stored 
> 
> s/bottom), each/bottom). Each/ OR s/bottom), each/bottom); each/
> 
> s/stored/stored./ (Apparently missing full-stops in many other places, too). 
> 
> Is this sentence incomplete? How is the difference stored?

reference added


> 
> Section 3.1:
> 
> > For the purpose of the predictior and context samples above the coded 
> picture are assumed to be 0, right of the coded picture are identical to the 
> closest left sample. And left of the coded picture are identical to the top 
> right one if such exist or 0. 
> 
> s/0, right/0; samples to the right/
> 
> s/left sample. And/left sample; samples to the left/
> 
> s/top right one if such exist or 0/top right sample (if there is one), 
> otherwise 0./
> 
> Section 3.6:
> 
> >Instead of coding the n+1 (or n+2 in the case of RCT) bits of the sample 
> difference with huffman or range coding only the n (or n+1) least significant 
> bits are used as thats enough the recover the original sample. bits in the 
> equation below is bits_per_raw_sample+1 for RCT and bits_per_raw_sample 
> otherwise. 
> 
> Instead of coding the n+1 bits of the sample difference with huffman or range 
> coding (or n+2 bits, in the case of RCT), only the n (or n+1) least 
> significant bits are used, since this is sufficient to recover the original 
> sample. 
> 
> In the equation below, bits represents bits_per_raw_sample+1 for RCT or 
> bits_per_raw_sample otherwise.
> 
> 3.6.1:
> 
> s/H.264[2]. But/H.264[2], but/
> 

> s/situation as well as its slightly worse performance CABAC/situation (as well 
> as its slightly worse performance) CABAC/

that makes it look like we care more about patants than performance


> 
> Non binary values:
> 
> s/integers, we could simply encode/ integers it would be possible to encode/
> 
> s/context, but /context, however/ OR s/context, but/context but/ (I like the 
> first option better). 
> 
> s/symbol which is not only a waste of memory but also requires more past 
> data/symbol which requires both more memory and more past data
> 
> s/reasonable/a reasonably/
> 
> s/Alternatively simply assuming/Alternatively, assuming/
> 
> s/mean like we do in huffman coding mode would be another possibility/
mean (as in huffman coding) would also be possible/
> 
> s/but due to flexibility and simplicity, another method was chosen, which 
> simply/ however, for maximum flexibility and simplicity, the chosen method/
> 
> s/mantisse and sign, the exact contexts which are used/mantissa and sign. The 
> exact contexts used/
> 
> s/can probably better be described by the following code then by some english 
> text/are best described by the following code, followed by some comments./
> 
> 3.6.2:
> 
> Need definitions in the definitions/glossary section for VLC and ESC (and if 
> we are to be pedantic MSB and any other acronyms used, unless they are 
> considered to be so commonly in use among the target audience as to be 
> completely unambiguous - MSB may well fall into this category).
> 

> Fix spacing between Suffix/non ESC and non ESC/Examples.

?


> 
> run mode/run length coding/level coding - capitalisation? 
> 
>  s/mode, and/mode and/
> 
> s/difference, on/ difference. On/
> 
> s/improved the compression rate a bit/slightly improved the compression rate./ 
> (unless you meant literally "one bit"). 
> 
> 
>  4.2 Header:
> 
> >version 0 or 1 
> >coder_type Coder used, 0 (Golomb Rice), 1 (Range coder), 2 (Range coder with 
> custom state transition table) 
> >state_transition_delta The range coder custom state transition table. If it 
> is not coded, all its elements are assumed to be 0. 
> >colorspace_type 0 (YCbCr), 1 (JPEG2000_RCT) 
> >chroma_planes 1 for color, 0 for grayscale 
> >bits_per_raw_sample The number of bits for each sample, commonly 8, 9, 10 or 
> 16 
> >h_chroma_subsample The subsample factor between luma and chroma width 
> (chroma_width = 2 − log2_h_chroma_subsampleluma_width) 
> >v_chroma_subsample The subsample factor between luma and chroma height 
> (chroma_height = 2 − log2_v_chroma_subsampleluma_height) 
> >alpha_plane 1 if a transparency plane is stored, 0 otherwise 
> 
> Need delimiters between value names and descriptions. Might be better in a 
> table.

this issue seems specific to the html output, dunno, maybe it can be
fixed by changing configuration somehow, tips welcome ...

other changes integrated

Thanks alot!

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The misfortune of the wise is better than the prosperity of the fool.
-- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120407/7b657474/attachment.asc>


More information about the ffmpeg-devel mailing list