[MPlayer-dev-eng] Patches for NUT

Fri Feb 10 22:18:38 CET 2006

Hi

On Fri, Feb 10, 2006 at 08:56:27PM +0200, Oded Shimon wrote:
> On Fri, Feb 10, 2006 at 03:51:59PM +0100, Michael Niedermayer wrote:
> > Hi
> > 
> > On Fri, Feb 10, 2006 at 02:48:23AM -0500, Rich Felker wrote:
> > > On Fri, Feb 10, 2006 at 02:34:30AM +0200, Oded Shimon wrote:
> > > > On Thu, Feb 09, 2006 at 06:43:30PM +0100, Michael Niedermayer wrote:
> > > > > On Thu, Feb 09, 2006 at 03:04:18PM +0200, Oded Shimon wrote:
> > > > > [...]
> > > > > > > and iam not happy about the 24bit hack either, either store adler32 or crc32
> > > > > > > everywhere IMHO
> > > > > > 
> > > > > > I suppose crc32 then, it's more common...
> > > > > 
> > > > > ok
> > > > 
> > > > Can I use the lutless crc in libnut or is it unusably slow? The biggest 
> > > > single chunk passed to it is 80kb index, and a total of 200-300kb of 
> > > > syncpoint data, with 15 byte chunks 30,000 times...
> > > 
> > > IMO it's no problem at all. Reading headers does not need to be fast,
> > > and it won't be slow for summing 10-15 bytes. In fact for this
> > > purpose, I expect overall decoding will be faster than with the
> > > table-based one because it won't pollute the cache with tables.
> > 
> > i fully agree with that for the headers, and the same might be true
> > for the syncpoints, but what about the index? its maybe 100k ...
> > 
> > anyway, it shouldnt matter, the user can calculate crcs with as large or as 
> > small tables as he wishes, for example:
> > 
> > static const int tab[16]={0x00000000, 0x00000000, 0x00000000, 0x00000000,
> >   0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
> >   0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000};
> > 
> > for(i=0; i<size; i++){
> >     crc ^= buf[i];
> >     crc = (crc>>4) ^ tab[crc&0xF];
> >     crc = (crc>>4) ^ tab[crc&0xF];
> > }
> > 
> > so this isnt a big issue, we just need to decide which generator polynom we
> > want to use for NUT CRCs, and we should check that its bit ordering and byte
> > ordering matches the most commonly used one
> 
> Hmm, I used the const you said earlier (0x1EDC6F41) . Seems Ogg uses 
> 0x04C11DB7 . 

0x1EDC6F41 is the Castagnoli / iSCSI CRC
0x04C11DB7 is the IEEE one, which seems more popular, so i think we should
switch to that

> Byte ordering is global and well defined for NUT ....

iam not talking about nut but the crc:

#define G 0x04C11DB7
for(i=0; i<size; i++){
    crc ^= buf[i]<<24;
    for(j=0; j<8; j++)
        crc= (crc<<1) ^ (G & (crc>>31));
}
put_byte(crc>>24); put_byte(crc>>16); put_byte(crc>>8); put_byte(crc);

vs.

#define G ((0x04C11DB7>>1) + 0x10000000)
for(i=0; i<size; i++){
    crc ^= buf[i];
    for(j=0; j<8; j++)
        crc= (crc>>1) ^ (G & -(crc&1));
}
put_byte(crc); put_byte(crc>>8); put_byte(crc>>16); put_byte(crc>>24);

and furthermore you can reverse the bits of 0x04C11DB7 to 0xDB710641 for both

we should also maybe use a non zero initial value other wise every all zero 
array will have a matching crc

> 
> Are you still against crc being immediately following syncpoint startcode? 
> It's the most natural position for it. Having it after frame header is very 
> weird, and having it at end of syncpoint is even worse as it is in the 
> middle of the area it checksums.

yes, ill explain the reason why
1.
<data> <crc>
crc is what the crc routine outputs after being feeded with data from left
to right 1 byte at a time
if you now run the crc routine over the data and the following 4 bytes of the
crc too then it will return 0, that obviously wont work if the crc is stored
somewhere else

2. crc have nice burst detection capabilities (every single burst error of
32 or less bits is detected and in the case of 0x04C11DB7 every 2 bursts of
8  or less bits are detected if they are less then 1mbit apart, every 2 
bursts of 10 or less bits are detected if they are <10kbit apart, ...
every 6 bit errors will be detected as long as they are confined to a area
of ~200bits

now these things depend on the mathematically correct placement of the 
crc checksum, if you place it at the begin they will very likely no longer
hold

> I'm also still a bit hesitating about the overhead of adding 4 bytes per 
> syncpoint, but I guess it's ok.

well, suggest an alternative ...
IMHO 4byte checksum vs. nothing is not hard to decide, the problems the 
"nothing" variant causes are too big ...

> 
> One more issue - now that index does not mark EOF as it can appear 
> anywhere, the index_ptr comes out of nowhere, the demuxer can even mistake 
> it for being a frame startcode (It could even appear as a NUT startcode!).
> So I suggest making a end_startcode before the index_ptr...

just put index_ptr into the index, right before the checksum

[...]

-- 
Michael