Incomplete description of checksum algorithm

Hi, the checksum definition in the current spec seems incomplete or unprecise to me. I have prepared the following patch to complement Oded's last change. It is based on the last discussion of the subject on mplayer-dev-eng. It is just a suggestion. Alex (beastd)

Hi On Wed, Feb 15, 2006 at 11:19:41PM +0100, Alexander Strasser wrote:
Hi,
the checksum definition in the current spec seems incomplete or unprecise to me. I have prepared the following patch to complement Oded's last change. It is based on the last discussion of the subject on mplayer-dev-eng.
It is just a suggestion.
hmm, alternative suggestion: crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data btw, why doesnt the syncpoint have a packet header / forward ptr !? [...] -- Michael

Hi, Michael Niedermayer wrote:
On Wed, Feb 15, 2006 at 11:19:41PM +0100, Alexander Strasser wrote:
the checksum definition in the current spec seems incomplete or unprecise to me. I have prepared the following patch to complement Oded's last change. It is based on the last discussion of the subject on mplayer-dev-eng.
It is just a suggestion.
hmm, alternative suggestion:
looks a bit scary :) and i am way too tired to read it closely...
crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data
ok, while we are at it, should we use non zero initial value? if yes which?
btw, why doesnt the syncpoint have a packet header / forward ptr !?
no clue Alex (beastd)

Hi On Thu, Feb 16, 2006 at 01:47:32AM +0100, Alexander Strasser wrote: [...]
crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data
ok, while we are at it, should we use non zero initial value? if yes which?
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a non zero initial value or xor some part of the whole (like the checksum) with some non zero value which? well -1 seems popular ... so one of the following is needed * add some guaranteed non zero element (iam not sure if width, sample_rate,... should be considered enough because IMHO it should be possible to find a undamaged packet purely with the checksum and without parsing the whole) IMHO the forward_ptr is ideal choice for this ... * set the initial value to -1 * xor the checksum with -1 [...] -- Michael

On Thu, Feb 16, 2006 at 04:44:02PM +0100, Michael Niedermayer wrote:
Hi
On Thu, Feb 16, 2006 at 01:47:32AM +0100, Alexander Strasser wrote: [...]
crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data
ok, while we are at it, should we use non zero initial value? if yes which?
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a
Is there an all-zero case? As far as I can tell, there's no NUT packet that's valid as all zeros, except possibly the very first syncpoint/header pair in the file. Rich

Hi On Thu, Feb 16, 2006 at 12:12:11PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 04:44:02PM +0100, Michael Niedermayer wrote:
Hi
On Thu, Feb 16, 2006 at 01:47:32AM +0100, Alexander Strasser wrote: [...]
crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data
ok, while we are at it, should we use non zero initial value? if yes which?
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a
Is there an all-zero case? As far as I can tell, there's no NUT packet that's valid as all zeros, except possibly the very first syncpoint/header pair in the file.
maybe, but having all zero packets with always matching checksum is risky, we will have to check that no change we do might lead to legal all zero packets, and its also more tricky on the demuxer side (search for a packet with matchig checksum vs. search for a packet we can parse with no errors and which has a matching checksum) [...] -- Michael

On Thu, Feb 16, 2006 at 07:47:49PM +0100, Michael Niedermayer wrote:
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a
Is there an all-zero case? As far as I can tell, there's no NUT packet that's valid as all zeros, except possibly the very first syncpoint/header pair in the file.
maybe, but having all zero packets with always matching checksum is risky, we will have to check that no change we do might lead to legal all zero packets, and its also more tricky on the demuxer side (search for a packet with matchig checksum vs. search for a packet we can parse with no errors and which has a matching checksum)
Fine, IMO it's ok to include forward pointer, but what about syncpoints? Do we really want to add another byte to them? :( Rich

Hi On Thu, Feb 16, 2006 at 04:06:22PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 07:47:49PM +0100, Michael Niedermayer wrote:
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a
Is there an all-zero case? As far as I can tell, there's no NUT packet that's valid as all zeros, except possibly the very first syncpoint/header pair in the file.
maybe, but having all zero packets with always matching checksum is risky, we will have to check that no change we do might lead to legal all zero packets, and its also more tricky on the demuxer side (search for a packet with matchig checksum vs. search for a packet we can parse with no errors and which has a matching checksum)
Fine, IMO it's ok to include forward pointer, but what about syncpoints? Do we really want to add another byte to them? :(
a compromise would be a flag in the main header for that, so as long as we havnt added any fields to syncpoints there wont be a forward ptr ... but personally i probably would simply unconditionally add the forward ptr [...] -- Michael

On Fri, Feb 17, 2006 at 03:31:04AM +0100, Michael Niedermayer wrote:
Hi
On Thu, Feb 16, 2006 at 04:06:22PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 07:47:49PM +0100, Michael Niedermayer wrote:
personally i would simply include the forward ptr, so the all zero case would naturally be gone, if the others are against this then we must either use a
Is there an all-zero case? As far as I can tell, there's no NUT packet that's valid as all zeros, except possibly the very first syncpoint/header pair in the file.
maybe, but having all zero packets with always matching checksum is risky, we will have to check that no change we do might lead to legal all zero packets, and its also more tricky on the demuxer side (search for a packet with matchig checksum vs. search for a packet we can parse with no errors and which has a matching checksum)
Fine, IMO it's ok to include forward pointer, but what about syncpoints? Do we really want to add another byte to them? :(
a compromise would be a flag in the main header for that, so as long as we havnt added any fields to syncpoints there wont be a forward ptr ... but personally i probably would simply unconditionally add the forward ptr
Well, I had another idea: having two separate startcodes for syncpoints, one with extensibility and the other with minimal size. However, somehow I doubt it's worth the added complexity in the demuxer to support such nonsense.. Rich

Hi On Wed, Feb 15, 2006 at 10:43:25PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 01:10:41AM +0100, Michael Niedermayer wrote:
btw, why doesnt the syncpoint have a packet header / forward ptr !?
Because the damn thing is already way too big! :(
that makes it non-extendible which breaks one of nuts goals [...] -- Michael

On Thu, Feb 16, 2006 at 04:31:59PM +0100, Michael Niedermayer wrote:
Hi
On Wed, Feb 15, 2006 at 10:43:25PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 01:10:41AM +0100, Michael Niedermayer wrote:
btw, why doesnt the syncpoint have a packet header / forward ptr !?
Because the damn thing is already way too big! :(
that makes it non-extendible which breaks one of nuts goals
IMO no.. frame headers can have arbitrary extendiblity, and syncpoints are unique - their checksum doesn't come at the end of the syncpoint, but at the following frame header... As for forward ptr in checksum, it slightly annoys my coding because i do: header -> temp buffer forward_ptr - length of header+4 -> file temp buffer -> file checksum of temp buffer -> file So I'd have to somehow push the forward ptr to the begginning of the temp buffer or some other solution. It doesn't matter badly though, I'm not really against it. But syncpoints don't need forward ptr though because they are special, they carry themselves AND the following frame header... So you'd have to do some weird thing where the forward ptr only points to the end of the syncpoint, and then there's the frame header, and THEN there is the checksum. it's ugly IMO... - ods15

On Fri, Feb 17, 2006 at 01:56:03PM +0200, Oded Shimon wrote:
On Thu, Feb 16, 2006 at 04:31:59PM +0100, Michael Niedermayer wrote:
Hi
On Wed, Feb 15, 2006 at 10:43:25PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 01:10:41AM +0100, Michael Niedermayer wrote:
btw, why doesnt the syncpoint have a packet header / forward ptr !?
Because the damn thing is already way too big! :(
that makes it non-extendible which breaks one of nuts goals
IMO no.. frame headers can have arbitrary extendiblity, and syncpoints are unique - their checksum doesn't come at the end of the syncpoint, but at the following frame header...
As for forward ptr in checksum, it slightly annoys my coding because i do:
header -> temp buffer forward_ptr - length of header+4 -> file temp buffer -> file checksum of temp buffer -> file
So I'd have to somehow push the forward ptr to the begginning of the temp buffer or some other solution. It doesn't matter badly though, I'm not really against it.
But syncpoints don't need forward ptr though because they are special, they carry themselves AND the following frame header... So you'd have to do some weird thing where the forward ptr only points to the end of the syncpoint, and then there's the frame header, and THEN there is the checksum. it's ugly IMO...
This is all your implementation issue. If the header is enclosed in the checksum, it's natural to consider it part of the syncpoint packet anyway. Also there's no reason a checksum needs to come from just one buffer. As long as you pass a state pointer to your checksum function you can checksum arbitrarily many different buffers as if they were all connected.. Rich

On Fri, Feb 17, 2006 at 09:47:16AM -0500, Rich Felker wrote:
On Fri, Feb 17, 2006 at 01:56:03PM +0200, Oded Shimon wrote:
On Thu, Feb 16, 2006 at 04:31:59PM +0100, Michael Niedermayer wrote:
Hi
On Wed, Feb 15, 2006 at 10:43:25PM -0500, Rich Felker wrote:
On Thu, Feb 16, 2006 at 01:10:41AM +0100, Michael Niedermayer wrote:
btw, why doesnt the syncpoint have a packet header / forward ptr !?
Because the damn thing is already way too big! :(
that makes it non-extendible which breaks one of nuts goals
IMO no.. frame headers can have arbitrary extendiblity, and syncpoints are unique - their checksum doesn't come at the end of the syncpoint, but at the following frame header...
As for forward ptr in checksum, it slightly annoys my coding because i do:
header -> temp buffer forward_ptr - length of header+4 -> file temp buffer -> file checksum of temp buffer -> file
So I'd have to somehow push the forward ptr to the begginning of the temp buffer or some other solution. It doesn't matter badly though, I'm not really against it.
But syncpoints don't need forward ptr though because they are special, they carry themselves AND the following frame header... So you'd have to do some weird thing where the forward ptr only points to the end of the syncpoint, and then there's the frame header, and THEN there is the checksum. it's ugly IMO...
This is all your implementation issue.
BTW, implementation should have some importance, if there's no way to accomplish something without ugly hacks in code it's bad. :/ But this isn't such a scenario.
If the header is enclosed in the checksum, it's natural to consider it part of the syncpoint packet anyway. Also there's no reason a checksum needs to come from just one buffer. As long as you pass a state pointer to your checksum function you can checksum arbitrarily many different buffers as if they were all connected..
Yes, I know, this is why I said I'm not against forward_ptr being in checksum, but I am against forward_ptr being in syncpoints! The point of forward_ptr being in syncpoints is to be able to add fields to syncpoints, not to frame headers, cause we already have that, right? The structure is: <startcode> <forward_ptr> <syncpoint> <reserved> <frame header> <checksum> (reserved being reserved bytes for syncpoint) Now, where the hell does the forward_ptr point to? If it points to after the checksum like all other headers, you won't know where the hell to start parsing the frame header. Pointing to right before the frame header is weird, because the checksum comes later. Putting the reserved bytes of the _syncpoint_ after the frame header is even weirder (data of syncpoint cut in two pieces??) - ods15

On Fri, Feb 17, 2006 at 07:05:48PM +0200, Oded Shimon wrote:
If the header is enclosed in the checksum, it's natural to consider it part of the syncpoint packet anyway. Also there's no reason a checksum needs to come from just one buffer. As long as you pass a state pointer to your checksum function you can checksum arbitrarily many different buffers as if they were all connected..
Yes, I know, this is why I said I'm not against forward_ptr being in checksum, but I am against forward_ptr being in syncpoints!
The point of forward_ptr being in syncpoints is to be able to add fields to syncpoints, not to frame headers, cause we already have that, right? The structure is:
<startcode> <forward_ptr> <syncpoint> <reserved> <frame header> <checksum>
(reserved being reserved bytes for syncpoint) Now, where the hell does the forward_ptr point to? If it points to after the checksum like all other headers, you won't know where the hell to start parsing the frame header. Pointing to right before the frame header is weird, because the checksum comes later. Putting the reserved bytes of the _syncpoint_ after the frame header is even weirder (data of syncpoint cut in two pieces??)
OK, I agree. Michael, do you have comments? Rich

Hi On Fri, Feb 17, 2006 at 12:36:31PM -0500, Rich Felker wrote:
On Fri, Feb 17, 2006 at 07:05:48PM +0200, Oded Shimon wrote:
If the header is enclosed in the checksum, it's natural to consider it part of the syncpoint packet anyway. Also there's no reason a checksum needs to come from just one buffer. As long as you pass a state pointer to your checksum function you can checksum arbitrarily many different buffers as if they were all connected..
Yes, I know, this is why I said I'm not against forward_ptr being in checksum, but I am against forward_ptr being in syncpoints!
The point of forward_ptr being in syncpoints is to be able to add fields to syncpoints, not to frame headers, cause we already have that, right? The structure is:
<startcode> <forward_ptr> <syncpoint> <reserved> <frame header> <checksum>
(reserved being reserved bytes for syncpoint) Now, where the hell does the forward_ptr point to? If it points to after the checksum like all other headers, you won't know where the hell to start parsing the frame header. Pointing to right before the frame header is weird, because the checksum comes later. Putting the reserved bytes of the _syncpoint_ after the frame header is even weirder (data of syncpoint cut in two pieces??)
OK, I agree. Michael, do you have comments?
none which will help the clean way to do this is 2 checksums 1 for the syncpoint, 1 for the frame header, and ptr points to the first checksum/frame header so i agree that no forward ptr in syncpoints is ok, unless we come up with a clean&low overhead solution [...] -- Michael

On Thu, Feb 16, 2006 at 01:10:41AM +0100, Michael Niedermayer wrote:
Hi
On Wed, Feb 15, 2006 at 11:19:41PM +0100, Alexander Strasser wrote:
Hi,
the checksum definition in the current spec seems incomplete or unprecise to me. I have prepared the following patch to complement Oded's last change. It is based on the last discussion of the subject on mplayer-dev-eng.
It is just a suggestion.
hmm, alternative suggestion:
crc32 checksum The checksum is choosen so that c(x) is a multiple of x^32+x^26+x^23+x^22+x^16+x^12+x^11+x^10+x^8+x^7+x^5+x^4+x^2+x+1 over GF(2) c(x) is the polynom coresponding to the block from and including the forward ptr and upto and including the checksum c(x)= (1&(d[0]>>8))*x^(n-1) + (1&(d[0]>>7))*x^(n- 2) + ... + (1&(d[1]>>8))*x^(n-9) + (1&(d[1]>>7))*x^(n-10) + ... + ... + ... + (1&(d[n/8-1]>>1))*x^1 + (1&(d[n/8-1]>>0))*x^0 alternatively you can simply run crc=0; for(i=0; i<size; i++){ crc ^= buf[i]<<24; for(j=0; j<8; j++) crc= (crc<<1) ^ (0x04C11DB7 & (crc>>31)); } over the data
... Comments on this, do we really need such a technichal explanation in NUT spec? Can we point to some other spec? - ods15
participants (4)
-
Alexander Strasser
-
Michael Niedermayer
-
Oded Shimon
-
Rich Felker