[MPlayer-dev-eng] Patches for NUT

Fri Feb 10 09:58:35 CET 2006

On Fri, Feb 10, 2006 at 02:48:23AM -0500, Rich Felker wrote:
> On Fri, Feb 10, 2006 at 02:34:30AM +0200, Oded Shimon wrote:
> > On Thu, Feb 09, 2006 at 06:43:30PM +0100, Michael Niedermayer wrote:
> > > On Thu, Feb 09, 2006 at 03:04:18PM +0200, Oded Shimon wrote:
> > > [...]
> > > > > and iam not happy about the 24bit hack either, either store adler32 or crc32
> > > > > everywhere IMHO
> > > > 
> > > > I suppose crc32 then, it's more common...
> > > 
> > > ok
> > 
> > Can I use the lutless crc in libnut or is it unusably slow? The biggest 
> > single chunk passed to it is 80kb index, and a total of 200-300kb of 
> > syncpoint data, with 15 byte chunks 30,000 times...
> 
> IMO it's no problem at all. Reading headers does not need to be fast,
> and it won't be slow for summing 10-15 bytes. In fact for this
> purpose, I expect overall decoding will be faster than with the
> table-based one because it won't pollute the cache with tables.

Cool...

> > > hmm somehow i dont like this, convert_ts & compare_ts are slow, we
> > > should avoid executing them too often, this would need 3xconvert_ts() per
> > > frame on the _demuxer_ side just to check if the pts are ok
> > > a common timebase would only need 1 convert_ts()
> 
> Agree, but Oded's implementation already does convert_ts to
> check/enforce monotonicity rule (MN). That's why...

compare_ts, not convert_ts .

> > Rich suggested an implementation for compare_ts that does not require 
> > convert_ts or division (long 1st grade multiplication). As for convert_ts, 
> > it's use could be eliminated by preparing threshold in every timebase at 
> > init, and:
> > (t2 - t1 > thres)
> > 
> > Is identical to:
> > (t2 - thres > t1)
> 
> It's not identical because thres can't necessarily be expressed in the
> timebase of t2. But maybe it can be used somehow.

Given the implementation I showed, it is identical, this is what I hated 
about this threshold idea, it's very impercise, which is why now I 
suggest..
pts_threshold, per stream, from last_pts . Same idea Michael had before my 
suggestion, however, last_pts is _NOT_ the pts of the last frame in the 
same stream (which would be very far in case of subtitles), it is the 
last_pts CONTEXT, which is reset by syncpoints all over. So, even for 
subtitles, it will never be more than 1 second away or so. This won't even 
need an extra variable. I think it is the best solution...

> > > i think we need to better understand the distribuion of errors on common
> > > media, it would be silly to complicate the index unneccesariely but it
> > > would also be silly if a 1k lost block in a 100k index makes a files
> > > sloooow-seekable
> 
> "sloooow-seekable" is only for extremely slow media. We've been
> testing libnut on files on hdd, and seeking with the binary search is
> instantaneous (not noticably slower than with index) and gives perfect
> results. Still waiting for some tests to be done on cdrom/dvd, but I
> expect the results will be acceptable.

Using a cd-rw and a 350mb file, unsurprisngly, index seek is far more 
enjoyable than binary search. Index is nearly immediate seeking, binary 
search is about 1-2 seconds for each seek (not horrible, but annoying 
frozen window...), and syncpoint cache isn't very helpful except for cases 
of seeking back to a spot you've actually already played (in which case 
there is no binary search at all). It's just barely helpful when seeking 
very close (<30 seconds) to a syncpoint already seen. Anything else, it's 
about 8 underlying seeks, which is the same as before there is any 
syncpoint cache at all... Maybe linear interpolation code can be improoved.

I just played with it a bit, seems pure linear interpolation is better than 
what it was, which was a mixture linear interpolation and binary search 
(7/8th weight to linear interpolation). Unfortunately, in some rare cases, 
linear interpolation alone has caused me as much as 24 seeks.. With the 
mixture I never got higher than 10 seeks, but rarely lower than 6 seeks, 
when linear interpolation usually gives me 3-5 seeks... I tried now 
changing the weight to 19/20. It's somewhat better.

> BTW if you're using the reference implementation, seeking gets
> progressively faster each time you seek, due to caching of syncpoints
> and building of a dynamic index in memory. Unless there's a major
> obstacle to using this (like very restricted memory space) I expect
> people will just use libnut in most cases because it performs so well.
> 
> > If you are talking about index, then you are not talking about 
> > streaming, then the only damage left is either p2p and local storage. 
> > AFAIK, those 2 only ever give damage in chunks, usually big chunks... BTW, 
> > if you truncate a NUT file by a single byte (or, even append a single byte), 
> > libnut will not be able to read the index, because there is no index_ptr...
> 
> It could still search for the index; however, the likelihood of 1-4
> bytes being truncated without part of the index itself also being
> truncated is very low, I think.. :)

Yeah. That reminds me, I want to update libnut to search for index 
immediately after the headers it just read (just read 8 bytes and see if 
they are startcode before going to eof to look for index ptr).

- ods15