[MPlayer-dev-eng] [PATCH] RFC: CrystalHD decoder support

Tue Dec 21 04:36:15 CET 2010

 On 2010-12-20 19:15:13, Reimar Döffinger  wrote:
>
> Really? That amount of buffered frames would mean this hardware is 
> completely
> useless for video telephone or any other interactive application.

 It seems ridiculous but I have not been able to convince it to give me 
 a decoded
 frame without shoving more data into it - no matter how long I wait.

> This is more than 4 seconds, you are going to run into _serious_ 
> issues,
> with that, e.g. if combined with uncompressed multichannel PCM audio
> 4 seconds is probably more than MPlayer will be able to buffer, 
> causing
> playback to fail.

 Highly likely :-/

> I'm also not sure how
> QUERY_UNSEEN_FRAMES should be used. I don't
> know how many complete frames are in the decoder waiting to be
> decoded

 Why don't you know that? At least for the non-interlaced, 
 non-error-concealment
 case it should be just a matter of "number of packets in - number of 
 frames out".

> and mplayer seems to be
> comparing the result with buffered_pts which comes from the pts
> passed to each decode call,
> so I'm pretty confused.

> Figure out how timestamp reordering is supposed to be done, then the 
> right
> solution should become more obvious.
> Unless of course they went the cheap way and don't have any way to 
> figure out
> with input time stamp belongs to which output frame, in which case 
> that solution
> would become completely useless for any variable frame-rate 
> content...

 So, It wasn't clear to me that I could assume every call to decode gave 
 me a
 complete encoded frame (as opposed to just a bunch of bytes) but it 
 seems clear
 that is true. Unfortunately, the current crystalhd code will not return 
 timestamps
 on decoded frames. It's supposed to support passing in a pts to the 
 input function
 and having it come back on the output but this seems to only work with 
 70012 hardware.
 So yes, I am screwed in that regard.

> A too large value will work most of the time, but is likely to cause
> issues with timestamp wraps.

 Ok - well, I can now shrink this down to the real value but it may 
 still be large
 enough to cause problems.

> demuxers aren't concerned with what comes after. Take a look
> at mpi_no_picture in vd_ffmpeg.c, it's likely you have to do
> something like that (though actually that should only
> really apply to the interlaced case you mention below).
> Hm, I just saw you actually use that, but I have some doubts
> if it is correct to use it for the ReadyListCount case.
> And the behaviour of the native demuxers and lavf should behave
> the same if you use -nocorrect-pts.

 I actually thought I removed it from the ReadyListCount case - I will 
 get
 rid of that.

> The point would be to have code sharing with e.g. VLC, at least in
> the long run.
> There is not enough manpower available in the MPlayer project
> to maintain this, I guarantee you that the moment you don't actively
> look after it it will bit-rot and become unusable in no time,
> while in the process probably wasting a good bit of mine and other's
> time.

 Understood. Once I've got the other problems sorted out in the mplayer
 context, I will look at pushing it down - but we'll definitely need to
 change how vd_ffmpeg does QUERY_UNSEEN_FRAMES as part of that.

> That is not quite where the letter "c" goes in my alphabet...

 Will fix.

> Can they be convinced to fix their header instead?

 It would be nice. Apparently this was fallout from a change to avoid 
 that types
 header getting pulled into kernel driver builds. Jarod hasn't been 
 responsive to
 any of my emails so far, so we'll have to see.

> With regards to FFmpeg integration a LGPL version would be 
> preferable.

 No problem on my part.

> MPlayer already has a fast_memcpy function (two actually, depending 
> on
> whether the source/target is normal memory or a memory-mapped 
> device).
> And in difference to this variant using intrinsics it
> 1) will have predictable performance (yes, compilers still sometimes 
> compile
>    intrinsics in a way that negates any speed advantage)
> 2) it actually compiles on older compilers

 Ok, I'll use that.

> <A number of comments on SPS/PPS decoding>

 I took this function as-is from xbmc so I can't take credit for the 
 code style.
 I'll rework it to address your comments.

> Pointless, query_format already checks that the format is supported.
> Also, why only IMGFMT_YUY2? That wastes a huge amount of bandwidth
> (unless the input is actual 4:2:2 - if the is even supported).

 Cute isn't it? The 70015 only outputs YUY2 - even though it only 
 supports 420
 input. The 70012 can apparently output YV12/NV12. *sigh*. So I had the 
 QUERY_FORMAT
 handling there with an eye to supporting YV12 on the 70012 and then you 
 need the
 dynamic check, but all the other implementations seem to have given up 
 and use YUY2
 on both decoders. :-/

>> +        format.height = sh->disp_h;
>> +        if (format.height == 1088) {
>> +           format.height = 1080;
>> +        }
>
> That certainly isn't correct.

 I need to investigate this more - the logic came from xbmc. Certainly 
 if I feed
 it a stream where the raw metadata says 1088, it doesn't automatically 
 crop to 1080
 on output.

> Shouldn't you do some stuff like closing the device and freeing some 
> stuff
> in all those error cases?

 Yes, I should.

> I have quite some doubts about this.
> The obvious one is: Why is the API designed in a way that you can end 
> up with the
> two top_field == bottom_field cases and what are they supposed to 
> mean.
> The less obvious one is: what exactly does interlaced mean?
> There is PAFF coding, there is MBAFF coding and then interlaced can 
> be
> coded as progressive (even if it is a bad idea), what exactly does 
> this
> one indicate (I have a suspicion that it actually mean PAFF).

 I think that libcrystalhd is buggy with respect to setting these flags 
 - at least
 on 70015. But it's resulted in some very strange values in my tests. I 
 haven't had
 a chance to run through the samples in the ffmpeg collection to fully 
 establish what
 the behaviour is in each of the three cases. I've tested on mpeg2 
 interlaced and an
 h.264 stream where I'm not sure what the real encoding is - all I know 
 is I get full
 frames comprised of interlaced fields and some frames are marked 
 progressive and
 others are marked as top fields (and then I get three top fields in a 
 row...)

> I suspect that memcpy_pic is going to solve this easier.
> Still, a memcpy is not exactly fast at high resolutions.

 No, and it's probably a big part of why this thing requires quite a lot 
 of work to
 get it to decode 1080p in real time (sad).

 Thanks for all the comments. I will address them and post an updated 
 diff, probably
 this weekend. As I mentioned above, I'm not going to move it to ffmpeg 
 until I've got
 it working properly - it's easier to debug the mplayer interactions in 
 a native codec.

 Thanks!

 --phil