[FFmpeg-devel] [PATCH v11 1/3] libavcodec/dnxucdec: DNxUncompressed decoder

Fri Oct 11 17:12:54 EEST 2024

On 11.10.24 10:57, Anton Khirnov wrote:
  It seems rather obvious to me - you're making a demuxer export something
> that IS raw video, yet you're tagging it as a new codec ID with a new
> decoder that (for these specific pixel format) duplicates what the
> rawvideo decoder does.
> 
> Just to be clear, I'm not saying the entirety of your decoder should be
> handled in the demuxer - just those pixel formats that can be. That
> would also have the advantage that the remainder of your decoder could
> now enable direct rendering.

**Wherever it is possible** I simply set the correct pixel format found 
by the  dnxuc_parser in avcontext and use a simple pass_through() 
routine to transfer the raw packet data without any further processing 
as frame data!

That's more or less the same mechanism as used in 
libavcodec/bitpacked_dec.c for uyvy422 data.

But if you look at the central dispatcher in my decoder, you'll see, 
that this pass_through solution can be only used in 2 cases of the 8 
already supported kinds of payload. For all other types of data there is 
simply no exact fitting ffmpeg pixel format available or it doesn't work 
satisfying in practice.

In the case of all 10 and 12bit variants DNxUncmpressed uses a very 
uncommon kind of line-based bitpacking, which you will hardly find 
somewhere else. This has to be preprocessed and translated into more 
common binary arrangements by this decoder.

But in case of the float variants I had to fight missing support and 
disappointing defects in swscaler.

>>>> +static int float2planes(AVCodecContext *avctx, AVFrame *frame, const AVPacket *pkt)
>>>> +{
>>>> +    int lw;
>>>> +    const size_t sof = 4;
>>>> +
>>>> +    lw = frame->width;
>>>> +
>>>> +    for(int y = 0; y < frame->height; y++){
>>>> +        for(int x = 0; x < frame->width; x++){
>>>> +            memcpy(&frame->data[2][sof*(lw*y + x)], &pkt->data[sof* 3*(lw*y + x)], sof);
>>>> +            memcpy(&frame->data[0][sof*(lw*y + x)], &pkt->data[sof*(3*(lw*y + x) + 1)], sof);
>>>> +            memcpy(&frame->data[1][sof*(lw*y + x)], &pkt->data[sof*(3*(lw*y + x) + 2)], sof);
>>>> +        }
>>>> +    }
>>>
>>> Same here, deinterleaving packed to planar is a job for swscale.

Do you really think, I'm a stupid idiot and didn't try to handle it in 
this much more simple unmodified manner first?

Sure, I did!!
But I immediately stumbled again over another nasty swscale related bug! :(

Just try it yourself:

First make a simple test with my last posted patch set:

   ./ffmpeg_g -i ./fate-suite/dnxuc/cb_rgb_float.mxf -c:v prores 
/tmp/out.mov

everything should simply work as expected

And now change line 348 of dnxucdec.c for float32 sub-format:

          case MKTAG(' ','r','g','f'):
-            ret = fmt_frame(avctx, frame, avpkt, AV_PIX_FMT_GBRPF32LE, 
96, float2planes);
+            ret = fmt_frame(avctx, frame, avpkt, AV_PIX_FMT_RGBF32LE, 
96, pass_through);

Now it's using the pass_through() mechanism and utilizes unmodified 
packed float input data instead of this clumsy rearrangement, but the 
same simple test command as before will not work anymore!

You'll get an error report, that ffmpeg couldn't figure out any working 
pipeline resp. pixel format transformation to connect the input data 
with anything else...

[vist#0:0/dnxuc @ 0x55e980fd7a00] [dec:dnxuc @ 0x55e981460f40] Decoder 
thread received EOF packet
[graph -1 input from stream 0:0 @ 0x7fb644002380] w:512 h:256 
pixfmt:rgbf32le tb:1/24 fr:24/1 sar:1/1 csp:bt709 range:unknown
[format @ 0x7fb6440029c0] Setting 'pix_fmts' to value 
'yuv422p10le|yuv444p10le|yuva444p10le'
[format @ 0x7fb6440029c0] Setting 'color_ranges' to value 'tv'
[auto_scale_0 @ 0x7fb644004340] w:iw h:ih flags:'' interl:0
[format @ 0x7fb6440029c0] auto-inserting filter 'auto_scale_0' between 
the filter 'Parsed_null_0' and the filter 'format'
Impossible to convert between the formats supported by the filter 
'Parsed_null_0' and the filter 'auto_scale_0'

That's why I had to add this crazy workaround in this particular case!

But I already told you about all these really disappointing experiences:

>> Some of these rather inefficient interleave<->planar conversions where
>> just necessary in practice, because swscale wasn't able to figure out a
>> working pipeline construction otherwise, although in theory the affected
>> pixel format closer to the data input should be supported and work as
>> well!:(
>>
>> At the end I just looked for something that simply works in real world, too!
> 
> But we do have a packed RGBF32 pixel format, how is it different from
> this?

Martin