[Libav-user] Extracting DCT coefficients from H264 videos

Wed Jan 3 04:35:18 EET 2024

Hi all,

I've been trying to find a way to extract dequantized DCT coefficients, and
here's what I've come up with so far:

static void print_macroblocks(const H264Context *h, H264SliceContext *sl) {
    int frame_width = h->width;
    int frame_height = h->height;
    int mb_width = frame_width / h->mb_width;   // Number of macroblocks
horizontally
    int mb_height = frame_height / h->mb_height; // Number of macroblocks
vertically
    int pixel_shift = h->pixel_shift; // 0 for 8-bit, 1 for higher bit depth
    int block_width;
    int block_height;

    printf("frame_width: %d, frame_height: %d, mb_width: %d, mb_height:
%d\n", frame_width, frame_height, mb_width, mb_height);

    for (int channel = 0; channel < 3; channel++) {
        printf("Channel");
        if (channel == 0) {
            printf(" Y:\n");
        } else if (channel == 1) {
            printf(" Cb:\n");
        } else {
            printf(" Cr:\n");
        }

        for (int mb_y = 0; mb_y < mb_height; mb_y++) {
            for (int mb_x = 0; mb_x < mb_width; mb_x++) {
                printf("Macroblock [%d, %d]:\n", mb_y, mb_x);

                // Adjust block dimensions based on chroma format
                block_width = (channel == 0) ? 16 : 16 >> (CHROMA422(h) ? 1
: CHROMA444(h) ? 0 : 1);
                block_height = (channel == 0) ? 16 : 16 >> (CHROMA422(h) ?
0 : CHROMA444(h) ? 0 : 1);

                for (int sub_y = 0; sub_y < block_height; sub_y++) {
                    for (int sub_x = 0; sub_x < block_width; sub_x++) {
                        int index = ((mb_y * block_height + sub_y) *
frame_width + (mb_x * block_width + sub_x)) << pixel_shift;
                        if (channel == 0 && sl->intra16x16_pred_mode == 1
&& sub_x < 4 && sub_y < 4) {
                            printf("%d ", sl->mb_luma_dc[0][sub_y * 4 +
sub_x]);
                        } else {
                            printf("%d ", sl->mb[index]);
                        }
                    }
                    printf("\n");
                }
            }
        }
    }
}

I'm calling this function at the end of ff_h264_decode_mb_cabac and
ff_h264_decode_mb_cavlc. Could anyone tell me if I'm on the right track? If
so, I'm thinking of returning this as another type of side data a la
AV_FRAME_DATA_MOTION_VECTORS. Would that be the right direction? Thank you
so much for your help in advance!

Cheers,
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ffmpeg.org/pipermail/libav-user/attachments/20240102/8208da50/attachment.htm>