[FFmpeg-devel] PAFF support h264 - preliminary patch as notes
Neil Brown
neilb
Tue Jul 17 01:24:19 CEST 2007
Don't get too excited....
As I have a camera (Sony HDR-SR1) that produces AVCHD files with PAFF
interlacing, I thought I'd see what is involved in getting ffmpeg to
work with them. I learnt a lot in the process.
There is quite a bit of code in h264.c to support some of PAFF
already.
I'm not sure if I have all the terminology right here, but:
A stream contains a collection of slices, which are contiguous
regions of macro-blocks (16x16 pixels).
A number of slices comprise a picture, but usually there is just
one slice per picture.
A picture is either a field or a frame, which is two fields of
opposite parity encoded together. Fields are either TOP (even
numbers lines) or BOTTOM (odd numbered lines).
So what we need to handle PAFF is to be able to correctly handle the
TOP and BOTTOM FIELDS - the FRAMES are largely handled properly
already.
I have a patch, included below, which fixes a number of errors in
mapping macro-block numbers to picture line numbers, and other related
code. With this, the first field gets decoded properly.
However it is only one field, half the scan-lines of a full frame. So
it looks a bit washed out. But it is progress.
The next steps, as I see them, are:
- Get the second field to decode properly.
In my files, it is a P field rather than an I field.
i.e. while the first is coded with 'intra-coding', without
reference to any other frame/field, the second is coded
with progressive 'inter-coding', with reference to the
preceding field. This brings up some issues.
Presumably is it encoded based on the first field, which
is of the opposite parity. That means that every pixel is
offset by one line. I don't know how to handle that.
There is some code missing from fill_default_ref_list that
I think is important. It reads.
if(s->picture_structure == PICT_FRAME){
.... do useful stuff
}else{ //FIELD
if(h->slice_type==B_TYPE){
}else{
//FIXME second field balh
}
}
I think this needs to be fleshed out so that decoding
the P frame has the right reference frames available.
I also have a suspicion that fill_caches needs some extra handling
to get left_block setup correctly in the PICT_FIELD case, but as yet I
have no idea what 'left_block' is for, so I cannot be sure.
- Combine fields into frames. I really don't know what the desired
result is here. The decoding process will produce a series of
interlace fields, first a top field (lines 0,2,4,6,...) then a
bottom field (lines 1,3,5,7,...). How should these be presented
to the application?
One option is to combine them into a single field. However this
loses information as the fields should be separated by 1/50th of a
second (for the PAL case).
The other option would be to pass them back as individual fields
with half the expected number of lines and tell the application
that they are fields to be interlaced together. However I have no
idea how to do that or if it is even possible.
If the first option is best, then we need to hold on to the first
field until the second field is ready. Then merge them together.
If the second option is best (which I suspect to be the case), then
we need to decode fields densely (not leaving blank lines between
content lines, only using the top half of the buffer), which means
that my following patch is completely wrong as it gets the
decoding to use the full height, but only half the lines.
It would also mean that if we find a FRAME picture while expecting
interlacing, we need to split it into the two component frames to
return it to the application. I think this would be a very
substantial code change. It might make the code cleaner though.
I don't know if/when I might find time to work on this again so I am
doing this brain dump now in case it might help someone else. I would
really like input on the question of how to return interlaced video
fields before I even consider hacking on the code any more.
This patch contains a hack to mpegvideo.c so that decoding h264 with
PAFF doesn't crash copying data from NULL, which it probably does
because the reference frames aren't being set up properly.
With this patch, I can 'ffplay' a .mts file and the first frame looks
recognisable, though a bit washed out. Following frames degrade quite
quickly. It still crashes on exit with a bad 'free'.
As I suggest above, it is entirely possible that this patch is
completely wrong as it tried to make a field look like a frame, and we
possibly shouldn't be doing that. It was a useful learning experience
though.
Thanks for your time,
NeilBrown
Index: libavcodec/h264.c
===================================================================
--- libavcodec/h264.c (revision 9692)
+++ libavcodec/h264.c (working copy)
@@ -172,6 +172,7 @@
//wow what a mess, why didn't they simplify the interlacing&intra stuff, i can't imagine that these complex rules are worth it
top_xy = mb_xy - s->mb_stride;
+ if (PICT_FIELD) top_xy -= s->mb_stride;
topleft_xy = top_xy - 1;
topright_xy= top_xy + 1;
left_xy[1] = left_xy[0] = mb_xy-1;
@@ -247,6 +248,9 @@
left_block[7]= 10;
}
}
+ }
+ else if (PICT_FIELD) {
+ /* MBAFF-FIXME Do we different values for 'left_block' here? */
}
h->top_mb_xy = top_xy;
@@ -4348,6 +4352,11 @@
}
s->resync_mb_x = s->mb_x = first_mb_in_slice % s->mb_width;
s->resync_mb_y = s->mb_y = (first_mb_in_slice / s->mb_width) << h->mb_aff_frame;
+ if (s->picture_structure == PICT_BOTTOM_FIELD)
+ s->resync_mb_y = s->mb_y = s->mb_y *2 + 1;
+ else if (s->picture_structure == PICT_TOP_FIELD)
+ s->resync_mb_y = s->mb_y = s->mb_y *2;
+
assert(s->mb_y < s->mb_height);
if(s->picture_structure==PICT_FRAME){
@@ -4355,6 +4364,8 @@
h->max_pic_num= 1<< h->sps.log2_max_frame_num;
}else{
h->curr_pic_num= 2*h->frame_num;
+ if (s->picture_structure == PICT_BOTTOM_FIELD)
+ h->curr_pic_num++;
h->max_pic_num= 1<<(h->sps.log2_max_frame_num + 1);
}
@@ -4390,7 +4401,7 @@
if(h->slice_type == P_TYPE || h->slice_type == SP_TYPE || h->slice_type == B_TYPE){
if(h->slice_type == B_TYPE){
h->direct_spatial_mv_pred= get_bits1(&s->gb);
- if(h->sps.mb_aff && h->direct_spatial_mv_pred)
+ if(h->mb_aff_frame && h->direct_spatial_mv_pred)
av_log(h->s.avctx, AV_LOG_ERROR, "MBAFF + spatial direct mode is not implemented\n");
}
num_ref_idx_active_override_flag= get_bits1(&s->gb);
@@ -5390,6 +5401,8 @@
int mb_xy = mb_x + mb_y*s->mb_stride;
mba_xy = mb_xy - 1;
mbb_xy = mb_xy - s->mb_stride;
+ if (PICT_FIELD)
+ mbb_xy -= s->mb_stride;
}
if( h->slice_table[mba_xy] == h->slice_num && !IS_SKIP( s->current_picture.mb_type[mba_xy] ))
@@ -5522,16 +5535,9 @@
return 1 + get_cabac_noinline( &h->cabac, &h->cabac_state[77 + ctx] );
}
static int decode_cabac_mb_dqp( H264Context *h) {
- MpegEncContext * const s = &h->s;
- int mbn_xy;
int ctx = 0;
int val = 0;
- if( s->mb_x > 0 )
- mbn_xy = s->mb_x + s->mb_y*s->mb_stride - 1;
- else
- mbn_xy = s->mb_width - 1 + (s->mb_y-1)*s->mb_stride;
-
if( h->last_qscale_diff != 0 )
ctx++;
@@ -5885,6 +5891,8 @@
if (left_mb_frame_flag != curr_mb_frame_flag) {
h->left_mb_xy[0] = pair_xy - 1;
}
+ } else if (s->picture_structure == PICT_BOTTOM_FIELD) {
+ h->top_mb_xy -= s->mb_stride;
}
return;
}
@@ -7117,7 +7125,7 @@
s->mb_x = 0;
ff_draw_horiz_band(s, 16*s->mb_y, 16);
++s->mb_y;
- if(FRAME_MBAFF) {
+ if(FRAME_MBAFF || PICT_FIELD) {
++s->mb_y;
}
}
@@ -7154,7 +7162,7 @@
s->mb_x=0;
ff_draw_horiz_band(s, 16*s->mb_y, 16);
++s->mb_y;
- if(FRAME_MBAFF) {
+ if(FRAME_MBAFF || PICT_FIELD) {
++s->mb_y;
}
if(s->mb_y >= s->mb_height){
Index: libavcodec/h264.h
===================================================================
--- libavcodec/h264.h (revision 9692)
+++ libavcodec/h264.h (working copy)
@@ -58,10 +58,12 @@
#define MB_MBAFF h->mb_mbaff
#define MB_FIELD h->mb_field_decoding_flag
#define FRAME_MBAFF h->mb_aff_frame
+#define PICT_FIELD (h->s.picture_structure != PICT_FRAME)
#else
#define MB_MBAFF 0
#define MB_FIELD 0
#define FRAME_MBAFF 0
+#define PICT_FIELD 0
#undef IS_INTERLACED
#define IS_INTERLACED(mb_type) 0
#endif
Index: libavcodec/mpegvideo.c
===================================================================
--- libavcodec/mpegvideo.c (revision 9692)
+++ libavcodec/mpegvideo.c (working copy)
@@ -1925,6 +1925,7 @@
if (!s->mb_intra) {
/* motion handling */
/* decoding or more than one mb_type (MC was already done otherwise) */
+ if (s->last_picture.data[0]){ /* Hack to stop h264/PAFF from crashing */
if(!s->encoding){
if(lowres_flag){
h264_chroma_mc_func *op_pix = s->dsp.put_h264_chroma_pixels_tab;
@@ -1953,7 +1954,7 @@
}
}
}
-
+}
/* skip dequant / idct if we are really late ;) */
if(s->hurry_up>1) goto skip_idct;
if(s->avctx->skip_idct){
More information about the ffmpeg-devel
mailing list