[Libav-user] unsynchronous for H.264+AAC:

Li Zhang lizhang at utelisys.com
Fri Mar 23 17:44:47 CET 2012


Hi everyone,

I used ffmpeg to transcode mpeg2 TS to H264+AAC( into TS container) and send them to multicast address. Now I found that the video and audio are unsynchronous. The audio is later about 1 second than video. I have no idea what is the problem.

During the transcoding, I found video decoding(avcodec_decode_video2) will fail 7 times (which means got_picture_ptr returns 0).

While for the encoding, video encoding (avcodec_encode_video) will fail 50 times (which means out_size  less than 0) and it throws 50 frames. Actually, these 50 frames include 3 I frames. The 50th packets decoded is a P frame which is the 1st frame for successful encoding. After encoding this frame is I frame.

The audio encoding (avcodec_encode_audio2) will fail 3 times(which means got_packet is 0).

I do not know why they will fail. Especially, the I frame were ignored.

After these failures, the transcoding works normally except the unsynchronous problem.  Can anyone give me some suggestions? Is it possible these failure caused the unsynchronous? If you need more information, please let me know.




Here is the information I got from ffmpeg during transcoding:
==============================================================================================================================
 Format mpegts probed with size=2048 and score=100
Unable to seek back to the start
stream=0 stream_type=2 pid=1b77 prog_reg_desc=
stream=1 stream_type=3 pid=1b78 prog_reg_desc=
stream=2 stream_type=6 pid=1b79 prog_reg_desc=
stream=3 stream_type=5 pid=1b7a prog_reg_desc=
parser not found for codec dvb_teletext, packets or times may be invalid.
mpeg_decode_postinit() failure
mpeg_decode_postinit() failure
mpeg_decode_postinit() failure
mpeg_decode_postinit() failure
mpeg_decode_postinit() failure
max_analyze_duration 5000000 reached at 5000000
Could not find codec parameters (Unknown: none ([5][0][0][0] / 0x0005))
Estimating duration from bitrate, this may be inaccurate
Input #0, mpegts, from 'udp://239.100.0.2:1234':
  Duration: N/A, start: 27086.974700, bitrate: 15160 kb/s
  Program 1103
    Metadata:
      service_name    :  3
      service_provider: Digitenne
    Stream #0:0[0x1b77], 130, 1/90000: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p, 704x576 [SAR 16:11 DAR 16:9], 1/50, 15000 kb/s, 25.60 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x1b78], 208, 1/90000: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, 2 channels, s16, 160 kb/s
    Stream #0:2[0x1b79](dut), 133, 1/90000: Subtitle: dvb_teletext ([6][0][0][0] / 0x0006)
    Stream #0:3[0x1b7a], 0, 1/90000: Unknown: none ([5][0][0][0] / 0x0005)
detected 8 logical cores
Output #0, mpegts, to 'udp://217.117.234.134:1235':
    Stream #0:0, 0, 1/90000: Video: h264 (hq), yuv420p, 1280x720, 1/25, q=10-51, 1500 kb/s, 90k tbn, 25 tbc
    Stream #0:1, 0, 1/90000: Audio: aac (LC), 48000 Hz, 2 channels, s16, 96 kb/s
using mv_range_thread = 56
using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2 AVX
profile High, level 3.2
264 - core 122 r2183 c522ad1 - H.264/MPEG-4 AVC codec - Copyleft 2003-2012 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=umh subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=7 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=2 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=cbr mbtree=1 bitrate=1500 ratetol=1.0 qcomp=0.60 qpmin=10 qpmax=51 qpstep=4 vbv_maxrate=1500 vbv_bufsize=1500 nal_hrd=none ip_ratio=1.40 aq=1:1.00
muxrate VBR, pcr every 2 pkts, sdt every 200, pat/pmt every 40 pkts
using mv_range_thread = 56
===================================================================================================================
Here is the main code which I used:
===================================================================================================================
        AVFormatContext *oc;
        AVCodecContext *oVcc,*oAcc;
        AVStream *video_st,*audio_st;
        int out_size = 0;
        int audio_out_size = 0;
        uint8_t *video_outbuf = NULL;
        uint8_t *buftmp = NULL;

        int frameFinished = 0;
        int len = 0;
        uint64_t frame_index = (uint64_t)0;
        int is_valid_encode = 0;
        int scale_need = 0;
        int resample_need = 0;
        AVFrame *oVFrame;
        oVFrame = avcodec_alloc_frame();
        AVFrame *oAFrame;
        oAFrame = avcodec_alloc_frame();
//used to record the output video stream index and audio stream index
unsigned int out_video_index ;
unsigned int out_audio_index;

//used to record the input video and audio stream index
int videoindex;
int audioindex;


while(av_read_frame(ic, &packet)>=0) {
                if(packet.stream_index == videoindex) {
                        avcodec_get_frame_defaults(oVFrame);
                        len = avcodec_decode_video2(vCodecCtx, oVFrame, &frameFinished, &packet);
                        if(len<0) {
               printf("Error while decoding\n");
               continue;
            }
                        else if(0 == frameFinished ) {
                                printf("Error decoding and frameFinished is 0\n");
                                continue;
                        }
                    else if(frameFinished) {
                                //if it is needed, rescale the frame
                                AVFrame encode_frame;
                                AVFrame temp_frame;
                                if(scale_need && video_scale_context) {
                                        if(avpicture_alloc((AVPicture*)&temp_frame, oVcc->pix_fmt, oVcc->width, oVcc->height)) {
                                                printf("the tempory frame can not be allocated\n");
                                                return 1;
                                        }
                                        //scale the video size for the encoder
                                        sws_scale(video_scale_context, oVFrame->data,oVFrame->linesize,
                                                     0, vCodecCtx->height,
                                                     temp_frame.data,
                                                     temp_frame.linesize);
                                        encode_frame = temp_frame;
                                        printf("the frame is scaled !!!\n" );
                                }
                                else {
                                encode_frame = *oVFrame;
                        }
                        encode_frame.interlaced_frame = oVFrame->interlaced_frame;
                        encode_frame.pict_type =  AV_PICTURE_TYPE_NONE;
                        encode_frame.quality = oVFrame->quality;
                        encode_frame.pts = frame_index;

                                video_outbuf = (unsigned char *) av_malloc(video_outbuf_size);
                                memset(video_outbuf, 0, video_outbuf_size);

                                out_size = avcodec_encode_video(oVcc, video_outbuf, video_outbuf_size, &encode_frame);
                                //if (is_valid_encode)
                                if(out_size > 0) {
                                        //encapsue the data into packet and configure some parameters
                                        AVPacket output_video_packet;
                                        av_init_packet(&output_video_packet);
                                        output_video_packet.data = video_outbuf;
                            output_video_packet.size = out_size;
                                        //configure the key frame
                            if(oVcc->coded_frame && oVcc->coded_frame->key_frame)
                                 output_video_packet.flags |= AV_PKT_FLAG_KEY;
                            output_video_packet.stream_index = video_st->index;
                                        //configure the pts and dts
                                        if(oVcc->coded_frame && oVcc->coded_frame->pts != AV_NOPTS_VALUE) {
                                output_video_packet.pts = av_rescale_q(oVcc->coded_frame->pts, oVcc->time_base,video_st->time_base);
                                        }

                                        output_video_packet.dts = AV_NOPTS_VALUE;
                                        //output_video_packet.dts = 3600 * (frame_index - 25);
                            int ret = write_packet(oc, &output_video_packet, oVcc, video_st);

                                        if(ret!=0) {
                                                printf("while write video frame error\n");
                                                continue;
                                        }
                                        if (oVcc->coded_frame->pict_type == AV_PICTURE_TYPE_B)
                                                printf("AV_PICTURE_TYPE_B\n");
                                        else if (oVcc->coded_frame->pict_type == AV_PICTURE_TYPE_I)
                                                printf("AV_PICTURE_TYPE_I\n");
                                        else if (oVcc->coded_frame->pict_type == AV_PICTURE_TYPE_P)
                                                printf("AV_PICTURE_TYPE_p\n");
                                        printf("packet: pts is:%d, dts is: %d\n", packet.pts, packet.dts);
                                        printf("the pts is: %d, and the dts is: %d, oVcc->coded_frame->pts is: %d\n", output_video_packet.pts, output_video_packet.dts, oVcc->coded_frame->pts);
                            //printf("video: the decoded packet and has been encoded has the packet size is: %d\n", packet.size);
                                        printf("video pacekt is written and frame_index is %d\n", frame_index);
                                        //printf("  video: the packet size is: %d\n", output_video_packet.size);
                                }
                                else {
                                        printf("video: encoding is failed, is_valid_encode is: %d\n", is_valid_encode);
                                        printf("video: encoding is failed, out_size is: %d\n", out_size);
                                }
                                frame_index++;
                                av_free(video_outbuf);

                                avpicture_free((AVPicture*)&temp_frame); //bug ??
                        }
                        else {
                                printf("video:frameFinished is %d\n", frameFinished);
                        }

                }

                else if(packet.stream_index==audioindex) {
                        int ret=0;
                        int is_audio_valid = 0;
                        avcodec_get_frame_defaults(oAFrame);

                        ret=avcodec_decode_audio4(aCodecCtx,oAFrame,&is_audio_valid, &packet);

                        if(ret<0 || is_audio_valid < 0) {
                                printf("audio: while decode audio failure\n");
                                continue;
                        }
                        else if(0 == is_audio_valid) {
                        // Audio was not decoded
                                printf("audio: Error decoiding audio frame result_bytes=0 bytes_decoded= %d\n", ret);
                        }

                        else {
                                //fifo the audio data if input codec context->frame_size not equal output codec context->frame_size------>
                                uint8_t *buf = oAFrame->data[0];
                        //decoded audio frame size
                        int tmp_size = aCodecCtx->frame_size * aCodecCtx->channels * av_get_bytes_per_sample(aCodecCtx->sample_fmt);

                        //encoding audio frame size
                                int frame_bytes = oAcc->frame_size * oAcc->channels * av_get_bytes_per_sample(oAcc->sample_fmt);
                                uint8_t *buftmp = NULL;
                                buftmp = (uint8_t*) av_realloc(buftmp, frame_bytes);

                        // write into the fifo buffer
                                av_fifo_generic_write(fifo_buf, buf, tmp_size, NULL);

                                while(av_fifo_size(fifo_buf)>=frame_bytes) { // loop if there are enough audio data
                                        av_fifo_generic_read(fifo_buf, buftmp, frame_bytes, NULL);
                                        //initialize encode_audio_frame
                                        AVFrame *encode_audio_frame = NULL;
                                        encode_audio_frame = avcodec_alloc_frame();
                                        if(!encode_audio_frame) {
                                                printf("Can not allocate encode_audio_frame.\n");
                                                continue;
                                        }

                                        avcodec_get_frame_defaults(encode_audio_frame);
                                        //configure the filling length for encode_audio_frame
                                        encode_audio_frame->nb_samples = 1024;
                                        //fill the data in tmp into the encode_audio_frame
                                        if(avcodec_fill_audio_frame(encode_audio_frame,
                                                                                                oAcc->channels,
                                                                                                oAcc->sample_fmt,
                                                                                                buftmp, frame_bytes, 1) < 0) {
                                                printf("audio: encoding_audio_frame is failed.\n");
                                                continue;
                                        }
                                        AVPacket pkt;
                                        av_init_packet(&pkt);
                                        audio_outbuf = (uint8_t*)av_malloc(audio_outbuf_size);
                                        memset(audio_outbuf, 0, audio_outbuf_size);
                                        pkt.size = audio_outbuf_size;
                                        pkt.data = audio_outbuf;
                                        oVFrame->pts = AV_NOPTS_VALUE;
                                        audio_out_size= avcodec_encode_audio2(oAcc, &pkt, encode_audio_frame ,&is_audio_valid);
                                        if(0 > audio_out_size){
                                                printf("audio: the encoding is error \n");
                                                continue;
                                        }
                                        else if(0 == is_audio_valid) {
                                                printf("audio: Error encoding audio, the is_audio_valid is 0\n");
                                                continue;
                                        }
                                        else if(is_audio_valid) {
                                                if(oAcc->coded_frame && (oAcc->coded_frame->pts != AV_NOPTS_VALUE)) {
                                                        //pkt.pts = av_rescale_q(oAcc->coded_frame->pts, oAcc->time_base, audio_st->time_base);

                                                }
                                                pkt.flags |= AV_PKT_FLAG_KEY;
                                                pkt.stream_index= audio_st->index;
                                                pkt.dts = AV_NOPTS_VALUE;
                                        #if 1
                                                if (write_packet(oc, &pkt,oVcc,audio_st) != 0) {
                                                        fprintf(stderr, "audio: Error while writing audio frame\n");
                                                        continue;
                                                }
                                                #endif
                                                printf("audio: audio packet is written.\n");
                                        }

                                        av_free(pkt.data);
                                        pkt.size = 0;
                                        audio_outbuf = NULL;
                                        av_free(encode_audio_frame);
                                }// the end of while
                                av_free(buftmp);
                                buftmp = NULL;
                        }
                }
                av_free_packet(&packet);
        }

        if(0 != av_write_trailer(oc)) {
                printf("the trailer can not be written!");
        }
=============================================================================================================================================

Best regards,

Li Zhang


More information about the Libav-user mailing list