[FFmpeg-user] Resampling of audio not working correctly

Julian Gardner joolzg at gardnersweden.com
Mon Dec 21 11:41:09 EET 2020


Problem:

I have a video with these parameters
Input #0, mpegts, from 'videoFiles/Ashbury Heights - Spiders.ts':
   Duration: 00:04:03.63, start: 1.368000, bitrate: 1106 kb/s
   Program 1
     Metadata:
       service_name    : Service01
       service_provider: FFmpeg
     Stream #0:0[0x100]: Video: h264 (High) ([27][0][0][0] / 0x001B), 
yuv420p(progressive), 720x576 [SAR 64:45 DAR 16:9], 25 fps, 25 tbr, 90k 
tbn, 50 tbc
     Stream #0:1[0x101]: Audio: aac (LC) ([15][0][0][0] / 0x000F), 32000 
Hz, stereo, fltp, 108 kb/s

I have, based on the examples transcode_aac.c, initialised the resampler 
and this is what it is being set to
init_resampler Input:3 fltp 32000, Output:3 fltp 44100

So we have a 32000 in and a 44100 out.

Now running my code the video ends at 176 seconds which is 
243*(32000/44100), so for some reason my write_audio_frame is not 
getting the right pts values.

So looking into the transcode_aac.c code it has this caveat

"/*
     * Perform a sanity check so that the number of converted samples is
     * not greater than the number of samples to be converted.
     * If the sample rates differ, this case has to be handled 
differently
     */"

Can someone tell me what I need to do to get this 32000->44100 
conversion working correctly. I am using the fifo as in the example but 
as I said i run out of audio at 148seconds.

The muxer part is based on muxing.c in the way it uses the PTS to decide 
which packet is needed next.

A bit more debug data from the audio thread
Preparing videoFiles/Ashbury Heights - Spiders.ts
init_resampler Input:3 fltp 32000, Output:3 fltp 44100
Output #0, matroska, to 'a.mkv':
     Stream #0:0: Video: h264 (Main), yuv420p, 1280x720, q=2-31, 2000 
kb/s, 25 tbn
     Stream #0:1: Audio: aac (LC), 44100 Hz, stereo, fltp, 128 kb/s
KEY  : K pts:126000   pts_time:1.4      dts:118800   dts_time:1.32     
duration:3600     duration_time:0.04      stream_index:0
A_RAW:   pts:123120   pts_time:1.368    dts:123120   dts_time:1.368    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:126000   pts_time:1.4      dts:126000   dts_time:1.4      
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:128880   pts_time:1.432    dts:128880   dts_time:1.432    
duration:2880     duration_time:0.032     stream_index:1
POPped Audio 0 0.02322
POPped Audio 1024 0.02322
A_RAW:   pts:131760   pts_time:1.464    dts:131760   dts_time:1.464    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:134640   pts_time:1.496    dts:134640   dts_time:1.496    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:137520   pts_time:1.528    dts:137520   dts_time:1.528    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:140400   pts_time:1.56     dts:140400   dts_time:1.56     
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:143280   pts_time:1.592    dts:143280   dts_time:1.592    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:146160   pts_time:1.624    dts:146160   dts_time:1.624    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:149040   pts_time:1.656    dts:149040   dts_time:1.656    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:151920   pts_time:1.688    dts:151920   dts_time:1.688    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:154800   pts_time:1.72     dts:154800   dts_time:1.72     
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:157680   pts_time:1.752    dts:157680   dts_time:1.752    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:160560   pts_time:1.784    dts:160560   dts_time:1.784    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:163440   pts_time:1.816    dts:163440   dts_time:1.816    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:166320   pts_time:1.848    dts:166320   dts_time:1.848    
duration:2880     duration_time:0.032     stream_index:1
POPped Audio 2048 0.02322
A_RAW:   pts:169200   pts_time:1.88     dts:169200   dts_time:1.88     
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:172080   pts_time:1.912    dts:172080   dts_time:1.912    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:174960   pts_time:1.944    dts:174960   dts_time:1.944    
duration:2880     duration_time:0.032     stream_index:1
POPped Audio 3072 0.02322
A_RAW:   pts:177840   pts_time:1.976    dts:177840   dts_time:1.976    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:180720   pts_time:2.008    dts:180720   dts_time:2.008    
duration:2880     duration_time:0.032     stream_index:1
A_RAW:   pts:183600   pts_time:2.04     dts:183600   dts_time:2.04     
duration:2880     duration_time:0.032     stream_index:1
POPped Audio 4096 0.02322
POPped Audio 5120 0.02322
POPped Audio 6144 0.02322
POPped Audio 7168 0.02322
POPped Audio 8192 0.02322
POPped Audio 9216 0.02322
POPped Audio 10240 0.02322
POPped Audio 11264 0.02322
POPped Audio 12288 0.02322
POPped Audio 13312 0.02322
POPped Audio 14336 0.02322
POPped Audio 15360 0.02322
POPped Audio 16384 0.02322
POPped Audio 17408 0.02322
POPped Audio 18432 0.02322
POPped Audio 19456 0.02322
POPped Audio 20480 0.02322


Anyone able to point me at a fix or an example which shows what the 
caveat means.

-- 
BR

Joolz


More information about the ffmpeg-user mailing list