[FFmpeg-user] Large Sized output files recieved while encoding the audio

Shubham Tiwari shubham.tiwari at observe.ai
Wed Apr 20 19:31:01 EEST 2022


Hi Ferdi,

My use case is redaction of audio. To achieve this, our backend adds the
filter to mute the audio in the ffmpeg command and then executes on audio
files. For some audio files it's taking more than 30 seconds which is
causing high cpu usage and low throughput. The problem in our scenario is
the same as I mentioned above. When we use the extension of the output
audio file as mp3, then the time taken is high. When we use the original
extension (wav) in the output audio file, then the file size is high (for
some cases, the size goes upto 250mb+).

Do you know what we are doing wrong? Any help would be appreciated.



*Your second command actually encodes the already encoded file
again,reducing the quality of the audio even further as 64 kb is already
verylow quality for an mp3 file. This is also why it takes longer.*

Do you know the reason why reducing the quality even further takes longer?

Regards,
Shubham

On Wed, Apr 20, 2022 at 8:08 PM Ferdi Scholten <ferdi at sttc-nlp.nl> wrote:

> Hi,
> > When I run ffmpeg command on a wav file, the received output file is of
> > large size as compared to the original file. The command and the output
> are
> > below,
> >
> > *FFMPEG command*
> >
> >   % ~/Downloads/audio/ffmpeg -i call-redacted.wav output.wav
> >
> > ffmpeg version 4.4.1-tessus  https://evermeet.cx/ffmpeg/  Copyright (c)
> > 2000-2021 the FFmpeg developers
> >
> >    built with Apple clang version 11.0.0 (clang-1100.0.33.17)
> >
> >    configuration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg
> > --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl
> > --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d
> > --enable-libfreetype --enable-libgsm --enable-libmodplug
> > --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb
> > --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
> > --enable-libopus --enable-librubberband --enable-libshine
> > --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora
> > --enable-libtwolame --enable-libvidstab --enable-libvmaf
> > --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx
> --enable-libwebp
> > --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid
> > --enable-libzimg --enable-libzmq --enable-libzvbi --enable-version3
> > --pkg-config-flags=--static --disable-ffplay
> >
> >    libavutil      56. 70.100 / 56. 70.100
> >
> >    libavcodec     58.134.100 / 58.134.100
> >
> >    libavformat    58. 76.100 / 58. 76.100
> >
> >    libavdevice    58. 13.100 / 58. 13.100
> >
> >    libavfilter     7.110.100 /  7.110.100
> >
> >    libswscale      5.  9.100 /  5.  9.100
> >
> >    libswresample   3.  9.100 /  3.  9.100
> >
> >    libpostproc    55.  9.100 / 55.  9.100
> >
> > Input #0, mp3, from 'call-redacted.wav':
> >
> >    Metadata:
> >
> >      encoder         : Lavf58.45.100
> >
> >    Duration: 00:13:06.38, start: 0.138125, bitrate: 64 kb/s
> >
> >    Stream #0:0: Audio: mp3, 8000 Hz, stereo, fltp, 64 kb/s
> >
> > Stream mapping:
> >
> >    Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
> >
> > Press [q] to stop, [?] for help
> >
> > Output #0, wav, to 'output.wav':
> >
> >    Metadata:
> >
> >      ISFT            : Lavf58.76.100
> >
> >    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz,
> stereo,
> > s16, 256 kb/s
> >
> >      Metadata:
> >
> >        encoder         : Lavc58.134.100 pcm_s16le
> >
> > size=   24569kB time=00:13:06.17 bitrate= 256.0kbits/s speed=2.97e+03x
> >
> > video:0kB audio:24569kB subtitle:0kB other streams:0kB global headers:0kB
> > muxing overhead: 0.000310
> >
> >
> > *Input and output file sizes*
> >
> > du -sh call-redacted.wav
> >
> > 6.5M call-redacted.wav
> >
> > du -sh output.wav
> >
> >   24M output.wav
> >
> >
> > Alternatively, when I run the same command but change the output
> > file's extension to mp3, the correct sized file is returned. But the time
> > taken (6 seconds) in this case is quite higher than the previous
> case(less
> > than 1 second). The command and output is below,
> >
> >
> >
> > ~/Downloads/audio/ffmpeg -i call-redacted.wav output.mp3
> >
> > ffmpeg version 4.4.1-tessus  https://evermeet.cx/ffmpeg/  Copyright (c)
> > 2000-2021 the FFmpeg developers
> >
> >    built with Apple clang version 11.0.0 (clang-1100.0.33.17)
> >
> >    configuration: --cc=/usr/bin/clang --prefix=/opt/ffmpeg
> > --extra-version=tessus --enable-avisynth --enable-fontconfig --enable-gpl
> > --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d
> > --enable-libfreetype --enable-libgsm --enable-libmodplug
> > --enable-libmp3lame --enable-libmysofa --enable-libopencore-amrnb
> > --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg
> > --enable-libopus --enable-librubberband --enable-libshine
> > --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora
> > --enable-libtwolame --enable-libvidstab --enable-libvmaf
> > --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx
> --enable-libwebp
> > --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid
> > --enable-libzimg --enable-libzmq --enable-libzvbi --enable-version3
> > --pkg-config-flags=--static --disable-ffplay
> >
> >    libavutil      56. 70.100 / 56. 70.100
> >
> >    libavcodec     58.134.100 / 58.134.100
> >
> >    libavformat    58. 76.100 / 58. 76.100
> >
> >    libavdevice    58. 13.100 / 58. 13.100
> >
> >    libavfilter     7.110.100 /  7.110.100
> >
> >    libswscale      5.  9.100 /  5.  9.100
> >
> >    libswresample   3.  9.100 /  3.  9.100
> >
> >    libpostproc    55.  9.100 / 55.  9.100
> >
> > Input #0, mp3, from 'call-redacted.wav':
> >
> >    Metadata:
> >
> >      encoder         : Lavf58.45.100
> >
> >    Duration: 00:13:06.38, start: 0.138125, bitrate: 64 kb/s
> >
> >    Stream #0:0: Audio: mp3, 8000 Hz, stereo, fltp, 64 kb/s
> >
> > Stream mapping:
> >
> >    Stream #0:0 -> #0:0 (mp3 (mp3float) -> mp3 (libmp3lame))
> >
> > Press [q] to stop, [?] for help
> >
> > Output #0, mp3, to 'output.mp3':
> >
> >    Metadata:
> >
> >      TSSE            : Lavf58.76.100
> >
> >    Stream #0:0: Audio: mp3, 8000 Hz, stereo, fltp
> >
> >      Metadata:
> >
> >        encoder         : Lavc58.134.100 libmp3lame
> >
> > size=    2304kB time=00:13:06.17 bitrate=  24.0kbits/s speed= 130x
> >
> > video:0kB audio:2304kB subtitle:0kB other streams:0kB global headers:0kB
> > muxing overhead: 0.011063
> >
> > du -sh output.mp3
> >
> > 3.3M output.mp3
> >
> >
> > Please help me out in identifying the right command options for optimum
> > time and filesize.
> >
> >
> > Regards,
> >
> > Shubham
> > _______________________________________________
> > ffmpeg-user mailing list
> > ffmpeg-user at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-user
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
> The fact that the file extension is .wav does not mean it also is an
> uncompressed pcm wav file. In this case as correctly identified by
> ffmpeg it is an mp3 file (lossy compressed audio) that will always get
> bigger when converted to uncompressed pcm. Which is what your first
> command does. It decodes the mp3 and stores the uncompressed stream in
> pcm format.
>
> Your second command actually encodes the already encoded file again,
> reducing the quality of the audio even further as 64 kb is already very
> low quality for an mp3 file. This is also why it takes longer.
>
> So what are you trying to achieve?
>
> Greetings Ferdi
>
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-user
>
> To unsubscribe, visit link above, or email
> ffmpeg-user-request at ffmpeg.org with subject "unsubscribe".
>


More information about the ffmpeg-user mailing list