[FFmpeg-devel] [PATCH] Whisper audio filter
Michael Niedermayer
michael at niedermayer.cc
Thu Jul 10 02:41:10 EEST 2025
Hi
On Wed, Jul 09, 2025 at 09:23:48AM +0200, Vittorio Palmisano wrote:
> It adds a new audio filter for running audio transcriptions with the whisper model.
> Documentation and examples are included into the patch.
>
> Signed-off-by: Vittorio Palmisano <vpalmisano at gmail.com>
> ---
> configure | 5 +
> doc/filters.texi | 101 ++++++++
> libavfilter/Makefile | 2 +
> libavfilter/af_whisper.c | 494 +++++++++++++++++++++++++++++++++++++++
> libavfilter/allfilters.c | 2 +
> 5 files changed, 604 insertions(+)
> create mode 100644 libavfilter/af_whisper.c
[...]
> +static void run_transcription(AVFilterContext *ctx, AVDictionary **metadata, int end_pos)
> +{
> + WhisperContext *wctx = ctx->priv;
> + end_pos = FFMIN(end_pos, wctx->audio_buffer_fill_size);
> +
> + if (!wctx->ctx_wsp || end_pos == 0)
> + {
> + return;
> + }
> +
> + if (!wctx->ctx_wsp)
> + {
> + return;
> + }
> +
> + float duration = (float)end_pos / WHISPER_SAMPLE_RATE;
In fact float should not be used here
end_pos, audio_buffer_fill_size are all integers
and the timestamp is also integer
exact integer / rational math can and should be used here
thx
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Give a rich man 100$ and he will turn it into 1000$.
Give a poor man 1000$ and he will spend it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20250710/d10ceb89/attachment.sig>
More information about the ffmpeg-devel
mailing list