[FFmpeg-devel] [PATCH 8/8] Make mime-type award a bonus probe score

Tomas Härdin git at haerdin.se
Fri Feb 21 11:15:56 EET 2025


tor 2025-02-20 klockan 22:08 +0100 skrev Michael Niedermayer:
> On Thu, Feb 13, 2025 at 10:29:33PM +0100, Tomas Härdin wrote:
> > Might be better to leverage afl-fuzz since it is more wily in its
> > tricks to provoke different program behavior. Then exit(1) whenever
> > the
> > test program probes something incorrectly. For example you could
> > start
> > with a small, valid MPEG-PS file and have afl-fuzz generate
> > slightly
> > different versions of it that don't probe as such
> 
> A real fuzzer will make every probe, probe incorrectly. Maybe i
> misunderstood
> what you suggested

What I'm getting at is that we can use fuzzing to generate files that
straddle the line between valid and not-valid, and then deliberately
decide not to support them. Maybe this isn't as useful as I initially
thought though.

> what we want is that
> 1. Random binary, random ascii, randon utf8 and intermediates do not
> get
>     detected as any format (thats what probetest does)

Right, probing /dev/urandom shouldn't really return meaningful scores.
Except probetest is deterministic but whatever.

> 2. that format A is detected more as format A than format B where B
> != A
>     we and our users test this by simply using ffmpeg and fate

A and B are not always cleanly separated. In fact some formats are
deliberately designed to be polyglots. MXF is the obvious example. In
these cases what is important is workflows, not formats.


> having a really large corpus of real world odd files and test probing
> on them
> seems the "ideal" way to test probing to me

Probably yeah. This is more of a philosophical point anyway I think.

Anyway to get back to the point of this patch: I think MIME type is an
excellent hint of user intent. The patch also doesn't break any
existing tests. If I don't see any explicit objection I'll push this
patch as well and we'll see how it plays out in the real world. If
anyone objects then they need to at least provide samples, and
preferably also explain their workflow.

/Tomas



More information about the ffmpeg-devel mailing list