[FFmpeg-devel] [PATCH 2/3] doc/dict2: Add doc and api change for AVDictionary2

Sat Apr 19 16:43:26 EEST 2025

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> Michael Niedermayer
> Sent: Samstag, 19. April 2025 04:29
> To: FFmpeg development discussions and patches <ffmpeg-
> devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/3] doc/dict2: Add doc and api
> change for AVDictionary2
> 
> Hi
> 
> On Thu, Apr 17, 2025 at 10:38:32PM +0000, softworkz . wrote:
> >
> >
> > > -----Original Message-----
> > > From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of
> [...]
> > > the LLM would probably mix and confuse things and hallucinate
> > > a lot of nonsense.
> >
> > That's less of a problem meanwhile as the available context
> > windows have increased and operating on trac ticket discussions
> > does not create such long conversations where the context
> > window overflows and important parts fall off.
> > Some care might only need to be taken for that it doesn't ingest
> > really large log outputs as are sometimes included in the tickets.
> >
> > At this time, it would be still too bold to let it work fully
> > autonomously, but that's not necessary because its
> > operations could be easily arbitrated by conventional logic.
> >
> > It could be controlled by a set of tags - something like:
> >
> > - tracbot-error
> > - tracbot-inconclusive
> > - tracbot-needs-manual-review
> > - tracbot-awaiting-user-response
> > - tracbot-reproduced-in-master
> > - tracbot-fixed-in-master
> >
> > Then, a scheduler service would run over all open issues and
> > invoke the AI on it (see below).
> >
> > The scheduler would exclude tickets which already have one of
> > those tags assigned.
> > Additionally, it would include tickets that are tagged with
> > "tracbot-awaiting-user-response" and have been updated since
> > the tag was assigned.
> >
> >
> > When the AI is invoked on a ticked, it has clear instructions
> > to follow. The primary directive is to reproduce the reported
> > issue. If the specified information is unclear or incomplete
> > or when no test file is provided, it posts a message, asking
> > for the missing information and applies the awaiting-user-response
> > tag.
> >
> > The AI would have an execution environment in a Docker
> > container where it has access to a library with daily builds
> > from the past 5 years.
> > If the issue doesn't reproduce with the latest daily build,
> > it adds the tracbot-fixed-in-master tag.
> >
> > If it can be reproduced with the latest build, it "bisects"
> > the issue using the daily binaries.
> > It adds a message like: "Issue reproducible since version
> > 20xx-xx-xx and the tag tracbot-reproduced-in-master
> >
> > If it can't make sense of it, or is platform-specific or
> > needs certain hardware, or errors, it adds one of the
> > other tags.
> >
> > Some safeguards must be added to avoid anybody getting
> > into a longer chat with it (always ending with
> > awaiting-user-response), but otherwise, I don't think
> > that there's much that can go wrong.
> >
> > A mailing list could be set up, to which it reports it
> > operations, and where interested members (or anybody)
> > can subscribe to. This would provide a kind of real-time
> > monitoring by the community.
> >
> >
> > All-in-all I think it's well doable.
> >
> > Unfortunately though, I cannot spend that much time.
> > Perhaps a candidate for GSoC?
> 
> GsoC would need a mentor and a student/contributor wanting to work on
> this.
> Also this would need someone (ideally either the mentor or
> contributor)
> willing to maintain it after GSoC
> 
> And it would not surprise me if its more work for us to do this in
> GSoC
> than just do it ourselfs.

Hi Michael,

yea, that's also one of the reasons why I'm not considering myself as a 
good teacher: I use to think that I'll be able to get it done by myself
even before I'm done explaining to someone else.

The other day I had let the same setup like for the dictionary run on 
it, and it struggled, saying it cannot get past the "Anubis" bot 
protection on trac.ffmpeg.org. Is that right, do we have that kind of
protection for the trac server? If yes, is there another way to access
the trac site via API?

After it couldn't access it, it started writing code to just "mock" the
API access and when I forbid that, it resorted to a browser automation
approach (via Puppeteer) but that's suboptimal of course, even though
it succeeded in retrieving ticket content.

sw