[FFmpeg-devel] [RFC] threading API

Michael Niedermayer michaelni
Wed Oct 7 18:13:46 CEST 2009


On Tue, Oct 06, 2009 at 04:34:12PM +0200, Reimar D?ffinger wrote:
> On Tue, Oct 06, 2009 at 03:23:14PM +0200, Michael Niedermayer wrote:
> > On Tue, Oct 06, 2009 at 03:13:46PM +0200, Reimar D?ffinger wrote:
> > > On Tue, Oct 06, 2009 at 01:52:47PM +0200, Michael Niedermayer wrote:
> > > > On Tue, Oct 06, 2009 at 09:17:30AM +0200, Reimar D?ffinger wrote:
> > > > > On Tue, Oct 06, 2009 at 09:01:15AM +0200, Reimar D?ffinger wrote:
> > > > > > The later is only a very minor optimization, and if a API change should
> > > > > > be avoided by all means, the former can be achieved also by adding to
> > > > > > AVCodecContext a set_thread_number callback, that if set is executed
> > > > > > each time before the main worker function with the same void *arg, i.e.
> > > > > > void (*set_thread_number)(struct AVCodecContext *c, void *arg, int thread)
> > > > > > 
> > > > > > So, what are the opinions? Other ideas?
> > > > > 
> > > > > Actually, a better idea: add a
> > > > > void *(get_worker_arg)(struct AVCodecContext *c, void *arg_array, int arg_size,
> > > > > int arg_elem_size, int jobnr, int threadnr)
> > > > > where the arg_ stuff matches with the avctx->execute (arg2, count, size)
> > > > > arguments.
> > > > 
> > > > > The default implementation would then be
> > > > > default_get_worker_arg:
> > > > > return (char*)arg_array + jobnr*arg_elem_size;
> > > > > whereas dnxhdenc would use something like
> > > > > DNXHDContext *c = ((DNXHDContext **)arg_array)[threadnr];
> > > > > c->start_slice = jobnr;
> > > > > return c;
> > > > 
> > > > I see no reason why we ever should do 
> > > > return (char*)arg_array + jobnr*arg_elem_size
> > > > its always better to reuse contexts
> > > 
> > > Well, reason 1: it allows to change the existing code gradually
> > > reason 2: it might be possible that some codecs do not need
> > > to have a separate context for each thread anyway, they can all
> > > use the same AVCodecContext (and its priv_data), but they need some
> > > per-task data.
> > 
> > maybe we should have 2 arrays, 1 per thread, 1 per task, and they can be NULL
> 
> Again, that requires an API change. Though on thinking again my approach
> isn't really possible without either...

i have no matches in ffmpeg.c for execute() ...


> Still, at least it wouldn't require changing all decoders/encoders at
> once.
> Apart from that it seems less flexible than just having a void * that is
> the same for all and passing thread and task number.

iam fine with that as well


> 
> > > I think that is what the current implementation
> > > was thought for and which seems like a useful feature to me.
> > 
> > the current implementation was mostly toward threads == tasks, it has its
> > advantages if one assumes that the OS keeps each thread on the same CPU
> > then if the image is split in 2 the top is handled always on CPU #1 and the
> > botton always on cpu #2, you have a advanatge due to better cache use for
> > P and B frames
> 
> You'd have to also assume that thread 1 also gets task 1, for which I
> think there is no reason at all with the current code (haven't checked
> though).

i think thread 1 gets task 1 if not that would be a bug IMO


> Also with low_delay (or in general without B-frames) only the
> B-frames get any advantage unless you assume the decoded data isn't
> (immediately) used by the application.

the app could also have 2 threads on the same cpus or whatever to
use the data split to maintain cache efficiency ...

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091007/f760cfe4/attachment.pgp>



More information about the ffmpeg-devel mailing list