[FFmpeg-devel] [PATCH 01/13] lavc/jpeg2000dec: Finer granularity threading

Tomas Härdin tjoppen at acc.umu.se
Fri Jun 24 11:19:46 EEST 2022


lör 2022-06-18 klockan 16:50 +0200 skrev Anton Khirnov:
> Quoting Tomas Härdin (2022-06-14 16:39:00)
> > Patch 12 in this series is optional since it's just me getting the
> > speed up on a specific machine
> > 
> > /Tomas
> > 
> > From 115aa26c343419e81c1b5ba0bfdb1615cbec27e9 Mon Sep 17 00:00:00
> > 2001
> > From: =?UTF-8?q?Tomas=20H=C3=A4rdin?= <git at haerdin.se>
> > Date: Fri, 10 Jun 2022 14:10:02 +0200
> > Subject: [PATCH 01/13] lavc/jpeg2000dec: Finer granularity
> > threading
> > 
> > Decoding and dequant is now threaded on codeblock level.
> > IDWT is threaded on component level.
> > MCT and write_frame() remain threaded on tile level.
> > 
> > This brings lossless 4K J2K with -lowres 2 -thread_type slice -
> > threads 96 on an AMD EPYC 7R32 from 4.8 fps (177% CPU) to 31 fps
> > (1284% CPU).
> 
> Any measurable impact on single-threaded or frame-threaded decoding?
> 

median of 11 runs with -threads 1 -vframes 100 on a 4K file
before: real    0m38,664s
 after: real    0m39,139s

I have in mind to try and roll together the last step in the IDWT code
with the av_clip() in write_frame() which should improve run time in
all cases.

/Tomas



More information about the ffmpeg-devel mailing list