[FFmpeg-devel] SDR->HDR tone mapping algorithm?
Niklas Haas
ffmpeg at haasn.xyz
Fri Feb 8 19:30:20 EET 2019
Hi,
The important thing to consider is what constraints we are trying to
solve. And I think the expected behavior is that an SDR signal in SDR
mode should look identical to an SDR signal in HDR mode, to the end
user.
This is, of course, an impossible constraint to solve, since we don't
know anything about the display, either in HDR or in SDR mode. At best,
in the absence of this knowledge, we could make a guess (e.g. it's
roughly described by sRGB in SDR mode, and for HDR mode it roughly
follows the techniques outlined in ITU-R Report BT.2390). Better yet
would be to actually obtain this information from somewhere, but where?
(The user? ICC profile? EDID?).
But the bottom line is that to solve the "make SDR in HDR mode appear
identical to SDR in SDR mode" constraint, the curve you are trying to
invert is not your own tone mapping operator, but the tone mapping
operator implemented by the display (in HDR mode), which definitely
depends on what brightness level the display is targeting (in both SDR
and HDR modes).
For an ideal HDR display, this would simply be the PQ curve's exact
definition (i.e. noop tone mapping). But in practice, the display will
almost surely not be capable of displaying up to 10,000 nits, so it will
implement a tone mapping operator of some kind (even if it's as simple
as clipping the extra range). Some colorimetric/reference displays
actually do the latter, since they prefer clipping out-of-range signals
over distorting in-range ones. But most consumer displays will probably
do something similar to the hable curve, most likely in per-channel
modes.
For an ideal SDR display, it depends on who you ask (w.r.t what "ideal"
means). In the ITU-R world, an ideal SDR reference display implements
the BT.1886 transfer function. In practice, it's probably closer to a
pure power gamma 2.2 curve. Or maybe sRGB. We really have nothing else
to do here except either consult an ICC profile or just stick our head
in the sand and guess randomly.
--------------------------------------------------------------------------
I'd also like to comment on your compositor design proposal. A few notes:
1. It's always beneficial to do as few color conversion steps as
possible, to minimize cumulative errors and optimize performance. If
you use a 3DLUT as any step (e.g. for implementing an ICC-profile
based mapping), the 3DLUT should be as "wide" as possible and cover
as many operations as possible, so that the 3DLUT can be end-to-end
optimized (by the CMM).
If you insist on doing compositing in linear light, then I would
probably composite in display-referred linear light and convert it to
non-linear light during scanout (either by implementing the needed
OETF + linear tone mapping operator via the VCGTs, or by doing a
non-linear tone mapping pass). But I would recommend trying to avoid
any second gamut conversion step (e.g. from BT.2020 to the display's
space after compositing).
Otherwise, I would composite directly in the target color space
(saving us one final conversion step), which would obviously be
preferable if there are no transparency effects to worry about.
Maybe we could even switch dynamically between the two depending on
whether any blending needs to occur? Assuming we can update the VCGTs
atomically and without meaningful latency.
2. Rec 2020 is not (inherently) HDR. Also, the choice of color gamut has
nothing to do with the choice of transfer function. I might have Rec
709 HDR content. In general, when ingesting a buffer, the user should
be responsible for tagging both its color primaries and its transfer
function.
3. If you're compositing in linear light, then you most likely want to
be using at least 16-bit per channel floating point buffers, with 1.0
mapping to "SDR white", and HDR values being treated as above 1.0.
This is also a good color space to use for ingesting buffers, since
it allows treating SDR and HDR inputs "identically", but extreme
caution must be applied due to the fact that with floating point
buffers, we're left at the mercy of what the client wants to put into
them (10^20? NaN? Negative values?). Extra metadata must still be
communicated between the client and the compositor to ensure both
sides agree on the signal range of the floating point buffer
contents.
4. Applications need a way to bypass the color pipeline in the
compositor, i.e. applications need a way to tag their buffers as
"this buffer is in display N's native (SDR|HDR) color space". This of
course only makes sense if applications both have a way of knowing
what display N's native SDR/HDR color space is, as well as which
display N they're being displayed (more) on. Such buffers should be
preserved as much as possible end-to-end, ideally being just directly
scanned out as-is.
5. Implementing a "good" HDR-to-SDR tone mapping operator; and even the
question of whether to use the display's HDR or SDR mode, requires
knowledge of what brightness range your composited buffer contains.
Crucially, I think applications should be allowed to tag their
buffers with the brightest value that they "can" contain. If they
fail to do so, we should assume the highest possible value permitted
by the transfer function specified (e.g. 10,000 nits for PQ). Putting
this metadata into the protocol early would allow us to explore
better tone mapping functions later on.
Some final words of advice,
1. The protocol suggestions for color management in Wayland have all
seemed terribly over-engineered compared to the problem they are
trying to solve. I have had some short discussions with Link Mauve on
the topic of how to design a protocol that's as simple as possible
while still fulfilling its purpose, and we started drafting our own
protocol for this, but it's sitting in a WIP state somewhere.
2. I see that Graeme Gill has posted a bit in at least some of these
threads. I recommend listening to his advice as much as possible.
On Fri, 08 Feb 2019 22:01:49 +0530, Harish Krupo <harish.krupo.kps at intel.com> wrote:
> Hi Vittorio,
>
> Vittorio Giovara <vittorio.giovara at gmail.com> writes:
>
> > On Fri, Feb 8, 2019 at 3:22 AM Harish Krupo <harish.krupo.kps at intel.com>
> > wrote:
> >
> >> Hello,
> >>
> >> We are in the process of implementing HDR rendering support in the
> >> Weston display compositor [1] (HDR discussion here [2]). When HDR
> >> and SDR surfaces like a video buffer and a subtitle buffer are presented
> >> together, the composition would take place as follows:
> >> - If the display does not support HDR metadata:
> >> in-coming HDR surfaces would be tone mapped using opengl to SDR and
> >> blended with the other SDR surfaces. We are currently using the Hable
> >> operator for tone mapping.
> >> - If the display supports setting HDR metadata:
> >> SDR surfaces would be tone mapped to HDR and blended with HDR surfaces.
> >>
> >> The literature available for SDR->HDR tone mapping varies from simple
> >> linear expansion of luminance to CNN based approaches. We wanted to know
> >> your recommendations for an acceptable algorithm for SDR->HDR tone mapping.
> >>
> >> Any help is greatly appreciated!
> >>
> >> [1] https://gitlab.freedesktop.org/wayland/weston
> >> [2]
> >> https://lists.freedesktop.org/archives/wayland-devel/2019-January/039809.html
> >>
> >> Thank you
> >> Regards
> >> Harish Krupo
> >>
> >
> > In *theory* the tonemapping functions should be reversible, so if you use
> > vf_tonemap or vf_tonemap_opencl and properly expand the range via zimg
> > (vf_zscale) before compression it should work fine. However I have never
> > tried it myself, so I cannot guarantee that those filters will work as is.
> > Of course haasn from the libplacebo project might have better suggestions,
> > so you should really reach out to him.
>
> Thanks, will try reversing the algorithms. Sure, will contact Haasn.
>
> Regards
> Harish Krupo
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
More information about the ffmpeg-devel
mailing list