[FFmpeg-devel] Scaling PAL8 images with alpha

Fri Sep 24 19:59:46 EEST 2021

> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-bounces at ffmpeg.org> On Behalf Of Tomas
> Härdin
> Sent: Friday, 24 September 2021 17:34
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] Scaling PAL8 images with alpha
> 
> fre 2021-09-24 klockan 10:30 +0000 skrev Soft Works:
> > Hi,
> >
> > for a new filter, I want to rescale PAL8 subtitle bitmaps where the
> > palette includes
> > colors with alpha.
> 
> Isn't this typically done at the player level? What's your use case?

Ok, the full story about this is also one out of several reasons for 
going forward with subtitle filtering.

One thing that I often needed to tell users doing live transcoding with 
burn-in of graphical subtitles on system with weak CPUs was like:

"This is an expensive processing which requires a significant amount
of CPU resources and may sometimes lead to transcoding speeds below
realtime (1.0x)"

But the truth is, that it's not that expensive at all - it's just 
expensive due to the way how ffmpeg is doing it - with sub2video.

Let's assume there's a 4k video with graphical subtitles that need
to be overlaid on top of the video.
For this purpose (and due to the lack of subtitle filtering), sub2video
is generating a full frame RGBA (4k) with transparency onto which it 
renders the graphical subtitles or just an empty 4k frame when no
subtitles are to be shown.
These RGBA frames are sent into a filtergraph where they can be
used with the overlay filter to blend over the video frames.

A rather small problem is the required conversion when overlaying 
over e.g. yuv420p (happening automatically)
The bigger one is that this creates a situation where every
single 4k video frame needs to be blended with a 4k overlay frame,
iterating over every single pixel and even in periods when no 
subtitles are visible at all.

When looking at bitmap subtitles and how they work, we can see that
these are actually covering just one or few small rect regions 
of PAL8 images, making up just a fraction of the frame area.

While my patchset includes a 'graphicsub2video' filter which 
replicates the sub2video behavior (still useful when uploading 
to hw for example), the new way to go is the 'overlay_graphicsubs'
filter, which has a video and a subtitle input and a video output.
It receives the subtitle rects directly instead of a pre-rendered
full-size frame.
When there are no subtitle rects at a time, it just does - nothing
and passes the input video unchanged to the output.
In case there are bitmap subtitles to display, it iterates over 
them and only blends the region of each subtitle rect with 
the main image.
When blending over a yuv video, there's no need to convert a 
full-size RGBA frame to yuv; not even the subtitle region.
I just convert the PAL8 RGBA palette to yuv - and use it for 
lookup when blending.

Now that the "why" is clear, there's one problem with it:

How to handle cases where the video needs to be scaled? 
Previously, we've simply been scaling that transparent full
frame in the same way like the main video. Needless to say 
that this produces great results, having a full byte per 
color per pixel.

But this suffers from the same problem as before. We need 
to stick to handling individual subtitle bitmaps. 
Also, it's not guaranteed  that these will end up being 
used for overlay - they could also be used for re-encoding 
as graphical subtitle stream.

For those purposes I'm working on a sub_scale filter, allowing to
scale-down or scale-up those PAL8 subtitle bitmaps.

Due to the previous(current) implementation using RGBA images
for blending and scaling, this endeavor is practically 
"doomed" to deliver results of at least similar quality, but 
with PAL8 instead of RGBA32 images.

Before concluding that this would not be achievable, there's
one important point to consider: these bitmaps are not like
photos. It's about text having a primary color, possibly a border
color and possibly a shadow color. Sometimes plus intermediate 
colors for smooth blending at the edges.

It doesn't require a large number of colors in such cases 
to achieve equal results than with full RGBA, but it does 
require a good algorithm to do the color quantization
(palettization) after scaling (with temporary RGBA conversion)
or a scaling that can work with PAL8 images directly 
and is able to produce an adaptive palette for the output.

I'm not sure whether that adds much to the actual problem,
but you got the full story at least :-)

Thanks again for all suggestion I got already!

softworkz