[FFmpeg-devel] [libav-devel] shared api for exposing a texture

Mon May 18 15:46:47 CEST 2015

Hi all

I have coincidentally been working on Hap support for libavcodec as well.

There are a couple of formats that are based on texture compression,
> usually called DXTn or BCn, and described here:
> http://en.wikipedia.org/wiki/S3_Texture_Compression. Currently in
> libavcodec only txd uses this style, but there are others I am working
> on, namely Hap and DSS.
>
> What I thought while working on them (and later found out actually
> being of commercial interest) is that the texture could be potentially
> left intact and rather than being decoded (or encoded) internally by
> libavcodec. The user might want to skip decoding the texture
> altogether and decode it him or herself, possibly exploiting gpu
> acceleration.
>

Just to be clear, as one respondent seemed to misunderstand, these
compressed texture formats are passed to the graphics layer (OpenGL or
DirectX) as a final buffer for display. Unlike hardware decoding of eg
H.264 they don't return as buffers in a traditional pixel format (so they
won't be involved with HWAccel). The formats are not limited to the S3
types - other formats exist, some only for use on particular desktop or
mobile platforms (BC7, PVRTC, etc).

The advantage they offer over traditional pixel formats is the reduced
bandwidth requirement from main memory to the GPU, and the subsequent
reduced graphics memory usage. Decoding at draw time is a fairly cheap
operation for the GPU. By using these formats, systems are able to play
more streams of higher resolution video than they could with RGB or Y'CbCr
formats.

Unfortunately these formats often employ additional compression or add
> custom headers so the user can't just demux and accelerate the output
> frame as is. Interested codecs could let the user choose this with a
> private option.
>
>
> There are a couple of strategies here.
> 1. Introduce a pixel format for each texture: this has the advantage
> of getting appropriately-sized buffers, but in the end it would
> require having a different pixel format for each variant of each
> texture compression. Our users tend to dislike this option and I am
> afraid this would require us of constantly adding new texture formats
> when support is added.
>

>From the point of view of simplicity within libavcodec and simplicity for
API users, this would be my preferred choice:

- AVPixelFormat defines the primary output of a decoder, these formats
would be the primary output of their decoders (when requested)
- These formats are block-based and can be described by AVPixFmtDescriptor
to work with existing allocation mechanisms
- Because they can work with existing allocation mechanisms they would work
with FramePools
- API users may wish to use custom allocators via CODEC_CAP_DR1 to decode
directly to graphics memory, which this would permit them to do
- API users would want to opt in to receive these formats and a mechanism
already exists for selecting pixel formats (AVCodecContext's get_format
callback)
- Decoders could emit traditional pixel formats by default unless a
compressed texture format is requested by get_format, so existing API users
would not have to deal with the new formats unless they wanted to
- API clients can be agnostic as to the codec used when negotiating pixel
formats through get_format: they will automatically support new codecs
which can emit these pixel formats when they are added to libavcodec.
- They are much more like traditional pixel formats than the existing
hardware acceleration formats
- If wanted, libswscale could be extended to handle them

> 2. Introduce a single opaque pixel format: this has the advantage of
> not having to update API at every new format, but leaves users in the
> dark to what the texture actually is, and requires to know what he or
> she is actually doing.
>

This is not helpful because knowing "what he or she is actually doing" for
formats such as TXD (which can contain several compressed texture types)
would require parsing encoded frames which defeats the purpose of using
libavcodec in the first place.

For other codecs it would require API clients maintain a hard-coded mapping
between codecs and compressed texture formats, so they would not
automatically support new codecs of these types added to libavcodec.

An AVFrame in this format would not contain all the information needed to
draw or otherwise process it.

3. Introduce a single opaque pixel format and a side data: as a
> variant of above, the side data would contain which variant of of the
> texture and would let the user know how to deal with data without
> anything special.
>

An improvement on 2. but still loses many of the advantages of expressing
compressed textures as a pixel format in the first place:

- Using CODEC_CAP_DR1 / AVCodecContext's get_buffer would become complicated
- The different formats would not be described by AVPixFmtDescriptor so
buffers couldn't be allocated using eg av_image_alloc(), ff_get_buffer()
and checks would have to be added to return an error from those functions
- Would not work with FramePools
- If the pixel-format is to be negotiated through get_format, this doesn't
allow API clients to opt in to only particular compressed texture formats -
it is likely they will only want to receive those formats that
system/hardware supports
- Internally codecs would be duplicating code to correctly size and
allocate these buffers.
- Would require a new enum type to describe the compressed texture formats
- Would require extra documentation and tests

> 4. Write in the normal data buffers: instead of filling in rgba
> buffers with decoded data, the raw texture could be written in
> data[0], and the side data would help understand how to interpret it.
> This could be somewhat hacky since it's not something users would
> normally expect.
>

- Hacky
- If allocated normally, the buffers would be horribly oversized

> 5. Introduce refcounted side data: just write everything in a side
> data buffer and let the user act upon it on demand. Similar to idea 3,
> but without introducing new pixel formats. Could be potentially
>

Presumably coupled with a codec option to disable filling the RGBA buffers?

- Wouldn't work at all with CODEC_CAP_DR1 / AVCodecContext's get_buffer
- Wouldn't work with FramePools
- Would require further API changes
- Would require extra documentation
- Would require each codec that wants to emit these formats add an option
to enable/disable it

> 6. Write in the 'special' data buffer : similar to what is done for
> paletted formats, write the texture in data[1], so that normal users
> don't have to worry about anything and special users might just access
> the appropriate buffers.
>

Most API users who want the compressed texture data will not want the RGBA
buffers as well - decode to RGBA will needlessly use time and memory. This
would still require agreeing and documenting a way to indicate the type of
compressed texture data the 'special' data buffer contained.

Cheers - Tom