[FFmpeg-devel] [PATCH v2 1/2] avformat: add muxer support for H266/VVC
Thomas Siedel
thomas.ff at spin-digital.com
Mon Jan 29 17:35:19 EET 2024
On Fri, 26 Jan 2024 at 16:22, Nuo Mi <nuomi2021 at gmail.com> wrote:
>
>
> On Fri, Jan 26, 2024 at 10:04 PM Thomas Siedel <thomas.ff at spin-digital.com>
> wrote:
>
>> Thanks for picking up the patch set!
>>
>> On Thu, 25 Jan 2024 at 13:26, Nuo Mi <nuomi2021 at gmail.com> wrote:
>>
>>> From: Thomas Siedel <thomas.ff at spin-digital.com>
>>>
>>> Add muxer for vvcc byte stream format.
>>> Add AV_CODEC_ID_VVC to ff_mp4_obj_type.
>>> Add AV_CODEC_ID_VVC to ISO Media codec (VvcConfigurationBox vvi1,
>>> vvc1 defined in ISO/IEC 14496-15:2021).
>>> Add VvcConfigurationBox vvcC which extends FullBox type in
>>> ISO/IEC 14496-15:2021.
>>> Add ff_vvc_muxer to RAW muxers.
>>>
>>> Tested with:
>>> ffmpeg -i NovosobornayaSquare_1920x1080.mp4 -c:v libvvenc test.mp4
>>> && ffmpeg -i test.mp4 -f null -
>>> ffmpeg -i NovosobornayaSquare_1920x1080.mp4 -c:v copy test.mp4
>>> && ffmpeg -i test.mp4 -f md5 -
>>>
>>> Signed-off-by: Thomas Siedel <thomas.ff at spin-digital.com>
>>> Co-Authored-By: Nuo Mi <nuomi2021 at gmail.com>
>>> ---
>>> libavformat/Makefile | 6 +-
>>> libavformat/isom.c | 1 +
>>> libavformat/isom_tags.c | 3 +
>>> libavformat/mov.c | 6 +
>>> libavformat/movenc.c | 41 +-
>>> libavformat/vvc.c | 971 ++++++++++++++++++++++++++++++++++++++++
>>> libavformat/vvc.h | 99 ++++
>>> 7 files changed, 1123 insertions(+), 4 deletions(-)
>>> create mode 100644 libavformat/vvc.c
>>> create mode 100644 libavformat/vvc.h
>>>
>>> diff --git a/libavformat/Makefile b/libavformat/Makefile
>>> index dcc99eeac4..05b9b8a115 100644
>>> --- a/libavformat/Makefile
>>> +++ b/libavformat/Makefile
>>> @@ -343,7 +343,7 @@ OBJS-$(CONFIG_MATROSKA_DEMUXER) +=
>>> matroskadec.o matroska.o \
>>> oggparsevorbis.o
>>> vorbiscomment.o \
>>> qtpalette.o replaygain.o
>>> dovi_isom.o
>>> OBJS-$(CONFIG_MATROSKA_MUXER) += matroskaenc.o matroska.o \
>>> - av1.o avc.o hevc.o \
>>> + av1.o avc.o hevc.o vvc.o\
>>> flacenc_header.o
>>> avlanguage.o \
>>> vorbiscomment.o wv.o
>>> dovi_isom.o
>>> OBJS-$(CONFIG_MCA_DEMUXER) += mca.o
>>> @@ -365,7 +365,7 @@ OBJS-$(CONFIG_MODS_DEMUXER) += mods.o
>>> OBJS-$(CONFIG_MOFLEX_DEMUXER) += moflex.o
>>> OBJS-$(CONFIG_MOV_DEMUXER) += mov.o mov_chan.o mov_esds.o
>>> \
>>> qtpalette.o replaygain.o
>>> dovi_isom.o
>>> -OBJS-$(CONFIG_MOV_MUXER) += movenc.o av1.o avc.o hevc.o
>>> vpcc.o \
>>> +OBJS-$(CONFIG_MOV_MUXER) += movenc.o av1.o avc.o hevc.o
>>> vvc.o vpcc.o \
>>> movenchint.o mov_chan.o
>>> rtp.o \
>>> movenccenc.o movenc_ttml.o
>>> rawutils.o \
>>> dovi_isom.o evc.o
>>> @@ -520,7 +520,7 @@ OBJS-$(CONFIG_RTP_MUXER) += rtp.o
>>> \
>>> rtpenc_vp8.o \
>>> rtpenc_vp9.o
>>> \
>>> rtpenc_xiph.o \
>>> - avc.o hevc.o
>>> + avc.o hevc.o vvc.o
>>> OBJS-$(CONFIG_RTSP_DEMUXER) += rtsp.o rtspdec.o httpauth.o
>>> \
>>> urldecode.o
>>> OBJS-$(CONFIG_RTSP_MUXER) += rtsp.o rtspenc.o httpauth.o
>>> \
>>> diff --git a/libavformat/isom.c b/libavformat/isom.c
>>> index 6d019881e5..9fbccd4437 100644
>>> --- a/libavformat/isom.c
>>> +++ b/libavformat/isom.c
>>> @@ -36,6 +36,7 @@ const AVCodecTag ff_mp4_obj_type[] = {
>>> { AV_CODEC_ID_MPEG4 , 0x20 },
>>> { AV_CODEC_ID_H264 , 0x21 },
>>> { AV_CODEC_ID_HEVC , 0x23 },
>>> + { AV_CODEC_ID_VVC , 0x33 },
>>> { AV_CODEC_ID_AAC , 0x40 },
>>> { AV_CODEC_ID_MP4ALS , 0x40 }, /* 14496-3 ALS */
>>> { AV_CODEC_ID_MPEG2VIDEO , 0x61 }, /* MPEG-2 Main */
>>> diff --git a/libavformat/isom_tags.c b/libavformat/isom_tags.c
>>> index a575b7c160..705811e950 100644
>>> --- a/libavformat/isom_tags.c
>>> +++ b/libavformat/isom_tags.c
>>> @@ -123,6 +123,9 @@ const AVCodecTag ff_codec_movvideo_tags[] = {
>>> { AV_CODEC_ID_HEVC, MKTAG('d', 'v', 'h', 'e') }, /* HEVC-based
>>> Dolby Vision derived from hev1 */
>>> /* dvh1 is handled
>>> within mov.c */
>>>
>>> + { AV_CODEC_ID_VVC, MKTAG('v', 'v', 'i', '1') }, /* VVC/H.266 which
>>> indicates parameter sets may be in ES */
>>> + { AV_CODEC_ID_VVC, MKTAG('v', 'v', 'c', '1') }, /* VVC/H.266 which
>>> indicates parameter shall not be in ES */
>>> +
>>> { AV_CODEC_ID_H264, MKTAG('a', 'v', 'c', '1') }, /* AVC-1/H.264 */
>>> { AV_CODEC_ID_H264, MKTAG('a', 'v', 'c', '2') },
>>> { AV_CODEC_ID_H264, MKTAG('a', 'v', 'c', '3') },
>>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>>> index 4cffd6c7db..cf931d4594 100644
>>> --- a/libavformat/mov.c
>>> +++ b/libavformat/mov.c
>>> @@ -2123,6 +2123,11 @@ static int mov_read_glbl(MOVContext *c,
>>> AVIOContext *pb, MOVAtom atom)
>>> if ((uint64_t)atom.size > (1<<30))
>>> return AVERROR_INVALIDDATA;
>>>
>>> + if (atom.type == MKTAG('v','v','c','C')) {
>>> + avio_skip(pb, 4);
>>> + atom.size -= 4;
>>> + }
>>> +
>>> if (atom.size >= 10) {
>>> // Broken files created by legacy versions of libavformat will
>>> // wrap a whole fiel atom inside of a glbl atom.
>>> @@ -8129,6 +8134,7 @@ static const MOVParseTableEntry
>>> mov_default_parse_table[] = {
>>> { MKTAG('s','g','p','d'), mov_read_sgpd },
>>> { MKTAG('s','b','g','p'), mov_read_sbgp },
>>> { MKTAG('h','v','c','C'), mov_read_glbl },
>>> +{ MKTAG('v','v','c','C'), mov_read_glbl },
>>> { MKTAG('u','u','i','d'), mov_read_uuid },
>>> { MKTAG('C','i','n', 0x8e), mov_read_targa_y216 },
>>> { MKTAG('f','r','e','e'), mov_read_free },
>>> diff --git a/libavformat/movenc.c b/libavformat/movenc.c
>>> index 8a27afbc57..40be71f3e0 100644
>>> --- a/libavformat/movenc.c
>>> +++ b/libavformat/movenc.c
>>> @@ -68,6 +68,7 @@
>>> #include "ttmlenc.h"
>>> #include "version.h"
>>> #include "vpcc.h"
>>> +#include "vvc.h"
>>>
>>> static const AVOption options[] = {
>>> { "brand", "Override major brand", offsetof(MOVMuxContext,
>>> major_brand), AV_OPT_TYPE_STRING, {.str = NULL}, .flags =
>>> AV_OPT_FLAG_ENCODING_PARAM },
>>> @@ -1473,6 +1474,23 @@ static int mov_write_evcc_tag(AVIOContext *pb,
>>> MOVTrack *track)
>>> return update_size(pb, pos);
>>> }
>>>
>>> +static int mov_write_vvcc_tag(AVIOContext *pb, MOVTrack *track)
>>> +{
>>> + int64_t pos = avio_tell(pb);
>>> +
>>> + avio_wb32(pb, 0);
>>> + ffio_wfourcc(pb, "vvcC");
>>> +
>>> + avio_w8 (pb, 0); /* version */
>>> + avio_wb24(pb, 0); /* flags */
>>> +
>>> + if (track->tag == MKTAG('v','v','c','1'))
>>> + ff_isom_write_vvcc(pb, track->vos_data, track->vos_len, 1);
>>> + else
>>> + ff_isom_write_vvcc(pb, track->vos_data, track->vos_len, 0);
>>> + return update_size(pb, pos);
>>> +}
>>> +
>>> /* also used by all avid codecs (dv, imx, meridien) and their variants
>>> */
>>> static int mov_write_avid_tag(AVIOContext *pb, MOVTrack *track)
>>> {
>>> @@ -2382,6 +2400,8 @@ static int mov_write_video_tag(AVFormatContext *s,
>>> AVIOContext *pb, MOVMuxContex
>>> avid = 1;
>>> } else if (track->par->codec_id == AV_CODEC_ID_HEVC)
>>> mov_write_hvcc_tag(pb, track);
>>> + else if (track->par->codec_id == AV_CODEC_ID_VVC)
>>> + mov_write_vvcc_tag(pb, track);
>>> else if (track->par->codec_id == AV_CODEC_ID_H264 &&
>>> !TAG_IS_AVCI(track->tag)) {
>>> mov_write_avcc_tag(pb, track);
>>> if (track->mode == MODE_IPOD)
>>> @@ -6170,6 +6190,7 @@ int ff_mov_write_packet(AVFormatContext *s,
>>> AVPacket *pkt)
>>> if ((par->codec_id == AV_CODEC_ID_DNXHD ||
>>> par->codec_id == AV_CODEC_ID_H264 ||
>>> par->codec_id == AV_CODEC_ID_HEVC ||
>>> + par->codec_id == AV_CODEC_ID_VVC ||
>>> par->codec_id == AV_CODEC_ID_VP9 ||
>>> par->codec_id == AV_CODEC_ID_EVC ||
>>> par->codec_id == AV_CODEC_ID_TRUEHD) && !trk->vos_len &&
>>> @@ -6235,6 +6256,18 @@ int ff_mov_write_packet(AVFormatContext *s,
>>> AVPacket *pkt)
>>> size = ff_hevc_annexb2mp4(pb, pkt->data, pkt->size, 0,
>>> NULL);
>>> }
>>> }
>>> + } else if (par->codec_id == AV_CODEC_ID_VVC && trk->vos_len > 6 &&
>>> + (AV_RB24(trk->vos_data) == 1 || AV_RB32(trk->vos_data) ==
>>> 1)) {
>>> + /* extradata is Annex B, assume the bitstream is too and convert
>>> it */
>>> + if (trk->hint_track >= 0 && trk->hint_track < mov->nb_tracks) {
>>> + ret = ff_vvc_annexb2mp4_buf(pkt->data, &reformatted_data,
>>> + &size, 0, NULL);
>>> + if (ret < 0)
>>> + return ret;
>>> + avio_write(pb, reformatted_data, size);
>>> + } else {
>>> + size = ff_vvc_annexb2mp4(pb, pkt->data, pkt->size, 0, NULL);
>>> + }
>>> } else if (par->codec_id == AV_CODEC_ID_AV1) {
>>> if (trk->hint_track >= 0 && trk->hint_track < mov->nb_tracks) {
>>> ret = ff_av1_filter_obus_buf(pkt->data, &reformatted_data,
>>> @@ -6281,6 +6314,9 @@ int ff_mov_write_packet(AVFormatContext *s,
>>> AVPacket *pkt)
>>> } else if(par->codec_id == AV_CODEC_ID_HEVC &&
>>> par->extradata_size > 21) {
>>> int nal_size_length = (par->extradata[21] & 0x3) + 1;
>>> ret = ff_mov_cenc_avc_write_nal_units(s, &trk->cenc,
>>> nal_size_length, pb, pkt->data, size);
>>> + } else if(par->codec_id == AV_CODEC_ID_VVC &&
>>> par->extradata_size > 21) {
>>> + int nal_size_length = (par->extradata[21] & 0x3) + 1;
>>>
>>
>> This is wrong for VVC (was noticed by James Almer in previous version of
>> the patch set). Instead, it should be this:
>> int nal_size_length = ((par->extradata[4]>>1) & 0x3) + 1;
>>
> Hi Thomas,
> Thank you for the reply.
> 4 is relatively small compared to 21. Is there any place where we can find
> details about CENC?
>
This was just using the same approach as for HEVC, but with
adjusted nal_size_length for VVC.
After more though, I think this was still wrong and should be this instead:
nal_size_length = ((par->extradata[0]>>1) & 0x3) + 1;
Unfortunately I am not really familiar with the CENC part, so I am not sure
if the ff_mov_cenc_avc_write_nal_units() can work for VVC like this.
Perhaps it would be better to just remove this part for now.
Regarding the nal_size_length, I based it on this:
spec: ISO/IEC 14496-15:2021(E)
Information technology — Coding of audio-visual objects — Part 15:
Carriage of network abstraction layer (NAL) unit structured video in the
ISO base media file format
in 11.2.4.2.2 Syntax
aligned(8) class VvcDecoderConfigurationRecord {
bit(5) reserved = '11111'b;
unsigned int(2) LengthSizeMinusOne;
unsigned int(1) ptl_present_flag;
if (ptl_present_flag) {
...
More information about the ffmpeg-devel
mailing list