[Ffmpeg-devel] [RFC] ZMBV encoder

Kostya kostya.shishkov
Tue Dec 5 13:17:11 CET 2006


On Mon, Dec 04, 2006 at 04:59:35PM +0100, Michael Niedermayer wrote:
> Hi
> 
> On Mon, Dec 04, 2006 at 08:13:57AM +0200, Kostya wrote:
> > Here is my ZMBV encoder for encoding palettized videos.
> 
> > Index: libavcodec/zmbvenc.c
> > ===================================================================
> > --- libavcodec/zmbvenc.c	(revision 0)
> > +++ libavcodec/zmbvenc.c	(revision 0)
> [...]
> > +
> > +#define ZMBV_KEYFRAME 1
> > +#define ZMBV_DELTAPAL 2
> 
> duplicate from zmbv.c

What do you suggest? Creating separate header is overkill.
> 
> [...]
> > +/** Block comparing function
> > + * XXX should be optimized and moved to DSPContext
> > + * TODO handle out of edge ME
> > + */
> > +int block_cmp(uint8_t *src, int stride, uint8_t *src2, int stride2, int bw, int bh)
> > +{
> > +    int sum = 0;
> > +    int i, j;
> > +    for(j = 0; j < bh; j++){
> > +        for(i = 0; i < bw; i++)
> > +            sum += src[i] ^ src2[i];
> > +        src += stride;
> > +        src2 += stride2;
> > +    }
> > +    return sum;
> > +}
> 
> i dont think sum += src[i] ^ src2[i]; is optimal compression wise ...
> _maybe_

Maybe, but on some files it compresses ~1% better than
 sum+=FFABS(src[i]-src2[i]);

> 
> sum += score[src[i] ^ src2[i]];
> and
> for(i,j=0; i,j<256; i,j++){
>     score[i^j] += ABS(i-j);
> }
> 
> or even simply
> sum += ABS(src[i] - src2[i])
> 
> might be better, and of course a 
> 
> for() 
>     dst[i]= src[i] ^ src2[i]; 
> sum= lzw_size(dst, len);
> 
> might be interresting ...

Yes, but now code relies on fact that if returned value is zero then
blocks are the same.
 
> 
> [...]
> > +        uint8_t tpal[3];
> > +        for(i = 0; i < 256; i++){
> > +            tpal[0] = (palptr[i] >> 16) & 0xFF;
> > +            tpal[1] = (palptr[i] >>  8) & 0xFF;
> > +            tpal[2] = palptr[i] & 0xFF;
> 
> the &0xFF is unneeded

fixed
 
> [...]
> > +    }else{
> > +        int x, y, bh2, bw2;
> > +        uint8_t *tsrc, *tprev;
> > +        uint8_t *mv;
> > +        int bmvx, bmvy, bv, tx, ty, tv, dx, dy;
> > +
> > +        bw = (avctx->width + ZMBV_BLOCK - 1) / ZMBV_BLOCK;
> > +        bh = (avctx->height + ZMBV_BLOCK - 1) / ZMBV_BLOCK;
> > +        mv = c->work_buf + work_size;
> > +        memset(c->work_buf + work_size, 0, (bw * bh * 2 + 3) & ~3);
> > +        work_size += (bw * bh * 2 + 3) & ~3;
> > +        /* for now just XOR'ing */
> > +        for(y = 0; y < avctx->height; y += ZMBV_BLOCK) {
> > +            bh2 = FFMIN(avctx->height - y, ZMBV_BLOCK);
> > +            for(x = 0; x < avctx->width; x += ZMBV_BLOCK, mv += 2) {
> > +                bw2 = FFMIN(avctx->width - x, ZMBV_BLOCK);
> > +
> > +                tsrc = src + x;
> > +                tprev = prev + x;
> > +
> > +                bmvx = bmvy = 0;
> > +                bv = block_cmp(tsrc, p->linesize[0], tprev, c->pstride, ZMBV_BLOCK, ZMBV_BLOCK);
> > +                if(bv) for(ty = FFMAX(y - 16, 0); ty < FFMIN(y + 16, avctx->height - ZMBV_BLOCK); ty++){
> > +                    for(tx = FFMAX(x - 16, 0); tx < FFMIN(x + 16, avctx->width - ZMBV_BLOCK); tx++){
> 
> use avctx->me_range

done

> 
> > +                        if(tx == x && ty == y) continue; // we already tested this block
> > +                        dx = tx - x;
> > +                        dy = ty - y;
> > +                        tv = block_cmp(tsrc, p->linesize[0], tprev + dx + dy*c->pstride, c->pstride, ZMBV_BLOCK, ZMBV_BLOCK);
> > +                        if(tv < bv){
> > +                            bv = tv;
> > +                            bmvx = dx;
> > +                            bmvy = dy;
> > +                            if(!bv) break;
> > +                        }
> > +                    }
> > +                    if(!bv) break;
> > +                }
> 
> hmmmmm
> maybe try using the motion estimation code from lavc or at least implement
> a optional more practical variant ...
> 
> practical:
> take motion vector from left, top, top right, last frame and (0,0)
> try cmp on all, take best
> try (x,y+1), (x,y-1), (x+1,y) (x-1,y) where (x,y) is the best so far
> continue this until the best is in the middle of this small set in
> which case you are done
> 
> improvments:
> try median(top,left,top right)
> try right one from last frame and try bottom one from last frame

For now I moved ME code into separate function so it will be easy to play with.
Current lavc ME scheme requires decoder context to have MpegEncContext as
motion estimation is tied to it.

> [...]
> > +    c->width = avctx->width;
> > +    c->height = avctx->height;
> > +    c->curfrm = 0;
> > +    c->keyint = avctx->keyint_min;
> 
> why is this copied from avctx? the copied variants dont seem to be used
> much ...

Dropped unused variables
 
> [...]
> > +/*
> > + *
> > + * Uninit zmbv decoder
> > + *
> > + */
> 
> not doxygen compatible

fixed

> [...]
> 
> > +
> > +AVCodec zmbv_encoder = {
> > +    "zmbv",
> > +    CODEC_TYPE_VIDEO,
> > +    CODEC_ID_ZMBV,
> > +    sizeof(ZmbvEncContext),
> > +    encode_init,
> > +    encode_frame,
> > +    encode_end
> > +};
> 
> supported pix_fmt entry is missing

added

> [...]
> 
> -- 
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> In the past you could go to a library and read, borrow or copy any book
> Today you'd get arrested for mere telling someone where the library is
> 
-------------- next part --------------
Index: Changelog
===================================================================
--- Changelog	(revision 7203)
+++ Changelog	(working copy)
@@ -38,7 +38,7 @@
 - ADTS AAC file reading and writing
 - Creative VOC file reading and writing
 - American Laser Games multimedia (*.mm) playback system
-- Zip Blocks Motion Video decoder
+- Zip Blocks Motion Video decoder and encoder
 - Improved Theora/VP3 decoder
 - True Audio (TTA) decoder
 - AVS demuxer and video decoder
Index: libavcodec/zmbvenc.c
===================================================================
--- libavcodec/zmbvenc.c	(revision 0)
+++ libavcodec/zmbvenc.c	(revision 0)
@@ -0,0 +1,348 @@
+/*
+ * Zip Motion Blocks Video (ZMBV) encoder
+ * Copyright (c) 2006 Konstantin Shishkov
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ *
+ */
+
+/**
+ * @file zmbvenc.c
+ * Zip Motion Blocks Video encoder
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "common.h"
+#include "avcodec.h"
+#include "dsputil.h"
+
+#ifdef CONFIG_ZLIB
+#include <zlib.h>
+#endif
+
+#define ZMBV_KEYFRAME 1
+#define ZMBV_DELTAPAL 2
+
+#define ZMBV_BLOCK 16
+
+/*
+ * Encoder context
+ */
+typedef struct ZmbvEncContext {
+    AVCodecContext *avctx;
+    AVFrame pic;
+
+    int range;
+    uint8_t *comp_buf, *work_buf;
+    uint8_t pal[768];
+    uint32_t pal2[256]; //for quick comparisons
+    uint8_t *prev;
+    int pstride;
+    int comp_size;
+    int keyint, curfrm;
+#ifdef CONFIG_ZLIB
+    z_stream zstream;
+#endif
+} ZmbvEncContext;
+
+/** Block comparing function
+ * XXX should be optimized and moved to DSPContext
+ * TODO handle out of edge ME
+ */
+static inline int block_cmp(uint8_t *src, int stride, uint8_t *src2, int stride2, int bw, int bh)
+{
+    int sum = 0;
+    int i, j;
+
+    for(j = 0; j < bh; j++){
+        for(i = 0; i < bw; i++)
+            sum += src[i] ^ src2[i];
+        src += stride;
+        src2 += stride2;
+    }
+    return sum;
+}
+
+/** Motion estimation function
+ * TODO make better ME decisions
+ */
+static int zmbv_me(ZmbvEncContext *c, uint8_t *src, int sstride, uint8_t *prev, int pstride,
+                    int x, int y, int *mx, int *my)
+{
+    int dx, dy, tx, ty, tv, bv;
+
+    *mx = *my = 0;
+    bv = block_cmp(src, sstride, prev, pstride, ZMBV_BLOCK, ZMBV_BLOCK);
+    if(!bv) return 0;
+    for(ty = FFMAX(y - c->range, 0); ty < FFMIN(y + c->range, c->avctx->height - ZMBV_BLOCK); ty++){
+        for(tx = FFMAX(x - c->range, 0); tx < FFMIN(x + c->range, c->avctx->width - ZMBV_BLOCK); tx++){
+            if(tx == x && ty == y) continue; // we already tested this block
+            dx = tx - x;
+            dy = ty - y;
+            tv = block_cmp(src, sstride, prev + dx + dy*pstride, pstride, ZMBV_BLOCK, ZMBV_BLOCK);
+            if(tv < bv){
+                 bv = tv;
+                 *mx = dx;
+                 *my = dy;
+                 if(!bv) return 0;
+             }
+         }
+    }
+    return bv;
+}
+
+static int encode_frame(AVCodecContext *avctx, uint8_t *buf, int buf_size, void *data)
+{
+    ZmbvEncContext * const c = (ZmbvEncContext *)avctx->priv_data;
+    AVFrame *pict = data;
+    AVFrame * const p = &c->pic;
+    uint8_t *src, *prev;
+    uint32_t *palptr;
+#ifdef CONFIG_ZLIB
+    int zret = Z_OK; // Zlib return code
+#endif
+    int len = 0;
+    int keyframe, chpal;
+    int fl;
+    int work_size = 0;
+    int bw, bh;
+    int i, j;
+
+    keyframe = !c->curfrm;
+    c->curfrm = c->curfrm++;
+    if(c->curfrm == c->keyint)
+        c->curfrm = 0;
+    *p = *pict;
+    p->pict_type= keyframe ? FF_I_TYPE : FF_P_TYPE;
+    p->key_frame= keyframe;
+    chpal = !keyframe && memcmp(p->data[1], c->pal2, 1024);
+
+    fl = (keyframe ? ZMBV_KEYFRAME : 0) | (chpal ? ZMBV_DELTAPAL : 0);
+    *buf++ = fl; len++;
+    if(keyframe){
+        deflateReset(&c->zstream);
+        *buf++ = 0; len++; // hi ver
+        *buf++ = 1; len++; // lo ver
+        *buf++ = 1; len++; // comp
+        *buf++ = 4; len++; // format - 8bpp
+        *buf++ = ZMBV_BLOCK; len++; // block width
+        *buf++ = ZMBV_BLOCK; len++; // block height
+    }
+    palptr = (uint32_t*)p->data[1];
+    src = p->data[0];
+    prev = c->prev;
+    if(chpal){
+        uint8_t tpal[3];
+        for(i = 0; i < 256; i++){
+            tpal[0] = palptr[i] >> 16;
+            tpal[1] = palptr[i] >>  8;
+            tpal[2] = palptr[i];
+            c->work_buf[work_size++] = tpal[0] ^ c->pal[i * 3 + 0];
+            c->work_buf[work_size++] = tpal[1] ^ c->pal[i * 3 + 1];
+            c->work_buf[work_size++] = tpal[2] ^ c->pal[i * 3 + 2];
+            c->pal[i * 3 + 0] = tpal[0];
+            c->pal[i * 3 + 1] = tpal[1];
+            c->pal[i * 3 + 2] = tpal[2];
+        }
+        memcpy(c->pal2, p->data[1], 1024);
+    }
+    if(keyframe){
+        for(i = 0; i < 256; i++){
+            c->pal[i*3 + 0] = palptr[i] >> 16;
+            c->pal[i*3 + 1] = palptr[i] >>  8;
+            c->pal[i*3 + 2] = palptr[i];
+        }
+        memcpy(c->work_buf, c->pal, 768);
+        memcpy(c->pal2, p->data[1], 1024);
+        work_size = 768;
+        for(i = 0; i < avctx->height; i++){
+            memcpy(c->work_buf + work_size, src, avctx->width);
+            src += p->linesize[0];
+            work_size += avctx->width;
+        }
+    }else{
+        int x, y, bh2, bw2;
+        uint8_t *tsrc, *tprev;
+        uint8_t *mv;
+        int mx, my, bv;
+
+        bw = (avctx->width + ZMBV_BLOCK - 1) / ZMBV_BLOCK;
+        bh = (avctx->height + ZMBV_BLOCK - 1) / ZMBV_BLOCK;
+        mv = c->work_buf + work_size;
+        memset(c->work_buf + work_size, 0, (bw * bh * 2 + 3) & ~3);
+        work_size += (bw * bh * 2 + 3) & ~3;
+        /* for now just XOR'ing */
+        for(y = 0; y < avctx->height; y += ZMBV_BLOCK) {
+            bh2 = FFMIN(avctx->height - y, ZMBV_BLOCK);
+            for(x = 0; x < avctx->width; x += ZMBV_BLOCK, mv += 2) {
+                bw2 = FFMIN(avctx->width - x, ZMBV_BLOCK);
+
+                tsrc = src + x;
+                tprev = prev + x;
+
+                bv = zmbv_me(c, tsrc, p->linesize[0], tprev, c->pstride, x, y, &mx, &my);
+                mv[0] = (mx << 1) | !!bv;
+                mv[1] = my << 1;
+                tprev += mx + my * c->pstride;
+                if(bv){
+                    for(j = 0; j < bh2; j++){
+                        for(i = 0; i < bw2; i++)
+                            c->work_buf[work_size++] = tsrc[i] ^ tprev[i];
+                        tsrc += p->linesize[0];
+                        tprev += c->pstride;
+                    }
+                }
+            }
+            src += p->linesize[0] * ZMBV_BLOCK;
+            prev += c->pstride * ZMBV_BLOCK;
+        }
+    }
+    /* save the previous frame */
+    src = p->data[0];
+    prev = c->prev;
+    for(i = 0; i < avctx->height; i++){
+        memcpy(prev, src, avctx->width);
+        prev += c->pstride;
+        src += p->linesize[0];
+    }
+
+#ifdef CONFIG_ZLIB
+    c->zstream.next_in = c->work_buf;
+    c->zstream.avail_in = work_size;
+    c->zstream.total_in = 0;
+
+    c->zstream.next_out = c->comp_buf;
+    c->zstream.avail_out = c->comp_size;
+    c->zstream.total_out = 0;
+    if((zret = deflate(&c->zstream, Z_SYNC_FLUSH)) != Z_OK){
+        av_log(avctx, AV_LOG_ERROR, "Error compressing data\n");
+        return -1;
+    }
+
+    memcpy(buf, c->comp_buf, c->zstream.total_out);
+    return len + c->zstream.total_out;
+#else
+    return -1;
+#endif
+}
+
+
+/**
+ * Init zmbv encoder
+ */
+static int encode_init(AVCodecContext *avctx)
+{
+    ZmbvEncContext * const c = (ZmbvEncContext *)avctx->priv_data;
+    int zret; // Zlib return code
+    int lvl = 9;
+
+    c->avctx = avctx;
+    avctx->has_b_frames = 0;
+
+    c->pic.data[0] = NULL;
+    c->curfrm = 0;
+    c->keyint = avctx->keyint_min;
+    c->range = 8;
+    if(avctx->me_range > 0)
+        c->range = FFMIN(avctx->me_range, 16);
+
+    if(avctx->compression_level >= 0)
+        lvl = avctx->compression_level;
+    if(lvl < 0 || lvl > 9){
+        av_log(avctx, AV_LOG_ERROR, "Compression level should be 0-9, not %i\n", lvl);
+        return 1;
+    }
+
+    if (avcodec_check_dimensions(avctx, avctx->width, avctx->height) < 0) {
+        return 1;
+    }
+
+#ifdef CONFIG_ZLIB
+    // Needed if zlib unused or init aborted before deflateInit
+    memset(&(c->zstream), 0, sizeof(z_stream));
+#else
+    av_log(avctx, AV_LOG_ERROR, "Zlib support not compiled.\n");
+    return 1;
+#endif
+    c->comp_size = avctx->width * avctx->height + 1024 +
+        ((avctx->width + ZMBV_BLOCK - 1) / ZMBV_BLOCK) * ((avctx->height + ZMBV_BLOCK - 1) / ZMBV_BLOCK) * 2 + 4;
+    if ((c->work_buf = av_malloc(c->comp_size)) == NULL) {
+        av_log(avctx, AV_LOG_ERROR, "Can't allocate work buffer.\n");
+        return 1;
+    }
+    /* Conservative upper bound taken from zlib v1.2.1 source via lcl.c */
+    c->comp_size = c->comp_size + ((c->comp_size + 7) >> 3) +
+                           ((c->comp_size + 63) >> 6) + 11;
+
+    /* Allocate compression buffer */
+    if ((c->comp_buf = av_malloc(c->comp_size)) == NULL) {
+        av_log(avctx, AV_LOG_ERROR, "Can't allocate compression buffer.\n");
+        return 1;
+    }
+    c->pstride = (avctx->width + 15) & ~15;
+    if ((c->prev = av_malloc(c->pstride * avctx->height)) == NULL) {
+        av_log(avctx, AV_LOG_ERROR, "Can't allocate picture.\n");
+        return 1;
+    }
+
+#ifdef CONFIG_ZLIB
+    c->zstream.zalloc = Z_NULL;
+    c->zstream.zfree = Z_NULL;
+    c->zstream.opaque = Z_NULL;
+    zret = deflateInit(&(c->zstream), lvl);
+    if (zret != Z_OK) {
+        av_log(avctx, AV_LOG_ERROR, "Inflate init error: %d\n", zret);
+        return 1;
+    }
+#endif
+
+    return 0;
+}
+
+
+
+/**
+ * Uninit zmbv decoder
+ */
+static int encode_end(AVCodecContext *avctx)
+{
+    ZmbvEncContext * const c = (ZmbvEncContext *)avctx->priv_data;
+
+    av_freep(&c->comp_buf);
+    av_freep(&c->work_buf);
+
+#ifdef CONFIG_ZLIB
+    deflateEnd(&(c->zstream));
+#endif
+    av_freep(&c->prev);
+
+    return 0;
+}
+
+AVCodec zmbv_encoder = {
+    "zmbv",
+    CODEC_TYPE_VIDEO,
+    CODEC_ID_ZMBV,
+    sizeof(ZmbvEncContext),
+    encode_init,
+    encode_frame,
+    encode_end,
+    .pix_fmts = (enum PixelFormat[]){PIX_FMT_PAL8, -1},
+};
+
Index: libavcodec/allcodecs.c
===================================================================
--- libavcodec/allcodecs.c	(revision 7203)
+++ libavcodec/allcodecs.c	(working copy)
@@ -155,7 +155,7 @@
     REGISTER_ENCODER(XVID, xvid);
 #endif
     REGISTER_ENCDEC (ZLIB, zlib);
-    REGISTER_DECODER(ZMBV, zmbv);
+    REGISTER_ENCDEC (ZMBV, zmbv);
 
     /* audio codecs */
 #ifdef CONFIG_FAAD
Index: libavcodec/Makefile
===================================================================
--- libavcodec/Makefile	(revision 7203)
+++ libavcodec/Makefile	(working copy)
@@ -167,6 +167,7 @@
 OBJS-$(CONFIG_ZLIB_DECODER)            += lcl.o
 OBJS-$(CONFIG_ZLIB_ENCODER)            += lcl.o
 OBJS-$(CONFIG_ZMBV_DECODER)            += zmbv.o
+OBJS-$(CONFIG_ZMBV_ENCODER)            += zmbvenc.o
 
 OBJS-$(CONFIG_PCM_S32LE_DECODER)       += pcm.o
 OBJS-$(CONFIG_PCM_S32LE_ENCODER)       += pcm.o
Index: libavcodec/avcodec.h
===================================================================
--- libavcodec/avcodec.h	(revision 7203)
+++ libavcodec/avcodec.h	(working copy)
@@ -37,8 +37,8 @@
 #define AV_STRINGIFY(s)         AV_TOSTRING(s)
 #define AV_TOSTRING(s) #s
 
-#define LIBAVCODEC_VERSION_INT  ((51<<16)+(25<<8)+0)
-#define LIBAVCODEC_VERSION      51.25.0
+#define LIBAVCODEC_VERSION_INT  ((51<<16)+(26<<8)+0)
+#define LIBAVCODEC_VERSION      51.26.0
 #define LIBAVCODEC_BUILD        LIBAVCODEC_VERSION_INT
 
 #define LIBAVCODEC_IDENT        "Lavc" AV_STRINGIFY(LIBAVCODEC_VERSION)
@@ -2303,6 +2303,7 @@
 extern AVCodec bmp_decoder;
 extern AVCodec mmvideo_decoder;
 extern AVCodec zmbv_decoder;
+extern AVCodec zmbv_encoder;
 extern AVCodec avs_decoder;
 extern AVCodec smacker_decoder;
 extern AVCodec smackaud_decoder;
Index: doc/ffmpeg-doc.texi
===================================================================
--- doc/ffmpeg-doc.texi	(revision 7203)
+++ doc/ffmpeg-doc.texi	(working copy)
@@ -989,7 +989,7 @@
 @item Fraps FPS1             @tab     @tab  X @tab
 @item CamStudio              @tab     @tab  X @tab fourcc: CSCD
 @item American Laser Games Video  @tab    @tab X @tab Used in games like Mad Dog McCree
- at item ZMBV                   @tab     @tab  X @tab
+ at item ZMBV                   @tab   X @tab  X @tab Encoder works only on PAL8
 @item AVS Video              @tab     @tab  X @tab Video encoding used by the Creature Shock game.
 @item Smacker Video          @tab     @tab  X @tab Video encoding used in Smacker.
 @item RTjpeg                 @tab     @tab  X @tab Video encoding used in NuppelVideo files.



More information about the ffmpeg-devel mailing list