[FFmpeg-devel] [PATCH] adpcm: Collapse nodes with similar state only after checking ssd
Michael Niedermayer
michaelni
Wed Nov 17 00:29:58 CET 2010
On Mon, Nov 15, 2010 at 03:47:20PM +0200, Martin Storsj? wrote:
> Hi,
>
> In the attached patch, I shuffled around some checks in the adpcm trellis
> code, moving the costly check for collapsing nodes with similar state to
> after checking if the ssd is worse than the one to be replaced.
>
> With this patch in place, the run time for -trellis 8 on one test sample
> drops from 158 to 79 seconds, on my machine.
>
> Runtime vs PSNR graphs are available at
> http://albin.abo.fi/~mstorsjo/adpcm-graphs2/. For the music1 and music2
> samples, this gives quite clear improvments in performance. For the
> trailer sample, I notice that all three algorithms seem to converge to the
> same output quality at the end.
>
> // Martin
> adpcm.c | 28 ++++++++++++++++------------
> 1 file changed, 16 insertions(+), 12 deletions(-)
> a8a50c076fb95bc35637592b1a192d374e4862a8 0001-adpcm-Collapse-nodes-with-similar-state-only-after-c.patch
> From 66a48591cd79d9b4269efe272fd7591bb5036975 Mon Sep 17 00:00:00 2001
> From: Martin Storsjo <martin at martin.st>
> Date: Sun, 14 Nov 2010 12:41:06 +0200
> Subject: [PATCH] adpcm: Collapse nodes with similar state only after checking the ssd
>
> If the heap is full, insertion often is skipped if the ssd is larger
> than for the node to be replaced.
>
> Since we know the size of the heap, we don't need to check if
> node_next[k] is non-null in each iteration.
>
> For -trellis 8, this lowers the run time from 158 to 79 seconds,
> for a 30 second 44 kHz mono sample, on my machine.
> ---
> libavcodec/adpcm.c | 28 ++++++++++++++++------------
> 1 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/libavcodec/adpcm.c b/libavcodec/adpcm.c
> index 56eb602..f37a104 100644
> --- a/libavcodec/adpcm.c
> +++ b/libavcodec/adpcm.c
> @@ -370,29 +370,33 @@ static void adpcm_compress_trellis(AVCodecContext *avctx, const short *samples,
> #define STORE_NODE(NAME, STEP_INDEX)\
> int d;\
> uint32_t ssd;\
> - int pos;\
> + int pos, heap_size = FFMIN(heap_pos, frontier);\
> TrellisNode *u;\
> dec_sample = av_clip_int16(dec_sample);\
> d = sample - dec_sample;\
> ssd = nodes[j]->ssd + d*d;\
> - /* Collapse any two states with the same previous sample value. \
> - * One could also distinguish states by step and by 2nd to last
> - * sample, but the effects of that are negligible. */\
> - for(k=0; k<frontier && nodes_next[k]; k++) {\
> - if(dec_sample == nodes_next[k]->sample1) {\
> - assert(ssd >= nodes_next[k]->ssd);\
> - goto next_##NAME;\
> - }\
> - }\
> if (heap_pos < frontier) {\
> - pos = heap_pos++;\
> + pos = heap_pos;\
> } else {\
> /* Try to replace one of the leaf nodes with the new \
> * one, but try a different slot each time. */\
> - pos = (frontier >> 1) + (heap_pos++ & ((frontier >> 1) - 1));\
> + pos = (frontier >> 1) + (heap_pos & ((frontier >> 1) - 1));\
> if (ssd > nodes_next[pos]->ssd)\
> goto next_##NAME;\
> }\
> + /* Collapse any two states with the same previous sample value. \
> + * One could also distinguish states by step and by 2nd to last \
> + * sample, but the effects of that are negligible. \
> + * Since nodes in the previous generation are iterated \
> + * through a heap, they're roughly ordered from better to \
> + * worse, but not strictly ordered. Therefore, an earlier \
> + * node with the same sample value is better in most cases \
> + * (and thus the current is skipped), but not strictly \
> + * in all cases. */ \
> + for (k = 0; k < heap_size; k++)\
> + if (dec_sample == nodes_next[k]->sample1)\
> + goto next_##NAME;\
should this not check ssd too, so that better ones are kept?
also a hashtable could be used to speed this up in theory
like:
h=&hash[dec_sample];
if(h->generation == generation && h->ssd < ssd){
goto next_##NAME;
}
h->generation= generation;
h->ssd= ssd
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101117/51573f47/attachment.pgp>
More information about the ffmpeg-devel
mailing list