[FFmpeg-devel] pre discussion around Blackfin dct_quantize_bfin routine
Reimar Doeffinger
Reimar.Doeffinger
Tue Jun 12 11:28:09 CEST 2007
Hello,
On Mon, Jun 11, 2007 at 10:58:00PM -0400, Marc Hoffman wrote:
> This is my rough draft it works well with the small precision errors
> due to 16bit arithmetic, actually its the same (as far as I can tell)
> as the MMX quantizer. I ran my reference fixpoint against the mmx and
> it seems to work on the cases I tried, its 1 bit off which seems
> about right considering the truncation of the coeffs from 22 to
> 16bits. Can someone look over my shoulder and give a peak. I know I
> have to clean up the codes a bit but I wanted to get someone else to
> review my work up until now. I have used inline asm, Michael this
> should be pretty straight forward for you to get your head around.
> This doubles the performance roughly.
Hmm. Ok, most is cleanup suggestions so I might be telling you things
you already know, still...
> #define clock() ({ int _t; asm volatile ("%0=cycles;" : "=d" (_t)); _t; })
If you leave that in the final version you should probably use a less
collision-freindly name like bfin_clock or so.
Also that ({ }) construction is a nasty gcc-ism, you could avoid it by
making it a static always_inline fuction, though that increases the risk
of the compiler messing up...
And actually, maybe you could just implement START_TIMER and STOP_TIMER
in libavutil (or does that do something else)?
If not, maybe naming it START_PROF and END_PROF is at least more
consistent, and least the EPROF name is not quite obvious in its meaning
;-)
[...]
> av_log (0,0,"%-20s: %12.4f\t%12.4f\n", TelemNames[i],v,v/64);
> av_log (0,0,"%-20s: %12.4f\t%12.4f\n%20.4f\t%d\n", "total",s/TelemCnt,s/TelemCnt/64,s,TelemCnt);
it might have the same result but using NULL and one of the loglevel
defines as first av_log arguments instead of 0 looks probably better
> #define abs(x) (((x)<0)?-(x):(x))
I think FFABS from libavutil does exactly the same...
> #define L1CODE __attribute__ ((l1_text))
for consistency, I'd name it e.g. attribute_l1_text
[...]
> /* for(i=start_i; i<64; i++) { */
> /* sign = (block[i]>>15)|1; */
> /* level = ((abs(block[i])+bias[0])*qmat[i])>>16; */
> /* if (level < 0) level = 0; */
> /* max |= level; */
> /* level = level * sign; */
> /* block[i] = level; */
> /* } */
below you put the C code to the right of the asm. I'd say make it
consistent (possibly just remove it, but if not C is not necessarily the
best way to annotate it).
Unfortunately I don't know what it does so I'm not sure it really makes
sense, but I'd suggest splitting the long asm lines, e.g.:
> " r1=abs r1 (v) || r2=[%2++];\n\t"
vs.
> " r1=abs r1 (v)"
> " || r2=[%2++];\n\t"
or
> " r1.h=(a1 =r1.h*r2.h), r1.l=(a0 =r1.l*r2.l) (tfu); \n\t"
vs.
> " r1.h=(a1 =r1.h*r2.h),
> " r1.l=(a0 =r1.l*r2.l)
> "(tfu); \n\t"
The best way of doing this of course depends on the actual meaning of
this, and my suggestions might be rather stupid thus ;-)
Greetings,
Reimar Doeffinger
More information about the ffmpeg-devel
mailing list