[FFmpeg-devel] [PATCH] support encrypted asf
Reimar Döffinger
Reimar.Doeffinger
Wed Oct 10 00:05:23 CEST 2007
Hello,
On Tue, Oct 09, 2007 at 09:36:07PM +0200, Michael Niedermayer wrote:
> On Tue, Oct 09, 2007 at 07:53:04PM +0200, Reimar D?ffinger wrote:
> > except for the missing license headers attached is the "final" (pending
> > comments/objections) version.
> > Please comment.
> [...]
> > +/**
> > + * \brief find multiplicative inverse modulo 2 ^ 32
> > + * \param v number to invert, must be odd!
> > + * \return number so that result * v = 1 (mod 2^32)
> > + */
> > +static uint32_t inverse(uint32_t v) {
> > + // this initialization reduces the number of iterations
> > + // to 3 compared to 4 for = 2 - v and 5 for = 1
> > + uint32_t inverse = v * v * v;
> > + // uses a fixpoint-iteration to find the inverse very quickly
> > + int i = 3;
> > + do {
> > + inverse *= 2 - v * inverse;
> > + } while (--i);
>
> is the resulting code (ever) smaller than if its unrolled?
gcc does not generate an non-unrolled version for me except with -O0
which would be an unfair comparison, but the unrolled code is 13
instructions (32 bytes) and the loop for i = 5 (gcc does not unroll then
with -O2) has 7 instructions (19 bytes) on x86_64.
The idea was to make it clearer it is an iteration, but I will write it
out along with improved comments.
> > +/**
> > + * \brief read keys from keybuf into keys
> > + * \param keybuf buffer of 48 bytes containing the keys
> > + * \param keys output key array of size 12 ints containing the keys
> > + * in native endianness for encryption
> > + */
> > +static void multiswap_init(const uint8_t *keybuf, uint32_t *keys) {
>
> nitpick:
> static void multiswap_init(const uint8_t keybuf[48], uint32_t keys[12]) {
Ok, and simplified the comments accordingly.
[... DES code ...]
> hmm, this doesnt look remotely optimal, maybe you should look at
> john the ripper IIRC it contains quite fast DES code. though surely far too
> messy for us to use directly but i guess the ideas used in it could be
> applied to our code as well
DES is a mess as it is, it seems unlikely it well ever get any new use,
it was designed for hardware that was a factor 100 slower than todays,
the fast implementations I have seen use several KB of messy tables, and
for ASF it is used to decode about 8 bytes out of over 1000.
I really don't see any point in optimizing the _speed_ of this thing,
there really are enough fast DES implementations, this one seems rather
unique in that it uses less than 1 KB of tables and is very close to the
specification, I would very much prefer to keep it like that (well, at
least the general idea, not in the details).
[...RC4...]
> and i dont think the use of uint8_t as index of arrays is very fast
> have you benchmarked that against normal int &0xFF ?
no, though I had a look at the code gcc generates, if I use long there,
it does not do any and operation, instead it uses movzbl etc. anyway (so
it basically uses uint8_t itself).
It seems that it's optimizer works better though and produces simpler
code.
I must admit that besides my all-time mantra of simple code (to me at least
;-) ) this is also uint8_t because I started implementing it for an 8
bit microcontroller (though I likely won't use it, AES is fast and small
enough for now).
> also des/rc4/asfcrypt belong to seperate commits/patches
Will split it before next round.
Greetings,
Reimar D?ffinger
More information about the ffmpeg-devel
mailing list