[FFmpeg-devel] [PATCH] strict-aliasing-safe aes.c
Michael Niedermayer
michaelni
Wed Jun 30 00:17:06 CEST 2010
On Wed, Jun 30, 2010 at 12:05:26AM +0200, Reimar D?ffinger wrote:
> On Tue, Jun 29, 2010 at 11:29:42PM +0200, Michael Niedermayer wrote:
> > On Tue, Jun 29, 2010 at 07:27:44PM +0200, Reimar D?ffinger wrote:
> > > On Tue, Jun 29, 2010 at 06:16:03PM +0200, Reimar D?ffinger wrote:
> > > > On Tue, Jun 29, 2010 at 05:10:09PM +0100, M?ns Rullg?rd wrote:
> > > > > Reimar D?ffinger <Reimar.Doeffinger at gmx.de> writes:
> > > > > >> Are these API changes necessary? Can't the rest of the fixes be done
> > > > > >> without this?
> > > > > >
> > > > > > No, they are not necessary, however I am in favour of at least having
> > > > > > av_aes_block available to allow for easy allocation of data
> > > > > > with suitable alignment.
> > > > >
> > > > > Exposing internals like that is less than ideal. Simply documenting
> > > > > the 8-byte alignment requirement should be enough.
> > > >
> > > > I guess I agree, I'll make the change if the idea is ok'd in principle.
> > >
> > > Oh well, I did it anyway already since I noticed I broke the more-than-one
> > > block case.
> >
> > > aes.c | 74 +++++++++++++++++++++++++++++++++++++-----------------------------
> > > 1 file changed, 42 insertions(+), 32 deletions(-)
> > > 172e32357bec3f033440078a57b908ff696472d7 aesalias.diff
> >
> > if its confirmed that there is no speedloss and the object file does not
> > become larger than i wont object to the patch. But its still ugly
>
> Well, I had to test with gcc-4.3, since 4.4 does not produce working code...
> Pure assembler size increases from 3957 bytes to 4025, for what I can tell
> mostly because gcc randomly decides to use some extra bytes for the local
> variables, which just makes it having to use 4-byte instead of 1-byte offsets
> to address function parameters, which in turn then causes some alignment
> to increase as well.
> As for speed, with aes-test the results seem to indicate a surprising
> 2% performance advantage for the new code (I'd expect due to some
> other random code or data re-shuffling):
> old:
> 13377 dezicycles in aes, 8191 runs, 1 skips
> 13387 dezicycles in aes, 8192 runs, 0 skips
> 13368 dezicycles in aes, 8191 runs, 1 skips
> 13360 dezicycles in aes, 8191 runs, 1 skips
> 14279 dezicycles in aes, 8191 runs, 1 skips
>
> new:
> 12929 dezicycles in aes, 8191 runs, 1 skips
> 12926 dezicycles in aes, 8191 runs, 1 skips
> 12916 dezicycles in aes, 8191 runs, 1 skips
> 12913 dezicycles in aes, 8191 runs, 1 skips
> 12973 dezicycles in aes, 8191 runs, 1 skips
>
> I'd welcome someone else to test.
> Does this seem good enough to you?
commit it before i change my mind
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I have never wished to cater to the crowd; for what I know they do not
approve, and what they approve I do not know. -- Epicurus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100630/228bd327/attachment.pgp>
More information about the ffmpeg-devel
mailing list