[FFmpeg-devel] [patch v4, v5] libpostproc: mmx code uses stack below %esp
Michael Niedermayer
michaelni at gmx.at
Sat Sep 24 01:32:24 CEST 2011
On Tue, Sep 20, 2011 at 09:40:48PM +0200, Michael Niedermayer wrote:
> On Tue, Feb 02, 2010 at 09:57:26PM +0100, Michael Niedermayer wrote:
> > On Tue, Feb 02, 2010 at 11:55:46PM +0300, Yuriy Kaminskiy wrote:
> > > On 02.02.2010 23:12, Michael Niedermayer wrote:
> > > > On Tue, Feb 02, 2010 at 11:08:01PM +0300, Yuriy Kaminskiy wrote:
> > > >> On 02.02.2010 22:20, Michael Niedermayer wrote:
> > > >>> On Tue, Feb 02, 2010 at 04:47:44PM +0300, Yuriy Kaminskiy wrote:
> > > >>> I think after reading over this again the best solution would be to use the
> > > >>> context as temporary space, we have a "m"(c->pQPb) anyway so if we put a
> > > >>> pointer into a register to the context we could address pQPb and the temp
> > > >>> aligned space easily.
> > > >>> That way we dont need to create any aligned space on the stack ...
> > > >> Eww, but context here is actually "aligned space on stack" (I think that's why
> > > >> some code can work at all: else gcc would die with "not enough registers" even
> > > >> without -fPIC):
> > > >> ==== cut postprocess_template.c:3163 ===
> > > >> static void RENAME(postProcess)(const uint8_t src[], int srcStride, uint8_t
> > > >> dst[], int dstStride, int width, int height,
> > > >> const QP_STORE_T QPs[], int QPStride, int
> > > >> isColor, PPContext *c2)
> > > >> {
> > > >> DECLARE_ALIGNED(8, PPContext, c)= *c2; //copy to stack for faster access
> > > > hmm, i forgot that ...
> > > > anyway does gcc add additional instructions with your code to align the new
> > > > variable or not?
> > > hmm. no, it seems uses "aligned" offset, but does not /realign/ stack
> > > so if stack /was/ misaligned (($esp % 8)) - it will be bad
> > > on other hand, DECLARE_ALIGNED (with same alignment) already used for this
> > > ppcontext copy, so i doubt it will make situation worse (and i doubt it is
> > > possible - such misaligned stack would be bad for double variables [I know that
> > > larger alignment - 16 bytes for SSE - is certainly problematic on some OSes])
> >
> > then patch ok
>
> v5 applied, if it fails somewhere ill switch to v4
v5 fails for ivan with
libpostproc/postprocess_template.c: In function 'dering_MMX2':
libpostproc/postprocess_template.c:1045:5: error: can't find a register in class 'GENERAL_REGS' while reloading 'asm'
libpostproc/postprocess_template.c:1045:5: error: 'asm' operand has impossible constraints
v4 works for him, so i will switch to v4
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
I do not agree with what you have to say, but I'll defend to the death your
right to say it. -- Voltaire
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20110924/cc4d84f6/attachment.asc>
More information about the ffmpeg-devel
mailing list