[FFmpeg-devel] [PATCH] lavfi/hqdn3d/x86: use full register for lowpass.

Michael Niedermayer michaelni at gmx.at
Sat Sep 22 03:33:40 CEST 2012


On Fri, Sep 21, 2012 at 03:45:00AM +0000, Loren Merritt wrote:
> On Fri, 21 Sep 2012, Michael Niedermayer wrote:
> 
> > On Wed, Sep 19, 2012 at 05:08:10PM +0200, Nicolas George wrote:
> > > LOWPASS starts using the full register for prevsample and cursample,
> > > but currently only returns a valid 32-bits register; later calls will
> > > reuse the value as full register, leading to possibly incorrect
> > > high-order bits.
> > >
> > > Fix trac ticket #1752.
> > >
> > > Signed-off-by: Nicolas George <nicolas.george at normalesup.org>
> > > ---
> > >  libavfilter/x86/hqdn3d.asm |    4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > >
> > > I am rather out of my league here. Using gdb, I observed this for the second
> > > line of the offending sample:
> > >
> > > "LOWPASS   t1, pixelant, spatial" becomes:
> > >
> > > sub    %r10,%rbx              rbx = 0xfffffffffffffff9 = -7
> > > sar    $0x4,%rbx              rbx = 0xffffffffffffffff = -1
> > > movswl (%r9,%rbx,2),%ebx      rbx = 0xfffffff8 = (uint32_t)-8
> > > add    %r10d,%ebx             rbx = 0xffffffff
> > >
> > > t1 is r10, pixelant is rbx.
> > >
> > > Then, a few lines below "LOWPASS   t0, t1, temporal" becomes:
> > >
> > > sub    %rbx,%r11              r11 = 0xffffffff00000001
> > > sar    $0x4,%r11              r11 = 0xfffffffff0000000
> > > movswl (%rax,%r11,2),%r11d    SEGV
> > > add    %ebx,%r11d
> > >
> > > The first subtraction on rbx seems to result in an incorrect sign extension,
> > > leading to a segfault when it is used as an index in the "temporal" array.
> > >
> > >
> > > diff --git a/libavfilter/x86/hqdn3d.asm b/libavfilter/x86/hqdn3d.asm
> > > index 88b9b0d..324642d 100644
> > > --- a/libavfilter/x86/hqdn3d.asm
> > > +++ b/libavfilter/x86/hqdn3d.asm
> > > @@ -27,8 +27,8 @@ SECTION .text
> > >  %if lut_bits != 8
> > >      sar    %1q, 8-lut_bits
> > >  %endif
> > > -    movsx  %1d, word [%3q+%1q*2]
> > > -    add    %1d, %2d
> > > +    movsx  %1q, word [%3q+%1q*2]
> > > +    add    %1q, %2q
> > >  %endmacro
> 
> This value should never be negative.
> Different fix that more closely tracks the fact that pixels are unsigned:

patch applied

thanks

PS: another way to fix this would be to make LUT_BITS 8 always, so
each table entry would only serve one case, but that seems slower ...


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Why not whip the teacher when the pupil misbehaves? -- Diogenes of Sinope
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20120922/8f4a0c52/attachment.asc>


More information about the ffmpeg-devel mailing list