[FFmpeg-devel] [PATCH] Fixed PNG decoding with Paeth filter for RGBA 64-bits.
Michael Niedermayer
michaelni at gmx.at
Sun Dec 7 05:39:16 CET 2014
On Fri, Dec 05, 2014 at 12:20:31AM +0000, Dominique Leroux wrote:
> Found and fixed an artifact in the last column of decoded RGBA 64-bits PNG images.
>
> The code was dealing with a SIMD-optimized version of the function without taking into account that we can have RGB/RGBA images that are respectively 6 or 8 bytes per pixel (not just 3 or 4).
>
> Dominique
> Changelog | 1 +
> libavcodec/pngdec.c | 16 ++++++++++++----
> libavcodec/pngdsp.c | 7 +++++--
> libavcodec/pngdsp.h | 5 +++++
> libavcodec/x86/pngdsp_init.c | 17 +++++++++++------
> 5 files changed, 34 insertions(+), 12 deletions(-)
> a3c13eb0d8d35a5f23a05c3005b7792893f1dc2f 0001-Fixed-PNG-decoding-with-Paeth-filter-for-RGBA-64-bit.patch
> From 5d3b88d2a9bad1f9e9431b6244493cbc6dda5701 Mon Sep 17 00:00:00 2001
> From: Dominique Leroux <dominique.leroux at autodesk.com>
> Date: Thu, 4 Dec 2014 19:05:44 -0500
> Subject: [PATCH] Fixed PNG decoding with Paeth filter for RGBA 64-bits.
>
> I noticed an artifact on the resulting image when passing a RGBA 64-bits PNG
> through the copy filter. The last column, although obviously "derived" from the
> original image's last column (i.e. not completely garbage), had way too much
> yellow in it; a telltale sign of an off-by-X somewhere. Original scenario used
> the scale filter, but taking it out was still exhibiting the problem.
>
> So my boiled-down test consisted in running:
>
> ./ffmpeg.exe -i original_rgba64.png -vf copy result.png
>
> and inspecting the result image with a viewer. I realized the image had some
> lines filtered with Paeth, and only those lines would get the artifact on their
> rightmost pixel.
>
> In libavcodec/pngdec.c, png_filter_row() has buffer-sizing arithmetic for
> feeding the Paeth prediction function with a buffer that has a multiple of 4
> bytes. This is done so the optimized version of the algorithm, using MMX/SSE3,
> can work on 4 bytes at a time without causing a write past the end of the buffer
> when writing back into the result buffer. This arithmetic assumed that
> situations where the buffer is improperly sized were limited to RGB 24-bits
> (bpp, or bytes per pixel, = 3) and that all other formats had buffers properly
> sized for SIMD, i.e. a multiple of 4.
>
> There are 2 problems with this:
>
> . bpp == 8 (RGBA 64-bits, my case) also has the proper size for SIMD. But it
> got handled as if it had a bpp of 3 because the code was testing for bpp == 4
> (with ternary operator's operands reversed).
> . bpp == 6 (RGB 48-bits) can also cause writes past the end of the buffer when
> handling the last 2 of 6 bytes for an image that has an odd number of columns.
>
> So I am reformulating the SIMD-adjusted width to ignore the trailing 3 (or less)
> bytes when the total width is not a multiple of 4. These bytes are fed to the
> unoptimized version of the algorithm.
>
> I was tempted to apply this to bpp == 1 as well (i.e. take out the bpp > 2 test
> in the enclosing if). But looking at pngdec.c's history, this has been done
> earlier and differences between the results of the optimized and non-optimized
> versions of the algorithm were observed. So I am leaving this untouched.
>
> It then looked to me that knowing the SIMD stride requirements of the x86
> platform implementation was a bit of a long-distance coupling between the
> platform-independent pngdec.c and the x86-specific x86/pngdsp.asm. So I
> abstracted this out through a pair of members in the PNGDSPContext structure:
> SIMD stride as well as a mask for adjusting the buffer size.
>
> Signed-off-by: Dominique Leroux <dominique.leroux at autodesk.com>
> ---
> Changelog | 1 +
> libavcodec/pngdec.c | 16 ++++++++++++----
> libavcodec/pngdsp.c | 7 +++++--
> libavcodec/pngdsp.h | 5 +++++
> libavcodec/x86/pngdsp_init.c | 17 +++++++++++------
> 5 files changed, 34 insertions(+), 12 deletions(-)
this is not correct
iam a bit tired so i just quickly read your elaborate text but the asm
works in steps of bpp not steps of 4 and writes 4 bytes at a time
you seem to assume the asm is simpler and more systematic than it
actually is, also theres are bugs in the asm i think
ive a fix for some of this locally but id like to test it more before
pushing
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
Breaking DRM is a little like attempting to break through a door even
though the window is wide open and the only thing in the house is a bunch
of things you dont want and which you would get tomorrow for free anyway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20141207/5bfca9a5/attachment.asc>
More information about the ffmpeg-devel
mailing list