[FFmpeg-devel] [FFMpeg-Devel] Ideas for changes to libpostproc

Michael Niedermayer michaelni at gmx.at
Wed Mar 18 02:30:29 CET 2015


On Tue, Mar 17, 2015 at 08:39:02PM -0400, Tucker DiNapoli wrote:
> This isn't really a patch, but it's easiest to express my ideas in the form of
> code. As a patch it creates a single file which is mostly composed of a rewrite
> of the main postprocessing loop. I've tried to express most of my ideas in
> the form of changes to the code, but in cases where that would be too much
> work, or wouldn't make sense in this file I've written my ideas in comments.
> 
> I'm mostly looking for opinions/critisims on my ideas, not necessarily the 
> code itself. I'm fully willing to change code, but I'm more intrested in
> weather or not my ideas make sense or not.
> 

> Updating libpostproc is something I plan to do for the google summer of code,
> so I can't make all the changes I'd like now. I need to have some sort of 
> qualification task complete within the next week, I've submitted some patches to

it would be good to have some patch next week, sure, but there is
more time, the 27th is the deadline for submiting an application to
google, there is more time for the qualification task


[...]

> +                else if(mode & V_DEBLOCK){
> +                    //Not sure how to convert this to simd, I was thinking vertClassify
> +                    //would return a mask classifying multiple blocks, but even if it
> +                    //does I'm not sure how to run the filters
> +
> +                    //I guess I could test the mask, and if it's not uniform
> +                    //run both filters and choose which one to use for each block
> +                    //based on the mask

yes, you have correctly analyzed the situation.
It would be possible to fall back to call the MMX code multiple times
when the type differs and makes AVX/SSE inppossible or both filters
could be run in AVX/SSE and then some mask & combine could be used

One possibility to move towards this in manageable steps could be to
first change the existing code so instead of doing
for each "8x8" block do
    h filter (categorize and apply filter based on that)
    transpose
    v filter (categorize and apply filter based on that)
    transpose
    dering
    ...

-for each "8x8" block do
+for each 4 8x8 blocks do
+    for i in 4 do

then the next step:
+    H categorize 4 blocks
     for i in 4 do
-       H categorize
        H Filter depending on categorize

then here one could add
     H categorize 4 blocks
+    if all have the same categorization
+       H Filter in AVX2
+    else if 2 match
+       H Filter in SSE2
+    else
     for i in 4 do
        H Filter depending on categorize

or the same could be done with the next step in the filtering
pipeline


also iam not sure its worth it to have the main loop block size
variable, it might be easier to always go by steps of 4 8-pixel blocks
horizontally and 1 8pixel block vertically



[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: Digital signature
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20150318/f1aa9d2e/attachment.asc>


More information about the ffmpeg-devel mailing list