[FFmpeg-devel] [PATCH 1/2] lavu/tx: rewrite internal code as a tree-based codelet constructor

Lynne dev at lynne.ee
Wed Jan 26 05:31:53 EET 2022


25 Jan 2022, 18:17 by onemda at gmail.com:

> On Tue, Jan 25, 2022 at 11:46 AM Lynne <dev at lynne.ee> wrote:
>
>> 21 Jan 2022, 09:51 by dev at lynne.ee:
>>
>> > 21 Jan 2022, 09:33 by dev at lynne.ee:
>> >
>> >> This commit rewrites the internal transform code into a constructor
>> >> that stitches transforms (codelets).
>> >> This allows for transforms to reuse arbitrary parts of other
>> >> transforms, and allows transforms to be stacked onto one
>> >> another (such as a full iMDCT using a half-iMDCT which in turn
>> >> uses an FFT). It also permits for each step to be individually
>> >> replaced by assembly or a custom implementation (such as an ASIC).
>> >>
>> >> Patch attached.
>> >>
>> >
>> > Forgot that I disabled double and int32 transforms to speed up
>> > testing, reenabled locally and on my github tx_tree branch.
>> > Also removed some inactive debug code.
>> > https://github.com/cyanreg/FFmpeg/tree/tx_tree
>> >
>>
>> I fixed bugs and improved to code more, and I think it's ready
>> for merging now.
>> The rdft is no longer bound by any convention, and its
>> scale may be changed by the user, eliminating after-transform
>> multiplies that are used pretty much everywhere in our code.
>>
>> If someone (looks at Paul) gives it a test or converts a filter,
>> would be nice. I've only tested it on my synthetic benchmarks:
>> https://github.com/cyanreg/lavu_fft_test
>>
>
>
> Will try it once its applied. Thanks.
>

Applied.
It's around 20% faster than lavc's rdft for powers of two lengths.
Non-power-of-two lengths are partially SIMD'd, so they're usable too.
I'll SIMD the small O(n) rdft loop once I'm done with NEON's and
PFA's SIMD. If you find bugs ping me on IRC.


More information about the ffmpeg-devel mailing list