[FFmpeg-devel] [PATCH] add dumpwave filter

Tobias Rapp t.rapp at noa-archive.com
Mon Jan 22 15:23:44 EET 2018

On 20.01.2018 20:17, Dmytro Humeniuk wrote:
>> On 18 Jan 2018, at 17:32, Dmytro Humeniuk <dmitry.gumenyuk at gmail.com> wrote:
>>> On 18 Jan 2018, at 08:56, Tobias Rapp <t.rapp at noa-archive.com> wrote:
>>> On 15.01.2018 13:48, Dmytro Humeniuk wrote:
>>>>> On 15 Jan 2018, at 09:14, Tobias Rapp <t.rapp at noa-archive.com> wrote:
>>>>> On 13.01.2018 23:52, Дмитрий Гуменюк wrote:
>>>>>> Hi,
>>>>>>> On 13 Jan 2018, at 01:37, Дмитрий Гуменюк <dmitry.gumenyuk at gmail.com> wrote:
>>>>>>> Hi
>>>>>>>> On 12 Jan 2018, at 13:32, Дмитрий Гуменюк <dmitry.gumenyuk at gmail.com> wrote:
>>>>>>>>> On 12 Jan 2018, at 13:17, Tobias Rapp <t.rapp at noa-archive.com> wrote:
>>>>>>>>> On 12.01.2018 12:16, Дмитрий Гуменюк wrote:
>>>>>>>>>> Hi
>>>>>>>>>>> On 11 Jan 2018, at 09:20, Tobias Rapp <t.rapp at noa-archive.com> wrote:
>>>>>>>>>>> On 10.01.2018 18:18, Kyle Swanson wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> For this to be a part of libavfilter the output needs to be more generic
>>>>>>>>>>>> than the just the Soundcloud format. If we want this to be generally useful
>>>>>>>>>>>> it should probably just output an array of floats between 0.0 and 1.0. The
>>>>>>>>>>>> consumer of this data (JS library, or whatever) can use this in whatever
>>>>>>>>>>>> way it wants.
>>>>>>>>>>> I agree. If the BWF Peak Envelope output which was suggested in the other thread does not match your demands and filter implementation is actually necessary I would prefer if the filter would attach the RMS value(s) as frame metadata instead of directly dumping to file. Frame metadata can then be re-
>>>>>>>>>> RMS values may be counted for several frames or only for a half of a frame
>>>>>>>>>>> used by other filters or dumped into file by using the existing "ametadata" filter.
>>>>>>>>>>> This would be similar to:
>>>>>>>>>>> ffmpeg -i input-file -f null -filter:a "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat" /dev/null
>>>>>>>>>> I like this idea, but won’t asetnsamples affect performance by creating fifo queue? And it may require some effort to parse long output
>>>>>>>>> I added asetnsamples to define the audio frame size (interval of values from astats). You can reduce the number of lines printed by ametadata by using the "key=lavfi.astats.foo" option.
>>>>>>>> I used asetnsamples as well, and I measured performance while transcoding - it appears to be slight slower
>>>>>>> I think output is now more generic and I got rid of long switch/case, thanks for support
>>>>>> Here is most recent patch, seems like all comments are addressed, did I miss something?
>>>>> I still would prefer to have the value attached as frame metadata, then dumped into file via the existing "ametadata" filter. Even better would be to integrate the statistic value (if missing) into the "astats" filter.
>>>>> If your concern is the output format of "ametadata" then some output format extension (CSV/JSON) needs to be discussed for ametadata/metadata.
>>>>> If your concern is performance then please add some numbers. In my tests using an approx. 5 minutes input WAV file (48kHz, stereo) the run with "asetnsamples" was considerably faster than the run without (1.7s vs. 13.9s)
>>>> Hi
>>>> As I mentioned previously adding metadata to each frame is not possible
>>>> as value may be counted for several frames or only for a half of a frame
>>>> I used 2 hours long 48kHz mp3 https://s3-eu-west-1.amazonaws.com/balamii/SynthSystemSystersJAN2018.mp3
>>>> For this purposes I set up CentOS AWS EC2 nano instance
>>>> Then I transcoded it while filtering like following (just to recreate real situation):
>>>> 1. -filter:a "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat" out.mp3
>>>> 2. -filter:a "dumpwave=n=192197:f=-" out.mp3
>>>> Results:
>>>> 1. 244810550046 nanoseconds
>>>> 2. 87494286740 nanoseconds
>>>> One of the possible use cases - to set up 2 chains of asetnsamples->metadata - for example:
>>>> "asetnsamples=192197,astats=metadata=on,ametadata=print:file=stats-file.dat,asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file1.dat” for sure it will affect performance
>>>> Comparing with "dumpwave=n=192197:f=out1,dumpwave=n= 22050:f=out2"
>>> Sorry, I misunderstood your concerns regarding asetnsamples filter performance. The numbers I provided have been for
>>> "asetnsamples=22050,astats=metadata=on,ametadata=print:file=stats-file.dat"
>>> versus
>>> "astats=metadata=on,ametadata=print:file=stats-file.dat"
>>> When comparing astats+ametadata versus dumpwave it is obvious that a specialized filter which only calculates one statistic value is faster than a filter that calculates multiple statistics. But still my opinion is that if the dumpwave filter is to be added to the codebase it should be more generic (i.e. output frame metadata similar to the psnr/ssim filters for video).
>> Actually current output(normalised float values in range 0...1) was proposed by Kyle as more generic.
> Ping

What I wrote is my personal opinion. I acknowledge that you have put 
good efforts in implementing the patch and even added FATE tests -- so 
my words must sound disappointing to you. Rest assured that almost all 
non-trivial patches need multiple iterations.

 From my side improving the existing astats+ametadata code would be the 
preferred way to continue. If that is absolutely unacceptable to you I 
suggest to take a look at the FFmpeg (public) API, the code in 
doc/examples/filtering_audio.c might be a good starting point.

Best regards,

More information about the ffmpeg-devel mailing list