[FFmpeg-devel] [PATCH 1/2] aptx: implement the aptX bluetooth codec

Fri Nov 10 00:48:37 EET 2017

On Thu, Nov 09, 2017 at 02:32:44PM +0000, Rostislav Pehlivanov wrote:
> On 9 November 2017 at 13:37, Aurelien Jacobs <aurel at gnuage.org> wrote:
> 
> > On Thu, Nov 09, 2017 at 12:52:34AM +0000, Rostislav Pehlivanov wrote:
> > > On 8 November 2017 at 22:41, Aurelien Jacobs <aurel at gnuage.org> wrote:
> > >
> > > > On Wed, Nov 08, 2017 at 06:26:03PM +0100, Michael Niedermayer wrote:
> > > > > On Wed, Nov 08, 2017 at 02:06:09PM +0100, Aurelien Jacobs wrote:
> > > > > [...]
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Compute the convolution of the signal with the coefficients,
> > and
> > > > reduce
> > > > > > + * to 24 bits by applying the specified right shifting.
> > > > > > + */
> > > > > > +av_always_inline
> > > > > > +static int32_t aptx_qmf_convolution(FilterSignal *signal,
> > > > > > +                                    const int32_t
> > coeffs[FILTER_TAPS],
> > > > > > +                                    int shift)
> > > > > > +{
> > > > > > +    int32_t *sig = &signal->buffer[signal->pos];
> > > > > > +    int64_t e = 0;
> > > > > > +
> > > > >
> > > > > > +    for (int i = 0; i < FILTER_TAPS; i++)
> > > > >
> > > > > "for (int" is something we avoided as some comilers didnt like it,
> > > > > iam not sure if this is still true but there are none in the codebase
> > > >
> > > > If you insist on this I will of course change it, but hey, we require
> > > > a C99 compiler since a long time and we use so many features of C99
> > > > that I really don't get why we couldn't use "for (int".
> > > > I can't imagine that any relevant C compiler would not support this
> > > > nowadays !
> > > > What I propose is to get this in as is, and see if anyone encounter
> > > > issue with any compiler. If any issue arise, I will of course send a
> > > > patch to fix it.
> > > >
> > > > Here is an updated patch.
> > > >
> > > >
> > > Send another patch because some people are beyond convincing and its
> > really
> > > pissing me off. Really. In particular maybe those compiler writers at
> > > Microsoft who seem to think shipping something that doesn't support such
> > a
> > > simple yet important feature is important.
> > > Or maybe users who don't want to update a 6 year old version of msvc.
> > > Or maybe us because we support compiling with msvc at all when its
> > clearly
> > > _not_ a C compiler and this project is written in C.
> >
> > Here you go.
> >
> > Just for reference, splitting the déclaration out of the for loop added
> > 19 lines in this file, which is a 2.3 % increase of the line count.
> > (I'm not sure this file is representative of the rest of the code base)
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> >
> Still some issues:
> 
> 1.) You need to set AVCodec->supported_samplerates for the encoder, since
> in the raw aptx muxer you always set the samplerate as 44100. Does the
> codec/container support other samplerates?

The codec is actually samplerate agnostic.
It only convert each group of 4 samples to a 16 bit integer.
There is no other information than the samples themselves in the
encoded stream. No header, no frame size, no samplerate.
And there is no standard container used to store aptX along with
the needed metadata such as the samplerate. It is meant to be streamed
thru bluetooth using the A2DP "protocol" which includes its own metadata
signaling with samplerate.
The raw muxer/demuxer that I implemented is mostly for testing purpose,
not really for practical use.
So with this in mind, I don't think that it make any sense to set
AVCodec->supported_samplerates.
I don't think I set the samplerate anywhere in the encoder.
I only set it in the demuxer to a default 44100 just to be able to
playback test files even though there is no way to know its actual
samplerate.

> 2.) "Frame must have a multiple of 4 samples" gets printed out when trying
> to resample a 48000 file and encode it to aptx.
> What you need to do is to remove AV_CODEC_CAP_VARIABLE_FRAME_SIZE from the
> capabilities and set some sane default for avctx->frame_size if
> avctx->frame_size is unset. Users can override it prior to encoder init and
> you can do it through the command line using -frame_size. If the user sets
> the frame_size, check if its a multiple of 4 and use it. Else, error out.
> If it isn't set (e.g. its 0), use the default (1024 is a good value).

OK, I will try this.
And I should probably document that for acheiving the lowest possible
latency for realtime bluetooth streaming, frame_size shoud be set to
4 samples.

> 3.) The packet timestamps on the encoder side are missing. Use the
> AudioFrameQueue API (libavcodec/audio_frame_queue.h) which at every frame
> counts how many samples you're given via the avframe and calculates what
> dts/pts to set on the avpacket.
> Just call ff_af_queue_init at the end of your init function (after the
> frame size is set and checked), call ff_af_queue_add at the start of your
> encode function and ff_af_queue_remove with the number of samples you
> encoded and pointers to the packet pts/duration fields. And make sure to
> call ff_af_queue_close on uninit, which will print out an error if you've
> encoded more samples than you've received or less samples than you've
> recevied.

I will look at this.