[Ffmpeg-devel-old] RE: [Ffmpeg-devel] Speex proposed addition to ffmpeg and suggestions

Dario Andrade dario
Wed Jun 22 21:34:52 CEST 2005


So glad you've mentioned, since I have done something I use it for my own
purposes.
You're right, the reception to such a project was somewhat a little radical,
then I decided to keep it private. (Anyway, doing a native decoder did not
make sense to me. Speex is evolving really fast with tons of ports to DSP
chips and an included fixed point version, if needed).

Do you have the interest in it? I have only a few issues that I needed to
clarify, such as:
4) In order for the speex codec to interpolate frames (in case of a lost
> > packet), I am expecting NULL to the "buf" parameter when decoding it.
> > That will create a single interpolated frame with no bytes consumed. May
> > I do that? Or should I just use the "flush" function for that?

The avcodec API does not specify a way for user to create a "lost frame"
(codec dependent interpolated frame). What do you think about that? (I guess
nobody will ever pass NULL to that function anyway).

Anyway, I've attached my version, which may contain errors.
Along with speex.c, one should add:

-=-=-=-=-=-=--=
to avcodec.h:

177a178
    CODEC_ID_SPEEX=0x17000,
344a346,352
#define CODEC_FLAG2_AUDIO_PREPROC_AGC       0x01000000 // automatic gain
control (encoder or decoder)
#define CODEC_FLAG2_AUDIO_PREPROC_VAD       0x02000000 // voice activity
detection (no activity -> returns 0 bytes encoded)
#define CODEC_FLAG2_AUDIO_PREPROC_DENOISE   0x04000000 // denoising
(preprocessing)
#define CODEC_FLAG2_AUDIO_DTX               0x08000000 // discontinous
transmission (DTX) (no activity -> generates 5 bits/frame, only if vbr or
cng enabled)
#define CODEC_FLAG2_AUDIO_CNG               0x10000000 // confort noise
generation (CNG) (no activity (codec vad) -> generates confort noise, always
when vbr is enabled)
#define CODEC_FLAG2_AUDIO_ENH               0x20000000 // perceptual
enhancement (ENH) (decoder)
 
1945a1961
extern AVCodec speex_encoder;
2046a2063
extern AVCodec speex_decoder;

-=-=-=-=-=-=-=-=-=
to libavformat/wav.c:

38a39
    { CODEC_ID_SPEEX, ('S'<<8)+'x' }, //dats: speex wave format (?)


Let me know if something is missing.
Anyway, I accept suggestions for the proposed changes in API (flags,
etc...).


Cheers,


Dario Andrade
IP.TV
Mobile +55.21.9453.5005
Office +55.21.2141.9525


> -----Original Message-----
> From: Michel Bardiaux [mailto:mbardiaux at peaktime.be]
> Sent: Wednesday, June 22, 2005 12:57 PM
> To: ffmpeg-devel at lists.sourceforge.net; dario at sinistro.net
> Subject: Re: [Ffmpeg-devel] Speex proposed addition to ffmpeg and
> suggestions
> 
> Dario Andrade wrote:
> >
> >
> > Hi Fabrice and ffmpeg maintainers,
> >
> >
> >
> > I am developing a Speex (http://www.speex.org) wrapper for the
> > libavcodec, speex.c, and I would like some feedback and/or suggestions:
> 
> What is the current state of your project? I know from my archive that
> you were, well not exactly flamed but somewhat singed... for going for
> an external library; but since then, the GSM codec using an external lib
> has been accepted, and anyway mp3lame was already in...
> 
> 
> >
> >
> >
> > First of all, if it interests you, I'd love to handle it to you for
> > publishing in the ffmpeg library. In all cases I could use some help and
> > I'd appreciate any feedback:
> >
> >
> >
> > 1) In the
> >
> >
> >
> > int avcodec_encode_audio(AVCodecContext *avctx,
> >
> >                       unsigned char *frame, int buf_size, void *data);
> >
> >
> >
> > I wonder if the framesize I am expecting to receive in the "data"
> > parameter is the same I filled in when user called avcodec_open
> > (ctx->frame_size).
> >
> > Should I expect something greater than that? If user sets that in the
> > ctx before opening, may I just encode everything it asks for? In that
> > case, what if the buf_size is not large enough?
> >
> >
> >
> > 2) Now the most difficult one: in the
> >
> >
> >
> > static int speex_decode_frame(AVCodecContext *avctx,
> >
> >                         void *data, int *data_size,
> >
> >                         uint8_t* buf, int buf_size)
> >
> >
> >
> > I am getting 4096 bytes from ffmpeg.exe when decoding from wav files.
> > What should I expect when decoding it? Should I just decode a single
> > frame and return the number of bytes consumed or should I decode all
> > frames that the "buf" points to (as long as buf_size is large enough).
> > Note that if I decode everything, I may overflow the "data" buffer since
> > speex has a very low bitrate encoding.
> >
> > If I just return one single frame and the number of bytes consumed,
> > ffmpeg.exe will feed the function with (4096 - cbLastUsed, i.e. 4096
> > minus the number of bytes I've consumed in the last call). That will
> > eventually lead to a call with few bytes available to decode
> > (underflowing my decoder bitstream). In this case, how should I proceed?
> > Should I just return that all bytes have been consumed and set data_size
> > to 0??
> >
> >
> >
> > 3) In order to use some advanced speex features as the preprocessor and
> > some codec features, I've added the following data types and used some
> > of the AVCodecContext already published:
> >
> >
> >
> > AVCodecCtx::level (for the AGC preprocess level)
> >
> > AVCodecCtx::noise_reduction (for the denoise preprocessor)
> >
> > AVCodecCtx:: rc_max_rate (the maximum VBR bitrate used, i.e. ABR)
> >
> > AVCodecCtx:: global_quality (for the speex quality property)
> >
> > AVCodecCtx:: flags2 (with the addition of the following constants)
> >
> >
> >
> > #define CODEC_FLAG2_AUDIO_PREPROC_AGC       0x01000000 // automatic gain
> > control (encoder or decoder)
> >
> > #define CODEC_FLAG2_AUDIO_PREPROC_VAD       0x02000000 // voice activity
> > detection (no activity -> returns 0 bytes encoded)
> >
> > #define CODEC_FLAG2_AUDIO_VBR               0x04000000 // variable
> bitrate
> >
> > #define CODEC_FLAG2_AUDIO_DTX               0x08000000 // discontinous
> > transmission (DTX) (no activity -> generates 5 bits/frame, only if vbr
> > or cng enabled)
> >
> > #define CODEC_FLAG2_AUDIO_CNG               0x10000000 // confort noise
> > generation (CNG) (no activity (codec vad) -> generates confort noise,
> > always when vbr is enabled)
> >
> > #define CODEC_FLAG2_AUDIO_ENH               0x20000000 // perceptual
> > enhancement (ENH) (decoder)
> >
> >
> >
> > 4) In order for the speex codec to interpolate frames (in case of a lost
> > packet), I am expecting NULL to the "buf" parameter when decoding it.
> > That will create a single interpolated frame with no bytes consumed. May
> > I do that? Or should I just use the "flush" function for that?
> >
> >
> >
> > Thanks for any feedback.
> >
> >
> >
> > ps: where is the opts.c file?? I love that functionality. Being able to
> > set codec properties (avoptions_parse) by means of a string was one of
> > the best additions to ffmpeg, after the codecs themselves of course.
> > Anyway, I'd love to hear there's another alternative for that.
> >
> >
> >
> > Dario Andrade
> >
> > IP.TV
> >
> > Mobile +55.21.9453.5005
> >
> > Office +55.21.2141.9525
> >
> >
> >
> 
> 
> --
> Michel Bardiaux
> Peaktime Belgium S.A.  Bd. du Souverain, 191  B-1160 Bruxelles
> Tel : +32 2 790.29.41
-------------- next part --------------
A non-text attachment was scrubbed...
Name: speex.c
Type: application/octet-stream
Size: 15339 bytes
Desc: not available
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20050622/41b2b96f/attachment.obj>



More information about the ffmpeg-devel mailing list