[FFmpeg-devel] [RFC] Generic psychoacoustic model interface
Michael Niedermayer
michaelni
Wed Aug 27 16:33:17 CEST 2008
On Wed, Aug 27, 2008 at 11:35:20AM +0300, Kostya wrote:
> Here's my first attempt to define codec-agnostic psy model.
> Here's an interface for it. I'm not sure about AC3, but
> it should be possible to use it with DCA, Vorbis,
> MPEG Audio Layers I-III and NBC, maybe WMA too.
> In case somebody codes an implementation, of course.
> Personally I plan to make my encoder use it backed with
> already implemented 3GPP model.
[...]
> /**
> * windowing related information
> */
> typedef struct FFWindowInfo{
> int window_type[2]; ///< window type (short/long/transitional, etc.) - current and previous
> int window_shape; ///< window shape (sine/KBD/whatever)
> void *additional_info; ///< codec-dependent window information
passing opaque data from psy to encoder is not clean, it requires
both to maintain a "hidden" compatible API
> }FFWindowInfo;
>
> /**
> * context used by psychoacoustic model
> */
> typedef struct FFPsyContext{
> AVCodecContext *avctx; ///< encoder context
>
> FFPsyBand bands[MAX_BANDS]; ///< frame bands information
> FFWindowInfo *win_info; ///< frame window info
>
> const uint8_t *long_bands; ///< scalefactor band sizes for long frame
> int num_long_bands; ///< number of scalefactor bands for long frame
> const uint8_t *short_bands; ///< scalefactor band sizes for short frame
> int num_short_bands; ///< number of scalefactor bands for short frame
Having only 2 band lists would be a problem for any codec that has more
than 2 window lengths (like wma)
[...]
> /**
> * Suggest window sequence for channel.
> *
> * @param ctx model context
> * @param audio samples for the current frame
> * @param la lookahead samples (NULL when unavailable)
> * @param channel number of channel element to analyze
> * @param prev_type previous window type
> *
> * @return suggested window information in a structure
> */
> FFWindowInfo* ff_psy_suggest_window(AACPsyContext *ctx, int16_t *audio, int16_t *la,
> int channel, int prev_type);
...get/find/calculate_suggested...
audio&la should be const
and maybe the return should be FFWindowInfo instead of FFWindowInfo* to
avoid memleak issues ...
>
> /**
> * Perform psychoacoustic analysis and set band info.
> *
> * @param ctx model context
> * @param tag number of channel element to analyze
> * @param type channel element type (e.g. ID_SCE or ID_CPE)
> * @param cpe pointer to the current channel element
> */
> void ff_psy_analyze(AACPsyContext *ctx, int tag, int type, ChannelElement *cpe);
ChannelElement is AAC specific
[...]
> /**
> * Preprocess several channel in audio frame in order to compress it better.
> *
> * @param ctx preprocessing context
> * @param audio samples to preprocess
> * @param dest place to put filtered samples
> * @param tag number of channel group
> * @param channels number of channel to preprocess (some additional work may be done on stereo pair)
> */
> void ff_aac_psy_preprocess(struct FFPsyPreprocessContext *ctx, int16_t *audio, int16_t *dest, int tag, int channels);
audio is missing a const
[...]
--
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
There will always be a question for which you do not know the correct awnser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080827/f73e4036/attachment.pgp>
More information about the ffmpeg-devel
mailing list