[FFmpeg-devel] [PATCH] AAC Encoder, Round 2

Michael Niedermayer michaelni
Sun Aug 24 16:10:12 CEST 2008


On Sun, Aug 24, 2008 at 09:21:26AM +0300, Kostya wrote:
> On Sat, Aug 23, 2008 at 10:28:03PM +0200, Michael Niedermayer wrote:
> > On Sat, Aug 23, 2008 at 06:31:30PM +0300, Kostya wrote:
> > > I'm back (feeling even worse than before but nm).
> > > 
> > > Here is $subj is in a form of diff against FFmpeg SVN.
> > 
> > now the psy model:
> > 
> > [...]
> > 
> > > +/**
> > >   * Calculate Bark value for given line.
> > >   */
> > >  static inline float calc_bark(float f)
> > >  {
> > >      return 13.3f * atanf(0.00076f * f) + 3.5f * atanf((f / 7500.0f) * (f / 7500.0f));
> > >  }
> > 
> > why does vorbis_dec.c use a slightly different one?
> 
> I use generic formula available everywhere.
> There's a comment in http://svn.xiph.org/trunk/vorbis/lib/scales.h:
> 
> /* The bark scale equations are approximations, since the original
>    table was somewhat hand rolled.  The below are chosen to have the
>    best possible fit to the rolled tables, thus their somewhat odd
>    appearance (these are more accurate and over a longer range than
>    the oft-quoted bark equations found in the texts I have).  The
>    approximations are valid from 0 - 30kHz (nyquist) or so.
> 
>    all f in Hz, z in Bark */

vorbis_dec uses
#define BARK(x) \
    (13.1f*atan(0.00074f*(x))+2.24f*atan(1.85e-8f*(x)*(x))+1e-4f*(x))

does anyone happen to know why there is a difference?
One would think a text from xiph would match a codec from xiph ...


>  
> > except that, i think the previous reviews have not been dealt with yet.
> > That is the various suggestions for quality improvment should be tried
> > what is better should be adopted
> > Also everything that Gabriel Bouvign suggested should be tried.
> 
> Err, when I find a way to download them. $20 for three-page paper is a bit
> high to me.

forget the papers, implement what does not depend on pay per view paper
IIRC he said something about scalefactors and 3gpp as well.


>  
> > I do not mind if we leave some of the harder things like viterbi based window
> > decission to after svn ci, but the majority of the things suggested should
> > be tried before the code is commited.
> 
> Comment on interface then or propose your own.
> It will be needed to plug any psychoacoustic model.
> Also it would allow to finish encoder faster and then concentrate on
> model(s).

The split between psy and encoder is odd to say at least.

things psy can provide IMHO
* find perceptual weights per band or per coefficient used for RD
* find the perceptual distortion between 2 time domain signals
* find the perceptual distortion between 2 freq domain signals, possibly
  just a single band or coeff



[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

I hate to see young programmers poisoned by the kind of thinking
Ulrich Drepper puts forward since it is simply too narrow -- Roman Shaposhnik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20080824/9fb21ace/attachment.pgp>



More information about the ffmpeg-devel mailing list