[FFmpeg-soc] [soc] libavsequencer [PATCH 02/08] Sub-song public API header file.
vitor1001 at gmail.com
Sat Aug 14 19:44:37 CEST 2010
On 08/14/2010 02:50 PM, Sebastian Vater wrote:
> Vitor Sessak a écrit :
>> On 08/13/2010 10:11 PM, Sebastian Vater wrote:
>>> Vitor Sessak a écrit :
>>>> Shouldn't be the job of whoever fills up the BSS to translate the
>>>> format commands to ours, instead of having two possible interpretation
>>>> of the commands by the player?
>>> This is not only a flag for demuxers, but also for composers, sometimes
>>> composers are such familar with the usual 0-64 volume range, they simply
>>> want to continue using it instead of being forced to new 0-255 range.
>> Composers are not using the BSS, they will use a composing tool. In
>> the worse case, when transcoding from a 0-64 range format to another
>> we will do the conversion to the 0-255 range and back.
> Regarding the volume stuff I'm long time thinking about to extend that
> to at least 16-bit at all, i.e. union of volume and sub-volume, the same
> would apply to panning and sub-panning, etc.
Let's leave this to later.
> Up to now, they're distinct entities, sub-volume and sub-panning is only
> considered during slides but never actually taken into account on final
> This stuff will surely cause interesting discussions. Or to be short,
> yes, there's a lot of things to improve! But all this please, when the
> basic stuff here works, i.e. module playback as original TuComposer did.
>>> This also affects slides, consider a initial volume of 0xFF and 0x40
>>> (old volume).
>> This should be possible to set elsewhere in the BSS, no?
> Hmm where? Would you like that extra defined in each track or sth. like
> that? I think once in a song is enough.
> Maybe I also should mention, that in TuComposer the relationship between
> module and sub-song is like the relationship of album between track.
>>> Now we volume slide down by 04 (which is 01 in old volume). 0xFF will
>>> become 0xFB then and 0x40 will become 0x3F. But 0x3F is 0xFC in 8-bit
>>> volume so it is off by one level.
>> Well, converting 8-bit to 6-bit is lossly, but what about if we
>> convert everything to 8-bit?
> Well, removing this would not actually be a problem, because I never
> used that, in TuComposer all loaders, indeed, have a flag to tell if to
> use 8-bit or 6-bit volumes, but my current implementation always used
> 8-bit though.
> Maybe this is really a candidate to fully remove.
Everything that is not used by any code _should_ be fully removed! It is
easy to start simple and improve afterwards than to try to commit a
complicated patch with unused parts.
>>> To be honest, this flag is for those: "I'm a hardcore MOD correct
>>> playback freak!" guys ;)
>>>>> + AVSEQ_SONG_COMPAT_FLAG_GLOBAL_NEW_ONLY = 0x10, ///< Global
>>>>> volume/panning changes affect new notes only (S3M)
>>>> Hmm, a flag for making "global X affect only Y" looks a sign of bad
>>> This again, is a compatibility flag, S3M global volume/panning effects
>>> are different than used in other trackers. Removing these would require
>>> an intelligent parser of the whole song track data which is a nightmare
>>> to do.
>> However is filling the BSS up wouldn't need to do this parsing anyway?
> No, it requires fully understanding of what's being parsed. Again, this
> is hard to describe in words (this problem was detected after
> disassembly nightmares almost 10 years ago, so I barely remember what
> was the exact cause for this flag, now).
> Let me explain the problem:
> In XM/IT when you use the global volume/panning commands, they only
> affect the next note being played, but not in S3M, here it affects all
> notes being currently played already!
Then it looks clear to me to duplicate all the volume formats to have an
"immediate" version and a "next note" version. Do you agree?
>>> Also OctaMED has a strange feature switching just the timing type (SPD
>>> <=> BpM) without changing the actual value.
>>>>> + AVSEQ_SONG_FLAG_MONO = 0x04, ///< Use mono
>>>>> instead of stereo output
>>>>> + AVSEQ_SONG_FLAG_SURROUND = 0x08, ///< Initial global
>>>>> surround instead of stereo panning
>>>> I don't really understand the comment...
>>> Should mean: Use initial global surround panning instead of stereo, i.e.
>>> all channels default to surround panning. Should I fix this to this
>>>>> + /** AVSequencerOrderList pointer to list of order data. */
>>>>> + AVSequencerOrderList *order_list;
>>>> How does one find the end of the list?
>>> order_list has number of channels elements
>> So this should be expressed in the comment.
>>>>> + /** Stack size, i.e. maximum recursion depth of GoSub command
>>>>> + defaults to 4. */
>>>>> + uint16_t gosub_stack_size;
>>>> Quoting what we discussed:
>>>>>>>>>> Doesn't this depend on the player implementation? Or is it
>>>>>>>>>> format-specific? Or is it read from the file?
>>>>>>>> GOSUB is a TuComposer only feature right now. I thought this to
>>>>>>>> be a
>>>>>>>> nice default value, for creating a new sub-song.
>>>>>> Ok, but does the player need to know this value or it could just fail
>>>>>> if the song use more recursion depth than the player support?
>>> GOSUB is TuComposer only right now. The player needs to know this value
>>> to determine if it reaches end of stack pointer (i.e. allow no further
>>> increase of it), or if it's safe to push the new order list onto a new
>>> element on the stack.
>>> GOSUB is basically a position jump command which pushes next order entry
>>> (before jumping) onto the gosub stack and return will pop it and
>>> continue at that position.
>>> This is useful to call refrains, in old trackers you had to replicate
>>> the whole pattern sequence for the refrain part of a song. In TuComposer
>>> you just can do a GOSUB to the refrain and that's done.
>>> Regarding the default of 4, that's an intuitive impression I got to use
>>> at default, in fact, we can change that to any value we like. ;-)
>> Wait, you are adding a new MOD command and you are expecting people to
>> use it to exploit the fact that the stack pointer is limited to write
>> songs? The clean way to do it is everybody supposes that the stack
>> depth is infinite, if ever a song shows up that uses more than the
>> player supports the player fail with an error. Anyway, is there any
>> sense having a depth of more than, ie, 256?
> Ehm, here you got me really wrong! The initial default does NOT mean,
> you're stick to it. You can set that to any valid 16-bit value you want to.
> The point why I did choose 4 entries is that it will sastify 99% (of my
> felt) needs, since most will call that for refrains and maybe for some
> nice drum loops, etc.
> For those who need more, can simply use the GUI slider to increase the
> stack and then use the gosub with that.
> But good question, if a depth of more than 256 is necessary, maybe some
> professors want to do recursive fibonacci with it? *gg*
> Anyway there will be public API functions for setting new stack sizes.
Let me repeat my question again: Why is the stack depth should be
described in the BSS? Why not set it in player.h?
>>>>> + /** Compatibility flags for playback. There are rare cases
>>>>> + where effect handling can not be mapped into internal
>>>>> + playback engine and have to be handled specially. For
>>>>> + each sub-song which needs this, this will define new
>>>>> + flags which tag the player to handle it to that special
>>>>> + way. */
>>>>> + uint8_t compat_flags;
>>>> enum AVSequencerSongFlags compat_flags;
>>>>> + /** Song playback flags. Some sequencers use a totally
>>>>> + different timing scheme which has to be taken care
>>>>> + specially in the internal playback engine. Also
>>>>> + sequencers differ in how they handle slides. */
>>>>> + uint8_t flags;
>>>>> + /** Maximum number of host channels, as edited in the track view.
>>>>> + to be allocated and usable for order list (defaults to
>>>>> 16). */
>>>>> + uint16_t channels;
>>>> Who sets this, based on what? Does anything fails if one sets this to
>>>> twice the correct value?
>>> Valid channels for sub-songs are 0-255 right now, i.e. channels = 256
>>> being the maximum value. The reason is that the effects use only 8-bit
>>> values to set channel data, i.e. making this value larger would make
>>> them impossible to address them with effects, etc.
>> Again, is this a limitation of the player? If yes, it does not belong
>> to the BSS.
> The BSS should not pass actually data to the player which it can't
No, of course it should! How can the BSS avoid doing it if it is not
supposed to know anything about the player? If the players limitations
should be defined somewhere, it should be in player.h!
> and more than 8-bit channels is not handled correctly, it will
> fallback a lots of features normally available, i.e. exact channel
> control via effects.
> The engines itself don't care much about numbers actually there, in
> theory both the module virtual channels and song host channels can
> handle infinite channels.
> But the effects are designed to access and control them, also, which is
> done currently by 8-bit values (other 8-bit decide control mechanism).
> But I do expect, as a musician that I can control what I play.
Again, what the player can or cannot do is completely irrelevant for the
discussion of the BSS. The BSS should be a way to describe a song, no
matter who (or if) it is going to be played!
> Imagine you're driving a car and I tell you, well, you can control 3 of
> your hoops, but the 4th one is beyond access boundary. Would you
> buy/take such one? No, right?
Imagine you are driving a car. When you are reading your map (that you
brought in some random shop), you find instructions of how bad the 4th
gear in Toyota cars behave, right in the middle of the map! Ok, your car
happens to be a Toyota and the problem happens to be real, but this does
not means such info about belong to the map, it belongs to the car manual!!!
> That's why I keep the limitations to be consistent, i.e. fitting in each
I don't really care much at this point about _what_ the limitations are,
I care now about _where_ they are, and IMHO it is not in the BSS.
> We can't simply say, ok let's change that to uint16_t from
> uint8_t. Or vice versa. I did carefully choose the data structures when
> writing TuComposer. I know, that this not set in stone, but they're
> useful and the whole engine currently is based on this as is.
> To summarize, the current limitations ensure that you can at least can
> control all channels, etc. by effects, which wouldn't be possible
> anymore if we extend that blindnessly.
Again, I'm not discussing the limitations. I'm discussing their presence
in the BSS.
>> To see if it really belongs to the BSS or to the player API, imagine
>> the following situation: someday someone decides to write wrappers so
>> FFmpeg can play MOD using either libavsequence, libmodplug or mikmod
>> (at user's choice). All it would do is to pass the BSS to the wrapper
>> that would call the external library. Would the 256 limitation still
>> makes sense? What if libmodplug supports up to one million channels?
> This limitation is not hard, it's only a soft-limitation in the sense
> of: the commands can't address more channels for direct control.
> Well, I know, I just added a check to avseq_song_set_channels which
> enforces these limits, but these can be removed if necessary. External
> apps are not necessary required to enforce that limits.
> Well, one million channels is quite a lot, you'ld need an CPU 100 times
> faster than most modern ones, even with the low quality mixer.
>>>>> + /** Initial speed divider, i.e. denominator which defaults
>>>>> + to disabled = 0. */
>>>>> + uint8_t speed_div;
>>>>> + /** Initial MED style SPD speed (defaults to 33 as in
>>>>> + OctaMED Soundstudio). */
>>>>> + uint16_t spd_speed;
>>>>> + /** Initial number of rows per beat (defaults to 4 rows are a
>>>>> beat). */
>>>>> + uint16_t bpm_tempo;
>>>>> + /** Initial beats per minute speed (defaults to 50 Hz => 125
>>>>> BpM). */
>>>>> + uint16_t bpm_speed;
>>>> Writing the defaults in the comment is a bad idea, since there is 99%
>>>> of change that if someday one changes them they will forget to update
>>>> the comment.
>>> It's the all trackers default, I encountered so far. ANY tracker I seen
>>> for now, sets BpM speed to 125 and BpM Tempo 4 (most even don't BpM
>>> tempo but only BpM speed, but for even for those BpM tempo 4 is
>>> Anyway, have you a better idea where to put the defaults then?
>> Nowhere. There is already a function that all it does is setting the
>> defaults. People who want to know what the defaults are should read
>> its source.
>> For the comment, I'd suggest:
>> /** Initial beats per minute speed (125 BpM for the vast majority of
>> formats). */
>> uint16_t bpm_speed;
>> Note how my example is _not_ one line of code away to being wrong,
>> while yours is.
> Fixed! Anyway it's the default value which is used when the structure is
> initialized (the same value as used in all trackers as initial value).
>>>>> + /** Minimum and lower limit of number of frames per row
>>>>> + (defaults to 1), a value of zero is pointless, since
>>>>> + that would mean to play unlimited rows and tracks in
>>>>> + just one tick. */
>>>>> + uint16_t frames_min;
>>>> Again, why is this needed? What happens if the correct value is 10 and
>>>> someone sets it to 1?
>>> There is no "correct" value in this sense regarding to the BSS. The
>>> correct ranges solely are based on the input format.
>> So this field do not below in the BSS. The whole idea of the BSS is to
>> be input-format independent!
> Yes, and that's why these values are there. Also composers can set this
> manually. Please don't forgot that ALL fields in the BSS are supposed to
> be editable by the user, too.
> Please note, removing any of these fields will break compatibility with
> IFF-TCM1 file format since TuComposer stores the values.
How so? If it is defined somewhere in IFF-TCM1 files, the decoder are
free to ignore this field, no?
>>> Some trackers allow
>>> 1-255, some 2-255, some 1-65535 (like TuComposer).
>>> This field is also
>>> supposed to be user editable to ensure that speed slides never exceed
>>> min/max ranges.
>> What if a user is editing a MOD file that he intends to save as a
>> completely different format? Again, the idea is the BSS should be
>> format-agnostic. Whoever is trying to create a file (or filling up the
>> BSS for the creation of some file) should take care of that.
> Well in that case, it can save the limits with the new target format (if
> it supports it), or it simply has to discard them. Also the
> muxer/encoder could issue a warning that certain value are not supported.
>>>>> + /** Maximum and upper limit of number of frames per row
>>>>> + (defaults to 255) since a larger value would not make
>>>>> + sense (see track effects, they all set 8-bit values only),
>>>>> + if we would allow a higher speed here, we could never
>>>>> + change the speed values which are larger than 255. */
>>>>> + uint16_t frames_max;
>>>> Again, looks like a limitation of the player, not something that
>>>> should be part of the sub-song description.
>>> It's in fact, a compatibility flag, too. Most trackers support only
>>> 8-bit tempo setting, requiring that values at 255 keep at 255.
>> So this should be named frames_clip_limit and the comment should
>> reflect that. You should then clearly say that for some formats, when
>> dealing with frames per row, 255+1 == 255.
>>> TuComposer however supports a maximum value of 65535. In order not to
>>> break compatibility, this field was introduced. It will retain
>>> compatibility with 8-bit data trackers.
>> This is reasonable.
>>>> These last comments apply to all the limits fields.
>>> The min/max fields are all supposed (as anything in the BSS) to be
>>> directly user editable, e.g. for preventing slides to exceed certain
>> This is a problem of the GUI editor, not of the BSS. You might define
>> later (very later, after playing is working and committed) a new
>> struct containing the different limits of each file format, but should
>> be separated from the BSS.
> The BSS is supposed to be 100% compatible with the IFF-TCM1 file format
> (in fact it's the only defined format right now which is able to store
> the whole BSS), that's also why I don't want to extend the sizes of the
> structs, because that would break 100% IFF-TCM1 compatibility and would
> also requiring introduction of a new format.
??? How so? How hard is to read an integer field from the BSS, convert
it so uint8_t (or fails with an error if it doesn't fit) and write it in
the TCM file? Conversely, for decoding, how hard is converting an
uint8_t read to an int in the BSS?
More information about the FFmpeg-soc