[NUT-devel] questions about "Language" info packets
Clemens Ladisch
cladisch at fastmail.net
Tue Feb 13 16:26:11 CET 2007
Michael Niedermayer wrote:
> On Tue, Feb 13, 2007 at 12:38:35PM +0100, Clemens Ladisch wrote:
> > nut.txt says:
> > | "Language"
> > | ISO 639 and ISO 3166 for language/country code
> >
> > Does "ISO 639" mean ISO 639-1 or ISO 639-2?
> > Are both codes required or allowed? If yes, in what format?
>
> that is a very good question, as the example below is a ISO 639-2 code
> i think its clear that ISO 639-2 is allowed
>
> furthermore there is a link
> http://www.loc.gov/standards/iso639-2/englangn.html
> pointng to 639-2 but none to 639-1 so id say 639-1 is not allowed
> also all 639-1 codes have a code in 639-2 while many 639-2 codes
> do not have one in 639-1
> comments are of course welcome ...
>
> > | something like "eng" (US English)
> >
> > When using a three-letter code from ISO 639-2, should a nut writer use
> > the bibliographic or the terminology code?
>
> that is also a very good question, i think none of us was aware that there
> are 2 different codes for some languages (that is one based on the native
> word for the language and one based on the english word) but luckily the
> majority of the languages has just 1 code
And we Germans are out of luck and cannot use nut? ;-)
If the language code were just used as a code, it wouldn't matter which
one is to be used, but there are certain players that just display the
raw code instead of converting it to a language name, so I think it
makes sense to let the encoder choose which one to use.
> > Are two-letter codes allowed at all?
>
> id say no
So ISO 3166 is out, too?
> > | can be 0 if unknown
> >
> > Does this mean that there is no Language entry, or that it is an emtpy
> > string, or that it is a string containing a zero byte, or that the
> > string is "0"?
>
> hmm ISO 639-2 contains a "und" for undetermined and nothing in our spec
> forbids its use so iam tempted to say that "und" must/should be used if
> unknown and applications must treat a empty string like "und"
>
> > | and "multi" if several languages
> >
> > ISO 639-2 already has "mul" for multiple languages.
> > Does this mean that both "mul" and "multi" are allowed?
>
> id handle this like above:
> "mul" must/should be used if multiple languages but demuxers must
> treat "multi" like "mul"
OK. Proposed new description:
"Language"
An ISO 639-2 (three-letter) language code, e.g. "eng" for English
(see <http://www.loc.gov/standards/iso639-2/php/code_list.php>).
All codes defined in ISO 639-2 are allowed, including "und"
(Undetermined), "mul" (Multiple languages) and the bibliographic/
terminology variants.
For historical reasons, demuxers MUST treat "multi" like "mul" and
"" (the empty string) like "und".
Regards,
Clemens
More information about the NUT-devel
mailing list