[NUT-devel] questions about "Language" info packets

Clemens Ladisch cladisch at fastmail.net
Tue Feb 13 16:26:11 CET 2007


Michael Niedermayer wrote:
> On Tue, Feb 13, 2007 at 12:38:35PM +0100, Clemens Ladisch wrote:
> > nut.txt says:
> > |   "Language"
> > |       ISO 639 and ISO 3166 for language/country code
> > 
> > Does "ISO 639" mean ISO 639-1 or ISO 639-2?
> > Are both codes required or allowed?  If yes, in what format?
> 
> that is a very good question, as the example below is a ISO 639-2 code
> i think its clear that ISO 639-2 is allowed
> 
> furthermore there is a link
> http://www.loc.gov/standards/iso639-2/englangn.html
> pointng to 639-2 but none to 639-1 so id say 639-1 is not allowed
> also all 639-1 codes have a code in 639-2 while many 639-2 codes
> do not have one in 639-1
> comments are of course welcome ...
> 
> > |       something like "eng" (US English)
> > 
> > When using a three-letter code from ISO 639-2, should a nut writer use
> > the bibliographic or the terminology code?
> 
> that is also a very good question, i think none of us was aware that there
> are 2 different codes for some languages (that is one based on the native
> word for the language and one based on the english word) but luckily the
> majority of the languages has just 1 code

And we Germans are out of luck and cannot use nut?  ;-)

If the language code were just used as a code, it wouldn't matter which
one is to be used, but there are certain players that just display the
raw code instead of converting it to a language name, so I think it
makes sense to let the encoder choose which one to use.

> > Are two-letter codes allowed at all?
> 
> id say no

So ISO 3166 is out, too?

> > |       can be 0 if unknown
> > 
> > Does this mean that there is no Language entry, or that it is an emtpy
> > string, or that it is a string containing a zero byte, or that the
> > string is "0"?
> 
> hmm ISO 639-2 contains a "und" for undetermined and nothing in our spec
> forbids its use so iam tempted to say that "und" must/should be used if
> unknown and applications must treat a empty string like "und"
> 
> > |       and "multi" if several languages
> > 
> > ISO 639-2 already has "mul" for multiple languages.
> > Does this mean that both "mul" and "multi" are allowed?
> 
> id handle this like above:
> "mul" must/should be used if multiple languages but demuxers must
> treat "multi" like "mul"

OK.  Proposed new description:

    "Language"
        An ISO 639-2 (three-letter) language code, e.g. "eng" for English
        (see <http://www.loc.gov/standards/iso639-2/php/code_list.php>).
        All codes defined in ISO 639-2 are allowed, including "und"
        (Undetermined), "mul" (Multiple languages) and the bibliographic/
        terminology variants.
        For historical reasons, demuxers MUST treat "multi" like "mul" and
        "" (the empty string) like "und".


Regards,
Clemens



More information about the NUT-devel mailing list