[FFmpeg-devel] Suggestion for a centralized language-tag facility in libavformat

Michael Niedermayer michaelni
Tue Apr 14 19:56:42 CEST 2009


On Tue, Apr 14, 2009 at 05:56:47PM +0200, cyril comparon wrote:
> Please find the updated patch.

> No more parasite code in it, and more memory-efficient language-code
> storage structure.

the changes to asf should still be split off he addition of the language
code remapping stuff


> Anyway, I did not sort it in any way since it is not bound to be used
> in tight loops or any code that may suffer performance penalty because
> of non-optimal look-up time.

i dont think thats a valid excuse, there is no disadvantage in sorting


tabs
avlanguages_2.patch:136:+    	AVMetadataTag *lang_tag = av_metadata_get(s->streams[n]->metadata, "language", NULL, 0);
avlanguages_2.patch:138:+        	const char *language = avlanguage_ISO6392toISO6391(lang_tag->value);
avlanguages_2.patch:145:+    	AVMetadataTag *lang_tag = av_metadata_get(s->streams[n]->metadata, "language", NULL, 0);

not gcc 2.95 compatible
avlanguages_2.patch:675:+        for (int i=0; languageTable[i].iso6392[0] != '\0'; i++)
avlanguages_2.patch:711:+        for (int i=0; languageTable[i].iso6392[0] != '\0'; i++)

Non static with no ff_/av_ prefix
avlanguages_2.patch:686:+const char *avlanguage_ISO6392toISO6391(const char *lang)
avlanguages_2.patch:697:+const char *avlanguage_ISO6392toHumanReadable(const char *lang)
avlanguages_2.patch:708:+const char *avlanguage_ISO6391toISO6392(const char *lang)
avlanguages_2.patch:744:+const char *avlanguage_ISO6392toISO6391(const char *lang);
avlanguages_2.patch:745:+const char *avlanguage_ISO6392toHumanReadable(const char *lang);
avlanguages_2.patch:746:+const char *avlanguage_ISO6391toISO6392(const char *lang);
avlanguages_2.patch:758:+HEADERS = avformat.h avio.h avlanguages.h
avlanguages_2.patch:761:+OBJS = allformats.o cutils.o metadata.o metadata_compat.o options.o os_support.o sdp.o utils.o avlanguages.o

missing } prior to else
avlanguages_2.patch:60:+    else PRINT_IF_GUID(g, ff_asf_language_guid);

 Non doxy comments
avlanguages_2.patch:202:+    const char iso6392[4];    // 3-char code of language as per ISO/IEC 639-2 (always exists)
avlanguages_2.patch:203:+    const char iso6392alt[4]; // alternative 3-char code of language as per ISO/IEC 639-2 (may be empty)
avlanguages_2.patch:204:+    const char iso6391[3];    // 2-char code of language as per ISO/IEC 639-1, (may be empty)
avlanguages_2.patch:205:+    const char english[60];   // readable English name for the language (always exists)
avlanguages_2.patch-206-+} TableEntry;
avlanguages_2.patch-207-+
NULL

Missing changelog entry (ignore if minor change)



[...]

> Index: libavformat/avformat.h
> ===================================================================
> --- libavformat/avformat.h	(revision 18510)
> +++ libavformat/avformat.h	(working copy)
> @@ -390,7 +390,7 @@
>      int64_t duration;
>  
>  #if LIBAVFORMAT_VERSION_INT < (53<<16)
> -    char language[4]; /** ISO 639 3-letter language code (empty string if undefined) */
> +    char language[4]; /** ISO639-2 3-letter language code (empty string if undefined) */
>  #endif

cosmetic

[...]
> +typedef struct {
> +    const char iso6392[4];    // 3-char code of language as per ISO/IEC 639-2 (always exists)
> +    const char iso6392alt[4]; // alternative 3-char code of language as per ISO/IEC 639-2 (may be empty)
> +    const char iso6391[3];    // 2-char code of language as per ISO/IEC 639-1, (may be empty)

these could be 1 byte shorter i think


> +    const char english[60];   // readable English name for the language (always exists)

is 60 not a little long for a sane name of a language
in that sense i think that if this cant be reduced to 1-2 words max
it could maybe be droped

also what does all te (Other) mean?

[...]

> +  { "crh", ""   , ""  , "Crimean Turkish" },
> +  { "ara", ""   , "ar", "Arabic" },
> +  { "mis", ""   , ""  , "Miscellaneous languages" },
> +  { "lim", ""   , ""  , "Lezghian (lezLimburgan - limLimburger - limlimburgish)" },
[...]
> +  { "cai", ""   , ""  , "Central American Indian (Other)" },
[...]
> +  { "myn", ""   , ""  , "Mayan languages" },
[...]
> +  { "grc", ""   , ""  , "Greek, Ancient (to 1453)" },

its Ancient Greek not Greek Ancient, the order is correct in the others above
otherwise it would be "languages Mayan"
Also i doubt the year really should be in there, i doubt
languages switch in exactly 1 year in all greek speaking countries


[...]
> Index: libavformat/Makefile
> ===================================================================
> --- libavformat/Makefile	(revision 18510)
> +++ libavformat/Makefile	(working copy)
> @@ -3,9 +3,9 @@
>  NAME = avformat
>  FFLIBS = avcodec avutil
>  
> -HEADERS = avformat.h avio.h
> +HEADERS = avformat.h avio.h avlanguages.h

iam not sure if this stuff should be part of the public API


[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20090414/adb66475/attachment.pgp>



More information about the ffmpeg-devel mailing list