Gettext/Language-Implementors
Next: Programmers for other Languages, Up: Programming Languages [Contents][Index]
15.1 The Language Implementor’s View
All programming and scripting languages that have the notion of strings
are eligible to supporting gettext
. Supporting gettext
means the following:
- You should add to the language a syntax for translatable strings. In
principle, a function call of
gettext
would do, but a shorthand syntax helps keeping the legibility of internationalized programs. For example, in C we use the syntax_("string")
, and in GNU awk we use the shorthand_"string"
. - You should arrange that evaluation of such a translatable string at
runtime calls the
gettext
function, or performs equivalent processing. - Similarly, you should make the functions
ngettext
,dcgettext
,dcngettext
available from within the language. These functions are less often used, but are nevertheless necessary for particular purposes:ngettext
for correct plural handling, anddcgettext
anddcngettext
for obeying other locale-related environment variables thanLC_MESSAGES
, such asLC_TIME
orLC_MONETARY
. For these latter functions, you need to make theLC_*
constants, available in the C header<locale.h>
, referenceable from within the language, usually either as enumeration values or as strings. - You should allow the programmer to designate a message domain, either by
making the
textdomain
function available from within the language, or by introducing a magic variable calledTEXTDOMAIN
. Similarly, you should allow the programmer to designate where to search for message catalogs, by providing access to thebindtextdomain
function or — on native Windows platforms — to thewbindtextdomain
function. - You should either perform a
setlocale (LC_ALL, "")
call during the startup of your language runtime, or allow the programmer to do so. Remember that gettext will act as a no-op if theLC_MESSAGES
andLC_CTYPE
locale categories are not both set. A programmer should have a way to extract translatable strings from a program into a PO file. The GNU
xgettext
program is being extended to support very different programming languages. Please contact the GNUgettext
maintainers to help them doing this. The GNUgettext
maintainers will need from you a formal description of the lexical structure of source files. It should answer the questions:- What does a token look like?
- What does a string literal look like? What escape characters exist inside a string?
- What escape characters exist outside of strings? If Unicode escapes are supported, are they applied before or after tokenization?
- What is the syntax for function calls? How are consecutive arguments in the same function call separated?
- What is the syntax for comments?
Based on this description, the GNU
gettext
maintainers can add support toxgettext
.If the string extractor is best integrated into your language’s parser, GNU
xgettext
can function as a front end to your string extractor.The language’s library should have a string formatting facility. Additionally:
- There must be a way, in the format string, to denote the arguments by a positional number or a name. This is needed because for some languages and some messages with more than one substitutable argument, the translation will need to output the substituted arguments in different order. See c-format Flag.
- The syntax of format strings must be documented in a way that translators
can understand. The GNU
gettext
manual will be extended to include a pointer to this documentation.
Based on this, the GNU
gettext
maintainers can add a format string equivalence checker tomsgfmt
, so that translators get told immediately when they have made a mistake during the translation of a format string.- If the language has more than one implementation, and not all of the
implementations use
gettext
, but the programs should be portable across implementations, you should provide a no-i18n emulation, that makes the other implementations accept programs written for yours, without actually translating the strings. - To help the programmer in the task of marking translatable strings,
which is sometimes performed using the Emacs PO mode (see Marking),
you are welcome to
contact the GNU
gettext
maintainers, so they can add support for your language topo-mode.el
.
On the implementation side, two approaches are possible, with different effects on portability and copyright:
- You may link against GNU
gettext
functions if they are found in the C library. For example, an autoconf test forgettext()
andngettext()
will detect this situation. For the moment, this test will succeed on GNU systems and on Solaris 11 platforms. No severe copyright restrictions apply, except if you want to distribute statically linked binaries. - You may emulate or reimplement the GNU
gettext
functionality. This has the advantage of full portability and no copyright restrictions, but also the drawback that you have to reimplement the GNUgettext
features (such as theLANGUAGE
environment variable, the locale aliases database, the automatic charset conversion, and plural handling).
Next: Programmers for other Languages, Up: Programming Languages [Contents][Index]