Gettext/Mark-Keywords

4.4 How Marks Appear in Sources

All strings requiring translation should be marked in the C sources. Marking is done in such a way that each translatable string appears to be the sole argument of some function or preprocessor macro. There are only a few such possible functions or macros meant for translation, and their names are said to be marking keywords. The marking is attached to strings themselves, rather than to what we do with them. This approach has more uses. A blatant example is an error message produced by formatting. The format string needs translation, as well as some strings inserted through some ‘%s’ specification in the format, while the result from sprintf may have so many different instances that it is impractical to list them all in some ‘error_string_out()’ routine, say.

This marking operation has two goals. The first goal of marking is for triggering the retrieval of the translation, at run time. The keyword is possibly resolved into a routine able to dynamically return the proper translation, as far as possible or wanted, for the argument string. Most localizable strings are found in executable positions, that is, attached to variables or given as parameters to functions. But this is not universal usage, and some translatable strings appear in structured initializations. See Special cases.

The second goal of the marking operation is to help xgettext at properly extracting all translatable strings when it scans a set of program sources and produces PO file templates.

The canonical keyword for marking translatable strings is ‘gettext’, it gave its name to the whole GNU gettext package. For packages making only light use of the ‘gettext’ keyword, macro or function, it is easily used as is. However, for packages using the gettext interface more heavily, it is usually more convenient to give the main keyword a shorter, less obtrusive name. Indeed, the keyword might appear on a lot of strings all over the package, and programmers usually do not want nor need their program sources to remind them forcefully, all the time, that they are internationalized. Further, a long keyword has the disadvantage of using more horizontal space, forcing more indentation work on sources for those trying to keep them within 79 or 80 columns.

Many packages use ‘_’ (a simple underline) as a keyword, and write ‘_("Translatable string")’ instead of ‘gettext ("Translatable string")’. Further, the coding rule, from GNU standards, wanting that there is a space between the keyword and the opening parenthesis is relaxed, in practice, for this particular usage. So, the textual overhead per translatable string is reduced to only three characters: the underline and the two parentheses. However, even if GNU gettext uses this convention internally, it does not offer it officially. The real, genuine keyword is truly ‘gettext’ indeed. It is fairly easy for those wanting to use ‘_’ instead of ‘gettext’ to declare:

#include <libintl.h>
#define _(String) gettext (String)

instead of merely using ‘#include <libintl.h>’.

The marking keywords ‘gettext’ and ‘_’ take the translatable string as sole argument. It is also possible to define marking functions that take it at another argument position. It is even possible to make the marked argument position depend on the total number of arguments of the function call; this is useful in C++. All this is achieved using xgettext’s ‘--keyword’ option. How to pass such an option to xgettext, assuming that gettextize is used, is described in po/Makevars and AM_XGETTEXT_OPTION.

Note also that long strings can be split across lines, into multiple adjacent string tokens. Automatic string concatenation is performed at compile time according to ISO C and ISO C++; xgettext also supports this syntax.

Later on, the maintenance is relatively easy. If, as a programmer, you add or modify a string, you will have to ask yourself if the new or altered string requires translation, and include it within ‘_()’ if you think it should be translated. For example, ‘"%s"’ is an example of string not requiring translation. But ‘"%s: %d"’ does require translation, because in French, unlike in English, it’s customary to put a space before a colon.