In Emacs, charset is short for “character set”. Emacs
supports most popular charsets (such as
addition to some charsets of its own (such as
eight-bit). All supported characters
belong to one or more charsets.
Emacs normally does the right thing with respect to charsets, so that you don’t have to worry about them. However, it is sometimes helpful to know some of the underlying details about charsets.
One example is font selection (see Fonts). Each language
environment (see Language Environments) defines a priority
list for the various charsets. When searching for a font, Emacs
initially attempts to find one that can display the highest-priority
charsets. For instance, in the Japanese language environment, the
japanese-jisx0208 has the highest priority, so Emacs
tries to use a font whose
registry property is
There are two commands that can be used to obtain information about charsets. The command M-x list-charset-chars prompts for a charset name, and displays all the characters in that character set. The command M-x describe-character-set prompts for a charset name, and displays information about that charset, including its internal representation within Emacs.
M-x list-character-sets displays a list of all supported charsets. The list gives the names of charsets and additional information to identity each charset; for more details, see the [https://www.itscj.ipsj.or.jp/itscj_english/iso-ir/ISO-IR.pdf ISO International Register of Coded Character Sets to be Used with Escape Sequences (ISO-IR)] maintained by the [https://www.itscj.ipsj.or.jp/itscj_english/ Information Processing Society of Japan/Information Technology Standards Commission of Japan (IPSJ/ITSCJ)]. In this list, charsets are divided into two categories: normal charsets are listed first, followed by supplementary charsets. A supplementary charset is one that is used to define another charset (as a parent or a subset), or to provide backward-compatibility for older Emacs versions.
To find out which charset a character in the buffer belongs to, put point before it and type C-u C-x = (see International Chars).