Emacs/docs/latest/emacs/Etags-Regexps
Previous: Create Tags Table, Up: Tags Tables [Contents][Index]
28.4.2.3 Etags Regexps
The ‘--regex
’ option to etags
allows tags to be recognized by regular expression matching. You can intermix this option with file names; each one applies to the source files that follow it. If you specify multiple ‘--regex
’ options, all of them are used in parallel. The syntax is:
--regex=[{language}]/tagregexp/[nameregexp/]modifiers
The essential part of the option value is tagregexp
, the regexp for matching tags. It is always used anchored, that is, it only matches at the beginning of a line. If you want to allow indented tags, use a regexp that matches initial whitespace; start it with ‘[ \t]*
’.
In these regular expressions, ‘\
’ quotes the next character, and all the C character escape sequences are supported: ‘\a
’ for bell, ‘\b
’ for back space, ‘\e
’ for escape, ‘\f
’ for formfeed, ‘\n
’ for newline, ‘\r
’ for carriage return, ‘\t
’ for tab, and ‘\v
’ for vertical tab. In addition, ‘\d
’ stands for the DEL
character.
Ideally, tagregexp
should not match more characters than are needed to recognize what you want to tag. If the syntax requires you to write tagregexp
so it matches more characters beyond the tag itself, you should add a nameregexp
, to pick out just the tag. This will enable Emacs to find tags more accurately and to do completion on tag names more reliably. In nameregexp
, it is frequently convenient to use “back references” (see Regexp Backslash) to parenthesized groupings ‘\( … \)
’ /@w in tagregexp
. For example, ‘\1
’ refers to the first such parenthesized grouping. You can find some examples of this below.
The modifiers
are a sequence of zero or more characters that modify the way etags
does the matching. A regexp with no modifiers is applied sequentially to each line of the input file, in a case-sensitive way. The modifiers and their meanings are:
- ‘
i
’ - Ignore case when matching this regexp.
- ‘
m
’ - Match this regular expression against the whole file, so that multi-line matches are possible.
- ‘
s
’ - Match this regular expression against the whole file, and allow ‘
.
’ intagregexp
to match newlines.
The ‘-R
’ option cancels all the regexps defined by preceding ‘--regex
’ options. It too applies to the file names following it. Here’s an example:
etags --regex=/reg1/i voo.doo --regex=/reg2/m \ bar.ber -R --lang=lisp los.er
Here etags
chooses the parsing language for voo.doo
and bar.ber
according to their contents. etags
also uses reg1
to recognize additional tags in voo.doo
, and both reg1
and reg2
to recognize additional tags in bar.ber
. reg1
is checked against each line of voo.doo
and bar.ber
, in a case-insensitive way, while reg2
is checked against the whole bar.ber
file, permitting multi-line matches, in a case-sensitive way. etags
uses only the Lisp tags rules, with no user-specified regexp matching, to recognize tags in los.er
.
You can restrict a ‘--regex
’ option to match only files of a given language by using the optional prefix {language}
. (‘etags --help
’ prints the list of languages recognized by etags
.) This is particularly useful when storing many predefined regular expressions for etags
in a file. The following example tags the DEFVAR
macros in the Emacs source files, for the C language only:
--regex='{c}/[ \t]*DEFVAR_[A-Z_ \t(]+"\([^"]+\)"/\1/'
When you have complex regular expressions, you can store the list of them in a file. The following option syntax instructs etags
to read two files of regular expressions. The regular expressions contained in the second file are matched without regard to case.
--regex=@case-sensitive-file --ignore-case-regex=@ignore-case-file
A regex file for etags
contains one regular expression per line. Empty lines, and lines beginning with space or tab are ignored. When the first character in a line is ‘@
’, etags
assumes that the rest of the line is the name of another file of regular expressions; thus, one such file can include another file. All the other lines are taken to be regular expressions. If the first non-whitespace text on the line is ‘--
’, that line is a comment.
For example, we can create a file called ‘emacs.tags
’ with the following contents:
-- This is for GNU Emacs C source files {c}/[ \t]*DEFVAR_[A-Z_ \t(]+"\([^"]+\)"/\1/
and then use it like this:
etags [email protected] *.[ch] */*.[ch]
Here are some more examples. The regexps are quoted to protect them from shell interpretation.
- Tag Octave files:
etags --language=none \ --regex='/[ \t]*function.*=[ \t]*\([^ \t]*\)[ \t]*(/\1/' \ --regex='/###key \(.*\)/\1/' \ --regex='/[ \t]*global[ \t].*/' \ *.m
- Tag Tcl files:
etags --language=none --regex='/proc[ \t]+\([^ \t]+\)/\1/' *.tcl
- Tag VHDL files:
etags --language=none \ --regex='/[ \t]*\(ARCHITECTURE\|CONFIGURATION\) +[^ ]* +OF/' \ --regex='/[ \t]*\(ATTRIBUTE\|ENTITY\|FUNCTION\|PACKAGE\ \( BODY\)?\|PROCEDURE\|PROCESS\|TYPE\)[ \t]+\([^ \t(]+\)/\3/'
Previous: Create Tags Table, Up: Tags Tables [Contents][Index]