Regexp Summary (The GNU Awk User’s Guide)

From Get docs
Gawk/docs/latest/Regexp-Summary

Previous: Case-sensitivity, Up: Regexp   [Contents][Index]



3.9 Summary

  • Regular expressions describe sets of strings to be matched. In awk, regular expression constants are written enclosed between slashes: //.
  • Regexp constants may be used standalone in patterns and in conditional expressions, or as part of matching expressions using the ‘~’ and ‘!~’ operators.
  • Escape sequences let you represent nonprintable characters and also let you represent regexp metacharacters as literal characters to be matched.
  • Regexp operators provide grouping, alternation, and repetition.
  • Bracket expressions give you a shorthand for specifying sets of characters that can match at a particular point in a regexp. Within bracket expressions, POSIX character classes let you specify certain groups of characters in a locale-independent fashion.
  • Regular expressions match the leftmost longest text in the string being matched. This matters for cases where you need to know the extent of the match, such as for text substitution and when the record separator is a regexp.
  • Matching expressions may use dynamic regexps (i.e., string values treated as regular expressions).
  • gawk’s IGNORECASE variable lets you control the case sensitivity of regexp matching. In other awk versions, use tolower() or toupper().