DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(mysql.info) regexp

Info Catalog (mysql.info) environment-variables (mysql.info) Top (mysql.info) limits
 
 Appendix G Regular Expressions
 ******************************
 
 A regular expression is a powerful way of specifying a pattern for a
 complex search.
 
 MySQL uses Henry Spencer's implementation of regular expressions, which
 is aimed at conformance with POSIX 1003.2. See  credits. MySQL
 uses the extended version to support pattern-matching operations
 performed with the `REGEXP' operator in SQL statements. See 
 pattern-matching, and  string-comparison-functions.
 
 This appendix is a summary, with examples, of the special characters
 and constructs that can be used in MySQL for `REGEXP' operations. It
 does not contain all the details that can be found in Henry Spencer's
 `regex(7)' manual page. That manual page is included in MySQL source
 distributions, in the `regex.7' file under the `regex' directory.
 
 A regular expression describes a set of strings. The simplest regular
 expression is one that has no special characters in it. For example,
 the regular expression `hello' matches `hello' and nothing else.
 
 Non-trivial regular expressions use certain special constructs so that
 they can match more than one string. For example, the regular
 expression `hello|word' matches either the string `hello' or the string
 `word'.
 
 As a more complex example, the regular expression `B[an]*s' matches any
 of the strings `Bananas', `Baaaaas', `Bs', and any other string
 starting with a `B', ending with an `s', and containing any number of
 `a' or `n' characters in between.
 
 A regular expression for the `REGEXP' operator may use any of the
 following special characters and constructs:
 
    * `^'
 
      Match the beginning of a string.
 
           mysql> SELECT 'fo\nfo' REGEXP '^fo$';                   -> 0
           mysql> SELECT 'fofo' REGEXP '^fo';                      -> 1
 
    * `$'
 
      Match the end of a string.
 
           mysql> SELECT 'fo\no' REGEXP '^fo\no$';                 -> 1
           mysql> SELECT 'fo\no' REGEXP '^fo$';                    -> 0
 
    * `.'
 
      Match any character (including carriage return and newline).
 
           mysql> SELECT 'fofo' REGEXP '^f.*$';                    -> 1
           mysql> SELECT 'fo\r\nfo' REGEXP '^f.*$';                -> 1
 
    * `a*'
 
      Match any sequence of zero or more `a' characters.
 
           mysql> SELECT 'Ban' REGEXP '^Ba*n';                     -> 1
           mysql> SELECT 'Baaan' REGEXP '^Ba*n';                   -> 1
           mysql> SELECT 'Bn' REGEXP '^Ba*n';                      -> 1
 
    * `a+'
 
      Match any sequence of one or more `a' characters.
 
           mysql> SELECT 'Ban' REGEXP '^Ba+n';                     -> 1
           mysql> SELECT 'Bn' REGEXP '^Ba+n';                      -> 0
 
    * `a?'
 
      Match either zero or one `a' character.
 
           mysql> SELECT 'Bn' REGEXP '^Ba?n';                      -> 1
           mysql> SELECT 'Ban' REGEXP '^Ba?n';                     -> 1
           mysql> SELECT 'Baan' REGEXP '^Ba?n';                    -> 0
 
    * `de|abc'
 
      Match either of the sequences `de' or `abc'.
 
           mysql> SELECT 'pi' REGEXP 'pi|apa';                     -> 1
           mysql> SELECT 'axe' REGEXP 'pi|apa';                    -> 0
           mysql> SELECT 'apa' REGEXP 'pi|apa';                    -> 1
           mysql> SELECT 'apa' REGEXP '^(pi|apa)$';                -> 1
           mysql> SELECT 'pi' REGEXP '^(pi|apa)$';                 -> 1
           mysql> SELECT 'pix' REGEXP '^(pi|apa)$';                -> 0
 
    * `(abc)*'
 
      Match zero or more instances of the sequence `abc'.
 
           mysql> SELECT 'pi' REGEXP '^(pi)*$';                    -> 1
           mysql> SELECT 'pip' REGEXP '^(pi)*$';                   -> 0
           mysql> SELECT 'pipi' REGEXP '^(pi)*$';                  -> 1
 
    * `{1}', `{2,3}'
 
      `{n}' or `{m,n}' notation provides a more general way of writing
      regular expressions that match many occurrences of the previous
      atom (or `piece') of the pattern. `m' and `n' are integers.
 
         * `a*'
 
           Can be written as `a{0,}'.
 
         * `a+'
 
           Can be written as `a{1,}'.
 
         * `a?'
 
           Can be written as `a{0,1}'.
 
      To be more precise, `a{n}' matches exactly `n' instances of `a'.
      `a{n,}' matches `n' or more instances of `a'. `a{m,n}' matches `m'
      through `n' instances of `a', inclusive.
 
      `m' and `n' must be in the range from `0' to `RE_DUP_MAX' (default
      255), inclusive. If both `m' and `n' are given, `m' must be less
      than or equal to `n'.
 
           mysql> SELECT 'abcde' REGEXP 'a[bcd]{2}e';              -> 0
           mysql> SELECT 'abcde' REGEXP 'a[bcd]{3}e';              -> 1
           mysql> SELECT 'abcde' REGEXP 'a[bcd]{1,10}e';           -> 1
 
    * `[a-dX]', `[^a-dX]'
 
      Matches any character that is (or is not, if ^ is used) either
      `a', `b', `c', `d' or `X'. A `-' character between two other
      characters forms a range that matches all characters from the
      first character to the second. For example, `[0-9]' matches any
      decimal digit. To include a literal `]' character, it must
      immediately follow the opening bracket `['. To include a literal
      `-' character, it must be written first or last. Any character
      that does not have a defined special meaning inside a `[]' pair
      matches only itself.
 
           mysql> SELECT 'aXbc' REGEXP '[a-dXYZ]';                 -> 1
           mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]$';               -> 0
           mysql> SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';              -> 1
           mysql> SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$';             -> 0
           mysql> SELECT 'gheis' REGEXP '^[^a-dXYZ]+$';            -> 1
           mysql> SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$';           -> 0
 
    * `[.characters.]'
 
      Within a bracket expression (written using `[' and `]'), matches
      the sequence of characters of that collating element. `characters'
      is either a single character or a character name like `newline'.
      You can find the full list of character names in the
      `regexp/cname.h' file.
 
           mysql> SELECT '~' REGEXP '[[.~.]]';                     -> 1
           mysql> SELECT '~' REGEXP '[[.tilde.]]';                 -> 1
 
    * `[=character_class=]'
 
      Within a bracket expression (written using `[' and `]'),
      `[=character_class=]' represents an equivalence class. It matches
      all characters with the same collation value, including itself.
      For example, if `o' and `(+)' are the members of an equivalence
      class, then `[[=o=]]', `[[=(+)=]]', and `[o(+)]' are all
      synonymous. An equivalence class may not be used as an endpoint of
      a range.
 
    * `[:character_class:]'
 
      Within a bracket expression (written using `[' and `]'),
      `[:character_class:]' represents a character class that matches
      all characters belonging to that class. The following table lists
      the standard class names. These names stand for the character
      classes defined in the `ctype(3)' manual page. A particular locale
      may provide other class names. A character class may not be used as
      an endpoint of a range.
 
      `alnum' Alphanumeric characters
      `alpha' Alphabetic characters
      `blank' Whitespace characters
      `cntrl' Control characters
      `digit' Digit characters
      `graph' Graphic characters
      `lower' Lowercase alphabetic characters
      `print' Graphic or space characters
      `punct' Punctuation characters
      `space' Space, tab, newline, and carriage return
      `upper' Uppercase alphabetic characters
      `xdigit'Hexadecimal digit characters
 
           mysql> SELECT 'justalnums' REGEXP '[[:alnum:]]+';       -> 1
           mysql> SELECT '!!' REGEXP '[[:alnum:]]+';               -> 0
 
    * `[[:<:]]', `[[:>:]]'
 
      These markers stand for word boundaries. They match the beginning
      and end of words, respectively. A word is a sequence of word
      characters that is not preceded by or followed by word characters.
      A word character is an alphanumeric character in the `alnum' class
      or an underscore (`_').
 
           mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]';   -> 1
           mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]';  -> 0
 
 To use a literal instance of a special character in a regular
 expression, precede it by two backslash (\) characters. The MySQL
 parser interprets one of the backslashes, and the regular expression
 library interprets the other. For example, to match the string `1+2'
 that contains the special `+' character, only the last of the following
 regular expressions is the correct one:
 
      mysql> SELECT '1+2' REGEXP '1+2';                       -> 0
      mysql> SELECT '1+2' REGEXP '1\+2';                      -> 0
      mysql> SELECT '1+2' REGEXP '1\\+2';                     -> 1
 
Info Catalog (mysql.info) environment-variables (mysql.info) Top (mysql.info) limits
automatically generated byinfo2html