MG Mud User | 88f1247 | 2016-06-24 23:31:02 +0200 | [diff] [blame^] | 1 | SYNOPSIS |
| 2 | Regular Expressions |
| 3 | |
| 4 | |
| 5 | DESCRIPTION |
| 6 | LDMud supports both the traditional regular expressions as |
| 7 | implemented by Henry Spencer ("HS" or "traditional"), and the |
| 8 | Perl-compatible regular expressions by Philip Hazel ("PCRE"). |
| 9 | Both packages can be used concurrently, with the selection |
| 10 | being made through extra option flags argument to the efuns. |
| 11 | One of the two packages can be selected at compile time, by |
| 12 | commandline argument, and by driver hook to be the default |
| 13 | package. |
| 14 | |
| 15 | The packages differ in the expressivity of their expressions |
| 16 | (PCRE offering more options that Henry Spencer's package), |
| 17 | though they both implement the common subset outlined below. |
| 18 | |
| 19 | All regular expression efuns take an additional options |
| 20 | parameter, which is a an number composed of bitflags, and is |
| 21 | used to modify the exact behaviour of the expression |
| 22 | evaluation. In addition, certain efuns may accept additional |
| 23 | specific options. |
| 24 | |
| 25 | For details, refer to the detailed manpages: hsregexp(C) for |
| 26 | the Henry Spencer package, pcre(C) for the PCRE package. |
| 27 | |
| 28 | |
| 29 | REGULAR EXPRESSION DETAILS |
| 30 | A regular expression is a pattern that is matched against a |
| 31 | subject string from left to right. Most characters stand for |
| 32 | themselves in a pattern, and match the corresponding charac- |
| 33 | ters in the subject. As a trivial example, the pattern |
| 34 | |
| 35 | The quick brown fox |
| 36 | |
| 37 | matches a portion of a subject string that is identical to |
| 38 | itself. The power of regular expressions comes from the |
| 39 | ability to include alternatives and repetitions in the pat- |
| 40 | tern. These are encoded in the pattern by the use of meta- |
| 41 | characters, which do not stand for themselves but instead |
| 42 | are interpreted in some special way. |
| 43 | |
| 44 | The following metacharacters are 'universal' in that both regexp |
| 45 | packages understand them in the same way: |
| 46 | |
| 47 | . Match any character. |
| 48 | |
| 49 | ^ Match begin of line. |
| 50 | |
| 51 | $ Match end of line. |
| 52 | |
| 53 | x|y Match regexp x or regexp y. |
| 54 | |
| 55 | () Match enclosed regexp like a 'simple' one. |
| 56 | |
| 57 | x* Match any number (0 or more) of regexp x. |
| 58 | |
| 59 | x+ Match any number (1 or more) of regexp x. |
| 60 | |
| 61 | [..] Match one of the characters enclosed. |
| 62 | |
| 63 | [^ ..] Match none of the characters enclosed. The .. are to |
| 64 | replaced by single characters or character ranges: |
| 65 | |
| 66 | [abc] matches a, b or c. |
| 67 | |
| 68 | [ab0-9] matches a, b or any digit. |
| 69 | |
| 70 | [^a-z] does not match any lowercase character. |
| 71 | |
| 72 | \B not a word boundary |
| 73 | |
| 74 | \c match character c even if it's one of the special |
| 75 | characters. |
| 76 | |
| 77 | The following metacharacters or metacharacter combinations implement |
| 78 | similar functions in the two regexp packages; |
| 79 | |
| 80 | \b PCRE: word boundary, also used inconjunction with |
| 81 | \w (any "word" character) and \W (any "non-word" |
| 82 | character). |
| 83 | |
| 84 | \< HS: Match begin of word. |
| 85 | \> HS: Match end of word. |
| 86 | |
| 87 | |
| 88 | OPTIONS |
| 89 | The package is selected with these option flags: |
| 90 | |
| 91 | RE_PCRE |
| 92 | RE_TRADITIONAL |
| 93 | |
| 94 | These flags are also used for the H_REGEXP_PACKAGE driver |
| 95 | hook. |
| 96 | |
| 97 | |
| 98 | Traditional regular expressions understand one option: |
| 99 | |
| 100 | RE_EXCOMPATIBLE |
| 101 | |
| 102 | |
| 103 | PCRE understands these options: |
| 104 | |
| 105 | RE_ANCHORED |
| 106 | RE_CASELESS |
| 107 | RE_DOLLAR_ENDONLY |
| 108 | RE_DOTALL |
| 109 | RE_EXTENDED |
| 110 | RE_MULTILINE |
| 111 | RE_UNGREEDY |
| 112 | RE_NOTBOL |
| 113 | RE_NOTEOL |
| 114 | RE_NOTEMPTY |
| 115 | |
| 116 | HISTORY |
| 117 | LDMud 3.3.596 implemented the concurrent use of both packages. |
| 118 | |
| 119 | SEE ALSO |
| 120 | hsregexp(C), pcre(C), regexp_package(H), regexp(E), regexplode(E), |
| 121 | regmatch(E), regreplace(E), regexp_package(E), invocation(D) |