doc/concepts/regexp - mudlib-public - Gitiles

 SYNOPSIS
         Regular Expressions


 DESCRIPTION
         LDMud supports both the traditional regular expressions as
         implemented by Henry Spencer ("HS" or "traditional"), and the
         Perl-compatible regular expressions by Philip Hazel ("PCRE").
         Both packages can be used concurrently, with the selection
         being made through extra option flags argument to the efuns.
         One of the two packages can be selected at compile time, by
         commandline argument, and by driver hook to be the default
         package.

         The packages differ in the expressivity of their expressions
         (PCRE offering more options that Henry Spencer's package),
         though they both implement the common subset outlined below.

         All regular expression efuns take an additional options
         parameter, which is a an number composed of bitflags, and is
         used to modify the exact behaviour of the expression
         evaluation. In addition, certain efuns may accept additional
         specific options.

         For details, refer to the detailed manpages: hsregexp(C) for
         the Henry Spencer package, pcre(C) for the PCRE package.


 REGULAR EXPRESSION DETAILS
         A regular expression is a pattern that is matched against  a
         subject string from left to right. Most characters stand for
         themselves in a pattern, and match the corresponding charac-
         ters in the subject. As a trivial example, the pattern

           The quick brown fox

         matches a portion of a subject string that is  identical  to
         itself.  The  power  of  regular  expressions comes from the
         ability to include alternatives and repetitions in the  pat-
         tern.  These  are encoded in the pattern by the use of meta-
         characters, which do not stand for  themselves  but  instead
         are interpreted in some special way.

         The following metacharacters are 'universal' in that both regexp
         packages understand them in the same way:

           .       Match any character.

           ^       Match begin of line.

           $       Match end of line.

           x|y     Match regexp x or regexp y.

           ()      Match enclosed regexp like a 'simple' one.

           x*      Match any number (0 or more) of regexp x.

           x+      Match any number (1 or more) of regexp x.

           [..]    Match one of the characters enclosed.

           [^ ..]  Match none of the characters enclosed. The .. are to
                   replaced by single characters or character ranges:

           [abc]   matches a, b or c.

           [ab0-9] matches a, b or any digit.

           [^a-z]  does not match any lowercase character.

           \B      not a word boundary

           \c      match character c even if it's one of the special
                   characters.

         The following metacharacters or metacharacter combinations implement
         similar functions in the two regexp packages;

           \b      PCRE: word boundary, also used inconjunction with
                         \w (any "word" character) and \W (any "non-word"
                         character).

           \<      HS: Match begin of word.
           \>      HS: Match end of word.


 OPTIONS
         The package is selected with these option flags:

           RE_PCRE
           RE_TRADITIONAL

         These flags are also used for the H_REGEXP_PACKAGE driver
         hook.


         Traditional regular expressions understand one option:

           RE_EXCOMPATIBLE


         PCRE understands these options:

           RE_ANCHORED
           RE_CASELESS
           RE_DOLLAR_ENDONLY
           RE_DOTALL
           RE_EXTENDED
           RE_MULTILINE
           RE_UNGREEDY
           RE_NOTBOL
           RE_NOTEOL
           RE_NOTEMPTY

 HISTORY
         LDMud 3.3.596 implemented the concurrent use of both packages.

 SEE ALSO
         hsregexp(C), pcre(C), regexp_package(H), regexp(E), regexplode(E),
         regmatch(E), regreplace(E), regexp_package(E), invocation(D)
	SYNOPSIS
	Regular Expressions


	DESCRIPTION
	LDMud supports both the traditional regular expressions as
	implemented by Henry Spencer ("HS" or "traditional"), and the
	Perl-compatible regular expressions by Philip Hazel ("PCRE").
	Both packages can be used concurrently, with the selection
	being made through extra option flags argument to the efuns.
	One of the two packages can be selected at compile time, by
	commandline argument, and by driver hook to be the default
	package.

	The packages differ in the expressivity of their expressions
	(PCRE offering more options that Henry Spencer's package),
	though they both implement the common subset outlined below.

	All regular expression efuns take an additional options
	parameter, which is a an number composed of bitflags, and is
	used to modify the exact behaviour of the expression
	evaluation. In addition, certain efuns may accept additional
	specific options.

	For details, refer to the detailed manpages: hsregexp(C) for
	the Henry Spencer package, pcre(C) for the PCRE package.


	REGULAR EXPRESSION DETAILS
	A regular expression is a pattern that is matched against a
	subject string from left to right. Most characters stand for
	themselves in a pattern, and match the corresponding charac-
	ters in the subject. As a trivial example, the pattern

	The quick brown fox

	matches a portion of a subject string that is identical to
	itself. The power of regular expressions comes from the
	ability to include alternatives and repetitions in the pat-
	tern. These are encoded in the pattern by the use of meta-
	characters, which do not stand for themselves but instead
	are interpreted in some special way.

	The following metacharacters are 'universal' in that both regexp
	packages understand them in the same way:

	. Match any character.

	^ Match begin of line.

	$ Match end of line.

	x\|y Match regexp x or regexp y.

	() Match enclosed regexp like a 'simple' one.

	x* Match any number (0 or more) of regexp x.

	x+ Match any number (1 or more) of regexp x.

	[..] Match one of the characters enclosed.

	[^ ..] Match none of the characters enclosed. The .. are to
	replaced by single characters or character ranges:

	[abc] matches a, b or c.

	[ab0-9] matches a, b or any digit.

	[^a-z] does not match any lowercase character.

	\B not a word boundary

	\c match character c even if it's one of the special
	characters.

	The following metacharacters or metacharacter combinations implement
	similar functions in the two regexp packages;

	\b PCRE: word boundary, also used inconjunction with
	\w (any "word" character) and \W (any "non-word"
	character).

	\< HS: Match begin of word.
	\> HS: Match end of word.


	OPTIONS
	The package is selected with these option flags:

	RE_PCRE
	RE_TRADITIONAL

	These flags are also used for the H_REGEXP_PACKAGE driver
	hook.


	Traditional regular expressions understand one option:

	RE_EXCOMPATIBLE


	PCRE understands these options:

	RE_ANCHORED
	RE_CASELESS
	RE_DOLLAR_ENDONLY
	RE_DOTALL
	RE_EXTENDED
	RE_MULTILINE
	RE_UNGREEDY
	RE_NOTBOL
	RE_NOTEOL
	RE_NOTEMPTY

	HISTORY
	LDMud 3.3.596 implemented the concurrent use of both packages.

	SEE ALSO
	hsregexp(C), pcre(C), regexp_package(H), regexp(E), regexplode(E),
	regmatch(E), regreplace(E), regexp_package(E), invocation(D)