Webmaster in a Nutshell

Previous Chapter 15 Next
 

15.13 Regular Expressions

Each character matches itself, unless it is one of the special characters + ? . * ^ $ ( ) [ ] { } | \. The special meaning of these characters can be escaped using a \.

.

Matches an arbitrary character, but not a newline unless it is a single-line match (see m/ /s).

(...)

Groups a series of pattern elements to a single element.

^

Matches the beginning of the target. In multiline mode (see m//m) also matches after every newline character.

$

Matches the end of the line. In multiline mode also matches before every newline character.

[...]

Denotes a class of characters to match. [^\*(EL] negates the class.

(... | ... | ...)

Matches one of the alternatives.

(?# text)

Comment.

(?: regexp)

Like (regexp) but does not make back-references.

(?= regexp)

Zero width positive look-ahead assertion.

(?! regexp)

Zero width negative look-ahead assertion.

(? modifier)

Embedded pattern-match modifier. modifier can be one or more of i, m, s, or x.

Quantified subpatterns match as many times as possible. When followed with a ? they match the minimum number of times. These are the quantifiers:

+

Matches the preceding pattern element one or more times.

?

Matches zero or one times.

*

Matches zero or more times.

{n,m}

Denotes the minimum n and maximum m match count. {n} means exactly n times; {n,} means at least n times.

A \ escapes any special meaning of the following character if non-alphanumeric, but it turns most alphanumeric characters into something special:

\w

Matches alphanumeric, including _, \W matches non-alphanumeric.

\s

Matches whitespace, \S matches non-whitespace.

\d

Matches numeric, \D matches non-numeric.

\A

Matches the beginning of the string, \Z matches the end.

\b

Matches word boundaries, \B matches non-boundaries.

\G

Matches where the previous m/ /g search left off.

\n, \r, \f, \t, etc.

Have their usual meaning.

\w, \s, and \d

May be used within character classes, \b denotes a backspace in this context.

Back-references:

\1...\9

Refer to matched subexpressions, grouped with ( ), inside the match.

\10 and up

Can also be used if the pattern matches that many subexpressions.

See also $1...$9, $+, $&, $`, and $' in the section called "Special Variables".

With modifier x, whitespace can be used in the patterns for readability purposes.


Previous Home Next
Array and List Functions Book Index Search and Replace Functions
 


Banner.Novgorod.Ru