You can use the following rules to build multicharacter regular expressions:
<cfoutput>REReplace("Hello","[T]*","7","ALL") -
#REReplace("Hello","[T]*","7","ALL")#<BR></cfoutput>
results in the following output:
REReplace("Hello","[T]*","7","ALL") - 7H7e7l7l7o
Here the regular expression [T]* can match empty strings. It first matches the empty string before "H" in "Hello". Next, (note that the "ALL" artgument tells REReplace
to replace all instances of an expression), the empty string before "e" is matched and so on until the empty string before "o" is matched. This result might be unexpected. The workarounds for these types of problems are specific to each case. In some cases you can use [T]+, which requires at least one "T", instead of [T]*. Alternatively, you might be able to specify an additional pattern after [T]*. In the following example the regular expression has a "W" at the end:
<cfoutput>REReplace("Hello World","[T]*W","7","ALL") -
#REReplace("Hello World","[T]*W","7","ALL")#<BR></cfoutput>
This expression results in the following more predictable output:
REReplace("Hello World","[T]*W","7","ALL") - Hello 7orld
An excellent reference on regular expressions is Mastering Regular Expressions, Jeffrey E. F. Friedl. O'Reilly & Associates, Inc., 1997. ISBN: 1-56592-257-3, http://www.oreilly.com.
In CFML regular expression functions, large input strings (greater than approximately 20,000 characters) cause a debug assertion failure and a regular expression error occurs. To avoid this, break your input into smaller chunks, as the following example shows. Here the variable input has a size greater than 50000.
<cfset test = mid(input, 1, 20000)>
<cfset out1 = REReplace(test, "[ #Chr(9)##Chr(13)##Chr(10)#]+#Chr(13)##Chr(10)#", "#chr(10)#", "ALL")> <cfset test = mid(input, 20001, 20000)> <cfset out2 = REReplace(test, "[ #Chr(9)##Chr(13)##Chr(10)#]+#Chr(13)##Chr(10)#", "#chr(10)#", "ALL")> <cfset test = mid(input, 40001, len(input) - 40000)> <cfset out3 = REReplace(test, "[ #Chr(9)##Chr(13)##Chr(10)#]+#Chr(13)##Chr(10)#", "#chr(10)#", "ALL")> <cfset result = out1 & out2 & out3>
You can anchor all or part of a regular expression to either the beginning or end of the string being searched: