Regex
Edit on GitHubRegular Expressions.
Added in 0.4.3
No other changes yet.
Values
Functions for working with regular expressions.
Regex.make
Added in 0.4.3
No other changes yet.
Compiles the given pattern string into a regular expression object.
For a general overview of regular expressions, refer to “Mastering Regular Expressions” by Friedl, or other online resources.
Regular expressions are a combination of normal and special characters. A normal
character in a pattern will match a one-character string containing that character.
Moreover, if there are two regular expressions A
and B
, they can be concatenated
into a regular expression AB
. If a string p
matches A
and q
matches B
,
then pq
will match AB
.
The special character sequences are as follows:
.
- Matches any character, except for a newline in multi-line mode^
- Matches the beginning of the input, or after a newline (\n
) in multi-line mode$
- Matches the end of the input, or right before a newline (\n
) in multi-line mode«re»*
- Matches«re»
zero or more times«re»+
- Matches«re»
one or more times«re»?
- Matches«re»
zero or one times«re»{«n»}
- Matches«re»
exactly«n»
times«re»{«n»,}
- Matches«re»
«n»
or more times«re»{,«m»}
- Matches«re»
zero to«m»
times«re»{«n»,«m»}
- Matches«re»
between«n»
and«m»
times«re»{}
- Matches«re»
zero or more times[«rng»]
- Matches any character in«rng»
(see below)[^«rng»]
- Matches any character not in«rng»
(see below)\«n»
- Matches the latest match for group«n»
(one-indexed)\b
- Matches the boundary of\w*
(\w
defined below, under “basic classes”)\B
- Matches where\b
does not\p{«property»}
- Matches any character with Unicode property«property»
(see below)\P{«property»}
- Matches any character without Unicode property«property»
(see below)(«re»)
- Matches«re»
, storing the result in a group(?:«re»)
- Matches«re»
without storing the result in a group(?«mode»:«re») - Matches
«re»with the mode settings specified by
«mode»` using the following syntax:«mode»i
- The same as«mode»
, but with case-insensitivity enabled (temporarily not supported until grain-lang/grain#661 is resolved)«mode»-i
- The same as«mode»
, but with case-insensitivity disabled (the default)«mode»m
/«mode»-s
- The same as«mode»
, but with multi-line mode enabled«mode»-m
/«mode»s
- The same as«mode»
, but with multi-line mode disabled- An empty string, which will not change any mode settings
(?«tst»«re1»|«re2»)
- Will match«re1»
if«tst»
, otherwise will match«re2»
. The following options are available for«tst»
(«n»)
- Will be true if group«n»
has a match(?=«re»)
- Will be true if«re»
matches the next sequence(?!«re»)
- Will be true if«re»
does not match the next sequence(?<=«re»)
- Will be true if«re»
matches the preceding sequence(?<!«re»)
- Will be true if«re»
does not match the preceding sequence
(?«tst»«re»)
- Equivalent to(?«tst»«re»|)
- Finally, basic classes (defined below) can also appear outside of character ranges.
Character ranges (referred to as «rng»
above) have the following syntax:
«c»
- Matches the character«c»
exactly«c1»-«c2»
- Matches any character with a character code between the character code for«c1»
and the code for«c2»
These forms can be repeated any number of times, which will construct a range of their union. That is, [ba-c]
and [a-c]
are equivalent ranges.
Additionally, there are the following special cases:
- A
]
as the first character of the range will match a]
- A
-
as the first or last character of the range will match a-
- A
^
in any position other than the first position will match a^
\«c»
, where«c»
is a non-alphabetic character, will match«c»
Furthermore, ranges can include character classes, which are predefined commonly-used sets of characters. There are two “flavors” of these: basic classes and POSIX classes. Both are provided for ease of use and to maximize compatibility with other regular expression engines, so feel free to use whichever is most convenient.
The basic classes are as follows:
\d
- Matches0-9
\D
- Matches characters not in\d
\w
- Matchesa-z
,A-Z
,0-9
, and_
\W
- Matches characters not in\w
\s
- Matches space, tab, formfeed, and return\S
- Matches characters not in\s
The POSIX classes are as follows:[:alpha:]
- Matchesa-z
andA-Z
[:upper:]
- MatchesA-Z
[:lower:]
- Matchesa-z
[:digit:]
- Matches0-9
[:xdigit:]
- Matches0-9
,a-f
, andA-F
[:alnum:]
- Matchesa-z
,A-Z
, and0-9
[:word:]
- Matchesa-z
,A-Z
,0-9
, and_
[:blank:]
- Matches space and tab[:space:]
- Matches space, tab, newline, formfeed, and return[:cntrl:]
- Contains all characters with code points < 32[:ascii:]
- Contains all ASCII characters
Parameters:
param | type | description |
---|---|---|
regexString |
String |
The regular expression to compile |
Returns:
type | description |
---|---|
Result<RegularExpression, String> |
The compiled regular expression |
Examples:
Regex.MatchResult
This object contains the results of a regular expression match. The results can be obtained using the following accessors:
Returns the contents of the given group. Note that group 0 contains the entire matched substring, and group 1 contains the first parenthesized group.
Returns the position of the given group.
The number of defined groups in this match object (including group 0).
Returns the contents of all groups matched in this match object.
Returns the positions of all groups matched in this match object.
Regex.isMatch
Added in 0.4.3
No other changes yet.
Determines if the given regular expression has a match in the given string.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search within |
Returns:
type | description |
---|---|
Bool |
true if the RegExp matches the string or false otherwise |
Examples:
Regex.isMatchRange
Added in 0.4.3
No other changes yet.
Determines if the given regular expression has a match in the given string between the given start/end offsets.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search |
start |
Number |
The start offset to search between |
end |
Number |
The end offset to search between |
Returns:
type | description |
---|---|
Bool |
true if the RegExp matches the string in the given range, otherwise false |
Examples:
Regex.find
Added in 0.4.3
No other changes yet.
Returns the first match for the given regular expression contained within the given string.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search |
Returns:
type | description |
---|---|
Option<MatchResult> |
The match result, if any |
Examples:
Regex.findRange
Added in 0.4.3
No other changes yet.
Returns the first match for the given regular expression contained within the given string between the given start/end range.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search |
start |
Number |
The start offset to search between |
end |
Number |
The end offset to search between |
Returns:
type | description |
---|---|
Option<MatchResult> |
The match result, if any |
Examples:
Regex.findAll
Returns all matches for the given regular expression contained within the given string.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search |
Returns:
type | description |
---|---|
List<MatchResult> |
The list of matches |
Regex.findAllRange
Added in 0.4.3
No other changes yet.
Returns all matches for the given regular expression contained within the given string between the given start/end range.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
string |
String |
The string to search |
start |
Number |
The start offset to search between |
end |
Number |
The end offset to search between |
Returns:
type | description |
---|---|
List<MatchResult> |
The list of matches |
Examples:
Regex.replace
Added in 0.4.3
No other changes yet.
Replaces the first match for the given regular expression contained within the given string with the specified replacement. Replacement strings support the following syntax:
$&
- Replaced with the text of the matching portion of input (e.g. for(foo)
, the search stringfoo bar
, and the replacementbaz $&
, the result will bebaz foo bar
)$n
/$nn
(wheren
is a digit) - Replaced with the text of groupnn
$$
- Replaced with a literal$
$.
- Does nothing (this exists to support replacement strings such as$4$.0
, which will place the contents of group 4 prior to a zero)- `$`` - Replaced with the text preceding the matched substring
$'
- Replaced with the text following the matched substring- Any other character will be placed as-is in the replaced output.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
toSearch |
String |
The string to search |
replacement |
String |
The string that replaces matches |
Returns:
type | description |
---|---|
String |
The given string with the appropriate replacements, if any |
Examples:
Regex.replaceAll
Added in 0.4.3
No other changes yet.
Replaces all matches for the given regular expression contained within the given string with the specified replacement.
See replace
for replacement string syntax.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to search for |
toSearch |
String |
The string to search |
replacement |
String |
The string that replaces matches |
Returns:
type | description |
---|---|
String |
The input string with the appropriate replacements, if any |
Examples:
Regex.split
Added in 0.5.5
No other changes yet.
Splits the given string at the first match for the given regular expression.
If the regex pattern contains capture groups, the content of the groups will be included in the output list.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to match |
str |
String |
The string to split |
Returns:
type | description |
---|---|
List<String> |
A list of the split segments |
Examples:
Regex.splitAll
Added in 0.5.5
No other changes yet.
Splits the given string at every match for the given regular expression.
If the regex pattern contains capture groups, the content of the groups will be included in the output list.
Parameters:
param | type | description |
---|---|---|
rx |
RegularExpression |
The regular expression to match |
str |
String |
The string to split |
Returns:
type | description |
---|---|
List<String> |
A list of the split segments |
Examples: