Javascript regexp find all matches

Patterns and flags

Regular expressions are patterns that provide a powerful way to search and replace in text.

In JavaScript, they are available via the RegExp object, as well as being integrated in methods of strings.

Regular Expressions

A regular expression (also “regexp”, or just “reg”) consists of a pattern and optional flags.

There are two syntaxes that can be used to create a regular expression object.

regexp = new RegExp("pattern", "flags");

And the “short” one, using slashes «/» :

regexp = /pattern/; // no flags regexp = /pattern/gmi; // with flags g,m and i (to be covered soon)

Slashes /. / tell JavaScript that we are creating a regular expression. They play the same role as quotes for strings.

In both cases regexp becomes an instance of the built-in RegExp class.

The main difference between these two syntaxes is that pattern using slashes /. / does not allow for expressions to be inserted (like string template literals with $ <. >). They are fully static.

Slashes are used when we know the regular expression at the code writing time – and that’s the most common situation. While new RegExp is more often used when we need to create a regexp “on the fly” from a dynamically generated string. For instance:

let tag = prompt("What tag do you want to find?", "h2"); let regexp = new RegExp(`>`); // same as // if answered "h2" in the prompt above

Flags

Regular expressions may have flags that affect the search.

There are only 6 of them in JavaScript:

i With this flag the search is case-insensitive: no difference between A and a (see the example below). g With this flag the search looks for all matches, without it – only the first match is returned. m Multiline mode (covered in the chapter Multiline mode of anchors ^ $, flag «m»). s Enables “dotall” mode, that allows a dot . to match newline character \n (covered in the chapter Character classes). u Enables full Unicode support. The flag enables correct processing of surrogate pairs. More about that in the chapter Unicode: flag «u» and class \p . y “Sticky” mode: searching at the exact position in the text (covered in the chapter Sticky flag «y», searching at position)

Читайте также:  Найти произведение нечетных элементов массива питон

From here on the color scheme is:

Searching: str.match

As mentioned previously, regular expressions are integrated with string methods.

The method str.match(regexp) finds all matches of regexp in the string str .

    If the regular expression has flag g , it returns an array of all matches:

let str = "We will, we will rock you"; alert( str.match(/we/gi) ); // We,we (an array of 2 substrings that match)
let str = "We will, we will rock you"; let result = str.match(/we/i); // without flag g alert( result[0] ); // We (1st match) alert( result.length ); // 1 // Details: alert( result.index ); // 0 (position of the match) alert( result.input ); // We will, we will rock you (source string)
let matches = "JavaScript".match(/HTML/); // = null if (!matches.length) < // Error: Cannot read property 'length' of null alert("Error in the line above"); >
let matches = "JavaScript".match(/HTML/) || []; if (!matches.length) < alert("No matches"); // now it works >

Replacing: str.replace

The method str.replace(regexp, replacement) replaces matches found using regexp in string str with replacement (all matches if there’s flag g , otherwise, only the first one).

// no flag g alert( "We will, we will".replace(/we/i, "I") ); // I will, we will // with flag g alert( "We will, we will".replace(/we/ig, "I") ); // I will, I will

The second argument is the replacement string. We can use special character combinations in it to insert fragments of the match:

Symbols Action in the replacement string
$& inserts the whole match
$` inserts a part of the string before the match
$’ inserts a part of the string after the match
$n if n is a 1-2 digit number, then it inserts the contents of n-th parentheses, more about it in the chapter Capturing groups
$ inserts the contents of the parentheses with the given name , more about it in the chapter Capturing groups
$$ inserts character $
alert( "I love HTML".replace(/HTML/, "$& and JavaScript") ); // I love HTML and JavaScript

Testing: regexp.test

The method regexp.test(str) looks for at least one match, if found, returns true , otherwise false .

let str = "I love JavaScript"; let regexp = /LOVE/i; alert( regexp.test(str) ); // true

Later in this chapter we’ll study more regular expressions, walk through more examples, and also meet other methods.

Full information about the methods is given in the article Methods of RegExp and String.

Summary

  • A regular expression consists of a pattern and optional flags: g , i , m , u , s , y .
  • Without flags and special symbols (that we’ll study later), the search by a regexp is the same as a substring search.
  • The method str.match(regexp) looks for matches: all of them if there’s g flag, otherwise, only the first one.
  • The method str.replace(regexp, replacement) replaces matches found using regexp with replacement : all of them if there’s g flag, otherwise only the first one.
  • The method regexp.test(str) returns true if there’s at least one match, otherwise, it returns false .

Comments

  • If you have suggestions what to improve — please submit a GitHub issue or a pull request instead of commenting.
  • If you can’t understand something in the article – please elaborate.
  • To insert few words of code, use the tag, for several lines – wrap them in tag, for more than 10 lines – use a sandbox (plnkr, jsbin, codepen…)

Источник

String.prototype.matchAll()

The matchAll() method returns an iterator of all results matching a string against a regular expression, including capturing groups.

Try it

Syntax

Parameters

A regular expression object, or any object that has a Symbol.matchAll method.

If regexp is not a RegExp object and does not have a Symbol.matchAll method, it is implicitly converted to a RegExp by using new RegExp(regexp, ‘g’) .

If regexp is a regex, then it must have the global ( g ) flag set, or a TypeError is thrown.

Return value

An iterable iterator object (which is not restartable) of matches. Each match is an array with the same shape as the return value of RegExp.prototype.exec() .

Exceptions

Thrown if the regexp is a regex that does not have the global ( g ) flag set (its flags property does not contain «g» ).

Description

The implementation of String.prototype.matchAll itself is very simple — it simply calls the Symbol.matchAll method of the argument with the string as the first parameter (apart from the extra input validation that the regex is global). The actual implementation comes from RegExp.prototype[@@matchAll]() .

Examples

Regexp.prototype.exec() and matchAll()

Without matchAll() , it’s possible to use calls to regexp.exec() (and regexes with the g flag) in a loop to obtain all the matches:

const regexp = /foo[a-z]*/g; const str = "table football, foosball"; let match; while ((match = regexp.exec(str)) !== null)  console.log( `Found $match[0]> start=$match.index> end=$regexp.lastIndex>.`, ); > // Found football start=6 end=14. // Found foosball start=16 end=24. 

With matchAll() available, you can avoid the while loop and exec with g . Instead, you get an iterator to use with the more convenient for. of , array spreading, or Array.from() constructs:

const regexp = /foo[a-z]*/g; const str = "table football, foosball"; const matches = str.matchAll(regexp); for (const match of matches)  console.log( `Found $match[0]> start=$match.index> end=$ match.index + match[0].length >.`, ); > // Found football start=6 end=14. // Found foosball start=16 end=24. // matches iterator is exhausted after the for. of iteration // Call matchAll again to create a new iterator Array.from(str.matchAll(regexp), (m) => m[0]); // [ "football", "foosball" ] 

matchAll will throw an exception if the g flag is missing.

const regexp = /[a-c]/; const str = "abc"; str.matchAll(regexp); // TypeError 

matchAll internally makes a clone of the regexp — so, unlike regexp.exec() , lastIndex does not change as the string is scanned.

const regexp = /[a-c]/g; regexp.lastIndex = 1; const str = "abc"; Array.from(str.matchAll(regexp), (m) => `$regexp.lastIndex> $m[0]>`); // [ "1 b", "1 c" ] 

However, this means that unlike using regexp.exec() in a loop, you can’t mutate lastIndex to make the regex advance or rewind.

Better access to capturing groups (than String.prototype.match())

Another compelling reason for matchAll is the improved access to capture groups.

Capture groups are ignored when using match() with the global g flag:

const regexp = /t(e)(st(\d?))/g; const str = "test1test2"; str.match(regexp); // ['test1', 'test2'] 

Using matchAll , you can access capture groups easily:

const array = [. str.matchAll(regexp)]; array[0]; // ['test1', 'e', 'st1', '1', index: 0, input: 'test1test2', length: 4] array[1]; // ['test2', 'e', 'st2', '2', index: 5, input: 'test1test2', length: 4] 

Using matchAll() with a non-RegExp implementing @@matchAll

If an object has a Symbol.matchAll method, it can be used as a custom matcher. The return value of Symbol.matchAll becomes the return value of matchAll() .

const str = "Hmm, this is interesting."; str.matchAll( [Symbol.matchAll](str)  return [["Yes, it's interesting."]]; >, >); // returns [["Yes, it's interesting."]] 

Specifications

Browser compatibility

BCD tables only load in the browser

See also

Found a content problem with this page?

This page was last modified on Apr 5, 2023 by MDN contributors.

Your blueprint for a better internet.

MDN

Support

Our communities

Developers

Visit Mozilla Corporation’s not-for-profit parent, the Mozilla Foundation.
Portions of this content are ©1998– 2023 by individual mozilla.org contributors. Content available under a Creative Commons license.

Источник

Оцените статью