Regular Expressions, or RegExps, are patterns used to match, locate, and extract parts of text. They are essential when working with strings that follow a particular structure — like emails, URLs, or HTML tags.


1. What is a Regular Expression?

A Regular Expression (RegExp) is a symbolic pattern that defines a set of strings. It helps identify specific sequences within text. Typical uses include:

  • Validating input (e.g., email addresses or phone numbers).
  • Extracting structured data from unstructured text.
  • Searching and replacing text efficiently.
  • Splitting or cleaning large text inputs.

2. Capturing Groups

Parentheses ( ) in regex not only group expressions but also capture portions of the match. This means the data matched inside ( ) is stored separately, allowing easy extraction.

Example:

(John) (Doe)

Applied to "John Doe", the results are:

  • match[0]: "John Doe" – full match
  • match[1]: "John" – first captured group
  • match[2]: "Doe" – second captured group

Capturing groups make regex useful for data extraction, not just text matching.


3. The exec() Method

In JavaScript, RegExp.exec(string) executes a search for a match in a string and returns detailed results as an array.

The returned array includes:

  • match[0]: the full match.
  • match[1], match[2], …: captured groups.
  • Properties like index (the position of the match) and input (the original string).

If no match is found, exec() returns null.


4. Extracting Data Example

Suppose you want to extract both the URL and text from an HTML anchor tag:

<a href="http://goalkicker.com">goalkicker</a>

The JavaScript code:

const html = '<a href="http://goalkicker.com">goalkicker</a>';
const regex = /<a href="(https?:\/\/[^"]+)">([^<]+)<\/a>/;

const result = regex.exec(html);

console.log(result[0]); // <a href="http://goalkicker.com">goalkicker</a>
console.log(result[1]); // http://goalkicker.com
console.log(result[2]); // goalkicker

Explanation:

  • (https?:\/\/[^"]+) captures the URL.
  • ([^<]+) captures the visible link text.
  • The parentheses ensure these values are stored separately in the match array.

5. Extracting Multiple Matches

Without the global flag g, exec() only returns the first match. Adding g allows multiple matches to be processed sequentially in a loop.

const html = `
  <a href="http://goalkicker.com">goalkicker</a>
  <a href="http://example.com">Example</a>
`;

const regex = /<a href="(https?:\/\/[^"]+)">([^<]+)<\/a>/g;

let match;
while ((match = regex.exec(html)) !== null) {
  console.log("URL:", match[1]);
  console.log("Text:", match[2]);
}

Each loop iteration retrieves the next match because exec() tracks its progress using the internal property lastIndex. When all matches are found, it returns null.


6. Common Mistakes and Fixes

  1. Missing quotes in patterns: Incorrect: /href=(https?:\/\/[^"]+)/ Correct: /href="(https?:\/\/[^"]+)"/

  2. Omitting the g flag: Without g, only the first match is found.

  3. Using greedy quantifiers: .+ matches as much as possible, often too much. Use the lazy version .+? to stop at the nearest match.


7. When exec() Returns null

If the target text does not contain any substring that matches the given regex, the return value is null.

Example:

/<img src="(.+?)">/.exec("<a href='link'>link</a>")
// Returns null

This happens because there is no <img> tag in the input string.


8. Practice Tasks

  1. Extract the href and link text from

    <a href="https://site.com">Site</a>
    

    What is stored in match[1]?

  2. Write a regex to capture the src and alt values from

    <img src="logo.png" alt="Company Logo">
    
  3. Correct the following regex:

    const regex = /<a href=(https?:\/\/[^"]+)>([^<]*)<\/a>/;
    
  4. Use exec() in a loop to extract all email addresses from:

    "Contact us at info@company.com or support@help.org"
    
  5. Explain why exec() sometimes returns null.


9. Review — Fill-Gap Questions

  1. Regular expressions are used to __ patterns within text.
  2. Parentheses ( ) in regex are known as __ groups.
  3. The method used to execute a regex in JavaScript is __.
  4. The full matched text is always found in match[____].
  5. Captured groups begin from index number __.
  6. Adding the g flag allows you to find __ matches.
  7. The regex property that tracks progress between matches is __.
  8. The result returned by exec() is an __ containing match details.
  9. When no match is found, exec() returns __.
  10. Using (.+?) instead of (.+) makes the match __ (non-greedy).

<
Previous Post
🎉 Build an Animated Greeting Bot with Python (GreetBot)
>
Next Post
🧠 Say Hello to NotebookLM: Your Ultimate AI-Powered Research Assistant