Young Black Boy Squeezing a Big Object

Have you ever needed to pull just the important bits out of a wall of text or code? Imagine you’re scanning through lines of HTML like this:

<a href="http://goalkicker.com">goalkicker</a>

Now, you don’t want the whole tag — just the useful parts:

  • The link addresshttp://goalkicker.com
  • The visible textgoalkicker

That’s where Regular Expressions (RegExp) and the exec() method become your best tools. They let you write patterns that hunt down specific text formats, and then capture the exact pieces you care about.

Let’s explore how it works.


🧩 1. What is a Regular Expression?

A Regular Expression (RegExp) is a pattern that describes a set of possible strings. You use it to search, extract, or validate text according to certain rules.

Think of it as writing a small “search formula” — one that says:

“Find text that looks like this.”

You can use RegExp for tasks like:

  • Checking if an email address is valid.
  • Extracting URLs or phone numbers.
  • Splitting logs into data fields.
  • Cleaning text input before saving it.

🪄 2. Capturing Groups — Your Data Nets 🎣

Parentheses ( ) in regex aren’t just for grouping; they capture parts of what you matched. That means:

Whatever falls inside those parentheses gets saved separately.

It’s like casting a net in a river — you might be searching for fish, but you only keep the ones that fall inside your net.

Example:

(John) (Doe)

If you apply that to the string "John Doe", you get:

  • match[0]"John Doe" (the entire match)
  • match[1]"John"
  • match[2]"Doe"

So parentheses turn your regex into a data extractor, not just a matcher.


🧠 3. Introducing exec() — The Smart Match Finder

JavaScript gives you a special method to work with regex:

RegExp.exec(string)

It’s like a powerful detective. When you call exec(), it searches through your text once, and returns a detailed match report.

That report (an array) contains:

  1. match[0]: the full match
  2. match[1], match[2], ...: the captured groups (from parentheses)
  3. some extra info (like where it found the match)

If no match is found, it simply returns null.


🧪 4. Example — Extracting an Anchor Tag

Let’s look at a real-world scenario:

You have this HTML code:

<a href="http://goalkicker.com">goalkicker</a>

You want:

  • The URL inside the href attribute
  • The text between the <a> and </a> tags

Here’s how you do it:

const html = '<a href="http://goalkicker.com">goalkicker</a>';
const regex = /<a href="(https?:\/\/[^"]+)">([^<]+)<\/a>/;

const m = regex.exec(html);

console.log(m[0]); // Full match: <a href="http://goalkicker.com">goalkicker</a>
console.log(m[1]); // First capture: http://goalkicker.com
console.log(m[2]); // Second capture: goalkicker

🧩 Explanation:

  • https? → matches http or https
  • :\/\/ → matches the literal ://
  • [^"]+ → matches one or more characters that are not a quote (")
  • ([^<]+) → captures the visible text until it sees < (the start of </a>)

The parentheses () around both patterns make them capturing groups, so exec() neatly returns both the URL and the text as separate results.


🔁 5. Finding All Matches with g (Global Flag)

By default, exec() finds just the first match. But what if your HTML has multiple links?

You can add the global flag g to your regex. Then, each time you call exec(), it picks up where it left off — giving you the next match.

const html = `
  <a href="http://goalkicker.com">goalkicker</a>
  <a href="http://example.com">Example</a>
`;

const regex = /<a href="(https?:\/\/[^"]+)">([^<]+)<\/a>/g;

let match;
while ((match = regex.exec(html)) !== null) {
  console.log("URL:", match[1]);
  console.log("Text:", match[2]);
}

This loop keeps calling exec() until there are no more matches (null).

Result:

URL: http://goalkicker.com
Text: goalkicker
URL: http://example.com
Text: Example

🧠 Why does this work?

Because when the regex has the g flag, JavaScript tracks its internal pointer (called lastIndex). Each new exec() call starts scanning from that position. Once the end is reached — exec() returns null.


⚡ 6. Common Mistakes (and Fixes)

  1. Forgetting the quotes in your regex

    /<a href=(https?:\/\/[^"]+)>([^<]*)<\/a>/
    

    🔧 Fix: Always include the quotes:

    /<a href="(https?:\/\/[^"]+)">([^<]*)<\/a>/
    
  2. Forgetting the g flag Without g, exec() will always return the first match — no matter how many exist.

  3. Using greedy .+ without limits Always use +? (lazy quantifier) when matching content that could repeat, e.g. (.+?) instead of (.+), to stop at the nearest match instead of the farthest.


🧩 7. When exec() Returns null

When no text in your string matches the pattern, exec() returns null. That’s JavaScript’s way of saying: “I found nothing that fits your rule.”

For example:

/<img src="(.+?)">/.exec("<a href='link'>link</a>")
// → null (because there’s no <img> tag)

🧠 8. Summary Table

Concept Explanation
RegExp.exec() Searches the string for a match and returns detailed info
Capturing Groups ( ) Parts of the pattern saved separately
match[0] Full matched text
match[1], match[2] Values captured by parentheses
g flag Enables multiple searches using a loop
null Returned when no match is found

🧪 Practice Tasks

Let’s solidify what you’ve learned. Try solving these:

  1. Extract the href and link text from:

    <a href="https://site.com">Site</a>
    

    Using exec(), what does m[1] contain?

  2. Write a regex to capture the src and alt values from:

    <img src="logo.png" alt="Company Logo">
    
  3. Fix this broken regex so it captures both parts correctly:

    const regex = /<a href=(https?:\/\/[^"]+)>([^<]*)<\/a>/;
    
  4. Use a loop with exec() (and the g flag) to extract all email addresses from:

    "Contact us at info@company.com or support@help.org"
    
  5. Explain why exec() sometimes returns null.


🧩 Review — Fill-Gap Questions

  1. Regular expressions are used to __ patterns within strings.
  2. Parentheses ( ) in regex are known as __ groups.
  3. The method used to execute a regex and return match details is __.
  4. The full matched text is always stored in match[____].
  5. The captured groups start from index number __.
  6. The global flag g allows you to find __ matches, not just one.
  7. The property lastIndex in regex helps track the search __.
  8. The result of exec() is an __ containing match info.
  9. If no match is found, exec() returns the value __.
  10. Using (.+?) instead of (.+) makes the match __ (non-greedy).

<
Previous Post
🎉 Build an Animated Greeting Bot with Python (GreetBot)
>
Next Post
🧠 Say Hello to NotebookLM: Your Ultimate AI-Powered Research Assistant