Search code examples
javascriptregexgoogle-apps-scriptregexp-replace

Google App Scripts - Regex - Remove all letters, replace all spaces, or new lines with comma not working


INTRO

I have a Google App Script Web App which has a form which contains a TextArea, and when submitted saves the data in a Google Sheet.

This text-area collects Employee ID's and will receive input from non-tech savvy users.

The users may use the following formats when entering the ID's into the textarea, including typos such as extra spaces, commas, or a mixture of all 3 formats.

A) Separated by a space.

A123456 B001234 TMP00123456

B) Separated by a comma (with or without a space too)

A123456,B001234, TMP00123456

C) One / line.

A123456
B001234
TMP00123456

THE OBJECTIVE

I need to cleanse the data above before its written to the Google Sheet. The cleansed (expected outcome) data should look like

123456,001234,00123456

THE PROBLEM

I cant seem to get REGEX to work correctly. I have tried many variations.

var agentIds = form.agentIds.replace(/[^\d]+/g, ',').replace(/^,+|,+$/g, '');

In a nutshell, I am looking for regex to...

  1. Replace all spaces with comma.
  2. Replace all line-breaks with commas.
  3. Remove all letters.
  4. IF there are 2 or more commas next to each other, remove the extras.

II am not sure if this is the most efficient way...but the only way I could think of. I am open to suggestions.

Thanks for your help!!! :)


Solution

  • I believe your goal is as follows.

    • You want to retrieve a value of 123456,001234,001234567 from "A123456 B001234 TMP00123456", "A123456,B001234, TMP001234567", "A123456\nB001234\nTMP001234567".

    In this case, how about the following sample script? In this sample script, I used split and replace.

    Sample script:

    const values = ["A123456 B001234 TMP00123456", "A123456,B001234, TMP00123456", "A123456\nB001234\nTMP00123456"];
    const res = values.map(e => e.split(/[\s,]+/g).map(f => f.trim().replace(/[A-Z]/ig, "")).join(","));
    // or const res = values.map(e => e.replace(/[\s,]+/g, ",").replace(/[A-Z]/ig, ""));
    console.log(res);

    • When this script is run, ["123456,001234,00123456","123456,001234,00123456","123456,001234,00123456"] is obtained.

    • In your sample input value and output value, it seems that a value of 001234567 is retrieved from TMP00123456 of A123456 B001234 TMP00123456. I'm worried that you might have miscopied TMP001234567 as TMP00123456. I'm not sure about the detail of this.

    • If you want to use this script for one value, how about the following sample script?

      const sample = "A123456,B001234, TMP00123456";
      const res = sample.split(/[\s,]+/g).map(f => f.trim().replace(/[A-Z]/ig, "")).join(",");
      // or const res = sample.replace(/[\s,]+/g, ",").replace(/[A-Z]/ig, "");
      console.log(res);
      

    References: