Search code examples
regexnotepad++

Finding Ten Digit Number using regex in notepad++


I am trying to replace everything from a data dump and keep only the ten digit numbers from that dump using notepad++ regex.

Trying to do something like this (?<!\d)0\d{7}(?!\d) but no luck.


Solution

  • Forward

    There where problems in older versions of Notepad++ which wouldn't handle PCRE expressions. This proposed solution was tested in NotePad++ v6.8.8, but should work in any version later than v6.2.

    Description

    ([0-9]{10})|.
    

    Regular expression visualization

    Replace with: $1

    This expression will do the following:

    • capture 10 digit numbers and place them into capture group 1, which is then just reinserted into the output string
    • matches everything less and removes it.

    How To in Notepad ++

    From Notepad++

    1. press the ctrlh to enter the find and replace mode

    2. Select the Regular Expression option

    3. In the "Find what" field place the regular expression

    4. in the "Replace with" field enter $1

    5. Click Replace all

    Example

    Live Demo

    https://regex101.com/r/fZ9vH7/1

    Source Text

    fdsafasfa1234567890zzzzzzz12345
    

    After Replacement

    1234567890
    

    Explanation

    NODE                     EXPLANATION
    ----------------------------------------------------------------------
      (                        group and capture to \1:
    ----------------------------------------------------------------------
        [0-9]{10}                any character of: '0' to '9' (10 times)
    ----------------------------------------------------------------------
      )                        end of \1
    ----------------------------------------------------------------------
     |                        OR
    ----------------------------------------------------------------------
      .                        any character except \n
    ----------------------------------------------------------------------
    

    Extra credit

    The OP wasn't clear on what to do with substrings of numbers longer than 10 characters. If strings of numbers longer than 10 digits are undesirable and need to be removed in their entirity, then use this

    ([0-9]{10})(?![0-9])|[0-9]+|.
    

    Regular expression visualization

    Replace with: $1

    Live Demo: https://regex101.com/r/aS4sN1/1