Search code examples
csvnotepad++

Finding numbers with a specific length from csv file


I'm working with a csv file from a customer, which holds a large amount of data. The data is extracted from an SQL database and the commas therefore signify the different columns. In one of these columns there are 10 digit numbers. For some reason all 10 digit numbers starting with 0 have been converted to 9 digit numbers with the 0 removed. I need to find all these instances and insert a 0 at the beginning of the 9 digit number.

A complication in the data is that another column also contains 9 digit numbers, and these do not need to be modified. I can assume, however that all those numbers start with 0 and all the numbers i need to find do not start with 0.

I'm currently using notepad++ trying to fix the problem and found the regular expression \d{9} which finds all numbers with 9 digits, but that is not what I'm looking for

Below i have an example of how the data could look. The column that needs all 9 digit numbers converted is on the left, and the other column with 9 digit numbers is on the right. An example of the data that is causing the trouble could be:

Column 1 Column 2
2323232323 002132413
231985313 004542435

In this example I need to find the second line of column 1 and insert a 0 in front of the number.


Solution

    • Ctrl+H
    • Find what: \b(?!0)\d{9}\b
    • Replace with: 0$0
    • TICK Wrap around
    • SELECT Regular expression
    • Replace all

    Explanation:

    \b          # word boundary, make sure ae haven't digit before
    (?!0)       # negative lookahead, make sure the next character is not 0
    \d{9}       # 9 digits
    \b          # word boundary, make sure ae haven't digit after
    

    Replacement:

    0           # 0 to be inserted
    $0          # the whole match (i.e. 9 digts)
    

    Screenshot (before):

    enter image description here

    Screenshot (after):

    enter image description here