Search code examples
regexultraedit

Ultraedit, regular expression help, extracting 2 values, comma separated


I have this file where I only want to extract the email address and first name from our client list.

So a sample from the file:

a@abc.com,www.abc.com,2011-11-15 00:00:00,8.8.8.8,John,Doe,209 Park Rd,See,FL,33870,,,
b@abc.com,cde.com,2011-11-07 00:00:00,4.4.4.4,Erickson,Crast,136 Kua St # 1367,Pearl,HI,96782,,8084568190,

I would like to get back

a@abc.com,John
b@abc.com,Erickson

So basically email address and First Name

I know I can do this in powershell but maybe a find and replace in ultraedit will be faster

Note: you will notice some fields are not provided so it will show ",," meaning those fields were left empty when the user signed up but the amount of comma in each line is the same, 12 being the count.


Solution

  • So basically there are fields separated by ",". Without looking at the correct content (i.e. email/timestamp etc. will need to have a certain format which could also be checked) let's just try to extract the values of the first and fourth field.

    so I'd suggest a Replace-Operation where you search for

    ^([^,]*),[^,]*,[^,]*,[^,]*,([^,]*),.*$
    

    and replace it with

    \1 # \2
    

    Options: "Regular Expressions: Unix".

    (Just inserted the # to have a separator, although the first whitespace would be sufficient. But you'll get the idea, I assume...)

    Result:

    a@abc.com # John
    b@abc.com # Erickson