Search code examples
csvdatasetnotepad++

Remove a column from CSV file in Notepad++


First of all, I'm a newbie in Notepad++. I'm trying to edit this dataset in Notepad++, which is stored in CSV. I can't open this file in Excel as there are some cells containing digits longer than 15 and Excel will convert these digits.

The column headings are like this,

BNFUniqueID,username,CollectTeam,Hos_methodID,CollectData,hhstatus,hhupazila,hhunion,hhlocationsitetype,hhsitename,hhlocalblock,villagename,hhlandmark,hhmajiname,hhmajitel,HoHname,Hhsize,HoHcontact,cardtype,cardnum,olduniqueID,BNFname,BNFage,BNFsex,BNFagegroup

In this dataset there is a column (hhlandmark) in the middle and I'm trying to delete this whole column. One of the problem is that, not all the cells contain data, so the ALT + SHIFT + isn't suitable for this task, as vertical selection would block cells from other columns as well.

I'm looking for a way to avoid this and delete only the column I want to delete.


Solution

  • There are 12 columns before the hhlandmark column, so we can try the following find and replace in regex mode:

    Find:    ^((?:[^,]*,){12})[^,]*,(.*)$
    Replace: $1$2
    

    This pattern says to match:

    • ^ from the start of the line
      • ((?:[^,]*,){12}) match and capture in $1 the first 12 columns
      • [^,]*, then match the 13th column (hhlandmark)
      • (.*) match and capture in the $2 the rest of the line
    • $ end of the line

    We then replace with $1$2 to effectively splice out the 13th hhlandmark column.

    Here is a running regex demo.