Search code examples
powershellcsvxlsx

Powershell delete column from CSV


I want to write a script for our Telephone system (3CX) that should convert a xlsx to a csv (this works fine), but I don't know how to replace a column
with a "," if its a certain value.

The format of our Contacts in the xlsx:

Name, Surname, Company, Mobile, Zip Code, Territory, Language, Telephone, Country, Email, Minor, Nr., Sellerscode, Currency

John;Doe;Microsoft;+49 (0)170 11223344;24558;;DEU;+49 (0)89 777778;DE;j.doe@Microsoft.com;Nein;KT001122;;

I need this format:

FirstName;LastName;Company;Mobile;Mobile2;Home;Home2;Business;Business2;Email;Other;BusinessFax;HomeFax;Pager

John,Doe,Microsoft,004917011223344,,,,00494193750860,,j.doe@microsoft.com,,,,

Can anyone help me please, to replace the following columns with a ","?

Zip-Code, Territory, Language, Country, Minor, Nr., Sellerscode, Currency 

This is what I get if I run it:

FirstName,LastName,Company,Mobile,Mobile2,Home,Home2,Business,Business2,Email,Other,BusinessFax,HomeFax,Pager

john,doe,Microsoft,0049 151 119842547,24558,,DEU,0049 4193 546860,DE,j.doe@microsoft.de,Nein,KT001122,,

jane,doe,Apple,0049 151 1215362547,24558,,DEU,0049 4193 8464860,DE,j.doe@apple.de,Nein,KT001122,,

max,doe,Microsoft,0049 151 11864357,24558,,DEU,0049 4193 535163560,DE,max.doe@lenovo.de,Nein,KT001122,,

This is what I need to get:

FirstName,LastName,Company,Mobile,Mobile2,Home,Home2,Business,Business2,Email,Other,BusinessFax,HomeFax,Pager

John,doe,Microsoft,0049151119842547,,,,0049419546860,,j.doe@microsoft.de,,,,

Jane,doe,Apple,00491511215362547,,,,004941938464860,,j.doe@apple.de,,,,

max,doe,lenovo,004915111864357,,,,00494193535163560,,max.doe@lenovo.de,,,,

This is my script:

$xlsx = 'U:\Tools\3cx_contacts\Kontakte-3cxExport.xlsx'

# Excel should be installed on the system
$InFile = Get-Item $xlsx
$OutFile= $InFile.FullName.replace($InFile.Extension,".csv")
$Excel = new-object -ComObject "Excel.Application"
$Excel.DisplayAlerts = $True
$Excel.Visible = $False # $True while testing
$WorkBook = $Excel.Workbooks.Open($InFile.FullName)
$WorkBook.SaveAs($OutFile, 6) # 6 -> type csv
$WorkBook.Close($True)
$Excel.Quit()
[void][System.Runtime.Interopservices.Marshal]::ReleaseComObject($Excel)

$file  = 'U:\Tools\3cx_contacts\Kontakte-3cxExport.csv'
$file2 = 'U:\Tools\3cx_contacts\Kontakte-3cxExport2.csv'
$file3 = 'U:\Tools\3cx_contacts\Kontakte-3cxExport3.csv'

$search1 = '  '
$replace1 = ' '
$search2 = ';'
$replace2 = ','
$search3 = '\(0\)'
$replace3 = ''
$search4 = 'Ä'
$replace4 = 'Ae'
$search5 = 'ä'
$replace5 = 'ae'
$search6 = 'Ü'
$replace6 = 'Ue'
$search7 = 'ü'
$replace7 = 'ue'
$search8 = 'Ö'
$replace8 = 'Oe'
$search9 = 'ö'
$replace9 = 'oe'
$search10 = "\+"
$replace10 = '00'



# remove first two lines
get-content $file |
    select -Skip 2 |
    set-content "$file-temp"
move "$file-temp" $file2 -Force


#Import-Csv -Path $file -Delimiter "," | 
#    Select-Object -ExcludeProperty "PLZ-Code", Gebietscode, Sprachcode, Länder-/Regionscode, Minderjährig, Nr.,    Verkäufercode, Währungscode | 
#        Export-Csv -Path  $file2 -Delimiter "," -Encoding UTF8 -NoTypeInformation


# insert specific first line
$x = Get-Content $file2
$x[0] = "FirstName;LastName;Company;Mobile;Mobile2;Home;Home2;Business;Business2;Email;Other;BusinessFax;HomeFax;Pager"
$x | Out-File $file2

# remove dual spaces
for ($i = 0; $i -lt 10; $i++)
{
    (Get-Content -Path $file2) -replace $search1,$replace1 | Set-Content -Path $file2

}
   
# replace german chars
(Get-Content -Path $file2) -replace $search2,$replace2 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search3,$replace3 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search4,$replace4 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search5,$replace5 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search6,$replace6 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search7,$replace7 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search8,$replace  | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search9,$replace9 | Set-Content -Path $file2
(Get-Content -Path $file2) -replace $search10,$replace10 | Set-Content -Path $file2

$purchaser = 'Einkauf,'
$content = Get-Content $file2

Get-Content $file2 | 
    Where-Object { $_ -notmatch $purchaser } | 
    Set-Content $file3

Solution

  • This is a solution using regex: as you see ,for your replace , i have created an object with key = value to search and value = replace, i have set just some couples (key,value)

    $file2 ="C:\Documents\tel.csv"
    #use -Raw to put all the file in a string instead of an array of string
    $x = Get-Content $file2 -Raw
    $OFS = "`r`n"
    $title = "FirstName,LastName,Company,Mobile,Mobile2,Home,Home2,Business,Business2,Email,Other,BusinessFax,HomeFax,Pager"
    $title += $OFS
    
    #add your search,replace couples
    $r = @{
        '  '= ' ';
        ";" = ",";
        "\(0\)"= "";
        "Ä" ="Ae";
        "\+" = "00"
    }
    #do the job for each couple
    foreach ($k in $r.Keys) {
      $x = $x -replace $k, $r.$k
    }
    
    #work on final job, to reset some values
    $option = [System.Text.RegularExpressions.RegexOptions]::Multiline 
    $pattern = [regex]::new("^(?<four>(?:.*?,){4})(?:.*?,){3}(?<Business>.*?,).*?,(?<Email>.*?,).+$", $option)
    
    #do the replace final
    $x = $pattern.Replace($x, '${four},,,${Business},${Email},,,');
    
    #add line of title
    $x = $title + $x
    

    explanation about final regex:

    ^                                    Beginning of the line
    (?<four>(?:.*?,){4})                 captured group named four -> 4 times .*,
    (?:.*?,){3}                          3 times .*, to discard
    (?<Business>.*?,)                    captured group named Business
    .*?,                                 1 time .*, to discard
    (?<Email>.*?,)                       captured group named Email
    .+$                                  rest of the line 
    

    If you want to work on other fields: use this pattern to help you

    $pattern = ""
    $pattern += "^(?<firstname>.*?,)"
    $pattern += "(?<lastname>.*?,)"
    $pattern += "(?<company>.*?,)"
    $pattern += "(?<mobile>.*?,)"
    $pattern += "(?<mobile2>.*?,)"
    $pattern += "(?<home>.*?,)"
    $pattern += "(?<home2>.*?,)"
    $pattern += "(?<business>.*?,)"
    $pattern += "(?<business2>.*?,)"
    $pattern += "(?<email>.*?,)"
    $pattern += "(?<other>.*?,)"
    $pattern += "(?<businessfax>.*?,)"
    $pattern += "(?<homefax>.*?,)"
    $pattern += "(?<pager>.*)$"
      
    # becareful: use simple quote
    $newvalue =''
    $newvalue += '${firstname}'
    $newvalue += '${lastname}'
    $newvalue += '${company}'
    $newvalue += '${mobile}'
    $newvalue += ','  #nothing for mobile2
    $newvalue += ','  #nothing for home
    $newvalue += ','  #nothing for home2
    $newvalue += '${busisness}'
    $newvalue += ','  #nothing for busisness2
    $newvalue += '${email}'
    $newvalue += ','  #nothing for other
    $newvalue += ','  #nothing for businessfax
    $newvalue += ','  #nothing for homefax
    $newvalue += ''   #nothing for pager
    
    $resultfinal = $pattern.Replace("textoffile", $newvalue)