I have been trying to make changes to a csv invoice that my company gets on a monthly basis. However, the File is 1.5 GB in size and contains over 1.500.000 rows with 36 columns. While I can open the file in Microsoft Excel, trying to edit the data results in a majority of the rows/columns getting deleted without my knowledge. Trying to go in and edit it results in me loosing about 16MB of data. I wrote a PowerShell script to split the file up into two different files, but the problem with that is that one of the CSVs is missing a header and I have to manually add it in. I also have the same issue of some data being deleted despite being under the 1,048,576 rows limit that Excel has.
I have tried updating the CSV in Visual Studio Code and using the Rainbow CSV extension, but I'm seeing Visual Studio crash every time I try to open the CSV. I have also tried using Notepad++ and using the CSV Lite plugin, but that doesn't have a filter option to make the operation easier. I was wondering if anyone had suggestions for editing a CSV file this big or if they know a way to edit the data in Excel without the risk of data being deleted on accident?
Thank you!
@MagineMozios had the right idea. It is very easy to write a PowerShell script to edit the CSV data without deleting anything. Here is the PowerShell script I made to edit the data of my 1.5GB CSV File.
Code:
# Define the path to the CSV file
$csvPath = "C:\Users\yourfile.csv" # Replace 'yourfile.csv' with your actual CSV file name
# Load the CSV into a variable
$csvData = Import-Csv -Path $csvPath
# Iterate through each row in the CSV
foreach ($row in $csvData) {
# Check if any column contains "SubscriptionName1" and update the "CostCenter" column to "####"
if ($row.PSObject.Properties.Value -contains "SubscriptionName1") {
$row."CostCenter" = "####"
}
# Check if any column contains "SubscriptionName2" and update the "CostCenter" column to "####"
elseif ($row.PSObject.Properties.Value -contains "SubscriptionName2") {
$row."CostCenter" = "####"
}
elseif ($row.PSObject.Properties.Value -contains "SubscriptionName3") {
$row."CostCenter" = "####"
}
}
# Export the updated CSV
$csvData | Export-Csv -Path $csvPath -NoTypeInform
Thank you to everyone who contributed!