I was provided with a .txt file that represents a set data delimited with a semicolon. The first line consists of 5 data fields, independent from the rest of the lines. Meanwhile, the other lines represent a data set delimited in 16 fields (no header line is included). The issue is that I'm having trouble to write the code to do the following in excel:
I've run out of ideas and I don't know how to approach the issue. Many thanks!
Ps. the data consists of arround 20.000+ entries so I don't know if iterating through the delimiters is efficient.
I've tried to modify the original input file, but I've run into some issue removing the first line and adding the headers while conserving the data of the errased line in the workbook. Furthermore, it seems that I cannot add a List object table unless the data is imported from a csv file, which implies a file conversion.
SAMPLE CSV
COUNTRY;31/12/2024;31/12/2024;1;NA002
EUR;1;XXXX;;ZZZZ;YY;;-99374;;;-84750;DEP;;-20474
EUR;1;XXXX;;ZZZZ;YY;;-72843;;;-55582;DEP;;-25107
EUR;1;XXXX;;ZZZZ;YY;;-43929;;;-58231;DEP;;-19313
EUR;1;XXXX;;ZZZZ;YY;;-63419;;;-23753;DEP;;-58099
EUR;1;XXXX;;ZZZZ;YY;;-15823;;;-10315;DEP;;-29964
EUR;1;XXXX;;ZZZZ;YY;;-25517;;;-51326;DEP;;-9411
EUR;1;XXXX;;ZZZZ;YY;;-72852;;;-46947;DEP;;-46277
EUR;1;XXXX;;ZZZZ;YY;;-62011;;;-58741;DEP;;-67422
EUR;1;XXXX;;ZZZZ;YY;;-11243;;;-85323;DEP;;-85026
EUR;1;XXXX;;ZZZZ;YY;;-54993;;;-34625;DEP;;-20409
EUR;1;XXXX;;ZZZZ;YY;;-17647;;;-58004;DEP;;-49106
EUR;1;XXXX;;ZZZZ;YY;;-78266;;;-31639;DEP;;-37452
EUR;1;XXXX;;ZZZZ;YY;;-95180;;;-32508;DEP;;-79336
EUR;1;XXXX;;ZZZZ;YY;;-50861;;;-33265;DEP;;-33790
EUR;1;XXXX;;ZZZZ;YY;;-8484;;;-98502;DEP;;-99399
EUR;1;XXXX;;ZZZZ;YY;;-55098;;;-52802;DEP;;-4358
EUR;1;XXXX;;ZZZZ;YY;;-81368;;;-86121;DEP;;-86581
Here is a Power Query version:
I set up three queries
Queries get pasted into the Advanced Editor in Power Query (Get & Transform
) as new queries from blank
For the first query, there are many ways to select the file. However, I just hard-coded it:
Also, rename this Query something like Read CSV
//Read in the CSV file as a single column
//Buffering might improve performance
// then this will be fed to the two queries that return the first line and the other lines
let
Source = Csv.Document(File.Contents("C:\Users\ron\Desktop\New csv.csv"),[Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None])
in
Table.Buffer(Source)
When you go to save this query, select Close and Load to
and when the next dialog opens select Connection Only
Then the First Row
query:
let
Source = Table.FromRecords({Table.First(#"Read CSV")}),
Split = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByDelimiter(";",QuoteStyle.Csv)),
//Set the data types
#"Type Data" = Table.TransformColumnTypes(Split,
List.Zip({Table.ColumnNames(Split), {type text, type date,type date, Int64.Type, type text }}),"en-GB")
in
#"Type Data"
Close and Load
will send the Query to a new worksheet.
You may want to rename the column headers.
Then for the other rows, another simple query:
let
Source = Table.RemoveFirstN(#"Read CSV",1),
Split = Table.SplitColumn(Source,"Column1", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv))
in
Split
Close and Load to
and instead of New worksheet select Existing Worksheet
and select the first cell.
Results of the above on the Excel Worksheet: