Search code examples
powerquerym

Reading the first n rows of a csv without parsing the whole file in Power Query


I have these csv data files with my relevant data in the first five rows and a bunch of mal formmatted data below it. When I use filter rows it still reads in all of the data below which causes problems for me. This is problematic because I am reading in a whole folder's worth and some of these files have a different number of columns below the rows I'm interested in. Those columns or data below those first five rows I do not need, but power query throws errors when it is looking for the same number of columns as the last file. I would like it to just read in the first n rows which are uniform between all files in the folder. 

Is there a way to do this or bypass the error? Let me know if there is anything else I can provide to help my question be better understood. 

I have already tried filter rows, but that still reads the whole document and throws some errors.

This is similar to what I'm looking for, but it's not clear how I can edit this to achieve what I want.

Skip 6 rows before "reading" into powerquery

Ultimately I'm going to be reading in the first 5 rows of all documents in the file folder. This is much easier in Pandas, but I need an Excel solution for a coworker. The error I get is "unexpected number of columns." I have confirmed this with doing a subset of files that has the same number of columns for the lower extra data I don't need. I would like a solution robust enough to handle all of the files.


Solution

  • When you load a CSV into the Query Editor, it will likely generate M code like this:

    let
        Source = Csv.Document(File.Contents("C:\FilePath\FileName.csv"), [Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
        #"Changed Type" = Table.TransformColumnTypes(Source, {{"Column1", type text}, {"Column2", type text}, {"Column3", type text}})
    in
        #"Changed Type"
    

    Delete the last step, #Changed Type, and change Columns=3 in the first step to the number you actually want instead of what it automatically detected.