I am comparing a local and a OneDrive version of the same drive, in order to identify discrepancies in the sync. There are nearly 20,000 files in total, among deeply nested folders.
I have tried other solutions, but I prefer to keep everything in Excel, for reasons too numerous to detail here. As such, I am using PowerQuery to list the contents of each drive, and I will then use various Table.*Join()
s to compare those contents.
Unfortunately, no native option meets my needs. While Folder.Files()
does list all files recursively, it fails to include the folders. By contrast, Folder.Contents()
does include the folders, but it fails to list the contents recursively—rather, it shows only the "first level".
As such, I created the custom function Folder_FullContents()
, to recursively list both files and folders:
let Folder_FullContents = (
path as text,
optional options as nullable record
) as any =>
let
contents = Folder.Contents(path, options),
subfolders = Table.AddColumn(
Table.SelectRows(contents, each [Attributes][Directory]),
"File Path",
each Text.Combine({[Folder Path], [Name]}),
type text
)[File Path],
result = Table.Combine(List.Combine({{contents},
List.Transform(subfolders, Folder_FullContents)
}))
in result
in Folder_FullContents
While Folder_FullContents()
does technically work, it is prohibitively slow for 20,000 files.
Is there a reliable solution in PowerQuery M, that
.Files()
or .Contents()
at scale?There are some empty folders on these drives. As such, it is insufficient to simply append the .Distinct()
set of [Folder Path]
s from Folder.Files()
, to the dataset of the files themselves.
Doing so would omit the empty folders, which do not appear in any filepaths.
Buffering usually solves problems like this. I have tried the following which returns 121k rows in about 6 seconds.
let Folder_FullContents = (path as text, optional options as nullable record) as any =>
let
contents = Table.Buffer( Folder.Contents(path, options)),
subfolders = Table.AddColumn(
Table.SelectRows(contents, each [Attributes][Directory]),
"File Path",
each Text.Combine({[Folder Path], [Name]}),
type text
)[File Path],
result = Table.Combine(List.Combine({{contents},
List.Transform(subfolders, @Folder_FullContents)
}))
in result
in Folder_FullContents