I am working on Excel add-ins with intranet server.
I have names of employees and each one has a folder in the intranet and this folder may has a power point file may not. so I need to read the files for each name.
the Problem is with names: each folder name has this Pattern :
surname, firstname
but the problem is with the names who contain multiple names as a firstname or surname:
ex: samy jack sammour. the first name is: "samy jack" and the last name is "sammour"
so the folder would be : sammour, samy jack
but I have only the field name, I don't know what is the last name or the firstname(it could be "jack sammour, samy" or "sammour, samy jack"). so I tried this code to fix it:
string[] dirs = System.IO.Directory.GetFiles(@"/samy*jack*sammour/","*file*.pptx");
if (dirs.Length > 0)
{
MessageBox.Show("true");
}
but it gave me an error:
file is not illegal
how can I fix this problem and search all the possibilties
That should do the trick:
var path = @"C:\Users\";
var name = "samy jack sammour";
Func<IEnumerable<string>, IEnumerable<string>> permutate = null;
permutate = items =>
items.Count() > 1 ?
items.SelectMany(
(_, ndx1) => permutate(items.Where((__, ndx2) => ndx1 != ndx2)),
(item1, item2) => item1 + (item2.StartsWith(",") ? "" : " ") + item2) :
items;
var names = name.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Concat(new[] { "," }).ToArray();
var dirs = new HashSet<string>(permutate(names).Where(n => !n.StartsWith(",") && !n.EndsWith(",")), StringComparer.OrdinalIgnoreCase);
if (new DirectoryInfo(path).EnumerateDirectories().Any(dir => dirs.Contains(dir.Name) && dir.EnumerateFiles("*.pptx").Any()))
MessageBox.Show("true");
In my opinion, you should't do this with a Regex because regexes can't match permutations very well. Instead you can create a HashSet which contains all case-insensitive permutations that correlate to your pattern:
surname, firstname
(Case-sensitivity isn't required because the windows file system doesn't care if a directory or file name is upper or lower case.)
For the sake of simplicity I just add the comma to the permutation parts and filter the items that start or end with a comma in a next step. If performance matters or if the names can consist of many parts I'm sure that there's a way to optimize these possibilities away sooner to prevent large parts of the unnecessary permutations.
In the last step you enumerate the directory names and check if there's a match in this HashSet of all possible names. When you've found a matching directory you just need to search for all .pptx files in this directory. If necessary just replace the "*.pptx" with your file name pattern.