Search code examples
c#linq.net-3.5sequences

Does Linq provide a way to easily spot gaps in a sequence?


I am managing a directory of files. Each file will be named similarly to Image_000000.png, with the numeric portion being incremented for each file that is stored.

Files can also be deleted, leaving gaps in the number sequence. The reason I am asking is because I recognize that at some point in the future, the user could use up the number sequence unless I takes steps to reuse numbers when they become available. I realize that it is a million, and that's a lot, but we have 20-plus year users, so "someday" is not out of the question.

So, I am specifically asking whether or not there exists a way to easily determine the gaps in the sequence without simply looping. I realize that because it's a fixed range, I could simply loop over the expected range.

And I will unless there is a better/cleaner/easier/faster alternative. If so, I'd like to know about it.

This method is called to obtain the next available file name:

public static String GetNextImageFileName()
{
    String retFile = null;
    DirectoryInfo di = new DirectoryInfo(userVars.ImageDirectory);
    FileInfo[] fia = di.GetFiles("*.*", SearchOption.TopDirectoryOnly);
    String lastFile = fia.Where(i => i.Name.StartsWith("Image_") && i.Name.Substring(6, 6).ContainsOnlyDigits()).OrderBy(i => i.Name).Last().Name;
    if (!String.IsNullOrEmpty(lastFile))
    {
        Int32 num;
        String strNum = lastFile.Substring(6, 6);
        String strExt = lastFile.Substring(13);
        if (!String.IsNullOrEmpty(strNum) && 
            !String.IsNullOrEmpty(strExt) && 
            strNum.ContainsOnlyDigits() &&
            Int32.TryParse(strNum, out num))
        {
            num++;
            retFile = String.Format("Image_{0:D6}.{1}", num, strExt);
            while (num <= 999999 && File.Exists(retFile))
            {
                num++;
                retFile = String.Format("Image_{0:D6}.{1}", num, strExt);
            }
        }
    }

    return retFile;
}

EDIT: in case it helps anyone, here is the final method, incorporating Daniel Hilgarth's answer:

public static String GetNextImageFileName()
{
    DirectoryInfo di = new DirectoryInfo(userVars.ImageDirectory);
    FileInfo[] fia = di.GetFiles("Image_*.*", SearchOption.TopDirectoryOnly);
    List<Int32> fileNums = new List<Int32>();
    foreach (FileInfo fi in fia)
    {
        Int32 i;
        if (Int32.TryParse(fi.Name.Substring(6, 6), out i))
            fileNums.Add(i);
    }
    var result = fileNums.Select((x, i) => new { Index = i, Value = x })
                .Where(x => x.Index != x.Value)
                .Select(x => (Int32?)x.Index)
                .FirstOrDefault();

    Int32 index;
    if (result == null)
        index = fileNums.Count - 1;
    else
        index = result.Value - 1;

    var nextNumber = fileNums[index] + 1;

    if (nextNumber >= 0 && nextNumber <= 999999)
        return String.Format("Image_{0:D6}", result.Value);

    return null;
}

Solution

  • A very simple approach to find the first number of the first gap would be the following:

    int[] existingNumbers = /* extract all numbers from all filenames and order them */
    var allNumbers = Enumerable.Range(0, 1000000);
    var result = allNumbers.Where(x => !existingNumbers.Contains(x)).First();
    

    This will return 1,000,000 if all numbers have been used and no gaps exist.

    This approach has the drawback that it performs rather badly, as it iterates existingNumbers multiple times.

    A somewhat better approach would be to use Zip:

    allNumbers.Zip(existingNumbers, (a, e) => new { Number = a, ExistingNumber = e })
              .Where(x => x.Number != x.ExistingNumber)
              .Select(x => x.Number)
              .First();
    

    An improved version of DuckMaestro's answer that actually returns the first value of the first gap - and not the first value after the first gap - would look like this:

    var tmp = existingNumbers.Select((x, i) => new { Index = i, Value = x })
                             .Where(x => x.Index != x.Value)
                             .Select(x => (int?)x.Index)
                             .FirstOrDefault();
    
    int index;
    if(tmp == null)
        index = existingNumbers.Length - 1;
    else
        index = tmp.Value - 1;
    
    var nextNumber = existingNumbers[index] + 1;