Search code examples
rubywindowsrecursionfile-handlingdirectory-structure

Ruby Script Recursively Creates Deeply Nested Directories


I'm working on a Ruby script, specifically with a method named copy2tmp, (copy pasted method definition)

define_singleton_method(:copy2tmp) do |files|
  files.each do |f|
    if File.symlink?(f)
      # Avoid trouble with symlink loops

      # Delete old symlink if there is one, because:
      # If a proper file or directory has been replaced with a symlink,
      # remove the obsolete stuff.
      # If there already is a symlink, delete because it might have been
      # relinked.
      if File.exist?("#{PARAMS[:tmpdir]}/#{f}")
        FileUtils.rm("#{PARAMS[:tmpdir]}/#{f}")
      end

      # Create new symlink instead of copying
      File.symlink("#{PARAMS[:jobpath]}/#{f}", "#{PARAMS[:tmpdir]}/#{f}")
    elsif File.directory?(f)
      FileUtils.mkdir_p("#{PARAMS[:tmpdir]}/#{f}")
      copy2tmp(Dir.entries(f)\
                  .delete_if do |s| ['.', '..', ]\
                  .include?(s) end.map { |s| "#{f}/#{s}" })
      # TODO: Is this necessary? Why not just copy? (For now, safer and more adaptable.)
    else
      FileUtils.cp(f,"#{PARAMS[:tmpdir]}/#{f}")
    end
  end
end

which seems to be creating excessively deeply nested directories, leading to errors. The script is intended for processing LaTeX documents using ltx2any. The issue arises when the script attempts to copy files to a temporary directory, but instead, it ends up creating a deeply nested structure like ../folderName/folderName/folderName/....

copy2tmp(Dir.entries('.').delete_if { |f| exceptions.include?(f) })

What I Have Tried:

  1. Debugging with Recursion Depth: Added debugging statements to track the recursion depth in copy2tmp. This revealed that with each recursive call, the depth increased, and the directory path grew longer (../folderName/...).
define_singleton_method(:copy2tmp) do |files, depth = 0|
    puts "Debug: Recursion depth #{depth}, files: #{files.inspect}"
# ...
copy2tmp(Dir.entries(f).delete_if do |s| ['.', '..', ].include?(s) end.map { |s| "#{f}/#{s}" }, depth + 1)
# ...
copy2tmp(Dir.entries('.').delete_if { |f| exceptions.include?(f) }, 0)
  1. Manual Inspection: Checked for symbolic links or unusual directory structures that might cause recursive loops. Didn't find any obvious issues here. I have tried running the script in folder with and without spacing in their names.

  2. Checking copy2tmp Calls: Ensured that copy2tmp was being called with appropriate parameters and reviewed its recursive calls.

  3. Expectation vs. Reality: I expected the script to copy files to a temporary directory without creating an unnecessary nested structure. Instead, the script created a deeply nested directory path, leading to a "No such file or directory" error.

Question:

I suspect the issue lies in the copy2tmp method, either in its definition or the way it is being called. The method is meant to copy files into a temporary directory, but it seems to be causing a recursive loop of directory creation. Can anyone help identify the flaw in the copy2tmp method or its invocation that's leading to this problem? Any insights into stopping this unintended recursion would be greatly appreciated.


Solution

  • So it took a bit of digging but I found the crux of your issue.

    You are using this line exceptions = ignore + ignore.map { |s| "./#{s}" } + Dir['.*'] + Dir['./.*'] to attempt to filter out some ignore directories as well as all dot files.

    In Ruby < 3.1 Dir['.*'] included the current directory ('.') and the parent directory ('..') so this filter was effective.

    As of Ruby 3.1 Dir['.*'] no longer includes the parent directory ('..'). See This Commit

    However Dir.entries('.') does include the parent directory.

    This means when you call

    Dir.entries('.').delete_if { |f| exceptions.include?(f) }
    

    The parent directory will exist and will not be removed from the files array, so you are passing the parent directory to the method. Since this method is recursive and appends the current directory to the file list when recalling itself you were stuck in a never ending loop when running this on ruby 3.1+.

    While there are other solutions to fix this issue the simplest and most backwards compatible is to use Dir.children('.') because Dir::children guarantees it will not include '.' or '..'

    Returns an array containing all of the filenames except for “.” and “..” in the given directory.

    Working Example