Search code examples
ruby-on-railsrubyregexrake

extracting path within a string using ruby


I have written a ruby script where I iterate through folders, and search for file names ending with ".xyz" . Within these files I search then for lines which have the following structure:

<ClCompile Include="..\..\..\Projects\Project_A\Applications\Modules\Sources\myfile.c"/>

This works so far with the script:

def parse_xyz_files
  files = Dir["./**/*.xyz"]
  files.each do |file_name|
    puts file_name
    File.open(file_name) do |f|
      f.each_line { |line|
        if line =~ /<ClCompile Include=/
          puts "Found #{line}"
        end 
      }
  end
  end
  end

Now I would like to extract only the string between double quotes, in this example:

..\..\..\Projects\Project_A\Applications\Modules\Sources\myfile.c

I'm trying to do it with something like this (with match method):

def parse_xyz_files
  files = Dir["./**/*.xyz"]
  files.each do |file_name|
    puts file_name
    File.open(file_name) do |f|
      f.each_line { |line|
        if line =~ /<ClCompile Include=/.match(/"([^"]*)"/)
            puts "Found #{line}"
          end

      }
  end
  end
  end

The regular expression is so far ok (checked with rubular). Any idea how to do it in a simple way? I'm relative new to ruby.


Solution

  • You can use the String#scan method:

    line = '<ClCompile Include="..\..\..\Projects\Project_A\Applications\Modules\Sources\myfile.c"/>'
    
    path = line.scan(/".*"/).first
    

    or in the case if your <CICompile> tag can have some other attributes then:

    path = line.scan(/Include="(.*)"/).first.first
    

    But using an XML parser is definitely a much better idea.