Search code examples
ruby-on-railsreplacevcr

Can't filter sensitive data with VCR


I've got the following in my spec_helper

c.filter_sensitive_data("<FILTERED>") { keys['s3_key'] }
c.filter_sensitive_data("<REDACTED>") { keys['s3_secret'] }

Yet when I run my spec I find that it creates the following entry in the cassette:

Authorization:
- AWS <FILTERED>:this_part_has_not_been_filtered=

As you can see there is a part that has not been filtered. I'm not sure if it contains useful information, but I don't want to paste it just incase. I can however say that it doesn't contain my key or my secret. Is it just fluff? Should I care? Is this what normally happens when filtering S3 requests when using the aws-sdk gem? If not then how can I get it to filter all of the authorization data?

Is there a special set of instructions for filtering S3 keys? I really don't want to mess this up.


Solution

  • Well, looks like it's safe, if your key isn't there. To be sure, you might use regexp matcher to replace whole string, something like %r<#{keys['s3_key']:.*?=>. Bad news: there are no regexp filter_sensitive_data. Good news: you can use more low-level methods to implement that yourself.

    That's current implementation of filter_sensitive_data

    # @param placeholder [String] The placeholder string.
    # @param tag [Symbol] Set this to apply this only to cassettes
    #  with a matching tag; otherwise it will apply to every cassette.
    # @yield block that determines what string to replace
    # @yieldparam interaction [(optional) VCR::HTTPInteraction::HookAware] the HTTP interaction
    # @yieldreturn the string to replace
    def define_cassette_placeholder(placeholder, tag = nil, &block)
      before_record(tag) do |interaction|
        orig_text = call_block(block, interaction)
        log "before_record: replacing #{orig_text.inspect} with #{placeholder.inspect}"
        interaction.filter!(orig_text, placeholder)
      end
    
      before_playback(tag) do |interaction|
        orig_text = call_block(block, interaction)
        log "before_playback: replacing #{placeholder.inspect} with #{orig_text.inspect}"
        interaction.filter!(placeholder, orig_text)
      end
    end
    alias filter_sensitive_data define_cassette_placeholder
    

    Source

    Which leads us to these methods

      # Replaces a string in any part of the HTTP interaction (headers, request body,
      # response body, etc) with the given replacement text.
      #
      # @param [#to_s] text the text to replace
      # @param [#to_s] replacement_text the text to put in its place
      def filter!(text, replacement_text)
        text, replacement_text = text.to_s, replacement_text.to_s
        return self if [text, replacement_text].any? { |t| t.empty? }
        filter_object!(self, text, replacement_text)
      end
    
    private
    
      def filter_object!(object, text, replacement_text)
        if object.respond_to?(:gsub)
          object.gsub!(text, replacement_text) if object.include?(text)
        elsif Hash === object
          filter_hash!(object, text, replacement_text)
        elsif object.respond_to?(:each)
          # This handles nested arrays and structs
          object.each { |o| filter_object!(o, text, replacement_text) }
        end
    
        object
      end
    

    Source

    Oh well, we might just try monkey patching this method:

    Somewhere in your spec_helper:

    class VCR::HTTPInteraction::HookAware
      def filter!(text, replacement_text)
        replacement_text = replacement_text.to_s unless replacement_text.is_a?(Regexp)
        text = text.to_s
        return self if [text, replacement_text].any? { |t| t.empty? }
        filter_object!(self, text, replacement_text)
      end
    end
    

    Of course, you can just opt out messing with the deep internals of alien library, and don't feel too paranoid knowing that some random alpha-numeric data is written to cassette near your token (but not including the latter).