Search code examples
rubyzipfilesizerar

File with random data but specific size


I am trying to generate a file in ruby that has a specific size. The content doesn't matter.

Here is what I got so far (and it works!):

File.open("done/#{NAME}.txt", 'w') do |f|
  contents = "x" * (1024*1024)
  SIZE.to_i.times { f.write(contents) }
end

The problem is: Once I zip or rar this file the created archive is only a few kb small. I guess thats because the random data in the file got compressed.

How do I create data that is more random as if it were just a normal file (for example a movie file)? To be specific: How to create a file with random data that keeps its size when archived?


Solution

  • You cannot guarantee an exact file size when compressing. However, as you suggest in the question, completely random data does not compress.

    You can generate a random String using most random number generators. Even simple ones are capable of making hard-to-compress data, but you would have to write your own string-creation code. Luckily for you, Ruby comes with a built-in library that already has a convenient byte-generating method, and you can use it in a variation of your code:

    require 'securerandom'
    one_megabyte = 2 ** 20 # or 1024 * 1024, if you prefer
    
    # Note use 'wb' mode to prevent problems with character encoding
    File.open("done/#{NAME}.txt", 'wb') do |f|
      SIZE.to_i.times { f.write( SecureRandom.random_bytes( one_megabyte ) ) }
    end
    

    This file is not going to compress much, if at all. Many compressors will detect that and just store the file as-is (making a .zip or .rar file slightly larger than the original).