Search code examples
rubyregexfilenull-character

How can I delete all null characters from a file?


I had a directory with a lot of PGN chess files, from which I wanted to remove the move times (written as [%emt {a_number}]. I wrote this script:

regex = /\[.emt[^\]]+\]/
directory = "path/to/files"
extension = ".pgn"

Dir.chdir(directory)
Dir.foreach(directory) do |file_name|
    file_object = File.open(file_name, "r+")
    contents = file_object.read
    new_contents = contents.gsub(regex, "")
    File.truncate(directory + "/" + file_name, 0)
    file_object.puts(new_contents)
    file_object.close
end

This removed all the move times, but curiously it appended a large number of null characters to the beginning of the files (I suspect this number is equal to the number of bytes in the file). So I replaced the line new_contents = contents.gsub(regex, "") with contents.delete("\0"), but this only made it worse, appending even more null characters to the beginning of the files. How can I remove them?


Solution

  • It should work OK if you replace:

    File.truncate(directory + "/" + file_name, 0)
    

    with:

    file_object.rewind
    

    or

    file_object.seek(0)
    

    File.truncate should not be applied to open files (as here), and file_object.truncate should not be followed by any file operation other than file_object.close.

    If you already have a file with nulls that you want to remove, read the file into a string str, close the file, execute

    str.delete!("\000")
    

    and then write str back to file.