I have files to convert to Unix format. What would be the differences/issues that I could face choosing python conversion way :
import sys
filename = sys.argv[1]
text = open(filename, 'rb').read().replace('\r\n', '\n')
open(filename, 'wb').write(text)
instead of : calling dos2unix Unix command in a subprocess ?
Thanks !
From man dos2unix
:
The Dos2unix package includes utilities "dos2unix" and "unix2dos" to convert plain text files in DOS or Mac format to Unix format and vice versa.
In DOS/Windows text files a line break, also known as newline, is a combination of two characters: a Carriage Return (CR) followed by a Line Feed (LF). In Unix text files a line break is a single character: the Line Feed (LF). In Mac text files, prior to Mac OS X, a line break was single Carriage Return (CR) character. Nowadays Mac OS uses Unix style (LF) line breaks.
Besides line breaks Dos2unix can also convert the encoding of files. A few DOS code pages can be converted to Unix Latin-1. And Windows Unicode (UTF-16) files can be converted to Unix Unicode (UTF-8) files.
...
-ascii Convert only line breaks. This is the default conversion mode.
dos2unix
thus can do more than converting line breaks, but default behavior is only that.
If your file is in wrong encoding, you would have to deal with it with dos2unix
too.