Search code examples
sortingbatch-filecomm

Linux Comm on Windows - Output should be Zero


The last days i did a lot of things with comm on Windows Batch to compare textfiles which each other. So normaly there is nothing i could do wrong. In my other projects the code below is working fine but not in the actual case and i can't see any reason for it.

Okay what i learned about comm is that it is needed that both files are sorted so i added it for both files, now i tried to compare New File > With Archive and the Output should be lines that are NOT inside all.txt

D:/filetype/core/sort.exe -b D:\filetype\test\all.txt -oD:\filetype\test\all.txt

D:/filetype/core/sort.exe -b D:\filetype\test\listfile_export_tmp2.txt -oD:\filetype\test\listfile_export_tmp2.txt

D:/filetype/core/comm.exe -2 -3 D:\filetype\test\listfile_export_tmp2.txt D:\filetype\test\all.txt > D:\filetype\test\output.txt

For testing i added the text that i would like to compare to my all.txt so the output should be zero because here is nothing new. But as result as output.txt comes exactly what i got inside the first textfile. I checked the all.txt by hand and these lines that i try to compare are inside, i checked that sort is working correctly with a testfile and different letters.

So here is what i think

  1. In my other projects are differences that i can't see. And it's my fault
  2. Comm is not able to compare two files if the one textfile is to small, i try to compare a 50MB file with an 1KB file

I could offer both files for testing on request


Solution

  • Okay i was able to figure it out. Grep and comm will match blank lines while comparing. So if you got only a small textfile as input and a few blank lines in the other (in my case the bigger) he will likely match all blank lines and as result you will see your input again.

    To remove the blank lines i used sed

    Windows

    sed.exe "/^\s*$/d" 
    

    Linux

    sed '/^\s*$/d'
    

    And jup now is all working fine again. (But don't forget to sort)