Search code examples
globchecksumshafactor-lang

Get the same SHA-224 sum in Factor as coreutils sha224sum


$ echo *
a b c
$ cat *
file 1
file 2
file 3
$ factor -e=" \ 
> USING: globs io sequences sorting io.files io.encodings.utf8 ; \ 
> \"*\" glob natural-sort [ utf8 file-lines ] map concat [ print ] each "
file 1
file 2
file 3

The outputs are the same using Factor's glob and the shell's glob. A diff on the outputs shows they match exactly.

$ factor -e=" \
> USING: math.parser checksums checksums.sha globs io sequences sorting io.files io.encodings.utf8 ; \
> \"*\" glob natural-sort [ utf8 file-lines ] map concat sha-224 checksum-lines bytes>hex-string print "

0feaf7d5c46b802404760778091ed1312ba82d4206b9f93c35570a1a
$ cat * | sha224sum
d1240479399e5a37f8e62e2935a7ac4b9352e41d6274067b27a36101

But the checksums don't match, nor will md5 checksums. Why is this? How do I get the same checksum in Factor as in coreutils sha224sum?

Changing the encoding to ascii doesn't change the output, nor does "\n" join sha-224 checksum-bytes instead of checksum-lines.


Solution

  • This odd behaviour is due to a bug in checksum-lines. factor/factor#1708

    Thanks to jonenst for finding the problem, and calsioro for this code on the Factor mailing list:

    This code:

    [
        { "a" "b" "c" } 3 [1,b]
        [ number>string "file " prepend [ write ] curry
          ascii swap with-file-writer ] 2each
    
        "*" glob natural-sort [ utf8 file-lines ] map concat
        [ "\n" append ] map "" join  ! Add newlines between and at the end
    
        sha-224 checksum-bytes bytes>hex-string print
    ] with-test-directory
    

    gives the same hash:

    d1240479399e5a37f8e62e2935a7ac4b9352e41d6274067b27a36101