Search code examples
perlfrequency-distribution

Unexpected character when running one-liner on Windows


I want to generate an output file that shows the frequency of each word inside an input file. After some search, I found that Perl is the ideal language for this problem, but I don't know this language.

After some more search, I found the following code here at stackoverflow, supposedly it provides the solution I want at great efficiency:

perl -lane '$h{$_}++ for @F; END{for $w (sort {$h{$b}<=>$h{$a} || $a cmp $b} keys %h) {print "$h{$w}\t$w"}}' file > freq

I tried running this command line using the form below:

perl -lane 'code' input.txt > output.txt

The execution halts due to an unexpected '>' (the one at '<=>'). I did some research but can't understand what is wrong. Could some one enlight me? Thanks!

Here is the topic from where I got the code: Elegant ways to count the frequency of words in a file

If it's relevant, my words use letters and numbers and are separated by a single white space.


Solution

  • You are probably using Windows. You therefore need to use doubles quotes " instead of singles quotes ' around your code:

    perl -lane "$h{$_}++ for @F; END{for $w (sort {$h{$b}<=>$h{$a} || $a cmp $b} keys %h) {print qq($h{$w}\t$w)}}" file > freq
    

    Also, note how I used qq() instead of "..." within the code, as suggested by @mob. Another option is to escape the quotes with \".