I have the following AWK script that counts occurences of elements in field 1 and when finishes to read entire file, prints each element and the times of repetitions.
awk '{a[$1]++} END{ for(i in a){print i"-->"a[i]} }' file
I'm very new with perl and I don't know how would be the equivalent. What I have so far is below, but it has incorrect syntax. Thanks in advance.
perl -lane '$a{$F[1]}++ END{foreach $a {print $a} }' file
____________________________________UPDATE ______________________________________
Hi, thanks both for your answers. The real input file has 34 million lines and the execution time is 3 or more times faster between awk and Perl. Is awk faster than perl?
awk '{a[$1]++}END{for(i in a){print i"-->"a[i]}}' file #--> 2:45 aprox
perl -lane '$a{$F[0]}++;END{foreach my $k (keys %a){ print "$k --> $a{$k}" } }' file #--> 7 min aprox
perl -lanE'$a{$F[0]}++; END { say "$_ => $a{$_}" for keys %a }' file # -->9 min aprox
Okay, Ger, one more time :-) I upgraded my Perl to the latest version available to me and made a file like what you described (34.5 million lines each having a 16 digit integer in the 1st and only column):
schumack@linux2 52> wc -l listbig
34521909 listbig
schumack@linux2 53> head -3 listbig
1111111111111111
3333333333333333
4444444444444444
I then ran a specialized Perl line (works for this file but is not the same as the awk line). As before I timed the runs using /usr/bin/time:
schumack@linux2 54> /usr/bin/time -f '%E %P' /usr/local/bin/perl -lne 'chomp; $a{$_}++; END{foreach $i (keys %a){print "$i-->$a{$i}"}}' listbig
5555555555555555-->4547796
1111111111111111-->9715747
9999999999999999-->826872
3333333333333333-->9922465
1212121212121212-->826872
4444444444444444-->5374669
2222222222222222-->1653744
8888888888888888-->826872
7777777777777777-->826872
0:12.20 99%
schumack@linux2 55> /usr/bin/time -f '%E %P' awk '{a[$1]++} END{ for(i in a){print i"-->"a[i]} }' listbig
1111111111111111-->9715747
2222222222222222-->1653744
3333333333333333-->9922465
4444444444444444-->5374669
5555555555555555-->4547796
1212121212121212-->826872
7777777777777777-->826872
8888888888888888-->826872
9999999999999999-->826872
0:12.61 99%
Both perl and awk ran very fast on the 34.5 million line file and were within a half second of each other. Curious as what type of machine / OS / Perl version you are currently using. I tested on an ASUS laptop that is about 4 years old, has Intel I7. I am using Ubuntu 16.04 and Perl v5.26.1
Anyways, thanks for the reason to play around with Perl!
Have fun, Ken