Search code examples
bashunixcut

How can I count and display only the words that are repeated more than once using unix commands?


I am trying to count and display only the words that are repeated more than once in a file. The basic idea is:

  • You are given a file with names and characters like commas, colons, slashes, etc..
  • Use the cut command to display only the first names in the file (other commands are also allowed).
  • Count and then display only the names repeated more than once.

I got to the point of counting and displaying all the names. However, I haven't found a way to display and to count only those names repeated more than once.

Here is a section of the file:

user1:x:80:200:Mia,Spurs:/home/user1:/bin/bash
user2:x:80:200:Martha,Dalton:/home/user2:/bin/bash
user3:x:80:200:Lucy,Carlson:/home/user3:/bin/bash
user4:x:80:200:Carl,Bingo:/home/user4:/bin/bash

Here is what I have been able to do:

Daniel@Daniel-MacBook-Pro Files % cut -d ":" -f 5-5 file1 | cut -d "," -f 1-1 | sort -n | uniq -c
   1 Mia
   3 Martha
   1 Lucy
   1 Carl
   1 Jessi
   1 Joke
   1 Jim
   2 Race
   1 Sem
   1 Shirly
   1 Susan
   1 Tim

Solution

  • You can filter out the rows with count 1 with grep.

    cut -d ":" -f 5 file1 | cut -d "," -f 1 | sort | uniq -c | grep -v '^ *1 '