Search code examples
linuxunixcommand-linecutls

How to only display owner of file when using ls command with special edge case


My objective is to find all files in a directory recursively and display only the file owner name so I'm able to use uniq to count the # of files a user owns in a directory. The command I am using is the following:

command = "find " + subdirectory.directoryPath + "/ -type f -exec ls -lh {} + | cut -f 3 -d' ' | sort | uniq -c | sort -n"

This command successfully displays only the owner of the file of each line, and allows me to count of the # of times the owner names is repeated, hence getting the # of files they own in a subdirectory. Cut uses ' ' as a delimiter and only keeps the 3rd column in ls, which is the owner of the file.

However, for my purpose there is this special edge case, where I'm not able to obtain the owner name if the following occurs.

-rw-r-----  1             31122918 group 20169510233 Mar 17 06:02                                              
-rw-r-----  1 user1                group 20165884490 Mar 25 11:11                                      
-rw-r-----  1 user1                group 20201669165 Mar 31 04:17                                     
-rwxr-x---  1 user3                group 20257297418 Jun  2 13:25             
-rw-r-----  1 user2                group 20048291543 Mar  4 22:04                                          
-rw-r-----  1             14235912 group 20398346003 Mar 10 04:47 

The special edge cases are the #s as the owner you see above. The current command Im using can detect user1,user2,and user3 perfectly, but because the numbers are placed all the way to the right, the command above doesn't detect the numbers, and simply displays nothing. Example output is shown here:

1  
1 user3
1 user2
1
2 user1

Can anyone help me parse the ls output so I'm able to detect these #'s when trying to only print the file owner column?


Solution

  • cut -d' ' won't capture the third field when it contains leading spaces -- each space is treated as the separator of another field.

    Alternatives:

    1. cut -c

       123456789X123456789X123456789X123456789X123456789L0123456789X0123
       -rw-r-----  1             31122918 group 20169510233 Mar 17 06:02
       -rw-r-----  1 user1                group 20165884490 Mar 25 11:11
      

    The data you seek is between characters 15 and 34 on each line, so you can say

        cut -c14-39
    
    1. perl/awk: other tools are adept at extracting data out of a line. Try one of

       perl -lane 'print $F[2]'
       awk '{print $3}'