Search code examples
linuxshellawkhbasebigdata

How to load Hbase table with shell script


I am a learner in Big Data and I am trying to load a file into Hbase table. The file content looks like-

U100,A300&A301&A302    
U101,A301&A302    
U102,A302    
U103,A303&A301&A302

This file is present in local file system. What I want is this data to be loaded in Hbase table something like this-

enter image description here

I am trying below script but unable to get this exact output-

echo "create 'uid-map', 'users'" | hbase shell
file="/home/abc/lookupfiles/uid.txt"
touch /home/abc/lookupfiles/uid1.txt
chmod 775 /home/abc/lookupfiles/uid1.txt
file1="/home/abc/lookupfiles/uid1.txt"
awk '$1=$1' FS="&" OFS=" " $file > $file1
num=1
while IFS= read -r line
do
 uid=`echo $line | cut -d',' -f1`
 users=`echo $line | cut -d'&' -f2`
 for row in $users
 do
 #artist= 'echo $row | cut -d',' -f$num
 echo "put 'uid-map', '$uid', 'users:artist$num', '$row'" | hbase shell
 let "num=num+1"
done
num=1
done <"$file"

I am getting output as-

enter image description here Please let me know what I am doing wrong.


Solution

  • Optimized solution with single Awk program:

    echo "create 'uid-map', 'users'" | hbase shell
    awk -F'[,&]' -v cmd="hbase shell" '{
            fmt="put \047uid-map\047, \047%s\047, \047users:artist%d\047, \047%s\047\n";
            for (i=2; i<=NF; i++) 
                printf(fmt, $1, ++c, $i ) | cmd; 
            c=0 
        }' "$file"
    

    The output that will be passed to hbase shell (one line per call ... | cmd):

    put 'uid-map', 'U100', 'users:artist1', 'A300'
    put 'uid-map', 'U100', 'users:artist2', 'A301'
    put 'uid-map', 'U100', 'users:artist3', 'A302'
    put 'uid-map', 'U101', 'users:artist1', 'A301'
    put 'uid-map', 'U101', 'users:artist2', 'A302'
    put 'uid-map', 'U102', 'users:artist1', 'A302'
    put 'uid-map', 'U103', 'users:artist1', 'A303'
    put 'uid-map', 'U103', 'users:artist2', 'A301'
    put 'uid-map', 'U103', 'users:artist3', 'A302'