Loaded nearly 50GB of CSV file into Hadoop cluster and I want to see some sample records for identifying the columns.
I have tried using
hadoop fs -cat employees.csv | head -n 10
My questions are
head -n 10
- it will load 50 GB data and it will do filter first 10 lines? how it is working ? This depends on your version.
For older Hadoop (< 3.1.0) versions:
hadoop fs -cat employees.csv | head -n 10
For newer (>= 3.1.0) Hadoop versions
hadoop fs -head employees.csv