How can I get the second column of a very large csv file using linux command?

I was given this question during an interview. I said I could do it with java or python like xreadlines() function to traverse the whole file and fetch the column, but the interviewer wanted me to just use linux cmd. How can I achieve that?

Solution

You can use the command awk.

Below is an example of printing out the second column of a file:

awk -F, '{print $2}' file.txt

And to store it, you redirect it into a file:

awk -F, '{print $2}' file.txt > output.txt

How to compile my own glibc C standard library from source and use it?
How to run a piece of code such that it doesn't steal all the focus to itself in QtQuick?
The method of using Regular Expression to express Date and Time: YYYY-MM-DD HH:MM:SS.XXX
Can Windows containers be hosted on Linux?
How to use multiple versions of GCC
Is it possible to disable roll-in / roll-out completely in xfce4?
pthread nice value setting for default scheduler in C
Fail during installation of Pillow (Python module) in Linux
What is the easiest way to create a secured network client in Linux (C) without any external libraries?
Authenticate to Azure with certificate from Linux
Shell Script to Log CAN Messages with FIFO
Python print statements being buffered with > output redirection
How to pass password to scp?
What is the benefit of calling ioread functions when using memory mapped IO
archiving hidden directories with tar
How does Linux's mv work internally?
Explanation of the "--update add" command for Alpine Linux
How to read ring buffer within linux kernel space?
Bash command refuses to run in background with &
How to test if your Linux Support SSE2
NVM: Getting Permission denied with nvm install command
IMAGE_FEATURES vs IMAGE_INSTALL in Yocto
Ansible - Is there a way to get access to USB devices?
Linux - How to track all files accessed by a process?
Can‘t Lock Android MTK 8050 CPU frequency
Does CPU_SET on Linux use logical or physical processor index?
Linux: compute a single hash for a given folder & contents?
how do I check that two folders are the same in linux
Script with lsof works well on shell not on cron
Why do some kernel programmers use goto instead of simple while loops?