I have a list of data and I need to cut certain characters out of certain columns.
Here is the list :
JCG2380 GREEN, JULIE C JR-II BISS CPSC BS INFO TECH XXX/XXX-9445
JAG1936 GREEN, JOE A. SO-I BISS CPSC BS INFO TECH XXX/XXX-7993
ACG4636 GREEN, ADAM C. JR-II BISS CPSC BS COMP SCI XXX/XXX-0437
SPG1696 GREEN, SEAN P. JR-I BISS CPSC BS COMP SCI XXX/XXX-2398
SEG8835 GREEN, SHAWN E. FR-II BISS CPSC BS COMP SCI XXX/XXX-7149
MCGo599 GREEN, MICHAEL C. JR-I BISS CPSC BS COMP SCI XXX/XXX-OOOO
GJG1887 GREEN, GREGORY J. SO-II BISS CPSC BS INFO TECH XXX/XXX-4354
NGG5479 GREEN, NICHOLAS G JR-I BISS CPSC BS INFO TECH XXX/XXX-8268
ZTG7190 GREEN, ZACHARY T. FR-II BISS CPSC BS INFO TECH XXX/XXX-1298
AXG9097 GREEN, ALEXANDER SO-I BISS CPSC BS INFO TECH XXX/XXX-0313
RJG6624 GREEN, ROBERT J. SO-II BISS CPSC BS COMP SCI XXX/XXX-ZOZI
MWG1990 GREEN, MATTHEW W SO-II BISS CPSC BS INFO TECH XXX/XXX-0581
The problem here is that not all the fields are the same size. Notice how Alexander Green (3rd from the bottom) does not have a middle initial. This prevents me from using awk uniformly on each column. My solution is to cut everything on the right side of the file so that the field delimiter won't mess everything up.
So how can I use the cut command to start at the right-most column and cut back 7 columns?
You can use cut as your data has fixed width fields.
Here is what I got with the ocr'd text:
$ cut -c 33-51,73-77 input
JR-II BISS CPSC BS 9445
SO-I BISS CPSC BS 7993
JR-II BISS CPSC BS 0437
JR-I BISS CPSC BS 2398
FR-II BISS CPSC BS 7149
JR-I BISS CPSC BS OOOO
SO-II BISS CPSC BS 4354
JR-I BISS CPSC BS 8268
FR-II BISS CPSC BS 1298
SO-I BISS CPSC BS 0313
SO-II BISS CPSC BS ZOZI
SO-II BISS CPSC BS 0581
and to match the requirement you wrote in a comment:
Exactly what I'm trying to do is get the first character out of the columns that start (from the top entry) with JR, BISS, CPSC, INFO. Then I need the last 4 digits from the phone numbers on the right.
$ cut -c 32-33,38-39,43-44,48-49,64-64,73-77 input
J B C B 9445
S B C B 7993
J B C B 0437
J B C B 2398
F B C B 7149
J B C B OOOO
S B C B 4354
J B C B 8268
F B C B 1298
S B C B 0313
S B C B ZOZI
S B C B 0581
You'll need to adjust the ranges for your actual data.