I have a sheet with the struct below.
COLUMN1 | COLUMN2 | COLUMN3
0 | 3 | 1
1 | 3 |
2 | 5 | 4
4 | 2 | 2
How can I return just the column 1 and 2 by its column name? Not by the index. Using the wks.get_all_values() as shown bellow, it will return all columns of the sheet.
My code:
ss = gs.open(args['name'])
wks = ss.worksheet(args['worksheet'])
data = wks.get_all_values()
headers = data.pop(0)
df = pd.Dataframe(data, columns=headers)
I was needing something like, wks.get_columns(['COLUMN1','COLUMN2'])
I hope someone can help me. Thanks
I believe your goal is as follows.
In this case, how about the following modification?
In this modification, the specific columns are retrieved from the values retrieved with wks.get_all_values()
. In this case, one API call is used. For example, when the specific columns are retrieved after the header row was retrieved, 2 API calls are retrieved. So I proposed this method.
headers = ["COLUMN1", "COLUMN3"] # Please set the header values you want to retrieve.
ss = gs.open(args['name'])
wks = ss.worksheet(args['worksheet'])
values = wks.get_all_values()
data = zip(*(e for e in zip(*values) if e[0] in headers))
df = pd.DataFrame(data, columns=headers)
When this script is run using your sample sheet in your question, the following result is obtained.
0 COLUMN1 COLUMN3
1 0 1
2 1
3 2 4
4 4 2
When you want to retrieve data
as a list, please modify data = zip(*(e for e in zip(*values) if e[0] in headers))
to data = [list(f) for f in zip(*(e for e in zip(*values) if e[0] in headers))]
.