I have a tokenized dataset titled, tokenized_datasets
as follows:
I want to add a column titled ['labels']
that is a copy of ['input_ids']
within the features. I'm aware of the following method from this post Add new column to a HuggingFace dataset:
new_dataset = dataset.add_column("labels", tokenized_datasets['input_ids'].copy())
But I first need to access the Dataset Dictionary. This is what I have so far but it doesn't seem to do the trick:
def new_column(example):
example["labels"] = example["input_ids"].copy()
return example
dataset_new = tokenized_datasets.map(new_column)
KeyError: 'input_ids'
Try one of the two options below:
# first option
def new_column(example):
return {"labels" = example["input_ids"]}
# second option
def new_column(example):
example["labels"] = example["input_ids"]
return example
dataset_new = tokenized_datasets.map(new_column)