Search code examples
pythondictionarytypeerror

Extracting dictionary values into .txt files


I am looking to create .txt files from a dictionary, extracting text into new lines of each txt file - dictionary structure looks like:

{'id': 0,
 'text': 'Mtendere Village was inspired by the vision'}

I am using this code:

from tqdm.auto import tqdm  #loading bar

text_data = []
file_count = 0

for sample in tqdm(new_dict):
    # remove newline characters from each sample as we need to use exclusively as seperators
    sample = sample['text'].replace('\n', '\s')
    text_data.append(sample)
    if len(text_data) == 5_000:
        # once we hit the 5K mark, save to file
        with open('file_path\oscar_data\oscar_%s.txt' %file_count, 'w', encoding='utf-8') as fp:
            fp.write('\n'.join(text_data)) 
        text_data = []
        file_count += 1

However this gives me an error;

---> 12     sample = sample['text'].replace('\n', '\s') 
TypeError: 'int' object is not subscriptable

Although I understand what the error is telling me, I'm not sure how to correct it...


Solution

  • I think you're trying to pass a list of dictionaries to the loop, but actually passed a dictionary.

    from tqdm.auto import tqdm  #loading bar
    
    new_dict = [
        {
            'id': 0,
            'text': 'Mtendere Village was inspired by the vision'
        }
    ]
    
    text_data = []
    file_count = 0
    
    for sample in tqdm(new_dict):
        # remove newline characters from each sample as we need to use exclusively as seperators
        sample = sample['text'].replace('\n', '\s')
        text_data.append(sample)
        if len(text_data) == 5000:
            # Once we hit the 5K mark, save it to file
            with open('file_path\oscar_data\oscar_%s.txt' %file_count, 'w', encoding='utf-8') as fp:
                fp.write('\n'.join(text_data)) 
            
            text_data = []
            file_count += 1
    

    I have updated new_dict to a list of dictionaries and it fixed the issue.