Search code examples
pythonword-frequency

Python Script Only Reads 100 First Files


I have a folder with 616 files, but my script only reads the first 100. What settings do I need to change around to get it to read them all? It's probably relevant, I'm using Anaconda Navigator's Jupyter Notebook.

Here's my code:

import re
import string
from collections import Counter
import os
import glob

def word_count(file_tokens):
    for word in file_tokens:
        count = Counter(file_tokens)
    return count

files_list = glob.glob("german/test/*/negative/*")
print(files_list)
for path in files_list:
    corpus, tache, classe, file_name = path.split("\\")
    file = open(path, mode="r", encoding="utf-8")
    read_file = file.read()

    ##lowercase
    file_clean = read_file.lower()


    ##tokenize
    file_tokens = file_clean.split()

    ##word count and sort
    print(word_count(file_tokens))

Solution

  • Have you tried printing the length of the files_list variable and check if it is 616 or 100 ?

    print(len(files_list))