Search code examples
pythonstringfrequencyletterfrequency-distribution

Counting subsequent letters


So I am trying to implement code that will count the next letter in a sentence, using python. so for instance,

"""So I am trying to implement code that will count the next letter in a sentence, using 
python"""

most common letters one after the other

  1. for 's'

    • 'o' :1
    • 'e' :1
  2. for 'o'

    • ' ' :1
    • 'd' :1
    • 'u' :1
    • 'n' :1

I think you get the idea

I already have written code for counting letters prior

def count_letters(word, char):
    count = 0
    for c in word:
        if char == c:
            count += 1
    return count

As you can see this just counts for letters, but not the next letter. can someone give me a hand on this one?


Solution

  • from collections import Counter, defaultdict
    
    counts = defaultdict(Counter)
    
    s = """So I am trying to implement code that will count the next letter in a sentence, using
    python""".lower()
    
    for c1, c2 in zip(s, s[1:]):
        counts[c1][c2] += 1
    

    (apart from being simpler, this should be significantly faster than pault's answer by not iterating over the string for every letter)

    Concepts to google that aren't named in the code:

    • for c1, c2 in ... (namely the fact that there are two variables): tuple unpacking
    • s[1:]: slicing. Basically this is a copy of the string after the first character.