Search code examples
python-2.7mathlistings

How to increase count as script reads lines?


I have this table of values and I was wondering how can I let the program read each line. For each line with 'a', 'g', 'c', or 'u', I want it to increase the count by one. For this example, when I run it, it should have a result of 12.

a  1    0.000 S
g  2    0.260 S
a  3    0.990 S
a  4    0.980 S
c  5    0.000 S
u  6    1.000 S
c  7    0.000 S
a  8    1.000 S
a  9    1.000 T
u 10    0.820 S
a 11    1.000 T
g 12    0.000 S
F 13    1.000 S
S 14    1.000 S
T 15    1.000 S

The code that I tried is below:

rna_residues = ['a','c','g','u']
count_dict = {}
        #Making the starting number 0
        rna_count = 0
        #if any lines of the file starts with one of the rna_residue
        if line.startswith(tuple(rna_residues)):
            for residue in line:
                if residue in rna_residues:
                    rna_count += 1
            count_dict[line] = [rna_count]  
            print count_dict    

Somehow, when I run it, there is no list of the count:

{'a  1    0.000 S\n': [1]}
{'g  2    0.260 S\n': [1]}
{'a  3    0.990 S\n': [1]}
{'a  4    0.980 S\n': [1]}
{'c  5    0.000 S\n': [1]}
{'u  6    1.000 S\n': [1]}
{'c  7    0.000 S\n': [1]}
{'a  8    1.000 S\n': [1]}
{'a  9    1.000 T\n': [1]}
{'u 10    0.820 S\n': [1]}
{'a 11    1.000 T\n': [1]}
{'g 12    0.000 S\n': [1]}

I know this is a lot of information, but is there any tips that can help me with this? Thanks a lot!!


Solution

  • You are using the whole line as a key in the dictionary, so unless you have identical lines all values will be 1. Why do you need the dictionary at all? I was under the impression you want to count the number of lines that start with any one of the characters 'a','c','g','u'.

    For this, the following code is suffice:

    rna_residues = ['a','c','g','u']
    rna_count = 0
    with open('/path/to/file') as opened_file:    
        for line in opened_file:
            # or if line[0] in rna_residues
            if any(line.startswith(residue) for residue in rna_residues):
                rna_count += 1
    print rna_count
    # 12