I used the below dict comprehension
dimer = {(ab+cd):{"1":0,"2":0,"3":0} for cd in 'ACGT' for ab in 'ACGT'}
to generate a dict of dicts,dimer:
dimer = {"AA":{"1":0,"2":0,"3":0}, "AC":{"1":0,"2":0,"3":0}, "AG":{"1":0,"2":0,"3":0}, "AT":{"1":0,"2":0,"3":0}, "CA":{"1":0,"2":0,"3":0}, "CC":{"1":0,"2":0,"3":0}, "CG":{"1":0,"2":0,"3":0}, "CT":{"1":0,"2":0,"3":0}, "GA":{"1":0,"2":0,"3":0}, "GC":{"1":0,"2":0,"3":0}, "GG":{"1":0,"2":0,"3":0}, "GT":{"1":0,"2":0,"3":0}, "TA":{"1":0,"2":0,"3":0}, "TC":{"1":0,"2":0,"3":0}, "TT":{"1":0,"2":0,"3":0}, "TG":{"1":0,"2":0,"3":0}}
However, now I want to sum up selected elements,
If I hardcode them out, it would be like
total_A = dimer["AA"]["1"]+dimer["CA"]["1"]+dimer["GA"]["1"]+dimer["TA"]["1"]+dimer["AA"]["2"]+dimer["CA"]["2"]+dimer["GA"]["2"]+dimer["TA"]["2"]+dimer["AA"]["3"]+dimer["CA"]["3"]+dimer["GA"]["3"]+dimer["TA"]["3"]
total_C = dimer["AC"]["1"]+dimer["CC"]["1"]+dimer["GC"]["1"]+dimer["TC"]["1"]+dimer["AC"]["2"]+dimer["CC"]["2"]+dimer["GC"]["2"]+dimer["TC"]["2"]+dimer["AC"]["3"]+dimer["CC"]["3"]+dimer["GC"]["3"]+dimer["TC"]["3"]
total_G = dimer["AG"]["1"]+dimer["CG"]["1"]+dimer["GG"]["1"]+dimer["TG"]["1"]+dimer["AG"]["2"]+dimer["CG"]["2"]+dimer["GG"]["2"]+dimer["TG"]["2"]+dimer["AG"]["3"]+dimer["CG"]["3"]+dimer["GG"]["3"]+dimer["TG"]["3"]
total_T = dimer["AT"]["1"]+dimer["CT"]["1"]+dimer["GT"]["1"]+dimer["TT"]["1"]+dimer["AT"]["2"]+dimer["CT"]["2"]+dimer["GT"]["2"]+dimer["TT"]["2"]+dimer["AT"]["3"]+dimer["CT"]["3"]+dimer["GT"]["3"]+dimer["TT"]["3"]
The best approach I have come up with to simplify it is using nested for-loops:
total_0 = {i:0 for i in 'ACGT'}
for i in 'ACGT':
for j in 'ACGT':
for k in '123':
total_0[i] += dimer[j+i][k]
I was wondering if there is any method to sum them up using a one liner?
I also have another nested for-loops:
row_sum = {i:{"1":0,"2":0,"3":0} for i in 'ACGT'}
for i in 'ACGT':
for j in 'ACGT':
for k in '123':
row_sum[i][k] += float(dimer[i+j][k])
The hardcode version is like:
row_sum = {"A":{"1":0,"2":0,"3":0},"C":{"1":0,"2":0,"3":0},"G":{"G":0,"2":0,"3":0},"T":{"1":0,"2":0,"3":0}}
for i in range(1,4,1):
row_sum["A"][str(i)] = float(dimer["AA"][str(i)]+dimer["AC"][str(i)]+dimer["AG"][str(i)]+dimer["AT"][str(i)])
row_sum["C"][str(i)] = float(dimer["CA"][str(i)]+dimer["CC"][str(i)]+dimer["CG"][str(i)]+dimer["CT"][str(i)])
row_sum["G"][str(i)] = float(dimer["GA"][str(i)]+dimer["GC"][str(i)]+dimer["GG"][str(i)]+dimer["GT"][str(i)])
row_sum["T"][str(i)] = float(dimer["TA"][str(i)]+dimer["TC"][str(i)]+dimer["TG"][str(i)]+dimer["TT"][str(i)])
I am also wondering if there is any method to sum the second nested for-loop up using a one liner?
Sorry I am really new to Python. Any help will be appreciated!
Firstly, you can collapse the 3 loops into one using a cartesian product like this.
from itertool import product
row_sum = {i: {"1": 0, "2": 0, "3": 0} for i in NT}
for i, j, k in product('ACGT', 'ACGT', '123'):
row_sum[i][k] += float(dimer[i + j][k])
Here is a one liner, but it's probably hard for you to follow if you are new to Python
{i: sum(sum(dimer[i + j].values()) for j in 'ACGT') for i in 'ACGT'}