Here is my problem:
I have a dataframe like this:
id tfidf_weights
1 {word1: 0.01, word2: 0.01, word3: 0.01, ...}
2 {word4: 0.01, word5: 0.01, word6: 0.01, ...}
3 {word7: 0.01, word8: 0.01, word9: 0.01, ...}
4 {word10: 0.01, word11: 0.01, word12: 0.01, ...}
5 {word13: 0.01, word14: 0.01, word15: 0.01, ...}
.
.
.
column 'id' represent the ids of the docs and 'tfidf_weights' the tfidf weight for each word of each docs.
from this dataframe, i can obtain a dict with the following structure:
mydict = {1:{word1: 0.01, word2: 0.01, word3: 0.01, ...}, 2:{word4: 0.01, word5: 0.01, word6: 0.01, ...}, 3:{word7: 0.01, word8: 0.01, word9: 0.01, ...}, 4:{word10: 0.01, word11: 0.01, word12: 0.01, ...}, 5:{word13: 0.01, word14: 0.01, word15: 0.01, ...}, ...}
what i want to do is, from this dictionary, obtain a matrix like this:
word1 word2 word3 word4 ...
1 0.01 0.01 0.01 0.01
2 0.01 0.01 0.01 0.01
3 0.01 0.01 0.01 0.01
4 0.01 0.01 0.01 0.01
5 0.01 0.01 0.01 0.01
.
.
.
Thank you for your help !
You can convert a list of dictionaries into a dataframe by using the pandas DataFrame class directly.
import pandas as pd
a = [{"0": 0}, {"1": 1}]
df = pd.DataFrame(a)
To apply this to your problem, all you have to do is turn mydict
into a list of dictionaries instead of a dictionary of dictionaries.